Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
Is there a standard database of records mapping city/state/zip to lat/lng? I found this database, and the USPS has a simple (no lat/lng) API, but is there another recommended alternative?
Update: Found this too: http://www.geopostcodes.com/
Just a small note. Most of these 3rd party city/state/zip to lat/lng databases are based on the US Census Tiger data. Start with the Census 2000, the zip code data was replaced with ZCTA - which is an approximation of zipcodes. Here's an explanation from the Census site (from 2011):
ZIP Code Tabulation Areas (ZCTAs™) are a new statistical entity developed by the U.S. Census Bureau for tabulating summary statistics from Census 2000. This new entity was developed to overcome the difficulties in precisely defining the land area covered by each ZIP Code®. Defining the extent of an area is necessary in order to accurately tabulate census data for that area.
ZCTAs are generalized area representations of U.S. Postal Service (USPS) ZIP Code service areas. Simply put, each one is built by aggregating the Census 2000 blocks, whose addresses use a given ZIP Code, into a ZCTA which gets that ZIP Code assigned as its ZCTA code. They represent the majority USPS five-digit ZIP Code found in a given area. For those areas where it is difficult to determine the prevailing five-digit ZIP Code, the higher-level three-digit ZIP Code is used for the ZCTA code. For more information, please refer to the ZCTA (FAQ) Frequently Asked Questions Web page.
The link below is an updated explanation (2013):
http://www.census.gov/geo/reference/zctas.html
The OpenGeoCode.Org team
ADDED 12/17/13: Our (FREE) state/city/zip dataset (CSV) can be found at the link below. It is derived from public domain "government" datasets:
http://www.opengeocode.org/download.php#cityzip
Google offers this as a lookup. Can you do ajax calls from your app?
It's called webservices.
http://code.google.com/apis/maps/documentation/webservices/index.html
You'd want to use the Google Geocoding api.
It's simple to use, make a call to this url:
http://maps.googleapis.com/maps/api/geocode/json?address=sydney&sensor=false
Change "address=" to whatever you need (ie the city state and zip code)
It can also reply in xml. just change json to xml
http://code.google.com/apis/maps/documentation/geocoding/
Example Result
{
"status": "OK",
"results": [ {
"types": [ "locality", "political" ],
"formatted_address": "Sydney New South Wales, Australia",
"address_components": [ {
"long_name": "Sydney",
"short_name": "Sydney",
"types": [ "locality", "political" ]
}, {
"long_name": "New South Wales",
"short_name": "New South Wales",
"types": [ "administrative_area_level_1", "political" ]
}, {
"long_name": "Australia",
"short_name": "AU",
"types": [ "country", "political" ]
} ],
"geometry": {
"location": {
"lat": -33.8689009,
"lng": 151.2070914
},
"location_type": "APPROXIMATE",
"viewport": {
"southwest": {
"lat": -34.1648540,
"lng": 150.6948538
},
"northeast": {
"lat": -33.5719182,
"lng": 151.7193290
}
},
"bounds": {
"southwest": {
"lat": -34.1692489,
"lng": 150.5022290
},
"northeast": {
"lat": -33.4245980,
"lng": 151.3426361
}
}
}
} ]
}
Then all you need to do is open up results[0].geometry.location.lat, and results[0].geometry.location.lng
[EDIT 8/3/2015]
The free non-commercial ZIP Code database I mentioned below has moved to softwaretools.com. Note: greatdata.com still has the premium ZIP Code data for enterprises.
Just a small note. Most of these 3rd party city/state/zip to lat/lng databases are based on the US Census Tiger data. [Andrew]
I'm a developer for a commercial ZIP Code Database company (GreatData). For low-end data, Andrew's recommendation is correct and if you know your way around census data, it's pretty easy to get it. Just know it may initially take some hours to get it right. If you prefer not to do the work yourself, you can get our free/non-commercial version here (it's pretty much what Andrew is suggesting with minor enhancements. It's updated every couple months).
For a really good explanation on what is missing in it (and more importantly, what's missing in most all low-end ZIP Code data that is based on census ZCTA data) versus a commercial grade, see here.
ps - regarding suggestions to use Google's API, I see this suggested a lot but unless you're displaying it in a google map, this violates Googles TOS. Specifically: "The Geocoding API may only be used in conjunction with a Google map; geocoding results without displaying them on a map is prohibited." You'll find StackOverFlow has several threads on those who's sites have been blocked.
Hope this is beneficial
This is an old thread, but I was looking for the same information recently and also came across this free database:
http://federalgovernmentzipcodes.us
Check out zcta. You can draw the geographic boundaries of a zip code using their data.
If you have a small number of US cities you can easily build up your own database from Google which gives you the co-ordinates straight on the search page without the need to follow any links, e.g. type:
chicago illinois longitude latitude
Related
{
"blogid": 11,
"blog_authorid": 2,
"blog_content": "(this is blog complete content: html encoded on base64 such as) PHNlY3Rpb24+PGRpdiBjbGFzcz0icm93Ij4KICAgICAgICA8ZGl2IGNsYXNzPSJjb2wtc20tMTIiIGRhdGEtdHlwZT0iY29udGFpbmVyLWNvbnRlbn",
"blog_timestamp": "2018-03-17 00:00:00",
"blog_title": "Amazon India Fashion Week: Autumn-",
"blog_subtitle": "",
"blog_featured_img_link": "link to image",
"blog_intropara": "Introductory para to article",
"blog_status": 1,
"blog_lastupdated": "\"Mar 19, 2018 7:42:23 AM\"",
"blog_type": "Blog",
"blog_tags": "1,4,6",
"blog_uri": "Amazon-India-Fashion-Week-Autumn",
"blog_categories": "1",
"blog_readtime": "5",
"ViewsCount": 0
}
Above is one sample blog as per my API. I have a JsonArray of such blogs.
I am trying to predict 3 similar blogs based on a blog's props(eg: tags,categories,author,keywords in title/subtitle) and contents. I have no user data i.e, there is no logged in user data(such as rating or review). I know that without user's data it will not be accurate but I'm just getting started with data science or ML. Any suggestion/link is appreciated. I prefer using java but python,php or any other lang also works for me. I need an easy to implement model as I am a beginner. Thanks in advance.
My intuition is that this question might not be at the right address.
BUT
I would do the following:
Create a dataset of sites that would be an inventory from which to predict. For each site you will need to list one or more features: Amount of tags, amount of posts, average time between posts in days, etc.
Sounds like this is for training and you are not worried about accuracy
too much, numeric features should suffice.
Work back from a k-NN algorithm. Don't worry about the classifiers. Instead of classifying a blog, you list the 3 closest neighbors (k = 3). A good implementation of the algorithm is here. Have fun simplifying it for your purposes.
Your algorithm should be a step or two shorter than k-NN which is considered to be among simpler ML, a good place to start.
Good luck.
EDIT:
You want to build a recommender engine using text, tags, numeric and maybe time series data. This is a broad request. Just like you, when faced with this request, I’d need to dive in the data and research best approach. Some approaches require different sets of data. E.g. Collaborative vs Content-based filtering.
Few things may’ve been missed on the user side that can be used like a sort of rating: You do not need a login feature get information: Cookie ID or IP based DMA, GEO and viewing duration should be available to the Web Server.
On the Blog side: you need to process the texts to identify related terms. Other blog features I gave examples above.
I am aware that this is a lot of hand-waving, but there’s no actual code question here. To reiterate my intuition is that this question might not be at the right address.
I really want to help but this is the best I can do.
EDIT 2:
If I understand your new comments correctly, each blog has the following for each other blog:
A Jaccard similarity coefficient.
A set of TF-IDF generated words with
scores.
A Euclidean distance based on numeric data.
I would create a heuristic from these and allow the process to adjust the importance of each statistic.
The challenge would be to quantify the words-scores TF-IDF output. You can treat those (over a certain score) as tags and run another similarity analysis, or count overlap.
You already started on this path, and this answer assumes you are to continue. IMO best path is to see which dedicated recommender engines can help you without constructing statistics piecemeal (numeric w/ Euclidean, tags w/ Jaccard, Text w/ TF-IDF).
I have this geojson for bike paths and I can't seem to figure out what the coordinates represents. I was expecting longitude and latitude, but those doesn't seem to be it.
Here is an example:
"geometry":{
"type":"LineString",
"coordinates":[
[
305049.6192401955,
5061761.891977313
],
[
305038.71863293805,
5061778.694289856
]
]
}
The source of data can be found here : bike path data
Unfortunately, it is only in french, but the data is under geojson section then click on "Explorer/Aller a la ressource"
Any help is appreciated,
Cheers,
Your data uses this spatial reference system: EPSG:2950, also known as: "NAD83(CSRS) / MTM zone 8". The units in this system is meters.
This information can be found at the top of your geojson file here:
"name": "urn:ogc:def:crs:EPSG::2950"
The crs stands for coordinate reference system (And for the trivia value, EPSG stands for "European Petroleum Survey Group").
This is likely confirmed by the shapefile from the same data source (based on your link), the .prj file states:
PROJCS["Ontario_MTM_Zone_8_east_of_75_degrees_W_NAD_83_datum"
The MTM refers to Modified Transverse Mercator. If you want to easily convert this data (unproject it essentially) to WGS84 to get longitude latitude pairs, you could download the shapefile, unzip it, and add all files to mapshaper.org, open the console and enter proj wgs84 and then export as a geosjon (or topojson).
Hi I am currently trying to utilize Watson's Visual Reco Service and I am getting a really weird response. After reading the documentaion I am guessing this photo doesn't meet the threshold value but I am not actually sure. Here's the a snippet of one of my response:
{ "classifiers": [{
"classes": [ { "class": "classname", "score": 0.522029 } ],
"classifier_id": "normalLeft_329785087", "name": "normalLeft" } ],
"image": "Testing_Left.zip/80589N.jpg"
},
{
"classifiers": [],
"image": "Testing_Left.zip/81860Y.jpg"
},
Another issue related to this is that sometimes my zip files aren't recognized by watson. Is there any particular reason why watson would have difficulties with zip files?
Thanks for the help in advance.
After reading the documentaion I am guessing this photo doesn't meet the threshold value but I am not actually sure.
That's exactly it. It means none of the classes in the classifiers applied to the image Testing_Left.zip/81860Y.jpg returned a score above the threshold. By default for custom classifiers, the threshold is 0.5 You can set the threshold parameter to 0 if you would like to see all each score per class per image.
Is there any particular reason why watson would have difficulties with zip files?
We have observed problems with some zip files with files or directories inside which have extended character sets, such as accented letters. Could that be the case for you?
I've been researching CLDR and IANA in order to find a centralized mapping of UN/LOCODEs to Olsen Timezones.
Ideally I would like to have for example:
+--------------+--------------------+
|un_locode |timezone |
+--------------+--------------------+
|USLAX | America/Los_Angeles|
+--------------+--------------------+
for every UN/LOCODE.
Are my nube skills failing me in understanding how to use these sources to reach my goal? (If so please help point me towards the scripting that would allow me to automate providing these mappings).
Or, do these sources fail to have the data correlation that I'm looking for? (If so please let me know if you have a reliable source).
We faced the exact same problem and hence had to provide a solution.
This solution involves linking the UN/LOCODES database with a geolocation/timezone database.
There are a few caveats to this approach that were captured by Matt Johnson's answer and the accompanying comments.
Namely:
the UN/LOCODE database of coordinates is not complete[1] and sometime has inaccurate data[2]
in some cases, a 1 to 1 mapping between the UN/LOCODE and a timezone is impossible due to the political nature of the timezones.
the two points above are worsened by the inaccuracy of free coordinates-to-timezone databases. It is helpful to get a dataset that also includes territorial waters so that ports timezones can be properly linked to the country they belong.
The following repository https://github.com/Portchain/un_locodes_sql contains the code to extract and link the data. It outputs a SQL file that can be imported into a PostgreSQL DB.
The geolocation/timezone data is based on the geo-tz[3] module which seems to source its data from timezone-boundary-builder[4].
Again, the list provided by our repository is of course incomplete and inaccurate. If you see any error in the data, please open a github issue and let's make an accurate, open source list of UN/LOCODE, coordinates and timezone information.
[1] For example, both Los Angeles and San Francisco, USA (USLAX & USSFO) are missing coordinates in the UN/LOCODE database.
[2] The petroleum port of Abu al Bukhoosh (AEABU) is situated in Abu Dhabi (UAE). Its coordinates in the UN/LOCODE database position the port right in the middle of the Persian Gulf (https://www.port-directory.com/ports/abu_al_bukhoosh/). When resolved, this causes the timezone to be unknown.
[3] https://github.com/evansiroky/node-geo-tz
[4] https://github.com/evansiroky/timezone-boundary-builder
The GeoNames free database of cities (which is available to download) provides: city names, latitude/longitude and, most importantly, timezone information. You can fairly quickly make your own database connecting this information with the UN/LOCODE code lists based on the name/country/coordinates.
I've not seen such a source. You could try to create one by mapping the lat/lon coordinates for those entries that have them, and correlating to IANA time zone by one of the methods listed here.
However, be sure to read Wikipedia's article about UN/LOCODE, especially describing errors with coordinates. Also note that many of the coordinates simply not in the data - why? I don't know.
The list of UN/LOCODE for the US is here, and show Los Angeles to be US LAX (not UNLAX). Its coordinates field is blank.
If you can find some other reliable source of UN/LOCODE to lat/lon, then you are in business. A quick search found that GeoNames claims to have this in their premium data subscription, but I haven't investigated further.
CLDR's map is here: https://unicode.org/reports/tr35/#Time_Zone_Identifiers
I saw CLDR tagged but not mentioned.
I want to implement a share feature in my app via an ActivityViewController. The data I want to share is just a simple dictionary of strings similar to this example:
My stores : [
Abercrombie : [
Jeans : 32
Polo : small
]
American Eagle : [
jeans : 31
Polo : extra small
]
I would think the Parse SDK would have prepackaged way to do this (similar to the validation email Parse gives you for free) but looking around in the IOS guide I really can't find anything? Can anyone point me in the right direction?