Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I would like to understand better the current ipv4 address space and need help finding data about the ip allocation for each country. If possible, going further for each city, ISP and organization.
I understand that the IPv4 addresses are controlled by 5 major Regional Internet Registries that together form the Number Resource Organization. Each RIR manages addresses for the following countries:
African Network Information Centre (AfriNIC) for Africa
American Registry for Internet Numbers (ARIN) for the United States, Canada, several parts of the Caribbean region, and Antarctica.
Asia-Pacific Network Information Centre (APNIC) for Asia, Australia, New Zealand, and neighboring countries
Latin America and Caribbean Network Information Centre (LACNIC) for Latin America and parts of the Caribbean region
Réseaux IP Européens Network Coordination Centre (RIPE NCC) for Europe, Russia, the Middle East, and Central Asia
-- from wikipedia
Being 5 separate organizations (each one with a different commercial presentation on their websites), I could not find a centralized place with an exhaustive map with all the allocated blocks
I found this site with the ip blocks and assigned countries. Thats part of what I want. Also I dont know if this is reliable
Also, this xkcd comic plays with the same data that I am looking for. The comic is probably based on this interesting image. According to CAIDA (The Cooperative Association for Internet Data Analysis), the image is a result of 2 months of ICMP exploration back in 2006:
A visualization of IPv4 addresses that responded to ICMP (ping) packets during a two-month (very slow) scan of the IPv4 address space. Some hosts do not respond to the probes due to firewalls, NAT boxes, and ICMP filtering. Thus, the data and map give us a lower bound on IPv4 address utilization.
From the same site, they talk about the Census data source:
The census data was provided by Information Sciences Institute at the University of Southern California. Internet Addresses Survey dataset, DHS PREDICT ID USC-LANDER/internet_address_survey_it15w-20061108. Traces taken 2006-11-08 to 2007-01-08. Provided by the USC/LANDER project. http://www.isi.edu/ant/lander/. Additional support comes from NSF grant SCI-0427144 and ARIN but does not necessarily reflect the opinions of any of the sponsoring organizations.
I tried finding the pointed dataset and traces but had no success.
I understand that this is essential for the current geolocation solutions, so I would like to understand where their data come from
There are several ways to get this information:
A list of allocated or assigned IP addresses and AS number is published by each RIR, and they provide mirrors for each other's data. Take a look at ftp://ftp.ripe.net/pub/stats/. Look for the delegated-* files. They contain lines like:
ripencc|NL|ipv4|37.77.56.0|2048|20120201|allocated
ripencc|NL|ipv6|2a00:8640::|32|20120130|allocated
The country codes listed here are very rough indications of where an address is going to be used. Geolocation companies might use this as a basis, but they have much more accurate data from cooperation with big web stores etc.
PS: the RIRs are not companies, they are membership organisations. Everybody who needs IP addresses can become a member, get voting rights at the AGM, etc. Their policies are determined by an even wider community where everyone can participate (see http://www.ripe.net/ripe/policies).
Related
We've been needing to implement some geo-location service integration for some of our products. There are a lot of 3rd party companies that offer both free and paid geo location services and databases that are updated sometimes constantly and sometimes once a month.
Where do these services get the geo location information for their databases they distribute?
depends on the type of geolocation. if you mean ip<>country/city [eg www.maxmind.com] - basic information can be found in the whois records maintained by the network operators/regional internet registries. for instance: http://tools.whois.net/whoisbyip/?host=64.34.119.12
i suppose it's possible to clean/normalize this data semi-automatically.
if you mean mac address<>geographical coordinates - most probably just spying on the users [with their consent.. or without] - either gathering the information from applications running on mobile devices with gps and wifi onboard or by 'war driving' around - like the google earth street view teams.
We get our data from public enities:
ISO: International Standards Organization
BGN: US Board on Geographic Names
UNGEGN: United Nations Group of Experts on Geographic Names
PCGN: UK Permanent Committee on Geographic Names
FAO: United Nations Food and Argriculture Organization
FIPS: US Federal Information Processing System
ITU: The International Teleccmmunication Union
Then update our MySQL database monthly. Our service is currently FREE!
http://nwstartups.com/api/geo/country.php
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 23 days ago.
The community reviewed whether to reopen this question 23 days ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I'm trying to find a webservice that will allow me to get a County name (not Country) for a specific Lat/Long. I would be performing the lookup within a server application (likely a Java application). It doesn't have a be a webservice if there is some library out there I suppose, but I would like up-to-date information. I have looked around quite a bit for an API that supports this, but so far I haven't been able to find one that works. I have tried the Yahoo APIs like so:
http://where.yahooapis.com/geocode?q=39.76144296429947,%20-104.8011589050293
But it doesn't populate the address information. I've tried with some of the "flags" options there too to no avail.
I've also looked around at Googles APIs as well, but I've read multiple places that they don't populate the County.
So does anyone know of any APIs that will take a Lat/Long and return the County associated with that location? And if you have any examples, that would be great.
I'd also like to know which APIs allow for use in a commercial application. A lot of the data I've found says that you can't use the data to make money. I might be reading those wrong, but I'm looking to build a service that I'd likely charge for that would use this data. So I'd need options. Maybe free services while I'm exploring options, and pay services down the road.
Just for completeness, I found another API to get this data that is quite simple, so I thought I'd share.
https://geo.fcc.gov/api/census/
The FCC provides an Block API for exactly this problem and it uses census data to perform the look up.
Their usage limit policy is (From developer#fcc.gov)
We do not have any usage limits for the block conversion API, but we do ask that you try to spread out your requests over time if you can.
Google does populate the county for your example,
http://maps.googleapis.com/maps/api/geocode/json?latlng=39.76144296429947,-104.8011589050293&sensor=false
In the response, look under the key address_components which contains this object representing "Adams" county,
{
long_name: "Adams"
short_name: "Adams"
-types: [
"administrative_area_level_2"
"political"
]
}
Here's from the Geocoding API's docs,
administrative_area_level_2 indicates a second-order civil entity below the country level. Within the United States, these administrative levels are counties. Not all nations exhibit these administrative levels.
Another option:
Download the cities database from http://download.geonames.org/export/dump/
Add each city as a lat/long -> Country mapping to a spatial index such as an R-Tree (some DBs also have the functionality)
Use nearest-neighbour search to find the country corresponding to the closest human settlement for any given point
Advantages:
Does not depend on aa external server to be available
Much faster (easily does thousands of lookups per second)
Disadvantages:
May give wrong answers close to borders, especially in sparsely populated areas
You may want to have look at Tiger data and see if it has polygons containing the county name in an attribute. If it does the Java Geotools API lets you work with this data. You will be performing point in polygon queries for the county polygons followed by a feature attribute look-up.
Maybe this is a great solution.It is in a json format.I always use this in my projects.
http://maps.google.com/maps/geo?ll=10.345561,123.896932
And simply extract the information using php.
$x = file_get_contents("http://maps.google.com/maps/geo?ll=10.345561,123.896932");
$j_decodex = json_decode($x);
print_r($j_decodex);
Are there any open source/commercial libraries out there that can detect mailing addresses in text, just like how Apple's Mail app underlines addresses on the Mac/iPhone.
I've been doing a little online research and the ideas seem to be either to use Google, Regex or a full on NLP package such as Stanford's NLP, which usually are pretty massive. I doubt iPhone has a 500MB NLP package in there, or connects to Google every time you read an email. Which makes me to believe there should be an easier way. Too bad UIDataDetectors is not open source.
I know this question has been asked before, but there were no conclusive answers, so here's my try.
As for Python you can try Pyap:
https://pypi.python.org/pypi/pyap
It currently supports US and Canadian addresses
Parsing addresses isn't a science. At my office we have been dealing with address parsing for years and the problem is that there aren't any rules about what constitutes a valid address. We use the USPS address database for cleaning addresses which is actually pretty fast and way more accurate than we were ever able to get on our own. It gets us 98% accuracy where as before we got about 90% cleaned addresses.
The bigger problem with address parsing tends to be that people don't input the address the same way. The same address might be in all the following forms.
128 E Beaumont St
128 East Beaumont Street
128 E Bmt St
128 Beaumont Street
128 Highway 88
The third one looks totally wrong but people will type that sometimes. Sometimes a street is also a highway. There are a bunch of possibilities. Just try to catch 90% and you accept that is as good as it gets for address parsing.
Extractiv provides commercial NLP powered by Language Computer Corporation that can parse entities and relations in either uploaded documents or from web crawls. The former service utilizes a REST API. I dropped this URL in, and it extracts 4/5 of the addresses. Note, having them strung like that together makes them especially difficult.
Search for "address" in this JSON output:
http://rest.extractiv.com/extractiv/?url=https://stackoverflow.com/questions/5099684/detect-parse-mailing-addresses-in-text&output_format=json
One of them:
{
"id": 11,
"len": 17,
"offset": 1557,
"text": "128 E Beaumont St",
"type": "ADDRESS"
},
(Note: if you use the HTML output, which is more for demos, it filters out non-sentence content, which is why I showed the JSON instead).
Disclaimer: I work at Extractiv.
Update:
Extractiv is no more.
You can actually get extremely high accuracy as Drew mentioned by extracting the addresses and then comparing them against the USPS data. Getting a DVD from the USPS yearly will certainly work but doesn't factor in the addresses that change. For that, you would want a more up-to-date version. The USPS publishes it's updated address data (in proprietary format) monthly so that would be a good source of authoritative addresses.
On top of that, using an address validation service (after you extract the address data) will standardize the addresses for you and then check them for deliverability and/or vacancy status. As Drew mentioned, the same address can be written in many different ways that still work. However, the USPS will always use the standardized format.
In order to do what you are looking for programmatically, you'll definitely want an API, although list processing services are also available.
SmartyStreets has a free address validation API called LiveAddress that will standardize, verify, and then validate any US postal address. In the interest of full disclosure, I'm the founder of SmartyStreets.
Is there a good physical address to GeoLocation conversion database in the UK? I am trying to use this to build a globrix style search box http://www.globrix.com/ for a web application. Any pointers will be nice. I have been searching for hours. I have found several that convert UK Postcodes into Geolocation. But I need the addresses listed as on Globrix.
The Google Maps API provides a geocoder webservice that you can actually use independently of Google Maps itself. You send it the address/postcode, and it responds with a lat/long plus disambiguated addresses. We use it server-side in the UK to do address lookup. It's incredibly quick, too.
http://code.google.com/apis/maps/documentation/geocoding/index.html
http://www.postcodeanywhere.co.uk should be able to help with this. Alternatively, you can buy the "PAF" (Postcode Address File) from the Royal Mail, but it is expensive.
Update for information relating UK geolocations in 2020. Since 2009:
Google's Geocoder has gotten an order of magnitude more expensive in 2018. It's ~0.5c per search with no free tier
Office for National Statistics have released a free postcode directory called ONSPD. This means if you have the postcode of your address, you can resolve a geolocation accurate to the postcode centroid (this may be 10-100m or so out). There's a free public service API available at https://postcodes.io which allows you to forward or reverse geocode a postcode. There are also public docker data and application images which allow you to host this easily
If you're interested in Rooftop accurate geocodes, a change in Ordnance Survey licensing in 2020 has meant its much simpler and cheaper to access geolocations for almost every premise in Great Britain from Ordnance Survey by combining it with Royal Mail PAF (Postcode Address File). As of September 2020, I think https://ideal-postcodes.co.uk is currently the only company to offer complete and authoritative rooftop geolocations under these new rules. It's likely other PAF vendors will catch up over the coming years.
Disclaimer: I'm the author of postcodes.io and work for ideal-postcodes.co.uk
I am working on integrating geolocation services into a website and the best source of data I've found so far is MaxMind's GeoIP API with GeoLite City data. Even this data seems to often be questionable though. For example, I am located in downtown Palo Alto, but it locates my IP as being in Portola Valley, which is about 7 miles away. Palo Alto has a population of 60k+, whereas Portola Valley has a population of less than 5k. I would think if you see an IP originating somewhere around there it would make more sense to assume it was coming from the highly populated city, not the tiny one. I've also had it locate Palo Alto IPs completely across the country in Kentucky, etc.
Does anyone know of any better sources of data, or any tools/technologies/efforts to improve the accuracy of geolocation efforts? Commercial solutions are fine.
Where an IP comes up at the wrong end of the country, you probably won't find a better match elsewhere because it's probably an ISP that uses one group of IPs for customers in a wide area. My favourite example is trains here in the UK where the on-board wifi is identified as being in Sweden because they use a satellite connection to an ISP in Sweden.
A commercial supplier may be able to afford to spend more time tracking down the hard cases, but in many cases there just won't be a good answer to give you. They may, however, give you a confidence factor to tell you when they're guessing. I've heard good things about Quova, though I've never used them.
Assuming that you've got the best latitude and longitude that you can get (or can afford), then you're left dealing with cases where they pick the closest city rather than a more likely larger city nearby. Unfortunately I don't have the code to hand, but I had some success using the data from geonames to pick a "sensible" city near a point. They list lat/long and population, so you can do something like
ORDER BY ( Distance / LOG( Population ) )
You'd need to experiment with that to get something with the right level of bias towards larger cities, but I had it working quite nicely taking the centre of a Google Maps view and displaying a heading like "Showing results near London..." that changed as you moved the map.
I am not sure if this will help, but here is a site that has done a pretty good job of IP mapping. Maybe you could ask them for help :) seomoz.org
A couple of sites I saw referenced recently for free GeoIP services are
WIPmania
hostip.info