Exactly how accurate is IP Geolocation? - geolocation

I'm setting up a iPhone tracking system for my friends, so they can submit their location to my website by their iPhone, anywhere, anytime - by WiFi or cellular data.
The website will use Google Maps for their coordination's so that my other friends can track where they are, however, it is the accuracy of the IP to coordinates to Google Maps is what I'm concerned about, exactly how accurate is it to use Google Maps that would track down the locations by an IP address?
I was thinking about 95%, but this was tested in a village which was quite fairly accurate, but what happens if it was in a city? Would this cause unaccurate locations?
Any kind help appreciated.

IP geolocation is really hit-or-miss, depending on both how the user's ISP assigns IPs and on the IP geolocation database you're using. For instance, I made a simple PHP script, IP2FireEagle, which looks up your IP. I found that the database kept placing me 10+ km to the west of where I really was. Updating my entry in Host IP wasn't the greatest, as it soon got reverted, presumably by someone also occasionally assigned that IP by my ISP! That being said, I found that Clarke has very accurate coordinates (not that this it's using IP geolocation per se but rather Skyhook's API and their WiFi geolocation database).
If it's a website for your friends and you know they have iPhones, I would suggest using its browser's support for navigator.geolocation.getCurrentPosition(). That is, get the location via Javascript and submit it to your server via an AJAX call. Even better since you want to use Google Maps, they give you a short tutorial on how get your friends' locations and then update a map.

Excerpt From:
http://www.clickz.com/822881
IP targeting has been around since the early days of ad serving. It's not very hard to write code that will strip the IP address from a request, compare it to a database, and deliver an ad accordingly. The true difficulty, as we shall see, is building and maintaining an IP database.
One of the first applications of information in an IP database was targeting to specific geographic regions. Most commercial ad management systems have IP databases that can make geographic targeting possible. However, there are a couple weaknesses in this method. The first (and biggest) problem is that, for various reasons, not all IPs can be mapped to an accurate location.
Take all the IPs associated with AOL users, for instance. Anybody who has seen a WebTrends report knows that all AOL users appear to be coming from somewhere in Virginia. This is caused by AOL's use of proxy servers to handle their web requests.
In the interest of saving space, we won't get into the reasons why AOL makes use of proxy servers. The important thing is that AOL does use them, and as a result, all its users appear to be accessing the web from Virginia. Thus, it is impossible to attach meaningful geographic location data to an AOL IP, and those IPs must be discarded from any database that wants to maintain a reasonable degree of accuracy.
Other ISPs and networks may use a method known as dynamic IP allocation for its users. In other words, a user might have a different IP address every time he visits the Internet. You can see how this might affect the accuracy of a database.
But the real difficulty in discerning geography from an IP address has to do with the level of specificity that a media planner might expect from this targeting method. The first few geo-targeted campaigns that I put together early in my career had to be accurate to the ZIP code level. This level of specificity is not practical via IP targeting.

Related

Where does raw geoip data come from?

This question is a general version of a more specific question asked here. However, those answers were unusable.
Question: What is the raw source for geoIP data?
Many websites will tell me where my IP is, but they all appear to be using databases from fewer than 5 companies (most are using a database from MaxMind). These companies offer limited free versions of their databases, but I'm trying to determine what they're using for their source data?
I've tried using Linux/Unix commands such as ping, traceroute, dig, whois, etc., but they don't provide predictably accurate information.
Preamble: I believe this is actually a very valid question for SO website as understanding how such things work is important to understanding how such datasets can be used in software. However the answer to this question is rather complex and full of historical remarks.
First - it is worth mentioning that there is NO unified raw geoip data. Such thing just does not exist. Second - the data for this comes from multiple resources and often is not reliable and/or outdated.
To understand how that comes to be one need to know how Internet came into existence and spread around the world. Short summary is below:
IANA is a global [non-profit] organization which manages assignment of IP blocks to regional organizations: https://www.iana.org/numbers This happens upon request and regional organization requests specified block size
Regional organizations may assign those IP blocks to either ISP directly or to country level sub-organizations (who would assign that to ISP then).
ISP assigns IP addresses to local branches etc.
From above you can easily see that:
There is no single body which is responsible for IP block assignment to this or that location
Decisions how to (and whether to) release information about which IP belongs to which location are not taken uniformly and instead each organizations decides how to (and whether do it at all) release that information
All of above creates a whole lot of mess. It takes a lot of dedication and long time to obtain, aggregate and sort this data. And this is why most up-to-date and detailed geoip datasets are commercial commodity.
Whoever takes on a challenge of building their own dataset should be able to obtain this information directly from end users (ISPs), because higher level organizations do not know to which location each IP address will be assigned. Higher level organizations only distribute IP blocks among applicants (and keep some reserve for faster processing) and it is a lowest level organizations who decide which location gets which IP address and they are not obligated to release this information publicly.
UPD:
To start building your own dataset you can begin with this list of blocks and how they are assigned

How do I get a company name from an IP address?

I've searched around a while and all of the IP --> Hostname things actually only end up giving an ISP. Is there something that goes beyond that? I'm only finding pay services that go further and not something that I can just tap a nice API and programmatically do it.
http://ipinfo.io/ just ends up showing ISP for many of what I've sampled. I saw that guy posts here fairly often.
whoisvisiting.com runs about $99/mnth for what my company site does but in that range I'd rather code something. I'm using the free trial right now and have the IP's logging to analytics so I'm looking at what it returns, what IIS returns as the hostname and what a couple sources like ipinfo.io show and whoisvisiting somehow actually shows what I'm looking for.
There's no way to do so. There's no central registry for which company has which address ranges. In fact, most companies will just be identifiable via their ISP.
Your paid services might be scams, by the way, or just work on very few select companies and universities that actually act as autonomous entities in the IP sense.
It is unlikely to differentiate between ISP or company IP address. Some geolocation providers will use range size or level of allocation to name ISP or business. However, this approach is not always accurate.

How does DNSBL is connected to geo-location?

I've read in Wikipedia that one of the ways to obtain geolocation information for a given IP is done using DNSBL. The following link is: http://en.wikipedia.org/wiki/Geolocation_software#Data_sources
Could someone explain me how this is done? And in general, what is DNSBL rather than a banning list?
DNSBL is a blacklist/database based on dns. DNS is just your api to get a specific result. Others could be HTTP or a simple local file.
IP needs routing and thus the physical machines doing that are placed in certain locations. Knowing that makes it possible to collect data where the routing points are and thus get to closest location of a certain IP address. (Knowing that there are 5 big co
http://en.wikipedia.org/wiki/Geo_targeting
http://en.wikipedia.org/wiki/LOC_record
http://en.wikipedia.org/wiki/Regional_Internet_registry

Is it safe to assume uniform geo-ip resolution for same first-16-bit IP addresses?

I have a geo-sensitive webapp for which I send a request's IP to a remote, commercial ip-to-location service, and get back the country, city, ISP, etc. for the IP.
I currently cache the IP lookups in my database in order to make subsequent lookups faster and free (the commercial service charges per lookup).
I wonder if I can further optimize my caching by assuming that the first 16 bits (i.e. the aaa.bbb in a aaa.bbb.ccc.ddd addresss) always have a uniform location. That way I can have at most 2^15 records to cache.
I don't mind so much about uniformity of ISP but that info would be helpful as well.
I'd recommend going down to at least /24 resolution. Oftentimes a /16 will tell you the ISP but not the city, or vice versa.
If you want a good idea of what the maps really look like, you can spend 49 USD on a developer license to Geobytes's GeoNetMap database. A developer license allows you to download the entire map from IP blocks to locations as a bunch of CSV files, but doesn't cover deploying it onto a production server. Geobytes has the added advantage of being entirely local, so lookups are liquid fast.
MaxMind also has a free downloadable map offering, although it is somewhat cut down from the full map, producing approximately double the error rate.
No, it's not safe. For example, if you do a GeoIP lookup on 216.34.181.45 (Slashdot) you get Mountain View, California. If you do a lookup on 216.34.1.1 you get Chesterfield, Missouri.
With respect to your caching, keep in mind that IPs can move around spatially. If an ISP goes bankrupt and its block gets bought by someone else, that block of IPs will move location.

How does Google calculate my location on a desktop?

Right this is confusing me quite a bit, i'm not sure if any of you have noticed or used the "my location" feature on google maps using your desktop (or none GPS/none mobile device). If you have a browser with google gears (easiest to use is Google Chrome) then you will have a blue circle above the zoom function in Google Maps, when clicked (without being logged into my Google Account) using standard Wi Fi to my own personal router and a normal internet connection to my ISP, it somehow manages to pinpoint my exact location with a 100% accuracy (at this moment in time).
How does it do it? they breifly mention it here but it doesn't quite explain it, it says that my browser knows where i am...
...i am baffled, how?
I am intrigued because I would love to integrate it in the future of my programming projects, just like some background understanding and it doesn't seem too well documented at the moment.
I am currently in Tokyo, and I used to be in Switzerland. Yet, my location until some days ago was not pinpinted exactly, except in the broad Tokyo area. Today I tried, and I appear to be in Switzerland. How?
Well the secret is that I am now connected through wireless, and my wireless router has been identified (thanks to association to other wifis around me at that time) in a very accurate area in Switzerland. Now, my wifi moved to Tokyo, but the queried system still thinks the wifi router is in Switzerland, because either it has no information about the additional wifis surrounding me right now, or it cannot sort out the conflicting info (namely, the specific info about my wifi router against my ip geolocation, which pinpoints me in the far east).
So, to answer your question, google, or someone for him, did "wardriving" around, mapping the wifi presence. Every time a query is performed to the system (probably in compliance with the W3C draft for the geolocation API) your computer sends the wifi identifiers it sees, and the system does two things:
queries its database if geolocation exists for some of the wifis you passed, and returns the "wardrived" position if found, eventually with triangulation if intensities are present. The more wifi networks around, the higher is the accuracy of the positioning.
adds additional networks you see that are currently not in the database to their database, so they can be reused later.
As you see, the system builds up by itself. The only thing you need is good seeding. After that, it extends in "50 meters chunks" (the range of a newly found wifi connection).
Of course, if you really want the system go banana, you can start exchanging wifi routers around the globe with fellow revolutionaries of the no-global-positioning movement.
It's a lot more simple that you think. You've signed into both your mobile and Chrome on your desktop using the same Google account. Google simply expect you will have your mobile with you most of the time. They take the location data from your phone and assume the location of your current desktop session is the same.
I proved this by RDPing into my Windows machine at home from work and checking Google maps remotely. It show my location as the same as Chrome on Linux at work.
If you don't have a mobile that is signed into Google then all they can do is lookup GeoIP data for the IP address assigned by your ISP. It will typically be wildly inaccurate.
They use a combination of IP geolocation, as well as comparing the results of a scan for nearby wireless networks with a database on their side (which is built by collecting GPS coordinates alongside wifi scan data when Android phone users use their GPS)
I've finally worked it out. The biggest issue is how they managed to work out what Wireless networks were around me and how do they know where these networks are.
It "seems" to be something similar to this:
skyhookwireless.com [or similar] Company has mapped the location of many wireless access points, i assume by similar means that google streetview went around and picked up all the photos.
Using Google gears and my browser, we can report which wireless networks i see and have around me
Compare these wireless points to their geolocation and triangulate my position.
Reference: Slashdot
According to Google Maps' own help:
Rejecting the WiFi networks idea!
Sorry folks... I don't see it. Using WiFi networks around you seems to be a highly inaccurate and ineffective method of collecting data. WiFi networks these days simply don't stay long in one place.
Think about it, the WiFi networks change every day. Not to mention MiFi and Adhoc networks which are "designed" to be mobile and travel with the users. Equipment breaks, network settings change, people move... Relying on "WiFi Networks" in your area seems highly inaccurate and in the end may not even offer a significant improvement in granularity over IP lookup.
I think the idea that iPhone users are "scanning and sending" the WiFi survey data back to google, and the wardriving, perhaps in conjunction with the Google Maps "Street View" mapping might seem like a very possible method of collecting this data however, in practicality, it does not work as a business model.
Oh and btw, I forgot to mention in my prior post... when I originally pulled my location the time I was pinpointed "precisely" on the map I was connecting to a router from my desktop over an ethernet connection. I don't have a WiFi card on my desktop.
So if that "nearby WiFi networks" theory was true... then I shouldn't have been able to pinpoint my location with such precision.
I'll call my ISP, SKyrim, and ask them as to whether they share their network topology to enable geolocation on their networks.
I know you can look up IP address to get approximate location, but it's not always accurate. Perhaps they're using that?
update:
Typically, your browser uses
information about the Wi-Fi access
points around you to estimate your
location. If no Wi-Fi access points
are in range, or your computer doesn't
have Wi-Fi, it may resort to using
your computer's IP address to get an
approximate location.
It is possible get your approximate locate based on your IP address (wireless or fixed).
See for example hostip.info or maxmind which basically provide a mapping from IP address to geographical coordinates. The probably use many kinds of heuristics and datasources. This kind of system has probably enough accuracy to put you in right major city, in most cases.
Google probably uses somewhat similar approach in addition to WiFi tricks.
So Google keep records of Wifi router location by using any cellphone
GPS that connected to that router when you use Google maps or
location on cellphone. then google knows every device that connected
to that Wifi router uses the same location.
when GPS off or no cellphone connected to router Google uses IP
geolocation

Resources