Major IP Address Blocks to Country Mappings - geolocation

Is there a standard of Major IP Address Blocks to Country mappings (only interested in knowing where a user is at the Country level)?
If there isn't a standard, then how often do they change and is paying for a database of IP Addresses to Regions recommended (simple application)?
If there is a standard, is there any documentation addressing this? Links? Resources?

Related

Enumerating Mainline DHT

I'm trying to understand why, historically, was a DHT (distributed hash table) a good system to use for decentralized p2p networks.
From an efficiency point-of-view: it's a fantastic way to have a bunch of nodes know how each node is reachable without complicated communication between them (using XOR distance in the case of mainline DHT).
From an anonymity point-of-view, I don't think that's the case: I'd like to know if it is possible to enumerate a DHT's nodes and whether protection from this discovery is a problem that a DHT should even solve.
For example: imagine a DHT with 100 nodes. By virtue of the DHT's design (at least Mainline DHT), a node would (please correct me if I'm wrong):
know that resource X is in node Y
Also know how to reach node Y
I know that a DHT crawler (like https://github.com/boramalper/magnetico) would be able to enumerate all nodes.
Is my reasoning correct, or did I misunderstand the attack vector?
Many thanks
Bittorrent makes no attempt to hide the IP address of any swarm member and on top of that some trackers expose APIs that allow fetching a list of all infohashes and then in turn fetching all IPs for each infohash. So in essence the set of bittorrent peers was mostly public anyway. The DHT adds another way to get this list.
This isn't unique to the bittorrent DHT, other p2p networks have similar properties.
Also note that participating in the DHT is not the same as participating in any particular torrent. A node may simply operate as a pure DHT node without any torrent client attached.

Dealing with Address Dimension and role playing it in multiple facts

A question in regards to Dimensional Modelling and Role Playing.
We have an Address dimension which is ‘role playing’. We receive Addresses from different sources including CRM systems. Addresses could also be of different types, such as Address of a company, individual etc. So from the Role Playing Address dimension, a single address could be tagged as the Address of a company and Address for billing in different facts.
There are different fact tables and they have different keys which would hold address data. Fact_Sales would have keys such as Customer_Address_Key, Company_Head_Office_Address_Key. So I believe we are kind of role playing the addresses in these facts.
Question:
Our lead Data Architect has a concern around this.
• We are capturing a lot of addresses from a number of systems. How would we identify where these addresses came from, and what type of addresses are these without going to the fact tables.
I would still suggest going through the facts, but I would like to consult the wider community over there before putting my feet firmly on the ground.
Is there any better way to do this, perhaps a separate table which defines the combination of Address_Key, Address_Type_Key and Source_Key.
Please let me know if you need any further clarification or pictures etc.
Cheers
Nithin
It sounds like in the situation you have that you should just include columns for the type of address and the source of the address in the address dimension itself, so it stands alone and you don't have to go via a fact to know what kind of thing it is. You wouldn't need a separate table with keys as you mentioned- the data can safely be denormalised in the dimension.
As an aside:
Although many people do have an address table which is separate, the approach from the Kimball Group would not be to have have 'address' or location dimension as a multi purpose dimension that stands alone- it provides part of what describes something else (like a company, or a customer, or even a 'delivery location'). Instead you'd have the dimension (e.g Customer) and Within that dimension you'd have a number of Address fields, named appropriately (CustomerAddress1, CustomerAddress2, CustomerCity). You may choose to administer the address centrally for convenience behind the scenes, with the other dimensions formed by means of views or further ETL, but in the presentation of the star schema the address table would not be seen separately. The addresses are still conformed in that they're called the same thing and mean the same thing.
However plenty of people go with a separate Address table as you've done
It is very reasonable to include source as an attribute of the dimension. The bigger question is how do you select the "Current" address for a customer if you have multiple sources. That is where things will get tricky.
You need Current Customer Address to mean the same thing throughout your business regardless of the source from which it was captured. I would refer to this as a conformed dimension. You need to 'conform' all of your addresses sources to the same structure so you can use them as a single dimension.
In the large majority of your facts, the source of the address is irrelevant. You are only needing to know that it is the current address. You may have a smaller model that can provide analysis on the source of the customer address.
The hard part is deciding which source is most trustworthy when the address is in multiple sources. You need to consider the source and the date of the last update. In other words, is the primary source still preferred when a less trustworthy source has a more recent update.
Type is usually just an attribute of the address. However, if your address can be used for multiple things (physical, shipping, billing, etc), that may need to be defined by the role-playing relationship. For other analytics on address, you can break city/state & zip into separate dimensions if you need to break things down by geographic location. I would recommend City & State be used as a single entity. If you treat City as separate from State, you'll get funny results when slicing by cities that exist in more than one state.

Which approach is better -- Multiple SSIDs or Single SSID

I am setting up wireless network in an university where we have a broad base of users type like Students (some are of graduation course, some of PG, Ph.D students and others); supporting staff, faculties, resident staff (along with their families).
I have to design the wireless network keeping all those user base in mind.
I have two options for providing wireless access to the users;
I need inputs (pros and cons) on these options -
OPTION I
Separate SSID for each user category (like separate SSID for IT students, separate SSID for commerce students; and so on).
If i go with this approach, i will ends up in creating roughly 20 SSIDs and in this approach i will be able to apply policies based on user category and can also limit the time period for different user group.
OPTION II
Second option, i am thinking about creating single SSID for all the users (or may be 2/3 SSID).
In this approach, i will need not be required to create 'n' number of SSIDs and will only needs to advertise ONE SSID for all the users (and this will help me in keeping the things simple).
But what i will miss in this approach is the granularity and will not be able to apply different policies for different user base.
I am open for any other approach also and i want to do the things in best possible manner.
Please suggest with which approach i should go ahead and if possible, explain pros and cons of the same.
Option with large number of SSIDs is undesirable because access points will broadcast beacons for each SSID 10 times per second on the lowest mandatory speed. It may consume significant airtime especially if you need to support legacy 802.11b/g standards. There are recommendations to use no more than 3-5 SSIDs on any single AP (link1, link2).
Depending on the functionality of the network equipment different policies may be applied on a per-client or per-user basis.
You could differentiate user groups by using a radius server and certificates. I believe some AP can even use this to set specific VLANs. You get a lot of flexibility but you need to assign a certificate to every potential client.
or you could assign each user group to a different subnet thanks to the DHCP server (but that does not sound very secure as people could manually change their IP to get more priviledges)

Geo-location by Zip/Postal Code

I'm looking for a service to get a latitude/longitude dynamically by zip/postal code.
I also need this service to give me an address (city/state/country) by providing an IP Address.
I will be using this service on my website and don't want to download and maintain a database.
I looked at a few services, some too expensive and some free with a max amount of daily/monthly lookups.
What are some good free services and what are some good paid services (Not too expensive) that allow for a large amount of queries?
I am using asp.net c# with MS SQL Server 2008 or later.
Thanks in advance.
The US zip codes are free (maybe just not very well maintained):
http://www.census.gov/geo/www/tiger/tigermap.html#ZIP
Also see a (whole world) crowd sourcing project:
http://www.freethepostcode.org/
The database of the UK zip code location was leaked last year or so. Or maybe it was made public by some government scheme I can't remember. It is definitely available here: http://www.freepostcodes.org.uk/
For lat/long: Google and Yahoo allow for several thousand queries per day, at least the last time I used them.
For GeoIP lookup, I can't say. In the past, I've used aggregate data from Google AdWords. This may be true of other advertising networks, or some may give you info per user.

Geocoding - Grouping multiple addresses into major cities

Been hunting through previous questions on Geocoding and while many are helpful I'm not finding one to my needs.
I need to group multiple addresses to the nearest city centers. My only address information is city, country, and state (if applicable). For example, all addresses in San Francisco and within miles should be listed as San Francisco. I'll need to know the count of addresses rolled-up to San Francisco.
I'm open to suggestions on how to approach this. I don't particularly want to manually identify a list of major cities if possible. Is there a list of these I can start from?
What about using an average lat/long location of all addresses within miles? Granted the final 'center point' would move around a bit as the average is computed but perhaps that is an approximate solution. Not quite sure how to do this so again, appreciate input!
Great question. I think more generally what you want is some standard way of rolling up cities into metropolitan areas and you're exactly right that you don't want to create or maintain a list of your own.
Yahoo! GeoPlanet provides a geographic ontology with a pretty thorough hierarchy. If you were happy with standard administrative divisions (like county or state), it would be easy, but I think you're looking for something a little more general than that. But GeoPlanet also provides zones, often -- in the US -- including the town's Metropolitan Statistical Area.
If you have each city name, you could use GeoPlanet to find any MSA zones that the city belongs to and roll up to that (and GeoPlanet provides a bounding box and centroid for each MSA so you can easily place it on a map). For rural towns that aren't a part of a US census bureau MSA you may not need to group it to the nearest city (which may be far away anyway).

Resources