How can extract location from text - geolocation

I am working on web mapping application and i am facing issue, scenario is that.
User can post address and address can be in any format like
Street, City,State, Country or Country Street State City
I have mentioned just two format but it would be in any format.
My task is extract City Name, Street, Country from address, problem is that multiple city name, street may be exist so how can i do this.
I have all information about locations in database like city,country,street,area code.

I don't believe there is an easy way to do what you want here. It seems the user can give the data in such a way that it is basically free form and there is no way to distinguish from the input data what is a street name vs what is a city name ect. Unless you enforce some sort of format nothing is going to work every time.
A different approach may be to take the input remove such things as "St" and "Street" ect and then search the database for each of the given names against city, street and county ect. From the results you will probably be able to determine what is the most likely address and get the user to confirm.
A lot of government websites appear to use the approach I have just given you above when registering for things. (i.e. Voting) It is not perfect however.

Related

Country and Customer Dimension

I hesitate whether I should add a Country_Dimension or not since I already have a Customer_Dimension which contain some redundant fields such as:
continent_name
country_name
postcode_#
Those could be two completely different things. A country name present in your Customer_Dimension will probably represent the country of an address your customer is using. Most likely the country he's living (or has lived) in. It can change over time since customers can switch addresses.
A dimension representing Countries will do exactly that, it represents countries. I think you first have to decide what the use of your dimension should be.

Identifying and relating cities from different sources

I have different providers which passes me an excel with different cities, in each city they use some special code for their operations and more data useful to my business.
The problem is that I have a mess with all these cities:
I have my own cities in my database, around 9000 records.
Provider A gives me his excel or webservice to get around 6000.
Provider B gives me another 5000.
Provider C ... etc
Some of the cities given by my providers are already in my database and I only have to update the required data I need.
Otherwise, I have to insert that new city in my database.
And this, each time a provider gives me an update of these cities.
Well, the main problem is that I call a city differently from them, and they differently from each other... how to know if I already have that city or I have to create a new one since we use different names?
The way I see it, I only can achieve it manually. Comparing their cities with mines.
Of course, it's too much work so I made my own script, and implementing the levehnstein function for the database, I can automatically see the more coincident ones and select them by a click. The script does the rest (updates their special operation code for that city into my corresponding city stored in my database).
Even with it, I still feel like I'm missing something. If there was an unicode for those cities this would be much easier and automatic, but I don't have any code which identifies these cities more than my table identifier. Same for my providers, despite some of the use to provide me the postal code among the cities their provide, but not all.
Is there any better solution than mine for this? Any universal code that you usually use or any other aproatch?
Edit:
Well, each city belongs to a country. Of course, I'm considering that.
In my city table I have an Id for each destination, and then a column for the operation code of each provider (I know, this could be better represented with a relationship more), plus country code, zip, url for seo...
Respecting the solution mentioned by MagnusL, creating a Synonyms table, why would I need to store the synonyms? Regarding the script you mentioned with levehnstein and human interaction, that's exactly what I'm currently doing:
With each record provided by a provider and my destinations table. Given a provider city record, I'm showing the more coincident ones from my table.
But before this, I automatically link all those which are coincident in zip code and country.
It's a lot of work for updating my providers special operation code for each city. I am just curious about how people deal with this problem, I'm sure a lot of developers have to face this at some point.
If it is important that the cities are correctly matched, I would guess you must have some manual steps in your process. If you include names of smaller towns you will some day encounter that the same name could actually be two different places in two different countries. (Try Munich on Google Maps and you get one in Germany and one in North Dakota.)
A somewhat complicated, but I guess future proof, workflow is to use id numbers in place of city names in your main data table. Then set up a locations table with those id numbers as primary keys and your preferred name of the city followed by as many meta data columns as required for country code, zip code, WGS84 coordinates, continent name, whatever. Add another table for city name synonyms, with just id numbers and names (without UNIQUE constraint on the id column).
Let your import script try to match the city with help from as many meta data as possible (probably different meta data from different providers), together with the Levehnstein algorithm you mentioned, and let it be clever enough to ask for human interaction in those cases where no one or more than one city are matched. It can of course show you the closest possible guesses, so you can pick the right one and have it stored in the synonym table.
(Yes, it is a lot of coding to get there. If you find it worth it or not depends on how often you do these updates.)
Tip: Wikipedia has articles with different names on cities, i.e. https://en.wikipedia.org/wiki/List_of_names_of_European_cities_in_different_languages
What if you used an extra table for name translation?
IE, the table would have 2 columns: column A the name you use, column B, the name a provider uses. You might need to do adapt this table manually, to look like:
Bruxelles:Brussels
Bruxelles:Brussel
Bruxelles:Bruxelles
While importing, for the name of the city you would then use
select A where B = Brussels
In your agglomerated database, names would then be consistent.

Simplifying locality search

I'm working on a property portal (to be developed in asp.net MVC) that allow users to search properties in specific region or city.
I created tables for Country, State, City and Region.
all these tables refers to one another in right to left direction, means Region referes to CityId, Cities refers to StateId etc.
I have a web page with single textbox which takes input as state, city, region or zipcode or just say any locality
I don't want user to select state then city and then region, User should be able to search directly with city, region or zip with single textbox
How can I get this job done with my current table structure? Do I need to change my table structure?
Either use drop downs making one depend upon another else if you need a text search make them enter in a word or words and search each table for the keyword

How to handle composite keys in Rails

I'm trying to learn Rails, and am unsure of the Rails way of (or alternative to) dealing with composite keys.
Suppose I have a table of countries -- lets say I use the ISO 3166-1 codes, and pick the Alpha2 code as the key. I think the Rails thing to do would be to call that id, right? So I have COUNTRIES = (id, alpha3_code, numeric_code, name)
Now, each country can have many ISO 3166-2 regions, so I have REGIONS = (id, country_id, name). I tell rails that COUNTRIES has_many REGIONS and that REGIONS belongs_to COUNTRIES (am I right so far? Apart from the capitalisation, of course.) The region ID's are unique within a country, but not necessarily globally, so I need the composite key.
So now I have company offices, each of which is in a particular region. But I can only specify a region with a [country, region] pair, and I understand Rails doesn't like composite keys. So what is the Rails way to organise this, so that given an office I can, for instance, determine its region and country, and given a country can find all the states in that country?
Thanks in advance.

Location based information

I would like to show information on my website based on user's geography. In my current design would not want the user to enter their location/zip code.
Using IP I can find user's location but how do i leverage this information to show relevant events/information from surrounding cities/town.
Thanks
Based on IPs, you only have a certain accuracy with showing location. You should have a option that lets them enter their city/state or zip code.
Once you have long/lat, you just need to run a query to find records in your database a specific distance from that long/lat.
PHP/MySQL: Select locations close to a given location from DB
http://www.ipinfodb.com/ip_location_api.php
this will send you an xml response with the pertinent information
Lets say you can get their City from the IP address. You would need a database of cities with ID's that would pertain to other database entries. Like:
database table cities database table restaurants
--------------------- --------------------------
ID City ID city_id name
1 Los Angelos 1 1 Big Al's
Then you could search for restaurants that have the city_id of the city you got from their IP.
There are so many different approaches to relational databases. This is just one small example.

Resources