Localized country names - localization

Where can I get the country names in all languages? I need these to localize an application.

The proper location to get this information from is CLDR - Unicode Common Locale Data Repository.
There you can find an updated list of countries (core/common/main), the data is available in numerous formats.

I recommend this site: https://github.com/umpirsky/country-list
List of all countries with names and ISO 3166-1 codes in all languages
and data formats.

There's probably an ISO standard document you can buy (a useful standard is ISO 3166-1, I think).
On the other hand, you might just be able to scrape through the various language versions of this wikipedia page, since it has a list of country names. I did a random check and it seemed the entire list was in at least one non-English language, too.

A know this is an old post but I found something that might help others who end up viewing this post via a google search.
This alternative to a select list gives (some) localised country names.
selectToAutocomplete by Jamie Appleseed
Take a look at the data-alternate-spelling tag for the items within the select menu.

IP2Location provide a free CSV formatted list of country names in 81 different languages. I've found this the most useful list for this purpose. The data can be fairy easily transformed into different formats if required:
https://www.ip2location.com/free/country-multilingual

Related

What's the best way to order list of countries in French from a usability standpoint?

I have a list of countries that I would like to display in a dropdown menu. Now, because of their french translations, the list either needs to be re-ordered, or countries need to be rewritten.
For example, Canary Islands is translated into Îles Canaries in french. Should I re-order the list so that all Îles are grouped togheter? Or should I write it as Canaries, Îles. Additionally, will people be able to navigate to Îles by typing the accented Î?
I agree it should be Isles first - stick with how someone would say the name. For example United States is Etas-Unis - or States United - but you wouldn't reverse the order to match English convention.
Why would you go for "Canaries, Îles"? Just to keep the English order?
Imagine that an English list would have that as "Islands, Canary"
Translate the way a native would expect it (in French)
Sort it following the French rules
Find it when the user types Î, and I
In general internationalisation is not hard: just turn the table and think what would you want if a German or French software is translated into English.

Where is the full list of exemplar cities for English in the CLDR time zone xml?

There are a couple exemplar cities and metazone names in core/common/main/en.xml from CLDR, however, the full list is not included in en.xml like there is in all the other languages.
Why is this and where do I find the entire list of exemplar cities for English?
I found the answer in the CLDR bug tracker.

Extract URLs automatically by industry names

So I searched a lot about this, but all I found was how to extract titles from URLs.
What I want for example is, I have this name "AB Electronics Inc."
So when you type that in google the first thing that pops up is www.abelectronicsinc.com/
That's all I want, I want to know how to automate this, because I have like 1000 names, I dont want to type all these names one by one. I have a text file with all these names. Like:
ABIOMED, Inc.
Accumold®
Accuratus Lab Services
Accutron Inc.
Acme Monaco
Acorn Product Development
And SO on.
So how do I find the url for those names directly is what I wish to know.
Thank you

How to make a classifier(TrainingSet) for countries and it's capitals?

i want to detect if a sentence contains country name or a capital(i.e. Egypt,Cairo,USA,Washongton,India,newdelhi,Kewit,Trablos,Paris,etc..) i want to make a file contains all the countries names and it's capitals and make a binary search on that file to see if there's any match,any idea on how to get a ready (Classifier)that makes a binary search or any kind of search on the data file will be helpful.
I am aware of a list of countries and itz capital and some more info is available but it is sql database file. http://www.joombah.com/downloads/view.download/23
Hope this database helps you in solving your problem

User input parsing - city / state / zipcode / country

I'm looking for advice on parsing input from a user in multiple combinations of City / State / Zip Code / Country.
A common example would be what Google maps does.
Some examples of input would be:
"City, State, Country"
"City, Country"
"City, Zip Code, Country"
"City, State, Zip Code"
"Zip Code"
What would be an efficient and correct way to parse this input from a user?
If you are aware of any example implementations please share :)
The first step would be to break up the text into individual tokens using spaces or commas as the delimiting characters. For scalability, you can then hand each token to a thread or server (if using a Map-Reducer like architecture) to figure out what each token is. For instance,
If we have numbers in the pattern, then it's probably a zip code.
Is the item in the list of known states?
Countries are also fairly easy to handle like states, there's a limited number.
What order are the tokens in compared to the common ways of writing an address? Most input will probably follow the local post office custom for address formats.
Once you have the individual token results, you can glue the parts back together to get a full address. In the cases where there are questions, you can prompt the user what they really meant (like Google maps) and add that information to a learned list.
The easiest method to add that support to an applications, assuming you're not trying to build a map system, is to query Google or Yahoo and ask them to parse the date for you.
I am myself very fascinated with how Google handles that. I do not remember seeing anything similar anywhere else.
I believe, you try to separate an input string in words trying various delimeters - space, comma, semicolon etc. Then you have several combinations. For each combination, you take each words and match it against country, city, town, postal code database. Then you define some metric on how to evaluate the group match result for each combination. Here should also be cross rules, like if the postal code does not match well, but country, city, town match well and in combination refer to a valid address then the metric yields a high mark.
It is sure difficult and not an evening code exercise. It also requires strong computational resources - a shared hosting would probably crack under just 10 requests, but a data center could serve it well.
Not sure if there is an example implementation. Many geographical services are offered on paid basis. Something that sophisticated as GoogleMaps would likely cost a fortune.
Correct me if I'm wrong.
I found a simple PHP implementation
http://www.eotz.com/2008/07/parsing-location-string-php/
Yahoo seems to have a webservice that offers the functionality (sort of)
http://developer.yahoo.com/geo/placemaker/
Openstreetmap seems to offer the same search functionality on its homepage
http://www.openstreetmap.org/
Assuming you're only dealing with those four fields (City Zip State Country), there are finite values for all fields except for City, and even that I guess if you have a big city list is also finite. So just split each field by comma then check against each field list.
Assuming we're talking US addresses-
Zip is most obvious, so check for
that first.
State has 50x2 options
(California or CA), check that next
Country has ~190x2 options, depending
on how encompassing you want to be
(US, United States, USA).
Whatever is left over is probably your City.
As far as efficiency goes, it might make sense to check a handful of 'standard' formats first, like Dan suggests.

Resources