Using YQL and apartment search location from craigslist, I get a result in the following form. Is there any way I can get latitude,longitude information from this ? or do i have to geocode the address ? are there any other source apart from craigslist that can be used to get property details along with geo-location?
{
"about": "http://kolkata.craigslist.co.in/apa/2559284148.html",
"title": [
"Temporary Stay Rental (Kasba Area, Kolkata) 2bd 1100sqft",
"Temporary Stay Rental (Kasba Area, Kolkata) 2bd 1100sqft"
],
"link": "http://kolkata.craigslist.co.in/apa/2559284148.html",
"description": "FURNISHED apartment for executives, professionals, NRIs and visitors for temporary stay in Kolkata (Calcutta).<br>\n<br>\nLOCATION: Kasba area near Delhi Public School - close to Gariahat and EM Bypass<br>\n<br>\n2 bedrooms, 2 bathrooms, a spacious living room, a separate dining room and a modular kitchen. Cooking facility - cooking gas burner, utensils, refrigerator etc. <br>\n<br>\nApartment located at the second floor of a four story building. Elevator available.<br>\n<br>\nHot water, air conditioned bedrooms and one covered parking space available.<br>\n<br>\nRent is INR 2,000.00 (or USD 50.00) per day for a minimum stay of 15 days. Costs of electricity and cooking gas charged separately on actual usage.<br>\n<br>\nShorter stay possible at negotiable rates.<br>\n<br>\nAppropriate ID required for renting.<!-- START CLTAGS -->\n\n\n<br><br><ul class=\"blurbs\">\n<li> <!-- CLTAG GeographicArea=Kasba Area, Kolkata -->Location: Kasba Area, Kolkata\n<li>it's NOT ok to contact this poster with services or other commercial interests</ul>\n<!-- END CLTAGS -->",
"date": "2011-08-22T10:05:43+05:30",
"language": "en-us",
"rights": "Copyright © 2011 craigslist, inc.",
"source": "http://kolkata.craigslist.co.in/apa/2559284148.html",
"type": "text",
"issued": "2011-08-22T10:05:43+05:30"
}
No. You will need to map the regionId's to the coordinates or zipcode.
Related
I currently have a project about NLP, I try to use NLTK to recognize a PERSON name. But, the problem is more challenging than just finding part-of-speech.
"input = "Hello world, the case is complex. John Due, the plaintiff in the case has hired attorney John Smith for the case."
So, the challenge is: I just want to get the attorney's name as the return from the whole document, not the other person, so "John Smith", part-of-speech: PERSON, occupation: attorney. The return could look like this, or just "John Smith".
{
"name": "John Smith",
"type": "PERSON",
"occupation": "attorney"
}
I have tried NLTK part-of-speech, also the Google Cloud Natural Language API, but it just helped me to detect the PERSON name. How can I detect if it is an attorney? Please guide me to the right approach. Do I have to train my own data or corpus to detect "attorney". I have thousands of court document txt files.
The thing with pre-trained Machine Learning models is that there is not much space for flexibility in what you want to achieve. Tools such as Google Cloud Natural Language offer some really interesting functionalities, but you cannot make them do other work for you. In such a casa, you would need to train your own models, or try a different approach, using tools such as TensorFlow, which require a high expertise in order to obtain decent results.
However, regarding your precise use case, you can use the analyzeEntities method to find named entities (common nouns and proper names). It turns out that, if the word attorney is next to the name of the person who is actually an attorney (as in "I am John, and my attorney James is working on my case." or your example "Hello world, the case is complex. John Due, the plaintiff in the case has hired attorney John Smith for the case."), it will bind those two entities together.
You can test that using the API Explorer with this call I share, and you will see that for the request:
{
"document": {
"content": "I am John, and my attorney James is working on my case.",
"type": "PLAIN_TEXT"
},
"encodingType": "UTF8"
}
Some of the resulting entities are:
{
"name": "James",
"type": "PERSON",
"metadata": {
},
"salience": 0.5714066,
"mentions": [
{
"text": {
"content": "attorney",
"beginOffset": 18
},
"type": "COMMON"
},
{
"text": {
"content": "James",
"beginOffset": 27
},
"type": "PROPER"
}
]
},
{
"name": "John",
"type": "PERSON",
"metadata": {
},
"salience": 0.23953272,
"mentions": [
{
"text": {
"content": "John",
"beginOffset": 5
},
"type": "PROPER"
}
]
}
In this case, you will be able to parse the JSON response and see that James is (correctly) connected to the attorney noun, while John is not. However, as per some tests I have been running, this behavior seems to be only reproducible if the word attorney is next to one of the names you are trying to identify.
I hope this can be of help for you, but in case your needs are more complex, you will not be able to do that with an out-of-the-box solution such as Natural Language API.
I am trying to determine the cities, states, and countries for twitter users.
The location field returns the location, but I need to parse it and store this data in a structured format.
For instance, if the location in an user's bio is "London", it should store the city as London, and the country as UK. If it's "Albany, NY", it should store the city as Albany, the state as NY, and the country as USA. If it's just "NY", it should store the state as NY, and country as USA. If it's "India", it should store the country as India (with no city or state). Obviously if the location is nonsense like "outer space", it will return nothing.
Is there a gem out there that does something like this? If not, is there any way I can do this intelligently leveraging some 3rd party?
I faced the same problem to gelocalize twitter location. The best and free service i found is openstreetmap.
It is really easy to use and the response is JSON.
try yourself: http://nominatim.openstreetmap.org/search?q=london&format=json&&addressdetails=1&accept-language=en
Here the first element that match "london":
{
"place_id": "97592906",
"licence": "Data \u00a9 OpenStreetMap contributors, ODbL 1.0. http:\/\/www.openstreetmap.org\/copyright",
"osm_type": "relation",
"osm_id": "65606",
"boundingbox": [
"51.2867584228516",
"51.6918754577637",
"-0.510375142097473",
"0.334015518426895"
],
"lat": "51.5072759",
"lon": "-0.1276597",
"display_name": "London, Greater London, England, United Kingdom",
"class": "place",
"type": "city",
"importance": 0.9654895765402,
"icon": "http:\/\/nominatim.openstreetmap.org\/images\/mapicons\/poi_place_city.p.20.png",
"address": {
"city": "London",
"county": "London",
"state_district": "Greater London",
"state": "England",
"country": "United Kingdom",
"country_code": "gb"
}
}
As you can see the address field contains all the informations you need.
I'm trying to find some IP addresses for testing IP geolocation functionality on a website. Does anyone know of a good way to find static IP addresses for certain cities (i.e. Seattle, Los Angeles), or a good way to find IP ranges for a U.S. city?
My service http://ipinfo.io has an API that let's you lookup cities for a given IP, eg:
$ curl ipinfo.io/8.8.8.8
{
"ip": "8.8.8.8",
"hostname": "google-public-dns-a.google.com",
"loc": "37.385999999999996,-122.0838",
"org": "AS15169 Google Inc.",
"city": "Mountain View",
"region": "California",
"country": "US",
"phone": 650
}
There's no API to do the reverse (find IPs for a given city), but you can use do a google search like site:ipinfo.io to turn up IPs from the given city. For example, searching for Seattle, US turns up the following pages:
http://ipinfo.io/97.113.203.115
http://ipinfo.io/174.21.174.240
http://ipinfo.io/54.200.79.127
http://ipinfo.io/207.171.163.31
I am working on a geographically aware high score server, and would like to be able to list scores like "First place in Dutchess County" or "Third place in the State of New York". I can reverse geocode the user's location and get a placemark that lists AdministrativeArea, etc.
The reverse geocoder used by iOS and Google would return "Dutchess" and "New York" for the above examples, so I need to supply "County" and "State" for the United States.
However, the game is global, so I need to know the names of each geographic organization level in other English-speaking countries.
So, in the United States, Google / iOS placemark levels would be described as following:
AdministrativeArea = "State"
SubAdministrativeArea = "County" (or "Parish" in Louisiana)
Locality = "City" or "Town"
Sublocality = (I'm calling this "Neighborhood")
PostalCode = "Zip Code"
What are all of these levels called in other English-speaking countries? (Canada, UK, Australia, New Zealand, Singapore, etc). If there's a resource that lists all these, I would love to know it. I think I may just not know what to search for on the web.
I'm not entirely sure of the effort/value ratio of this exercise. It could get rather difficult, especially with unitary authority areas in England.
In most of the United Kingdom,
country is "United Kingdom" [ie the country name]
administrative_area_level_1 could be "England" or "Scotland" etc [ie the country name]
administrative_area_level_2 might be "East Sussex" [county]
locality might be "Hailsham" [town]
postal_code is a postcode
However in London,
administrative_area_level_2 is "London" which isn't a county
administrative_area_level_2 might be "Greater London" too
administrative_area_level_3 might be "London Borough of Lewisham" [yay! Borough makes sense]
locality is "London" which isn't a locality
In unitary authority areas in England,
administrative_area_level_2 might be "Medway"
Unitary authorities replace county council and borough/district councils, but they are located within a "ceremonial county" which is what most people will use ordinarily. Places in Medway Council's area are in Kent. Unfortunately these county names aren't returned by the geocoder. Some counties (eg Berkshire) were abolished completely and replaced entirely by unitary authorities. However the old county name (Kent, or Berkshire) is the right name to use.
I need to get all the buildings with "church" function that are far 100km from a specified point (lat, lng). I made in this way:
[{
"id": null,
"name": null,
"type": "/architecture/building",
"building_function" : [{"name" : 'church'}],
"/location/location/geolocation" : {"latitude" : 45.1603653, "longitude" : 10.7976976}
"/location/location/area" : 100
}]
but I alway get an empty response
code: "/api/status/ok"
result: []
status: "200 OK"
transaction_id: "cache;cache03.p01.sjc1:8101;2011-04-16T12:32:45Z;0035"
What am I missing?
Thanks
An area isn't a distance and you probably don't want an exact match to the value "100" anyway. You've asked for things which are precisely at that long/lat and have exactly that area.
Are you looking for churches which are less than a certain distance, more than a certain distance, or exactly the given distance? You probably want to look at the Geosearch API http://api.freebase.com/api/service/geosearch?help (although it's not a long term solution since it's been deprecated)
The /location/location/area property is used to query locations which cover a certain amount of area. So your query looks for buildings centered at (45.1603653, 10.7976976) which cover an area of 100km. Naturally there are no results that match.
Searching for topics within 100km of a those coordinates takes a little more work. You'll need to use the Geosearch service which is still in alpha. The following query should give you the results that you're looking for:
http://www.freebase.com/api/service/geosearch?location={%22type%22:%22Point%22,%22coordinates%22:[10.7976976,45.1603653]}&type=/architecture/building&within=100&indent=1
Once you have that list of buildings, you can query the MQL Read API to find out which ones are churches like this:
[{
"id": null,
"name": null,
"type": "/architecture/building",
"building_function" : [{"name" : 'church'}],
"filter:id|=":[
"/en/verona_arena",
"/en/basilica_palladiana",
"/en/teatro_olimpico",
"/en/palazzo_del_te",
"/en/villa_capra_la_rotonda",
"/en/villa_badoer",
"/en/san_petronio_basilica",
"/en/palazzo_schifanoia",
"/en/palazzo_chiericati",
"/en/basilica_di_santandrea_di_mantova",
"/en/basilica_of_san_domenico",
"/en/castello_estense",
"/en/palazzo_dei_diamanti",
"/en/villa_verdi",
"/en/cathedral_of_cremona",
"/en/monte_berico",
"/en/villa_pojana",
"/en/san_sebastiano",
"/en/cremona_baptistery",
"/en/palazzo_della_pilotta"
]
}]
Right now its only matching 2 results so you'll probably need to edit some of those topics to mark them as churches.