Retrieve item, its pageview and geographical attribution in one Mediawiki API call - geolocation

This is a comprehensive and complete version of the answer I've already asked a while ago at Get location with Wikimedia API. I happened to dig through all the Mediawiki API, GeoData API and Wikidata Query SPARQL Service documentation for days, publish my question on Stackoverflow and several talk boards in Wikimedia but didn't find the satisfying answer.
The question is as follows: I am trying to make use of GeoData API to perform aforementioned task - country and city attribution of geolocated item. The short description of my task: get a list of Wikipedia pages around a certain location defined with coordinates, get some page properties (page views, main image), then get the country and the city (the human readable - not the IDs) which this page item belongs to. Example description: let's imagine I have some geo coordinate near Sagrada Familia as an input. I want to receive a list of N Wikipedia pages in 1km radius around this coordinate. I want to receive number of page views and main image for each of this pages. I want for each item described on the page to be determined it is located in Barcelona, Spain. I could perform it in one Wikimedia call and N Wikibase Query Service calls but it is crucial to perform the requested in one call.
I found GeoData API very clean, simple and user friendly in retrieving various data according to geo location of the item. But there are difficulties with retrieving country/city affiliation of the item. While country can theoretically be get in a single request (also not always but only if being specified and not in name format but rather by its alphabetic designation) as the parameter of GeoData API itself, the city is possible to be get only for items which are cities by themselves. From the second hand this information does exist for every geo tagged item and is available for example through Wikibase SPARQL query service. But then I'll need to perform secondary requests to WikiData which I would have liked to avoid by all means. I managed to try all the ways round:
To call Wikimedia API (GeoData extension) from within Wikibase SPARQL request but it doesn't seem to work.
To retrieve Wikidata items around certain coordinates with Wikibase SPARQL request but then I can't get information from Wikipedia about page views.
To produce a list of pages around geo location with "generator=geosearch" and pass it to several props and pageprops of Wikimedia API calling for related Wikidata item. But then I only get the IDs of Wikidata properties, while I need human readable labels.

Related

Get random address/coordinates in a specified town

Is there any way to give Google Maps API or a similar API a town name and have it return a random address inside the town? I was hoping to be able to get the data as a JSON so I could parse it with SwiftyJSON in XCode and use it, but I can't seem to find any way to get the address in the first place. If coordinates would be easier to get, then those would work too, as long as its random and inside the town borders.
You can try to use Google Places API Web Service. It allows you to query for place information on a variety of categories, such as: establishments, prominent points of interest, geographic locations, and more. You can search for places either by proximity or a text string. A Place Search returns a list of places along with summary information about each place.
A Nearby Search lets you search for places within a specified area. You can refine your search request by supplying keywords or specifying the type of place you are searching for.
A Nearby Search request is an HTTP URL of the following form:
https://maps.googleapis.com/maps/api/place/nearbysearch/output?parameters
where output may be either xml or JSON values.
And if you want either address or coordinates, you can use Geocoding for it. Here i found a tutorial on how to use Geocoding in IOS.

Twitter Stream api filter by location AND track

I'm using the following line in order to get geolocated tweets that contain a certain keyword. (I'm using the word Madonna)
https://stream.twitter.com/1.1/statuses/filter.json?track=Madonna&locations=-180,-90,180,90
My problem is that result is not consisted by geolocated tweets that contain the keyword Madonna, but is consisted by geolocated tweets in general.
Any help on what I'm doing wrong here?
"-180,-90,180,90" - it is worldwide location;
Currently for use "AND" instead of "OR" in Twitter stream API you need make request like this: https://stream.twitter.com/1.1/statuses/filter.json?locations=-74,40,-73,41 and filtered results by "Madonna" inside your app after. Unfortunatly, I can not find another way for today;
Filtering by locations can contain:
If coordinates is empty but place is populated, the region defined in
place is checked for intersection against the locations bounding box.
Any overlap will match.
Another, somewhat hack-y solution to this, is you can have a track key work that would never match, such as "dkghaskldfnascjkawenaf", and add a location bounding box.
The API does an OR relationship between tracking and location, you'll only receive tweets from within (or very nearby) the bounding box

YQL documentation for the google.news search and the "geo" key

Someone know some documentation of Yql Google News Search? I am trying understand the "geo" key values for the search.
This link show a example for the search.
Thanks and sorry for my bad english.
Cleber.
For details of the usage of the different keys on the YQL google.news table, see the source API's documentation.
In this case that can be found in the Google News Search API - JSON Developer's Guide, and the geo key is described as:
This optional argument tells the News Search system to scope search results to a particular location. With this argument present, the query argument (q) becomes optional. You must supply either a city, state, province, country, or zip code as in geo=Santa%20Barbara or geo=British%20Columbia or geo=Peru or geo=93108.
It goes on to say:
When using the geo property, please note the following:
Make sure the location you supply exists within the scope of your chosen news edition. For example, if you specify geo=Quebec for the Canadian edition of Google news, you probably won't get good results.
You can't combine geo with the topic property.
Some editions of News Search don't support the geo parameter. To test if geo works with a specific edition,
Go to that edition's landing page (for example, news.google.ca)
Click Add a Section.
In the Add a Local Section box on the right side of the page, enter a search query relevant to your desired location (for example, Quebec). You should now see a Local Results pane on the edition homepage.
If the Local Results pane is populated with results, you can use the geo parameter for that region.

Retrieve most retweeted tweets for a given hashtag

I'd like to retrieve the tweets for given a hashtag and sort them from the most retweeted to the less retweeted.
The closest thing I've found is using the search call and use the type tag:
E.g.: http://search.twitter.com/search.json?q=TheHashTagHere&result_type=popular
However, I'm not sure on how "popular" option works.
For instance, if it finds 100 tweets with that hashtag I believe it should show the X most retweeted tweets, and if none of those tweets have been retweeted then it should show X of them randomly (or sorted in some other way like the most recent).
Unfortunately, if follows some kind of unknown rule to identify what's popular and what not and even hashtags with thousands of tweets might return only one or two results.
I hope I made myself clear. Thanks in advance :)
PS: I'll use PHP but I think that shouldn't affect the question?
Results will sometimes contain a
result_type field into the metadata
with a value of either "recent" or
"popular". Popular results are derived
by an algorithm that Twitter computes,
and up to 3 will appear in the default
mixed mode that the Search API
operates under. Popular results
include another node in the metadata
called recent_retweets. This field
indicates how many retweets the Tweet
has had.
Source (Emphasis are mine)
Just call with result_type=popular and check the recent_retweets node to see how popular it is. result_type=popular will become the default in an upcome release so beware if you omit this parameter.
Results with popular tweets aren't ordered chronologically. *
If you would like to always have results to show, use result_type=mixed: they will have the result_type in the "metadata" section with a value of "recent", and popular results will have "popular". A small reference about result_types:
mixed: Include both popular and real time results in the response.
recent: return only the most recent results in the response
popular: return only the most popular results in the response.
If a search query has any popular results, those will be returned at the top, even if they are older than the other results. *
*[Twitter API Announcements]
This isn't a programmatic method but rather works in the browser with a chrome extension (HackyBird) :
Install the extension
Search for a phrase e.g. #Social (twitter.com/search?q=%23Social)
Click the extension to sort it (you can adjust the ratio of retweets/likes used for sorting in extension options).
P.S. It'll also sort your or any other user's timeline.

Lookup telephone area code by latitude and longitude

Looking for a way to get a list of telephone area codes for a given latitude and longitude (and if necessary a given intl. code.) Note, I'm not talking about international dialing prefixes but the area codes within them.
For example, Denver Colorado is covered by the area codes 303 and 720. It's at 39.739 -104.985 and is in NANP 1. So given 39.739,-104.985,1 I'd like to get back [303,720].
Libraries, web services, DB's, or raw data that needs to be parsed into a DB, e.g., a web page of shape points, are all fine and the more global coverage the better, but just NANP 1 would be a great help.
Note I already use MaxMind and could turn the lat-lng into a fake IP and use that as the lookup key, but MaxMind claims only U.S. area codes (whether they truly mean U.S. or actually NANP I haven't tested) and seemingly only 1 per location (e.g. just 303 for Denver.) So it's a possibility, just not a great one.
UPDATE: I found some more relevant information, but no definitive solutions so I'm listing it here rather than in an answer:
I was able to find two U.S. databases http://www.area-codes.com/area-code-database.asp and http://www.nationalnanpa.com/area_codes/index.html (50% down the page, MS Access file.) The former includes lat/lng for $450 and the latter would require nearest-neighbor matching as KeithS talks about (it's probably the same DB underlying the NANPA City Query he found.)
Additionally I found information that implies Teleatlas has area code boundary maps and that ESRI includes area code shape files with copies of ArcGIS. Maponics seems to have data available: there's a Google Maps implementation of Maponics' data at http://www.usnaviguide.com/areacode.htm.
Wow. You'll definitely need some sort of pre-existing database of points. My first thought was ZIPList5 Geocode. It includes lat-long data for each active U.S. ZIP code, so you can throw this data in a DB table, index the hell out of it, and search by just about any geographic info you'd have access to. You can buy one copy for $40, with enterprise-level use for $100. Only problem is that this DB has only the "primary" area code for each ZIP code, so metro areas that have more than one (Dallas, Chicago, NYC) aren't going to show all of them.
You could try a two-pronged approach with some free data I found: for a given latitude and longitude, do a nearest-neighbors search of the data in the USGS Geographic Names Information System; it includes information on every human habitation center, and every named landmark feature, with lat/long coordinates of their centers. You now have your lat/long point mapped to the nearest town/city, ZIP code, county, and state. Now, you can compare that against this list of U.S. Area Codes, to find area codes matching any or all of the identifying information from the USGS. This is all free, and will eventually get you what you need, but you'll probably have to do some work to "massage" the two sets of data into something you can efficiently cross-reference, and/or you'll need to implement a good "search engine" that will accurately find nearest-neighbor named points, and then find area codes for locations matching the names.
One more thing to look at is NANPA, which administers area code assignment to begin with. I'm sure they have a more comprehensive downloadable DB, but the only free public access I could find was this search page, which will find area codes for any city with >20k people. You could turn your lat/long data into a city and state, and then hit this search page: NANPA City Query
Here is an option:
http://geocoder.ca/39.739,-104.985?geoit=xml
<TimeZone>America/Denver</TimeZone>
<AreaCode>720,303</AreaCode

Resources