foursquare - get generic user location - geolocation

I am trying to get the "generic location" of a location through foursqaure.
For example - for a specific geolong, geolat provided to Foursquare API - is there an API/algorithm I can use, in order to determine if user is at home/mall/suburbs/city (generic locations)?
I could process the location type and determine this, but i was wondering if there is an easier way out there?

Since the vast majority of foursquare venues have one or more categories, you could use those to help you identify where a user is. If you have an oauth token for a user, you can access their check-in history through this endpoint: https://developer.foursquare.com/docs/users/users.html.
You'll then see on their most recent check-in a list of categories, like "home," "bar," "nightlife," "movie theatre," that you can use to get an idea of what class of place they are at.
The database actually has very few issues with misspellings, duplicates, or false positives, but you will find some issues with miscategorized homes (many college dorms, for example, classify themselves as nightclubs or strip clubs as jokes). For a majority of locations, however, the results should be useful.

Have you considered GeoNames API for searching Wikipedia by location? Details here.
Using Foursquare's API is a pretty clever idea, but based on prior experiences, you may get some false positives due to mis-spellings and/or duplicate listings.

Related

Given an address, how can I get the congressional district?

Is there a library or service that returns the US federal congressional district given a US address?
You can do this on govtrack.us. I am not sure if there is a code framework for this, but you could probably write one from this information.
Full disclosure - I work for a company that also provides congressional district data - smartystreets.com
Keep in mind that this information changes often and I have been looking for the last 4.5 years and haven't yet pinpointed an actual, absolutely reliable source for this data. All of the companies I have looked at, including mine, say that their source is the US Postal Service but no one in the US Postal Service has been willing to tell me how frequently the data is updated within their system and what their source might be. So, just be aware.
I'm one of the creators of Geocodio and we had this problem ourselves. So we made it part of our geolocation tools!
You can use the Geocodio API or spreadsheet upload to look up Congressional districts, Representative/Senator contact information, etc. Docs: https://www.geocod.io/docs/#congressional-districts
You can use the Sunlight Foundation Congress API, which is free to use. Go to: http://tryit.sunlightfoundation.com/congress You can determine congressional district by zipcode (not completely reliable since zipcodes can traverse multiple districts) or by latitude/longitude. Just use Google geocoding to retrieve lat/lng from your address, and then you can look up the congressional district.
The whoismyrepresentative.com/api site can translate a zipcode into a congressional district. You can do a call with the url below, replacing 10038 with your zipcode.
http://whoismyrepresentative.com/getall_mems.php?zip=10038&output=json
Hope that helps!
**Also note that the sunlightfoundation site returns a 404 error and govtrack.us does not have an api for searching for congressional districts.

Some general Twitter4J questions

I'm trying to do a write up of Twitter4J for part of a uni project, but I'm getting hung up on a few things. From the Twitter4J api:
void sample()
Starts listening on random sample of all public
statuses. The default access level provides a small proportion of the
Firehose. The "Gardenhose" access level provides a proportion more
suitable for data mining and research applications that desire a
larger proportion to be statistically significant sample.
This implies that by default, a "default access" is provided to the stream, but another type of access, "Gardenhose access" is available. Is this correct? And if so, how do you access the higher Gardenhose access?
I'm asking as I've seen some answers on SO suggest that there is only one level of access - the Gardenhose, and I'm trying to clear this up once and for all.
In addition to this, I would like a reference (if possible) to the number of tweets the sample stream allows access to. I've read lots of people cite 1% for "default access" and 10% for "gardenhose access" - but I can't find this anywhere in the API.
So to sum up, two questions:
Does the sample stream have a "default access" and a "gardenhose access", or just one of those?
How much of the Twitter firehose stream can these levels of access gain?
If replying, please have links to reference-able API where possible.
The gardenhose is different from the default sample stream, you would have had to request access from Twitter in order to use it.
However, I am not sure if Twitter still allows access to the gardenhose, or even if it still exists. It seems the current mechanism may be to use one of Twitter's preferred data partners:
Using the Streaming API?
Every Twitter account can connect to a small sampling of the Streaming API. Accounts that need increased access for data gathering or analytical reasons should check out our preferred partners page.
(source)
It may be different for students or educational instutions and that the gardenhose is still available to you. Previously you would have to either e-mail api-research#twitter.com or you could use the following form, but I have no idea if these methods work still - the post is quite old.
As for the percentage of Tweets that the default sample stream allows access to, the best reference I could find was a comment made by a Twitter employee on the developer forums - emphasis mine:
I would recommend just using the 1% sample stream from https://stream.twitter.com/1/statuses/sample.json that you can connect to with your Twitter account. It's unlikely that you'll be in a situation where you can access all of the data and will have to make do with a sample. At about 230 million tweets a day, you'd still be theoretically getting 2.3 million tweets a day.
(source)
Although, again this is an old post.
Regarding the firehose stream, as specified by the documentation you need to be granted permission to access it, I believe very few people have full access to this stream:
GET statuses/firehose
This endpoint requires special permission to access.
Returns all public statuses. Few applications require this level of access. Creative use of a combination of other resources and various access levels can satisfy nearly every application use case.
Overall documentation is scarce on the different access levels and what they offer, I suggest contacting Twitter directly to discuss your requirements or contacting one of their data partners.
Apologies if this wasn't as concrete as you would have liked, good luck with your research.

Tweets, Location, Keywords and Data

I'm trying do some analysis on locations where people are going during winters. The approach I'm following is get tweets from a specific city (say, New York) and with the keyword Foursquare. Then use foursquare data for that user to see his/her checkins and try to trace a pattern.
So, I'm stuck in the first phase. How do I get those tweets from ONE city and with the keyword FOURSQUARE. I'm not sure if I understood how to use streaming API correctly and the ReST API isn't working (shows NOT AUTHORISED)
Could you tell me a detailed procedure for a rookie to understand the process of doing the above mentioned process. Also, let me know if you have a better approach for analysing trends in check ins.
Thanks
You want to read these:
https://dev.twitter.com/docs/api/1/get/search
https://dev.twitter.com/docs/platform-objects/places
You can give Twitter a latitude/longitude coordinate and a radius, or you can use the "place" field as a filter. Either way, expect to fine-tune this a bit to fit your needs. You also need to take into account that a lot of people might tweet without location services enabled.
If you want to use the REST API, you need to get an API key from twitter.

Geolocation and getting a city from an input address (Rails)

The app I'm building needs to be able to match up users to events based on the city/town they're in. I'm still relatively new to Rails and completely new to Geolocation and using locations in an app. I'd figured on a design where users have one or many cities, and events would have one city which I'd hoped to extract without specifically asking the user for it, by getting it from the event address entered.
Mostly to provide some outside checking to help get the address entered correctly and consistently, but also to show a map, I installed this jquery address picker (https://github.com/sgruhier/jquery-addresspicker). Unfortunately the data returned by Google doesn't include a city but a "locality" or an "administrative area" that doesn't correlate reliably to city names. The localities being returned are more like what we in my home town would call "suburbs". What I need to procure is a city so I can allow users to search all events in their city rather than just the ones in their suburb.
Can anyone offer advice on how I could go about doing this? Many thanks.
Edit: Should maybe add that I'm wanting to do geocoding client-side so I don't run into problems with Google Maps limits or have to pay for geocoding etc.
There are some gems that provide you with that and may others geo related features, like calculating distances.
Here are the 2 most famous: https://github.com/alexreisner/geocoder and https://github.com/imajes/geokit
In the future I highly recommend you to head to https://www.ruby-toolbox.com/ to see what is available as a gem already and see what is the most popular at the moment.
For raw address info, use Google Maps API Reverse Geocoding which accepts lat/lon inputs and returns street address components. Modern browsers support location awareness (geolocation), with user permission, and will give you a lat/lon that "tends to be close" to where the browser is. That will probably get you a correct city/town in most cases.
The maps API is part of Google's broad suite of API tools -- there are gems that handle any Google API (well, most of them), or check out Google Maps for Rails, which will at the very least give you a good head start on how to use the API.
But if you're looking to validate postal code, this method will come up short, since the location awareness will vary in accuracy depending on browser, device (more accurate for mobile), the connection, population density, network coverage, and so on. Also, calling the
If you can get GPS-accurate lat/lon then it will be much more accurate ... except in some cases like in large cities, a single building will have its own postal code, so a few feet one way or the other might matter.

what POIs (Point of Interests) DB can I use for a commercial app

I need a POI database for a startup project I am working on - it will be a free basic version and a premium paid for version in the sense that user will pay a monthly subscription.
I would like to use foursquare type checkin to places and plancast type functionality to search for places (one-line search). Ie I need to:
perform a search for POIs around a location
associate users to that POI, with a time stamp
allow users to add own POIs
provide free-text search for POIs (a la google one-line search)
Google API allows great search, but I understand there are limits in number of requests that can be done? This would prevent scaling, and may result in application breaking when too many users. Also what does google T&C say about using this in a paid for service?
Openstreetmap I understand does not have these contstraints, but do they also provide a good one-line search type API? Or how could I solve this?
have a look at http://eventful.com/ or http://qype.com - they both have APIs you can call to get find out whats happening near you. You can convert these events into POIs in your application. The APIs are free to use (your app just need to credit Eventful or Qype).
Take a look at the API at http://compass.webservius.com - may not have everything you need, but has very rich data on 16+ million business locations in the USA.

Resources