fetching all past tweets from users home timeline on twitter - twitter

I'm trying to write a program that will retrieve all of the tweets the user has seen on their twitter home timeline (ie, from people their following, as they would see at twitter.com). I realise this is a lot of data, and the rest API has limitations.
What would be the be best way to do this? Slowly retrieve the last 200 or whatever the limit is tweets, keeping in mind the 350 requests per hour limit? Or is there some hard limit to how far back I can go even with that?
The streaming API only streams from current point on I believe, so I don't think this is an option. This is a personal project so I can't pay very much for any elevated access or anything.

Yes, there is a limit to how far back you can go:
Clients may access a theoretical
maximum of 3,200 statuses via the page
and count parameters for the
user_timeline REST API methods. Other
timeline methods have a theoretical
maximum of 800 statuses. Requests for
more than the limit will result in a
reply with a status code of 200 and an
empty result in the format requested.
Twitter still maintains a database of
all the tweets sent by a user.
However, to ensure performance of the
site, this artificial limit is
temporarily in place.
Source: http://dev.twitter.com/pages/every_developer
As you mentioned, you will need to go page by page through each of the 200 max results that come back until you hit that limit and get the empty result set, making sure not to hit the 350 requests per hour. There might also be gaps depending on how many tweets the user has on their timeline.

Related

Is there a way around twitter's rate limit? A pricing plan perhaps?

I am working on a project which requires getting tweet and user information from twitter. I can't even test the current system because I keep hitting twitter rate limit. Is there any way around it?
Basic information that I am looking to extract from each status is:
Status text
User follower count
User following count
Retweet count
Geo location co-ordinates
I am using Twitter4J API to do this.
Any help will be appreciated. Thanks in advance.
EDIT
I am using twitter's search API to get list of tweets.
One option is to use a Twitter Data Reseller (e.g. GNIP) who can sell tweets.
Another option is to maximize your use of the API. Here are some tips:
Check Rate Limit Status for each API you're use to make sure you don't go over and when the rate limit resets (currently every 15 minutes).
Look at the parameters to make sure you request a count of the maximum number of tweets for that API. e.g. a count can default to 20, but you can set it to 200, depending availability and limits on the specific endpoint. This potentially reduces the number of queries you have to make.
Page your results according to the Twitter's Working with Timelines guidance. Use SinceID and MaxID to make sure you're only requesting new tweets. This could reduce requests by reducing the number of tweets you need (through increasing the opportunity to stay within max count) and reducing the number of requests by avoiding queries for tweets you already have.
Essentially, you want to examine endpoint parameters with a perspective for how to decrease bandwidth and reduce the number of queries you have to make.

How to query a Twitter timeline in parallel?

I am building a Twitter app and I'll be pulling a big amount of data from the user's timeline. For speed, I need to query the timeline in parallel. My aim is to pull 1000 of user's tweets from the API, but the upper limit of number of tweets per request is set to 200 by the Twitter API. Pagination works by specifying the last (oldest) tweet's ID from the previous request, so I need to know the result of the previous API call to make the next call. This method is not parallelizable. Is there any alternative method for getting the user timeline from the Twitter API where I can make parallel requests (there is the page property, but is deprecated and will be nonfunctional in the near future).
What you have to remember, is that Twitter have a difficult relationship with external developers. Using their API for anything interesting like this is simply not allowed by them.
What you need is access to the Firehose.
However, even if you're willing to pay a million dollars a year - Twitter aren't interested.
You could try getting it from a third party like Gnip but - again - likely to be expensive.
So, essentially, you can't. Twitter just aren't interested in amateur developers doing anything innovative with their platform. Sorry.

I have IDs from many tweets; how can I fetch their full information with the Twitter API without exceeding the rate limit?

I have IDs from many tweets, and I'd like to fetch their full information from Twitter in order to do some data analysis. The obvious API method (https://dev.twitter.com/docs/api/1/get/statuses/show/:id) appears to take only one ID at a time. This is a problem because the number of tweets we need to analyze is well more than the API limit of 350 calls per hour.
Thus: is there some way to get full information for a set of tweet IDs, not just one, or alternately to submit many REST calls in the same HTTP request and have it count only once against the API limit?
There's unfortunately no bulk lookup offered for Tweets. You'll need to perform requests one at a time and scope your project to cope with the rate limitations. If you have friends who would like to help you, you could potentially ask them to authorize your application and leverage their permission to gain access to more requests.

Get all screen names from Twitter without hitting rate limit

I have successfully gotten a bunch of Twitter user_ids using the Twitter API resource "GET friends/ids".
I need to be able to get all screen names from Twitter without hitting the rate limit. I know I can use "GET friendships/show" to get the "screen_name", but to do that, I would have to loop through all of the user_ids, each one being a request, thereby potentially hitting the rate limit.
Does anyone know of a way to send an array of user_ids, in one request? Or... any other ideas or methods?
You can use the users/lookup API? I believe you can send a list of upto 100 userids in one request and it will send back a fair bit of info about each one, including their last tweet. As it says in the docs
It's also well suited for use in tandem with friends/ids and followers/ids.
I think this will solve your problem.

how to overcome twitter api rate limit?

I am writing a small app, building stats for twitter users (no of tweets, friends etc). I am using this api
http://api.twitter.com/1/users/show.json?user_id=12345
I can only make 150 calls per hour, which is very very small, given the size of twitter. How do companies that rely on Twitter's API manage to overcome this rate limit?
The 150 API calls is per user per application. Larger companies likely broker deals with Twitter.
You need to get whitelisted to get a far higher rate limit. They are open to all sorts of developers, as long as you give a good reason for what you are developing:
http://twitter.com/help/request_whitelisting
You will easily get whitelisted, just apply. They will accept more or less any reasonable application, but just don't want to leave it 'wide open'. If they dont accept you, and you still want to get your hands on the data, just scrape it.

Resources