I have IDs from many tweets; how can I fetch their full information with the Twitter API without exceeding the rate limit? - twitter

I have IDs from many tweets, and I'd like to fetch their full information from Twitter in order to do some data analysis. The obvious API method (https://dev.twitter.com/docs/api/1/get/statuses/show/:id) appears to take only one ID at a time. This is a problem because the number of tweets we need to analyze is well more than the API limit of 350 calls per hour.
Thus: is there some way to get full information for a set of tweet IDs, not just one, or alternately to submit many REST calls in the same HTTP request and have it count only once against the API limit?

There's unfortunately no bulk lookup offered for Tweets. You'll need to perform requests one at a time and scope your project to cope with the rate limitations. If you have friends who would like to help you, you could potentially ask them to authorize your application and leverage their permission to gain access to more requests.

Related

How to surpass rate limiting in Twitter?

I am trying to extract data from Twitter. The data includes the tweets and people who retweeted a particular tweet. I have 46,000 tweets and I need to find retweeters for each of the tweet. Further, using Twitter call: retweet/id, you can pass only one id at a time, limiting 15 requests per 15 minutes.
Is there any way to surpass this limit and make unlimited calls?
Not through the REST API, no.
You may want to investigate Twitter's Streaming API to see if the functionality it provides meets your needs. Accessing it is a little more complex than the REST API, but it may be able to help you meet your needs.
You will find people who will tell you to do things like set up dummy accounts and dummy applications. Don't do this. Twitter actively monitors the API for use patterns like this and you will find your applications and IP addresses blacklisted.

Is there a way around twitter's rate limit? A pricing plan perhaps?

I am working on a project which requires getting tweet and user information from twitter. I can't even test the current system because I keep hitting twitter rate limit. Is there any way around it?
Basic information that I am looking to extract from each status is:
Status text
User follower count
User following count
Retweet count
Geo location co-ordinates
I am using Twitter4J API to do this.
Any help will be appreciated. Thanks in advance.
EDIT
I am using twitter's search API to get list of tweets.
One option is to use a Twitter Data Reseller (e.g. GNIP) who can sell tweets.
Another option is to maximize your use of the API. Here are some tips:
Check Rate Limit Status for each API you're use to make sure you don't go over and when the rate limit resets (currently every 15 minutes).
Look at the parameters to make sure you request a count of the maximum number of tweets for that API. e.g. a count can default to 20, but you can set it to 200, depending availability and limits on the specific endpoint. This potentially reduces the number of queries you have to make.
Page your results according to the Twitter's Working with Timelines guidance. Use SinceID and MaxID to make sure you're only requesting new tweets. This could reduce requests by reducing the number of tweets you need (through increasing the opportunity to stay within max count) and reducing the number of requests by avoiding queries for tweets you already have.
Essentially, you want to examine endpoint parameters with a perspective for how to decrease bandwidth and reduce the number of queries you have to make.

How to query a Twitter timeline in parallel?

I am building a Twitter app and I'll be pulling a big amount of data from the user's timeline. For speed, I need to query the timeline in parallel. My aim is to pull 1000 of user's tweets from the API, but the upper limit of number of tweets per request is set to 200 by the Twitter API. Pagination works by specifying the last (oldest) tweet's ID from the previous request, so I need to know the result of the previous API call to make the next call. This method is not parallelizable. Is there any alternative method for getting the user timeline from the Twitter API where I can make parallel requests (there is the page property, but is deprecated and will be nonfunctional in the near future).
What you have to remember, is that Twitter have a difficult relationship with external developers. Using their API for anything interesting like this is simply not allowed by them.
What you need is access to the Firehose.
However, even if you're willing to pay a million dollars a year - Twitter aren't interested.
You could try getting it from a third party like Gnip but - again - likely to be expensive.
So, essentially, you can't. Twitter just aren't interested in amateur developers doing anything innovative with their platform. Sorry.

How do I get the follower count for thousands of Twitter users per hour, without getting rate limited?

I know that I can use "users/show" and get "followers_count" or I can do "followers/ids" and count the number of IDs returned, but both of these methods are rate-limited at 150 requests per hour when anonymous and 350 when signed w OAuth.
The program I'm writing uses the Twitter Search API to look for all mentions of a hashtag. I'm using the Search API and not the Streaming API because I need to look for historical tweets, not just real time.
When I find a tweet that contains the hashtag, I want to save the user's handle, tweet ID, time of tweet, and the number of followers that user has. Since the number of followers per user isn't returned with the Search API, I need to use another API call for that. That extra call is what's causing me trouble.
Are there any more efficient ways to get the number of followers for more than 350 users per hour? (There are a TON of tweets coming in...)
Your only option is GET users/lookup which supports fetching up too 100 user objects in a single request. Authentication is required so you will be allowed 35000 user objects/hour. If that still isn't enough should look into queueing the requests.

fetching all past tweets from users home timeline on twitter

I'm trying to write a program that will retrieve all of the tweets the user has seen on their twitter home timeline (ie, from people their following, as they would see at twitter.com). I realise this is a lot of data, and the rest API has limitations.
What would be the be best way to do this? Slowly retrieve the last 200 or whatever the limit is tweets, keeping in mind the 350 requests per hour limit? Or is there some hard limit to how far back I can go even with that?
The streaming API only streams from current point on I believe, so I don't think this is an option. This is a personal project so I can't pay very much for any elevated access or anything.
Yes, there is a limit to how far back you can go:
Clients may access a theoretical
maximum of 3,200 statuses via the page
and count parameters for the
user_timeline REST API methods. Other
timeline methods have a theoretical
maximum of 800 statuses. Requests for
more than the limit will result in a
reply with a status code of 200 and an
empty result in the format requested.
Twitter still maintains a database of
all the tweets sent by a user.
However, to ensure performance of the
site, this artificial limit is
temporarily in place.
Source: http://dev.twitter.com/pages/every_developer
As you mentioned, you will need to go page by page through each of the 200 max results that come back until you hit that limit and get the empty result set, making sure not to hit the 350 requests per hour. There might also be gaps depending on how many tweets the user has on their timeline.

Resources