Twitter strategy: Streaming API vs. REST API - twitter

I'm working on a kind of a twitter wall. Users can login with twitter and create their own wall, which will display the tweets for certain terms/hashtags.
I'm still looking for the best strategy to get the data out of the Twitter APIs.
Following some of my thoughts:
Strategy 1: Streaming API
Open a single stream (POST statuses/filter) for all walls
Each hashtag is added to the track parameter
When new tweets arrive, they will be processed and sent to the corresponding wall
("one account, one application, one open connection" cf. https://dev.twitter.com/discussions/14935)
Problems with the Streaming API
Streaming api is limited to 400 keywords to track
What to do if there are more than 400 keywords to track?
Streaming api is limited to 1% of the tweets of the firehose
It's very difficult to get above 1% of the firehose, but if you're tracking a term like "apple" it'd be pretty easy to exceed the 1%. (cf. https://dev.twitter.com/discussions/6349)
How can I handle such popular terms? Blacklist them?
Strategy 2: REST Search API
Store user access tokens
Poll the Search API (GET search/tweets) on behalf of the user, respecting the rate limits of 180 queries per 15 minute
(cf. https://dev.twitter.com/discussions/11141)
Problems with the REST Search API
Polling
Could get very expensive to poll the API for a lot of users.
Do you have any suggestions/recommendations which strategy would fit the best? Are there already solutions for these problems?
Best regards

Related

How does the userQuota limitations works on YouTube Data API V3?

I'm building an alternative client for YouTube subscriptions browsing (folder based subscriptions with according feed generated), and I'm making a lot of requests to YouTube to aggregate that data.
I'm caching a lot of requests as it is not needed to refresh them once it has been fetched on any other day before the current one.
The fact is, current-day refreshes are consuming a lot, and I reach my quota pretty fast even though those requests are read-only.
I submitted that YouTube quota increase request form, but still, I'm quite afraid.
Am I missing something with the userIp & quotaUser parameters ?
Shouldn't those requests - as they are pretty much the same that a normal user would do on the regular YouTube client - be considered as "Queries per 100 seconds per user" ?
My main quota, the "Queries per day" currently seems to handle ALL the requests coming from my app, even though I added the quotaUser parameter on all my requests made by a user on the frontend.
I think I am missing something as my app should not be considered as "data consuming" as it is sending almost nothing to YouTube in terms of data, and it is just reading data that is also available on the YouTube main client, but not in the same format..
Thanks for your help.

Handle status 429 in Rails API

I did a Twitter clone using rails api + react, just for study purposes.
I have quite simple logic of requests: click in a user, load its informations and tweets, requesting for the api. However, If I do this fast like 3 times, I receive the status 429 (too many requests) with the header Retry-After: 5.
There is a way to increase the number of requests in a given time? How would be the correct approach to handle with this in such common situation?
From my understanding, the error information you have shown is correct, It means request cannot be served due to the application's rate limit having been peaked for the resource.
Rate limits are divided into 15 minute intervals. All endpoints
require authentication, so there is no concept of unauthenticated
calls and rate limits.
To overcome this situation, here is an example from the documentation itself.

How to retrieve a big amount of tweets

I´m trying to retrieve a big amount of tweets, like 1000 or 3000 per minute, now I´m using the public API of twitter with the URL:
https://api.twitter.com/1.1/search/tweets.json?q=#whatever+OR+#whatever&since=15-05-2015&count=100&result_type=recent
but the problem is that I need more than 100 tweets because Twitter only supports 180 request every 15 mins, and always I need more than than, my question is that if there is somewhere a twitter API that can do what the public API but can retrieve more than 100 tweets per request.
The sort answer is that there is no legitimate way to get around the rate limits of the Rest API.
You can request "whitelisting" for your application, but Twitter doesn't do it very often and not for average users.
The Streaming API might provide the specific functionality that you're looking for.

Understanding POST statuses/filter Rate Limit

I need to do a keyword based data fetching on Twitter. I looked up the documentation and "POST statuses/filter" seemed like the best option. However, I do not understand how the rate limiting works. Does this mean that I can fire this request repeatedly? If yes, at what rate should I do so? Or do I have to fire the request only once and keep on getting data continuously? They have given clear explanations for the REST API. There's even a page showing the number of requests permissible in a 15 minute window for each REST API method. I was unable to find something similar for "POST statuses/filter".
From what I've been researching about using the Streaming API there aren't any rate limits because you just make the request once to open the connection, then you keep it open and you are sent a stream (hence the name) of tweets.
Once applications establish a connection to a streaming endpoint, they
are delivered a feed of Tweets, without needing to worry about polling
or REST API rate limits.
https://dev.twitter.com/docs/streaming-apis/streams/public

getting all tweets of a twitter user, rate limit problem

I've been trying to get all tweets of a some public(unlocked) twitter user.
I'm using the REST API:
http://api.twitter.com/1/statuses/user_timeline.json?screen_name=andy_murray&count=200&page=1'
While going over the 16 pages (page param) it allows, thus getting 3200 tweets which is ok.
BUT then I discovered the rate limit for such calls is 150 per hour(!!!), meaning like less than 10 user queries in an hour (16 pages each). (350 are allowed if u authenticate, still very low number)
Any ideas on how to solve this? the streaming\search APIs don't seem appropriate(?), and there are some web services out there that do seem to have this data.
Thanks
You can either queue up the requests and make them as the rate limit allows or you can make authenticated requests as multiple users. Each users has 350 requests/hour.
One approach would be to use the streaming API (or perhaps the more specific user streams, if that's better suited to your application) to start collecting all tweets as they occur from your target user(s) without having to bother with the traditional rate limits, and then use the REST API to backfill those users' historical tweets.
Granted, you only have 350 authenticated requests per hour, but if you run your harvester around the clock, that's still 1,680,000 tweets per day (350 requests/hour * 24 hours/day * 200 tweets/request).
So, for example, if you decided to pull 1,000 tweets per user per day (5 API calls # 200 tweets per call), you could run through 1,680 user timelines per day (70 timelines per hour). Then, on the next day, begin where you left off by harvesting the next 1,000 tweets using the oldest status ID per user as the max_id parameter in your statuses/user_timeline request.
The streaming API will keep you abreast of any new statuses your target users tweet, and the REST API calls will pretty quickly, in about four days, start running into Twitter's fetch limit for those users' historical tweets. After that, you can add additional users to fetch going forward from the streaming endpoint by adding them to the follow list, and you can stop fetching historical tweets for those users that have maxed out, and start fetching a new target group's tweets.
The Search API would seem to be appropriate for your needs, since you can search on screen name. The Search API rate limit is higher than the REST API rate limit.

Resources