How to retrieve a big amount of tweets - twitter

I´m trying to retrieve a big amount of tweets, like 1000 or 3000 per minute, now I´m using the public API of twitter with the URL:
https://api.twitter.com/1.1/search/tweets.json?q=#whatever+OR+#whatever&since=15-05-2015&count=100&result_type=recent
but the problem is that I need more than 100 tweets because Twitter only supports 180 request every 15 mins, and always I need more than than, my question is that if there is somewhere a twitter API that can do what the public API but can retrieve more than 100 tweets per request.

The sort answer is that there is no legitimate way to get around the rate limits of the Rest API.
You can request "whitelisting" for your application, but Twitter doesn't do it very often and not for average users.
The Streaming API might provide the specific functionality that you're looking for.

Related

youtube search by keyword internal working

Need a clarification on this:
As per docs "By default, a search result set identifies matching video, channel, and playlist resources", how this matching takes place, do they search on comments also, any idea on this.
Thanks !!
The YouTube api operates on the same rate limits as the other google apis.
There are project based limits and user based limits.
You can see the limits on google developer console.
My project can make a max 1800000 requests per minute
It also has a quota cost limit of 10000 which is not really what it sound like.
Then each user can make a max of 180000 request per minute.
This not related to the amount of data a user has on their account. Its strictly related to the number of requests or the cost of the request your application or a user can make over a period of time.
You can request additional daily quota over the development 10k if you want. Just submit the form over on google cloud console.
I am not aware of increased abilities with YouTube APIs for big YouTube channels.

Twitter API Limits when using statuses/lookup endpoint

simple question, (I was not able to find answer on twitter api doc)
following get request
https://api.twitter.com/1.1/statuses/lookup.json?include_entities=true&id=657208379442597888%2C657215510283730944
is request to get 2 tweets by theirs ids.
Simple question : in the point of the Twitter API Limits, when exectuting this request it will be considered as 1 or 2 calls ?
Regards,
It should be counted as a single request. Since you can ask for up to 100 tweets, but the rate limit for app auth is 60, that only makes sense.
However, you can prove that by just checking the rate limit response headers. If you make the same request twice (and no other app is making requests on the user's behalf, or if you're the only one using the app's auth), you should see the X-Rate-Limit-Remaining header only decrease by one.

Twitter strategy: Streaming API vs. REST API

I'm working on a kind of a twitter wall. Users can login with twitter and create their own wall, which will display the tweets for certain terms/hashtags.
I'm still looking for the best strategy to get the data out of the Twitter APIs.
Following some of my thoughts:
Strategy 1: Streaming API
Open a single stream (POST statuses/filter) for all walls
Each hashtag is added to the track parameter
When new tweets arrive, they will be processed and sent to the corresponding wall
("one account, one application, one open connection" cf. https://dev.twitter.com/discussions/14935)
Problems with the Streaming API
Streaming api is limited to 400 keywords to track
What to do if there are more than 400 keywords to track?
Streaming api is limited to 1% of the tweets of the firehose
It's very difficult to get above 1% of the firehose, but if you're tracking a term like "apple" it'd be pretty easy to exceed the 1%. (cf. https://dev.twitter.com/discussions/6349)
How can I handle such popular terms? Blacklist them?
Strategy 2: REST Search API
Store user access tokens
Poll the Search API (GET search/tweets) on behalf of the user, respecting the rate limits of 180 queries per 15 minute
(cf. https://dev.twitter.com/discussions/11141)
Problems with the REST Search API
Polling
Could get very expensive to poll the API for a lot of users.
Do you have any suggestions/recommendations which strategy would fit the best? Are there already solutions for these problems?
Best regards

Understanding POST statuses/filter Rate Limit

I need to do a keyword based data fetching on Twitter. I looked up the documentation and "POST statuses/filter" seemed like the best option. However, I do not understand how the rate limiting works. Does this mean that I can fire this request repeatedly? If yes, at what rate should I do so? Or do I have to fire the request only once and keep on getting data continuously? They have given clear explanations for the REST API. There's even a page showing the number of requests permissible in a 15 minute window for each REST API method. I was unable to find something similar for "POST statuses/filter".
From what I've been researching about using the Streaming API there aren't any rate limits because you just make the request once to open the connection, then you keep it open and you are sent a stream (hence the name) of tweets.
Once applications establish a connection to a streaming endpoint, they
are delivered a feed of Tweets, without needing to worry about polling
or REST API rate limits.
https://dev.twitter.com/docs/streaming-apis/streams/public

getting all tweets of a twitter user, rate limit problem

I've been trying to get all tweets of a some public(unlocked) twitter user.
I'm using the REST API:
http://api.twitter.com/1/statuses/user_timeline.json?screen_name=andy_murray&count=200&page=1'
While going over the 16 pages (page param) it allows, thus getting 3200 tweets which is ok.
BUT then I discovered the rate limit for such calls is 150 per hour(!!!), meaning like less than 10 user queries in an hour (16 pages each). (350 are allowed if u authenticate, still very low number)
Any ideas on how to solve this? the streaming\search APIs don't seem appropriate(?), and there are some web services out there that do seem to have this data.
Thanks
You can either queue up the requests and make them as the rate limit allows or you can make authenticated requests as multiple users. Each users has 350 requests/hour.
One approach would be to use the streaming API (or perhaps the more specific user streams, if that's better suited to your application) to start collecting all tweets as they occur from your target user(s) without having to bother with the traditional rate limits, and then use the REST API to backfill those users' historical tweets.
Granted, you only have 350 authenticated requests per hour, but if you run your harvester around the clock, that's still 1,680,000 tweets per day (350 requests/hour * 24 hours/day * 200 tweets/request).
So, for example, if you decided to pull 1,000 tweets per user per day (5 API calls # 200 tweets per call), you could run through 1,680 user timelines per day (70 timelines per hour). Then, on the next day, begin where you left off by harvesting the next 1,000 tweets using the oldest status ID per user as the max_id parameter in your statuses/user_timeline request.
The streaming API will keep you abreast of any new statuses your target users tweet, and the REST API calls will pretty quickly, in about four days, start running into Twitter's fetch limit for those users' historical tweets. After that, you can add additional users to fetch going forward from the streaming endpoint by adding them to the follow list, and you can stop fetching historical tweets for those users that have maxed out, and start fetching a new target group's tweets.
The Search API would seem to be appropriate for your needs, since you can search on screen name. The Search API rate limit is higher than the REST API rate limit.

Resources