Twitter gem: how to avoid searching deeply with max_id? - ruby-on-rails

I want my app to search tweets with a specific #tag on Twitter every few minutes, like this:
results = client.search("#mypopulartag")
However, I don't want to do a full search each time. In building the app, I've encountered the Twitter::TooManyRequests error, because it returns a lot of results (presumably the Twitter gem makes as many requests to Twitter as needed for one client.search() call).
I don't need it to search super deep each time. Can I pass in the max_id parameter to the client.search method, so I don't waste API calls?

Yes, if you keep track of the id of the most recent tweet that you've processed, you can get all tweets since then with something like this (using gem version 5.13):
client.search(
"#mypopulartag",
result_type: 'recent',
since_id: since_id # your last processed id
).take(15)
Just keep in mind that if there are, say, 60 results, you'll need to execute more client.search calls to get all the tweets. For those calls, you'd want to also specify a max_id equal to the last tweet id that was processed in the current search.

Related

How to determine how many more requests my app can make before it hits the Twitter API's rate limit?

My application habitually makes a request to the Twitter API for a user's timeline (the user's tweets).
As an aside, I'm using the twitter gem and configured a Twitter client in order to enable my app to make the request:
client = Twitter::REST::Client.new do |config|
config.consumer_key = ENV["TWITTER_CONSUMER_KEY"]
config.consumer_secret = ENV["TWITTER_CONSUMER_SECRET"]
config.access_token = ENV["TWITTER_ACCESS_TOKEN"]
config.access_token_secret = ENV["TWITTER_ACCESS_TOKEN_SECRET"]
end
The endpoint that I'm making a request to -- https://api.twitter.com/1.1/statuses/user_timeline.json -- has a rate limit of 900 requests every 15 minutes.
If I understand it correctly, there are 2 common ways to determine if my app has hit its rate limit -- but neither is what I need:
First Approach:
This is a function I wrote that attempts to make the request for a user's tweets, and if the app has it the rate limit, it'll raise an exception.
def get_tweets_from_TwitterAPI(twitter_handle)
tweets = []
begin
tweets = CLIENT.user_timeline(twitter_handle)
rescue Twitter::Error::TooManyRequests => error
raise error
end
return tweets
end
The problem with that approach is I'd like to find out how many more requests my app can safely make before I even make the request to the Twitter API. I fear that this approach would have my app pinging the Twitter API long after it hit the rate limit, and that would open my app up to punitive actions by Twitter (like my app being blacklisted for instance.)
Second Approach
The second approach is to make a request to this Twitter endpoint --https://api.twitter.com/1.1/application/rate_limit_status.json -- that sends data back about where the application's status for a given rate limit.
But again, this endpoint also has its own rate limit (180 requests every 15 minutes) -- which isn't very high. My app would blow past that limit. In an ideal world, I would like to determine my app's current rate-limit status before I make a request to the API at all. And so:
def start_tweets_fetch
number_of_requests_in_last_15 minutes = WHERE DO I GET THIS NUMBER FROM??
if number_of_requests_in_last_15 minutes <= 900
get_tweets_from_TwitterAPI(twitter_handle)
end
end
I'm imagining I would have to increment some number that I've persisted to my database to keep track of requests to the API. Or is there an easier way?
I can't speak for the gem you are using, but a way to track your request limits without having to additionally call the rate_limit_status endpoint is to examine the X-Rate-Limit-Remaining headers on each API call. I don't know whether that data is available on the Ruby gem you're using, though.
Edit
This is in response to Andy Piper's answer which I think is the simplest way to keep track of the remaining calls.
Assuming you're using this Twitter gem, it looks like each response from the gem will populate a Twitter::RateLimit object with the information from the rate limiting headers like Andy has suggested.
You should be able to access that information like this:
tweets = CLIENT.user_timeline(twitter_handle)
remaining_calls = tweets.rate_limit.remaining
From there you can save that value to check it the next time you want to make a request. How you save it and check is up to you but the rest of my answer may still be useful for that.
Note: I haven't tried this method before but it's one of the first things I would try in your situation if I didn't to permanently store request logs.
One way might to be to use Rails' built in Cache API. This will allow you to store any value you wish in a cache store which should be faster and lighter than a database.
number_of_requests_in_last_15 = Rails.cache.fetch("twitter_requests", expires_in: 15.minutes) { 0 }
if number_of_requests_in_last_15 minutes <= 900
get_tweets_from_TwitterAPI(twitter_handle)
Rails.cache.increment("twitter_requests")
end
Let's break this down
Rails.cache.fetch("twitter_requests", expires_in: 15.minutes) { 0 }:
The fetch method on Rails.cache will attempt to pull the value for the key twitter_requests.
If the key doesn't exist, it will evaluate the block and set the return value as the key's new value and return that. In this case, if the key twitter_requests doesn't exist, the new key value will be 0.
The expires_in: 15.minutes option passed to the fetch method says to automatically clear this key (twitter_requests) every 15 minutes.
Rails.cache.increment("twitter_requests"):
Increments the value in the twitter_requests key by 1.
Notes
By default, Rails will use an in memory datastore. This should work without issue but any values stored in the cache will be reset every time you restart the rails server.
The backend of the cache is configurable and can be changed to other popular systems (i.e. memcache, redis) but those will also need to be running and accessible by Rails.
You may want to increment the cache before calling the API to reduce the chance of the cache expiring between when you checked it and when you increment it. Incrementing a key that doesn't exist will return nil.

Twitter::Error::TooManyRequests in GetFollowersController#index

How to solve this problem? I am using 'gem twitter'. But I can not get a list of followers:
#client.followers.collect {|f| #client.user(f).screen_name } #=> Twitter::Error::TooManyRequests (Rate limit exceeded):
It looks like you're trying to collect the screen_names of your followers. However, in your collect block, you're calling the Twitter API again. This results in many API calls to Twitter; and hits your rate limit.
You don't need to do it that way. When you make the #client.followers call, you already have the screen_names of your followers. Try this:
#client.followers.map { |follower| follower.screen_name }
You can look at the API documentation for GET followers/list to see what else is in there. It also tells you that it's rate limited to 15 requests in a 15 minute window. If you've been testing your code a few times, it can be pretty easy to hit that limit.
Another concern is Twitter only returns a maximum of 200 followers per call. If you have more than 200 followers, you'll need to make multiple API calls to retrieve all followers. If you have more than 3,000 followers, it may not be possible to retrieve all followers within the 15 minute rate-limit window.
The twitter gem handles the multiple API calls for you. For example, if you have 1,000 followers, the gem will make multiple API calls behind the scenes. The gem has a recommended way to handle rate limits. Here's what they recommend:
follower_ids = client.follower_ids('justinbieber') begin
follower_ids.to_a
rescue Twitter::Error::TooManyRequests => error
# NOTE: Your process could go to sleep for up to 15 minutes but if you
# retry any sooner, it will almost certainly fail with the same exception.
sleep error.rate_limit.reset_in + 1
retry
end
That error is saying you've made too many requests within a given time frame. You'll have to wait until your rate limit has been cleared.
This is what Twitter says:
Rate limiting of the API is primarily on a per-user basis — or more accurately described, per user access token. If a method allows for 15 requests per rate limit window, then it allows 15 requests per window per access token.
See: https://developer.twitter.com/en/docs/basics/rate-limiting

How pagination works in twitter gem?

I have been integrated a twitter feed into my Rails 4 application using this gem.
Able to fetched twitter home timeline feeds using following twitter gem method
Method:
twitter.home_timeline
and this will return last 20 tweets of home_timeline
To fetch first page tweets, i have added following options to the method which will return first 200 tweets in page 1.
Method:
twitter.home_timeline(:page => 1, :count => 200)
Here, everytime i have to provide page number manually like :page => 2, :page => 3,..and so on to fetch next page tweets.
So, is their any method or way to get total page counts for twitter home time tweets using twitter gem?
As mentioned in the official documentation:
Note: This method can only return up to 800 Tweets, including
retweets.
Meaning that you will never be able to get tweets greater than page 4 (while using count as 200)
To answer the question in headline, gem simply makes a call and pass all provided parameters, including page and count, to twitter API - see source
def home_timeline(options = {})
perform_get_with_objects('/1.1/statuses/home_timeline.json', options, Twitter::Tweet)
end
So whatever logic you get is logic from the twitter API. For API, you can read official doc here. Paging is additionally explained here. As indicated in other answer, there is a limit for statuses/home_timeline call:
Up to 800 Tweets are obtainable on the home timeline. It is more
volatile for users that follow many users or follow users who tweet
frequently.
Meanwhile, you can test the API https://dev.twitter.com/rest/tools/console - was helpful link for me

Get all liked pages sorted by general likes

I'm trying to get Pages liked by the user ordered descending by amount of likes each Page has...
It's difficult to get this using Graph API cause I'd have to fetch request like this:
let request = FBSDKGraphRequest(graphPath: "me/likes" parameters: nil)
and recursively call this inside because this request will paginate response. After I get everything I'll have to sort it locally and that's how I'd get it 😬
IMHO, it's a lil bit overkill so I've looked into a method of achieving same thing but using FQL and this is the query:
SELECT name, fan_count FROM page WHERE page_id IN (SELECT page_id FROM page_fan WHERE uid = me()) ORDER BY fan_count DESC
At first I was happy with this but after some test my friend told me that he can't see Messi on his list. So I wonder what's the reason that not all Pages are show in this FQL query result?
You don’t have to make separate requests for this.
The Graph API has a feature called “field expansion”, that allows you to specify that you want data from multiple “levels” in one go. https://developers.facebook.com/docs/graph-api/using-graph-api/v2.4#fieldexpansion
So requesting
/me/likes?fields=id,name,likes
will give you the id, name and number of likes for each of the user’s liked pages.
(You will still have to follow the pagination links, gather all results and do the sorting on your end afterwards, since the API doesn’t currently allow for sorting.)
FQL is deprecated and only works with older Apps using v2.0 of the Graph API. As of now, the only way to do this is to recursively get all Pages and do the sorting on your own.

How can I get all tweets relating to a specific user in Twitter?

I am new to the Twitter API and I'm having an issue with the user_timeline API.
I am using the following REST query:
https://api.twitter.com/1/statuses/user_timeline.xml?screen_name=twitterapi&count=50
which is provides the user's timeline data, however it only gives the user's tweets; I want all tweets by and about the user (i.e. the user's tweets, and mentions of the user by the user's followers).
Thanks in advance!
You can access this by searching for the user's # handle. This will return tweets which mention #user and also tweets by #user.
Twitter API - Search
--
I've no experience about formatting for JSON calls but the following should be enough:
https://api.twitter.com/1.1/search/tweets.json?q=%40ataulm
The %40 is for the # symbol, and ataulm is the user name you wish to query. See the page linked for default values to the other parameters - this will, for example, only return 15 tweets per "page" (not sure what a page refers to), but can be set to a maximum of 100 per page, using the count parameter.
"https://api.twitter.com/1.1/statuses/user_timeline.json?screen_name=".$twitteruser.'&count=500'
BUt it is giving only 200 records.

Resources