Twitter::Error::TooManyRequests in GetFollowersController#index - ruby-on-rails

How to solve this problem? I am using 'gem twitter'. But I can not get a list of followers:
#client.followers.collect {|f| #client.user(f).screen_name } #=> Twitter::Error::TooManyRequests (Rate limit exceeded):

It looks like you're trying to collect the screen_names of your followers. However, in your collect block, you're calling the Twitter API again. This results in many API calls to Twitter; and hits your rate limit.
You don't need to do it that way. When you make the #client.followers call, you already have the screen_names of your followers. Try this:
#client.followers.map { |follower| follower.screen_name }
You can look at the API documentation for GET followers/list to see what else is in there. It also tells you that it's rate limited to 15 requests in a 15 minute window. If you've been testing your code a few times, it can be pretty easy to hit that limit.
Another concern is Twitter only returns a maximum of 200 followers per call. If you have more than 200 followers, you'll need to make multiple API calls to retrieve all followers. If you have more than 3,000 followers, it may not be possible to retrieve all followers within the 15 minute rate-limit window.
The twitter gem handles the multiple API calls for you. For example, if you have 1,000 followers, the gem will make multiple API calls behind the scenes. The gem has a recommended way to handle rate limits. Here's what they recommend:
follower_ids = client.follower_ids('justinbieber') begin
follower_ids.to_a
rescue Twitter::Error::TooManyRequests => error
# NOTE: Your process could go to sleep for up to 15 minutes but if you
# retry any sooner, it will almost certainly fail with the same exception.
sleep error.rate_limit.reset_in + 1
retry
end
That error is saying you've made too many requests within a given time frame. You'll have to wait until your rate limit has been cleared.
This is what Twitter says:
Rate limiting of the API is primarily on a per-user basis — or more accurately described, per user access token. If a method allows for 15 requests per rate limit window, then it allows 15 requests per window per access token.
See: https://developer.twitter.com/en/docs/basics/rate-limiting

Related

How to determine how many more requests my app can make before it hits the Twitter API's rate limit?

My application habitually makes a request to the Twitter API for a user's timeline (the user's tweets).
As an aside, I'm using the twitter gem and configured a Twitter client in order to enable my app to make the request:
client = Twitter::REST::Client.new do |config|
config.consumer_key = ENV["TWITTER_CONSUMER_KEY"]
config.consumer_secret = ENV["TWITTER_CONSUMER_SECRET"]
config.access_token = ENV["TWITTER_ACCESS_TOKEN"]
config.access_token_secret = ENV["TWITTER_ACCESS_TOKEN_SECRET"]
end
The endpoint that I'm making a request to -- https://api.twitter.com/1.1/statuses/user_timeline.json -- has a rate limit of 900 requests every 15 minutes.
If I understand it correctly, there are 2 common ways to determine if my app has hit its rate limit -- but neither is what I need:
First Approach:
This is a function I wrote that attempts to make the request for a user's tweets, and if the app has it the rate limit, it'll raise an exception.
def get_tweets_from_TwitterAPI(twitter_handle)
tweets = []
begin
tweets = CLIENT.user_timeline(twitter_handle)
rescue Twitter::Error::TooManyRequests => error
raise error
end
return tweets
end
The problem with that approach is I'd like to find out how many more requests my app can safely make before I even make the request to the Twitter API. I fear that this approach would have my app pinging the Twitter API long after it hit the rate limit, and that would open my app up to punitive actions by Twitter (like my app being blacklisted for instance.)
Second Approach
The second approach is to make a request to this Twitter endpoint --https://api.twitter.com/1.1/application/rate_limit_status.json -- that sends data back about where the application's status for a given rate limit.
But again, this endpoint also has its own rate limit (180 requests every 15 minutes) -- which isn't very high. My app would blow past that limit. In an ideal world, I would like to determine my app's current rate-limit status before I make a request to the API at all. And so:
def start_tweets_fetch
number_of_requests_in_last_15 minutes = WHERE DO I GET THIS NUMBER FROM??
if number_of_requests_in_last_15 minutes <= 900
get_tweets_from_TwitterAPI(twitter_handle)
end
end
I'm imagining I would have to increment some number that I've persisted to my database to keep track of requests to the API. Or is there an easier way?
I can't speak for the gem you are using, but a way to track your request limits without having to additionally call the rate_limit_status endpoint is to examine the X-Rate-Limit-Remaining headers on each API call. I don't know whether that data is available on the Ruby gem you're using, though.
Edit
This is in response to Andy Piper's answer which I think is the simplest way to keep track of the remaining calls.
Assuming you're using this Twitter gem, it looks like each response from the gem will populate a Twitter::RateLimit object with the information from the rate limiting headers like Andy has suggested.
You should be able to access that information like this:
tweets = CLIENT.user_timeline(twitter_handle)
remaining_calls = tweets.rate_limit.remaining
From there you can save that value to check it the next time you want to make a request. How you save it and check is up to you but the rest of my answer may still be useful for that.
Note: I haven't tried this method before but it's one of the first things I would try in your situation if I didn't to permanently store request logs.
One way might to be to use Rails' built in Cache API. This will allow you to store any value you wish in a cache store which should be faster and lighter than a database.
number_of_requests_in_last_15 = Rails.cache.fetch("twitter_requests", expires_in: 15.minutes) { 0 }
if number_of_requests_in_last_15 minutes <= 900
get_tweets_from_TwitterAPI(twitter_handle)
Rails.cache.increment("twitter_requests")
end
Let's break this down
Rails.cache.fetch("twitter_requests", expires_in: 15.minutes) { 0 }:
The fetch method on Rails.cache will attempt to pull the value for the key twitter_requests.
If the key doesn't exist, it will evaluate the block and set the return value as the key's new value and return that. In this case, if the key twitter_requests doesn't exist, the new key value will be 0.
The expires_in: 15.minutes option passed to the fetch method says to automatically clear this key (twitter_requests) every 15 minutes.
Rails.cache.increment("twitter_requests"):
Increments the value in the twitter_requests key by 1.
Notes
By default, Rails will use an in memory datastore. This should work without issue but any values stored in the cache will be reset every time you restart the rails server.
The backend of the cache is configurable and can be changed to other popular systems (i.e. memcache, redis) but those will also need to be running and accessible by Rails.
You may want to increment the cache before calling the API to reduce the chance of the cache expiring between when you checked it and when you increment it. Incrementing a key that doesn't exist will return nil.

How to use thread pool for Graph API requests in Rails?

The process:
I'm getting events information a facebook page, among the fields I ask for is the Place ID and Owner ID. I store all the info in database and then I make more requests for every place and owner using the IDs I got before to get info about those places and Owners.
require 'koala'
def update
graph = Koala::Facebook::API.new(user.facebook.accesstoken)
events_info = graph.get_object("#{self.fb_id}/events?fields=id,name,place,owner,...")
place_info = graph.get_object("#{self.fb_id}/events?fields=id,name,location,...")
owner_info = graph.get_object("#{self.fb_id}/events?fields=id,name,...")
A LOT OF DATABASE STORING
end
The problem comes when I process multiple facebook pages like:
fbpages.each{|page| page.update}
It takes too long to execute. This is the benchmark:
#<Benchmark::Tms:0x00000007670f28 #label="", #real=49.49197926799752, #cstime=0.0, #cutime=0.0, #stime=0.29000000000000004, #utime=4.140000000000001, #total=4.430000000000001>
As I see it, most of the time, the process is waiting for responses from facebook. I've been looking and i've found information about thread pooling, promises, futures, and batch requests but I Don't get how to make every update asyncronous or other ways to not wait for an update to finish to start with the others.
Please help!

Twitter gem: how to avoid searching deeply with max_id?

I want my app to search tweets with a specific #tag on Twitter every few minutes, like this:
results = client.search("#mypopulartag")
However, I don't want to do a full search each time. In building the app, I've encountered the Twitter::TooManyRequests error, because it returns a lot of results (presumably the Twitter gem makes as many requests to Twitter as needed for one client.search() call).
I don't need it to search super deep each time. Can I pass in the max_id parameter to the client.search method, so I don't waste API calls?
Yes, if you keep track of the id of the most recent tweet that you've processed, you can get all tweets since then with something like this (using gem version 5.13):
client.search(
"#mypopulartag",
result_type: 'recent',
since_id: since_id # your last processed id
).take(15)
Just keep in mind that if there are, say, 60 results, you'll need to execute more client.search calls to get all the tweets. For those calls, you'd want to also specify a max_id equal to the last tweet id that was processed in the current search.

Multiple GET requests in Rails?

I'm developing an application, and on one page it requires approximately 12-15 GET requests to be made to an API in the background. My original intent was to make the requests using AJAX from jQuery, but it turns out that it is impossible to do so with the Steam Web API I am using.
Doing this in the Rails controller before the page loads is, for obvious reasons, very slow.
After I get the data from the API, I parse it and send it to the JavaScript using gon. The problem is that I don't know how to get and set the data after the page renders.
Here is what my controller would look like:
def index
#title = "My Stats"
if not session.key?(:current_user) then
redirect_to root_path
else
gon.watch.winlossdata = GetMatchHistoryRawData(session[:current_user][:uid32])
end
end
The function GetMatchHistoryRawData is a helper function that is calling the GET requests.
Using the whenever gem --(possibly, see below)....
Set a value in a queue database table before rendering the page. Using a "cron" task (whenever gem) that monitors the queue table you can make requests to the Steam API and populate a queue result table. On the rendered page you could implement a JavaScript periodic check with AJAX to monitor the queue result table and populate the page once the API returns a result.
Additional Info:
I have not used the whenever gem yet but I did some more reading on it and there might be an issue with the interval not being short enough to make it as close to real time as possible. I am currently doing my job processing with a Java application implementing a timer but have wondered about moving to whenever and CRON. So whenever might not work for you but the idea of an asynchronous processor doing the work of contacting the API is the gist of my answer. If the payload from the Steam API is small and returned fast enough then like what was stated above you could use a direct call via AJAX to the controller and then the Steam API.
Regarding the Rails code it should be pretty much standard.
controller:
def index
# Create a Steam API Queue row in the database and save any pertinent information needed for contacting the Steam API
#unique_id = Model.id # some unique id created for the Steam API queue row
end
# AJAX calls START
def get_api_result
# Check for a result using
params[:unique_id]
# render partial for <div>
end
# AJAX calls end
View: index
# Display your page
# Setup an intermittent AJAX call to "controller#get_api_result" with some unique id #{#unique_id} i.e. params[:unique_id] to identify the Steam API Queue table row, populate the result of the call into a <div>
external_processor_code (Whenever Gem, Java implementation, some Job processor, etc...)
Multiple threads should be available to process the Steam API Queue table and retrieve results every few seconds and populate the result table that will be read by the controller when requested via the AJAX call.
To give a complete example of this type of implementation would take some time so I have briefly, from the conceptual level, outlined it above. There might be some other ways to do this that could be more efficient with the way technology is expanding so please do some investigation.
I hope this is helpful!

My web site need to read a slow web site, how to improve the performance

I'm writing a web site with rails, which can let visitors inputing some domains and check if they had been regiestered.
When user clicked "Submit" button, my web site will try to post some data to another web site, and read the result back. But that website is slow for me, each request need 2 or 3 seconds. So I'm worried about the performance.
For example, if my web server allows 100 processes at most, that there are only 30 or 40 users can visit my website at the same time. This is not acceptable, is there any way to improve the performance?
PS:
At first, I want to use ajax reading that web site, but because of the "cross-domain" problem, it doesn't work. So I have to use this "ajax proxy" solution.
It's a bit more work, but you can use something like DelayedJob to process the requests to the other site in the background.
DelayedJob creates separate worker processes that look at a jobs table for stuff to do. When the user clicks submit, such a job is created, and starts running in one of those workers. This off-loads your Rails workers, and keeps your website snappy.
However, you will have to create some sort of polling mechanism in the browser while the job is running. Perhaps using a refresh or some simple AJAX. That way, the visitor could see a message such as “One moment, please...”, and after a while, the actual results.
Rather than posting some data to the websites, you could use an HTTP HEAD request, which (I believe) should return only the header information for that URL.
I found this code by googling around a bit:
require "net/http"
req = Net::HTTP.new('google.com', 80)
p req.request_head('/')
This will probably be faster than a POST request, and you won't have to wait to receive the entire contents of that resource. You should be able to determine whether the site is in use based on the response code.
Try using typhoeus rather than AJAX to get the body. You can POST the domain names for that site to check using typhoeus and can parse the response fetched. Its extremely fast compared to other solutions. A snippet that i ripped from the wiki page from the github repo http://github.com/pauldix/typhoeus shows that you can run requests in parallel (Which is probably what you want considering that it takes 1 to 2 seconds for an ajax request!!) :
hydra = Typhoeus::Hydra.new
first_request = Typhoeus::Request.new("http://localhost:3000/posts/1.json")
first_request.on_complete do |response|
post = JSON.parse(response.body)
third_request = Typhoeus::Request.new(post.links.first) # get the first url in the post
third_request.on_complete do |response|
# do something with that
end
hydra.queue third_request
return post
end
second_request = Typhoeus::Request.new("http://localhost:3000/users/1.json")
second_request.on_complete do |response|
JSON.parse(response.body)
end
hydra.queue first_request
hydra.queue second_request
hydra.run # this is a blocking call that returns once all requests are complete
first_request.handled_response # the value returned from the on_complete block
second_request.handled_response # the value returned from the on_complete block (parsed JSON)
Also Typhoeus + delayed_job = AWESOME!

Resources