How can I prevent many sidekiq jobs from exceeding the API calls limit - ruby-on-rails

I am working on an Ruby On Rails application. We have many sidekiq workers that can process multiple jobs at a time. Each job will make calls to the Shopify API, the calls limit set by Shopify is 2 calls per second. I want to synchronize that, so that only two jobs can call the API in a given second.
The way I'm doing that right now, is like this:
# frozen_string_literal: true
class Synchronizer
attr_reader :shop_id, :queue_name, :limit, :wait_time
def initialize(shop_id:, queue_name:, limit: nil, wait_time: 1)
#shop_id = shop_id
#queue_name = queue_name.to_s
#limit = limit
#wait_time = wait_time
end
# This method should be called for each api call
def synchronize_api_call
raise "a block is required." unless block_given?
get_api_call
time_to_wait = calculate_time_to_wait
sleep(time_to_wait) unless Rails.env.test? || time_to_wait.zero?
yield
ensure
return_api_call
end
def set_api_calls
redis.del(api_calls_list)
redis.rpush(api_calls_list, calls_list)
end
private
def get_api_call
logger.log_message(synchronizer: 'Waiting for api call', color: :yellow)
#api_call_timestamp = redis.brpop(api_calls_list)[1].to_i
logger.log_message(synchronizer: 'Got api call.', color: :yellow)
end
def return_api_call
redis_timestamp = redis.time[0]
redis.rpush(api_calls_list, redis_timestamp)
ensure
redis.ltrim(api_calls_list, 0, limit - 1)
end
def last_call_timestamp
#api_call_timestamp
end
def calculate_time_to_wait
current_time = redis.time[0]
time_passed = current_time - last_call_timestamp.to_i
time_to_wait = wait_time - time_passed
time_to_wait > 0 ? time_to_wait : 0
end
def reset_api_calls
redis.multi do |r|
r.del(api_calls_list)
end
end
def calls_list
redis_timestamp = redis.time[0]
limit.times.map do |i|
redis_timestamp
end
end
def api_calls_list
#api_calls_list ||= "api-calls:shop:#{shop_id}:list"
end
def redis
Thread.current[:redis] ||= Redis.new(db: $redis_db_number)
end
end
the way I use it is like this
synchronizer = Synchronizer.new(shop_id: shop_id, queue_name: 'shopify_queue', limit: 2, wait_time: 1)
# this is called once the process started, i.e. it's not called by the jobs themselves but by the App from where the process is kicked off.
syncrhonizer.set_api_calls # this will populate the api_calls_list with 2 timestamps, those timestamps will be used to know when the last api call has been sent.
then when a job wants to make a call
syncrhonizer.synchronize_api_call do
# make the call
end
The problem
The problem with this is that if for some reason a job fails to return to the api_calls_list the api_call it took, that will make that job and the other jobs stuck for ever, or until we notice that and we call set_api_calls again. That problem won't affect that particular shop only, but also the other shops as well, because the sidekiq workers are shared between all the shops using our app. It happen sometimes that we don't notice that until a user calls us, and we find that it was stuck for many hours while it should be finished in a few minutes.
The Question
I just realised lately that Redis is not the best tool for shared locking. So I am asking, Is there any other good tool for this job?? If not in the Ruby world, I'd like to learn from others as well. I'm interested in the techniques as well as the tools. So every bit helps.

You may want to restructure your code and create a micro-service to process the API calls, which will use a local locking mechanism and force your workers to wait on the socket. It comes with the added complexity of maintaining the micro-service. But if you're in a hurry then Ent-Rate-Limiting looks cool too.

Related

Speed up rake task by using typhoeus

So i stumbled across this: https://github.com/typhoeus/typhoeus
I'm wondering if this is what i need to speed up my rake task
Event.all.each do |row|
begin
url = urlhere + row.first + row.second
doc = Nokogiri::HTML(open(url))
doc.css('.table__row--event').each do |tablerow|
table = tablerow.css('.table__cell__body--location').css('h4').text
next unless table == row.eventvenuename
tablerow.css('.table__cell__body--availability').each do |button|
buttonurl = button.css('a')[0]['href']
if buttonurl.include? '/checkout/external'
else
row.update(row: buttonurl)
end
end
end
rescue Faraday::ConnectionFailed
puts "connection failed"
next
end
end
I'm wondering if this would speed it up, Or because i'm doing a .each it wouldn't?
If it would could you provide an example?
Sam
If you set up Typhoeus::Hydra to run parallel requests, you might be able to speed up your code, assuming that the Kernel#open calls are what's slowing you down. Before you optimize, you might want to run benchmarks to validate this assumption.
If it is true, and parallel requests would speed it up, you would need to restructure your code to load events in batches, build a queue of parallel requests for each batch, and then handle them after they execute. Here's some sketch code.
class YourBatchProcessingClass
def initialize(batch_size: 200)
#batch_size = batch_size
#hydra = Typhoeus::Hydra.new(max_concurrency: #batch_size)
end
def perform
# Get an array of records
Event.find_in_batches(batch_size: #batch_size) do |batch|
# Store all the requests so we can access their responses later.
requests = batch.map do |record|
request = Typhoeus::Request.new(your_url_build_logic(record))
#hydra.queue request
request
end
#hydra.run # Run requests in parallel
# Process responses from each request
requests.each do |request|
your_response_processing(request.response.body)
end
end
rescue WhateverError => e
puts e.message
end
private
def your_url_build_logic(event)
# TODO
end
def your_response_processing(response_body)
# TODO
end
end
# Run the service by calling this in your Rake task definition
YourBatchProcessingClass.new.perform
Ruby can be used for pure scripting, but it functions best as an object-oriented language. Decomposing your processing work into clear methods can help clarify your code and help you catch things like Tom Lord mentioned in the comments on your question. Also, instead of wrapping your whole script in a begin..rescue block, you can use method-level rescues as in #perform above, or just wrap #hydra.run.
As a note, .all.each is a memory hog, and is thus considered a bad solution to iterating over records: .all loads all of the records into memory before iterating over them with .each. To save memory, it's better to use .find_each or .find_in_batches, depending on your use case. See: http://api.rubyonrails.org/classes/ActiveRecord/Batches.html

How to use the callback after_destroy (or something similar) in Rails 4?

I'm creating an application that creates polls, each poll has many poll pages, and each poll page has many question clusters, what I want to do is that when a question cluster is deleted, search every question clusters from the same page that had a higher position, and diminish 1.
This is what I tried, but it doesn't even runs:
after_destroy :reassign_position
private
def reassign_position
question_clusters = QuestionCluster.where(poll_page_id: self.poll_page_id)
question_clusters.where("position > ?", self.position)
quest_cluster.each do |question_cluster|
question_cluster.position -= 1
end
end
How can I accomplish what I want?
You are not updating the question_cluster's attribute (position). Take a look:
def reassign_position
question_clusters = QuestionCluster.where(poll_page_id: self.poll_page_id)
question_clusters.where("position > ?", self.position)
quest_cluster.each do |question_cluster|
# actually update the question_cluster
question_cluster.update!(position: question_cluster.position - 1) # <========
end
end

Broadcasting message every second using websokets-rails gem

I'm building an app for receiving some info every second using websockets-rails gem.
Right now, it seems like all messages are send after method is fully executed.
My websockets controller:
class DbTestsController< WebsocketRails::BaseController
def run_tests_on_all
dbtsch = DbTestsScheduler.new
dbtsch.run(1, 10, message['shard'], :push) do |ops|
send_message 'db_test.run_tests_on_all', ops
Rails.logger.info(ops)
end
end
end
'run' method looks like
def run(ecfs, fafs, shard, operation)
st = tep_t = Time.now
while st + fafs.second > Time.now
Octopus.using(shard) do
send(operation)
end
if tep_t + ecfs.second <= Time.now
tep_t = tep_t + 1.second
yield(#ops) if block_given?
#ops = 0
end
end
end
In console I see Rails.logger.info(ops) outputs message every second, but send_message sends all 10 results at once when method execution is completed.
I think what you want to do is use a a gem like sync
Real-time partials with Rails. Sync lets you render partials for models that, with minimal code, update in realtime in the browser when changes occur on the server.
you can check out a example here

Where does an if/than statement that needs to run constantly go in rails?

Right now I'm building a call tracking app to learn rails and twilio. The app has 2 relevant models ; The Plans model has_many users. The plans table also has the value max_minutes.
I want it to make it so that when a particular user goes over their max_minutes, their sub account is disabled, and I can also warn them to upgrade in the view.
To do this, here's a parameter I created in the User class
def at_max_minutes?
time_to_bill=0
start_time = Time.now - ( 30 * 24 * 60 * 60) #30 days
#subaccount = Twilio::REST::Client.new(#user.twilio_account_sid, #user.twilio_auth_token)
#subaccount.calls.list({:page => 0, :page_size => 1000, :start_time => ">#{start_time.strftime("%Y-%m-%d")}"}).each do |call|
time_to_bill += (call.duration.to_f/60).ceil
end
time_to_bill >= self.plan.max_minutes
end
This allows me to run if/else statements in the view to urge them to upgrade. However, I'd also like to make an if/else statement where, if at_max_minutes? than the user's twilio subaccount is disabled, else, it's enabled.
I'm not sure where I would put that though in rails.
It would look something like this
#client = Twilio::REST::Client.new(#user.twilio_account_sid, #user.twilio_auth_token)
#account = #client.account
if at_max_minutes?
#account = #account.create({:status => 'suspended'})
else
#account = #account.create({:status => 'active'})
end
BUT, I'm not sure where I would put this code, so that it's active all the time.
How would you implement this code, for the functionality to work?
Instead of constantly computing the total minutes used in at_max_minutes?, why not keep track of a user's used minutes, and set the status to "suspended" on the transition (when used minutes goes over max_minutes). Then your view and call code would only have to check status (you may also want to store status directly on user, to save API calls over to Twilio).
Add to User model:
used_minutes
When every call ends, update minutes:
def on_call_end( call )
self.used_minutes += call.duration_in_minutes # this assumes Twilio gives you a callback and has the length of the call)
save!
end
Add an after_save to User:
after_save :check_minutes_usage
def check_minutes_usage
if used_minutes >= plan.max_minutes
#account = #account.create({:status => 'suspended'})
else
#account = #account.create({:status => 'active'})
end
end
You're going to have to do some sort of scheduled background job for this check if you want it to be "active all the time". I'd recommend resque with resque-scheduler, which is a pretty good scheduling solution for Rails. Basically what you to do is to make a job, which executes that second block of code you specified, and have it run on a regular interval (maybe every 2 hours).

How to test the number of database calls in Rails

I am creating a REST API in rails. I'm using RSpec. I'd like to minimize the number of database calls, so I would like to add an automatic test that verifies the number of database calls being executed as part of a certain action.
Is there a simple way to add that to my test?
What I'm looking for is some way to monitor/record the calls that are being made to the database as a result of a single API call.
If this can't be done with RSpec but can be done with some other testing tool, that's also great.
The easiest thing in Rails 3 is probably to hook into the notifications api.
This subscriber
class SqlCounter< ActiveSupport::LogSubscriber
def self.count= value
Thread.current['query_count'] = value
end
def self.count
Thread.current['query_count'] || 0
end
def self.reset_count
result, self.count = self.count, 0
result
end
def sql(event)
self.class.count += 1
puts "logged #{event.payload[:sql]}"
end
end
SqlCounter.attach_to :active_record
will print every executed sql statement to the console and count them. You could then write specs such as
expect do
# do stuff
end.to change(SqlCounter, :count).by(2)
You'll probably want to filter out some statements, such as ones starting/committing transactions or the ones active record emits to determine the structures of tables.
You may be interested in using explain. But that won't be automatic. You will need to analyse each action manually. But maybe that is a good thing, since the important thing is not the number of db calls, but their nature. For example: Are they using indexes?
Check this:
http://weblog.rubyonrails.org/2011/12/6/what-s-new-in-edge-rails-explain/
Use the db-query-matchers gem.
expect { subject.make_one_query }.to make_database_queries(count: 1)
Fredrick's answer worked great for me, but in my case, I also wanted to know the number of calls for each ActiveRecord class individually. I made some modifications and ended up with this in case it's useful for others.
class SqlCounter< ActiveSupport::LogSubscriber
# Returns the number of database "Loads" for a given ActiveRecord class.
def self.count(clazz)
name = clazz.name + ' Load'
Thread.current['log'] ||= {}
Thread.current['log'][name] || 0
end
# Returns a list of ActiveRecord classes that were counted.
def self.counted_classes
log = Thread.current['log']
loads = log.keys.select {|key| key =~ /Load$/ }
loads.map { |key| Object.const_get(key.split.first) }
end
def self.reset_count
Thread.current['log'] = {}
end
def sql(event)
name = event.payload[:name]
Thread.current['log'] ||= {}
Thread.current['log'][name] ||= 0
Thread.current['log'][name] += 1
end
end
SqlCounter.attach_to :active_record
expect do
# do stuff
end.to change(SqlCounter, :count).by(2)

Resources