How to handle external service failure in Open-Uri? - ruby-on-rails

In my Rails app I am trying to fetch a number of currency exchange rates from an external service and store them in the cache:
require 'open-uri'
module ExchangeRate
def self.all
Rails.cache.fetch("exchange_rates", :expires_in => 24.hours) { load_all }
end
private
def self.load_all
hashes = {}
CURRENCIES.each do |currency|
begin
hash = JSON.parse(open(URI("http://api.fixer.io/latest?base=#{currency}")).read) #what if not available?
hashes[currency] = hash["rates"]
rescue Timeout::Error
puts "Timeout"
rescue OpenURI::Error => e
puts e.message
end
end
hashes
end
end
This works great in development but I am worried about the production environment. How can I prevent the whole thing from being cached if the external service is not available? How can I ensure ExchangeRate.all always contains data, even if it's old and can't be updated due to an external service failure?
I tried to add some basic error handling but I'm afraid it's not enough.

If you're worried about your external service not being reliable enough to keep up with caching every 24 hours, then you should disable the auto cache expiration, let users work with old data, and set up some kind of notification system to tell you if the load_all fails.
Here's what I'd do:
Assume ExchangeRate.all always returns a cached copy, with no expiration (this will return nil if no cache is found):
module ExchangeRate
def self.all
rates = Rails.cache.fetch("exchange_rates")
UpdateCurrenciesJob.perform_later if rates.nil?
rates
end
end
Create an ActiveJob that handles the updates on a regular basis:
class UpdateCurrenciesJob < ApplicationJob
queue_as :default
def perform(*_args)
hashes = {}
CURRENCIES.each do |currency|
begin
hash = JSON.parse(open(URI("http://api.fixer.io/latest?base=#{currency}")).read) # what if not available?
hashes[currency] = hash['rates'].merge('updated_at' => Time.current)
rescue Timeout::Error
puts 'Timeout'
rescue OpenURI::Error => e
puts e.message
end
if hashes[currency].blank? || hashes[currency]['updated_at'] < Time.current - 24.hours
# send a mail saying "this currency hasn't been updated"
end
end
Rails.cache.write('exchange_rates', hashes)
end
end
Set the job up to run every few hours (4, 8, 12, less than 24). This way, the currencies will load in the background, the clients will always have data, and you will always know if currencies aren't working.

Related

How can I prevent many sidekiq jobs from exceeding the API calls limit

I am working on an Ruby On Rails application. We have many sidekiq workers that can process multiple jobs at a time. Each job will make calls to the Shopify API, the calls limit set by Shopify is 2 calls per second. I want to synchronize that, so that only two jobs can call the API in a given second.
The way I'm doing that right now, is like this:
# frozen_string_literal: true
class Synchronizer
attr_reader :shop_id, :queue_name, :limit, :wait_time
def initialize(shop_id:, queue_name:, limit: nil, wait_time: 1)
#shop_id = shop_id
#queue_name = queue_name.to_s
#limit = limit
#wait_time = wait_time
end
# This method should be called for each api call
def synchronize_api_call
raise "a block is required." unless block_given?
get_api_call
time_to_wait = calculate_time_to_wait
sleep(time_to_wait) unless Rails.env.test? || time_to_wait.zero?
yield
ensure
return_api_call
end
def set_api_calls
redis.del(api_calls_list)
redis.rpush(api_calls_list, calls_list)
end
private
def get_api_call
logger.log_message(synchronizer: 'Waiting for api call', color: :yellow)
#api_call_timestamp = redis.brpop(api_calls_list)[1].to_i
logger.log_message(synchronizer: 'Got api call.', color: :yellow)
end
def return_api_call
redis_timestamp = redis.time[0]
redis.rpush(api_calls_list, redis_timestamp)
ensure
redis.ltrim(api_calls_list, 0, limit - 1)
end
def last_call_timestamp
#api_call_timestamp
end
def calculate_time_to_wait
current_time = redis.time[0]
time_passed = current_time - last_call_timestamp.to_i
time_to_wait = wait_time - time_passed
time_to_wait > 0 ? time_to_wait : 0
end
def reset_api_calls
redis.multi do |r|
r.del(api_calls_list)
end
end
def calls_list
redis_timestamp = redis.time[0]
limit.times.map do |i|
redis_timestamp
end
end
def api_calls_list
#api_calls_list ||= "api-calls:shop:#{shop_id}:list"
end
def redis
Thread.current[:redis] ||= Redis.new(db: $redis_db_number)
end
end
the way I use it is like this
synchronizer = Synchronizer.new(shop_id: shop_id, queue_name: 'shopify_queue', limit: 2, wait_time: 1)
# this is called once the process started, i.e. it's not called by the jobs themselves but by the App from where the process is kicked off.
syncrhonizer.set_api_calls # this will populate the api_calls_list with 2 timestamps, those timestamps will be used to know when the last api call has been sent.
then when a job wants to make a call
syncrhonizer.synchronize_api_call do
# make the call
end
The problem
The problem with this is that if for some reason a job fails to return to the api_calls_list the api_call it took, that will make that job and the other jobs stuck for ever, or until we notice that and we call set_api_calls again. That problem won't affect that particular shop only, but also the other shops as well, because the sidekiq workers are shared between all the shops using our app. It happen sometimes that we don't notice that until a user calls us, and we find that it was stuck for many hours while it should be finished in a few minutes.
The Question
I just realised lately that Redis is not the best tool for shared locking. So I am asking, Is there any other good tool for this job?? If not in the Ruby world, I'd like to learn from others as well. I'm interested in the techniques as well as the tools. So every bit helps.
You may want to restructure your code and create a micro-service to process the API calls, which will use a local locking mechanism and force your workers to wait on the socket. It comes with the added complexity of maintaining the micro-service. But if you're in a hurry then Ent-Rate-Limiting looks cool too.

How to memoize hash across requests?

In my Rails application I have a very expensive function that fetches a bunch of conversion rates from an external service once per day:
require 'open-uri'
module Currency
def self.all
#all ||= fetch_all
end
def self.get_rate(from_curr = "EUR", to_curr = "USD")
all[from_curr][to_curr]
end
private
def self.fetch_all
hashes = {}
CURRENCIES.keys.each do |currency|
hash = JSON.parse(open(URI("http://api.fixer.io/latest?base=#{currency}")).read)
hashes[currency] = hash["rates"]
end
hashes
end
end
Is there a way to store the result of this function (a hash) to speed things up? Right now, I am trying to store it in an instance variable #all, which speeds it up a little, however it is not persisted across requests. How can I keep it across requests?
create a file lets say currency_rates.rb in your initializer with the following code:
require 'open-uri'
hashes = {}
CURRENCIES.keys.each do |currency|
hashes[currency] = JSON.parse(open(URI("http://api.fixer.io/latest?base=#{currency}")).read)["rates"]
end
CURRENCY_RATES = hashes
Then write the following rake task which will run daily:
task update_currency_rates: :environment do
require 'open-uri'
hashes = {}
CURRENCIES.keys.each do |currency|
hashes[currency] = JSON.parse(open(URI("http://api.fixer.io/latest?base=#{currency}")).read)["rates"]
end
Constant.const_set('CURRENCY_RATES', hashes)
end
The only drawback is that it will run every time you deploy new version of your app/on restart. You can go with it if you are ok with it.
You can avoid that if you use caching like memcachier or something, then you can do like,
def currency_rates
Rails.cache.fetch('currency_rates', expires_in: 24.hours) do
# write above code in some method and call here which will return hash and thus it will be cached.
end
end
I would initilize the hash lazy like this:
require 'open-uri'
require 'json'
module Currency
def self.get_rate(from_curr = "EUR", to_curr = "USD")
#memorized_result ||={}
#memorized_result.fetch(from_curr) do |not_found_key|
data = JSON.parse(open(URI("http://api.fixer.io/latest?base=# {not_found_key}")).read)
#memorized_result[not_found_key] = data["rates"]
end[to_curr]
end
end
I think you don't need all the exchange rates at all the time. So you can speed up things by fetching only the required one at a time. Over the time you keep all rates in memory.
This is persisted between requests in some edge cases. It depends on your server, for instance, unicorn uses multiple processes. Every process has it's own
#memorized_result variable, which needs to be filled.
If you want to share this data betweend multiple processes or servers then you need a storage for the fetched data which can be shared between multiple processes.
If you need a time to life for your entries then I would tweak #Md. Farhan Memon Rails cache hint like this:
def get_rate(from_curr = "EUR", to_curr = "USD")
Rails.cache.fetch("currency_rates_#{from_curr}_#{to_curr}", expires_in: 24.hours) do
data = JSON.parse(open(URI("http://api.fixer.io/latest?base=#{from_curr}")).read)
data["rates"][to_curr]
end
end

How can I avoid deadlocks on my database when using ActiveJob in Rails?

I haven't had a lot of experience with deadlocking issues in the past, but the more I try to work with ActiveJob and concurrently processing those jobs, I'm running into this problem. An example of one Job that is creating it is shown below. The way it operates is I start ImportGameParticipationsJob and it queues up a bunch of CreateOrUpdateGameParticipationJobs.
When attempting to prevent my SQL Server from alerting me to a ton of deadlock errors, where is the cause likely happening below? Can I get a deadlock from simply selecting records to populate an object? Or can it really only happen when I'm attempting to save/update the record within my process_records method below when saving?
ImportGameParticipationsJob
class ImportGameParticipationsJob < ActiveJob::Base
queue_as :default
def perform(*args)
import_participations(args.first.presence)
end
def import_participations(*args)
games = Game.where(season: 2016)
games.each do |extract_record|
CreateOrUpdateGameParticipationJob.perform_later(extract_record.game_key)
end
end
end
CreateOrUpdateGameParticipationJob
class CreateOrUpdateGameParticipationJob < ActiveJob::Base
queue_as :import_queue
def perform(*args)
if args.first.present?
game_key = args.first
# get all particpations for a given game
game_participations = GameRoster.where(game_key: game_key)
process_records(game_participations)
end
end
def process_records(participations)
# Loop through participations and build record for saving...
participations.each do |participation|
if participation.try(:player_id)
record = create_or_find(participation)
record = update_record(record, participation)
end
begin
if record.valid?
record.save
else
end
rescue Exception => e
end
end
end
def create_or_find(participation)
participation_record = GameParticipation.where(
game_id: participation.game.try(:id),
player_id: participation.player.try(:id))
.first_or_initialize do |record|
record.game = Game.find_by(game_key: participation.game_key)
record.player = Player.find_by(id: participation.player_id)
record.club = Club.find_by(club_id: participation.club_id)
record.status = parse_status(participation.player_status)
end
return participation_record
end
def update_record(record, record)
old_status = record.status
new_status = parse_status(record.player_status)
if old_status != new_status
record.new_status = record.player_status
record.comment = "status was updated via participations import job"
end
return record
end
end
They recently updated and added an additional option you can set that should help with the deadlocking. I had the same issue and was on 4.1, moving to 4.1.1 fixed this issue for me.
https://github.com/collectiveidea/delayed_job_active_record
https://rubygems.org/gems/delayed_job_active_record
Problems locking jobs
You can try using the legacy locking code. It is usually slower but works better for certain people.
Delayed::Backend::ActiveRecord.configuration.reserve_sql_strategy = :default_sql

Speeding up Ruby code to make faster/more API calls

I have the following code:
list_entities = [{:phone => '0000000000', :name => 'Test', :"#i:type => '1'},{:phone => '1111111111', :name => 'Demo', :"#i:type => '1'}]
list_entities.each do |list_entity|
phone_contact = PhoneContact.create(list_entity.except(:"#i:type"))
add_record_response = api.add_record_to_list(phone_contact, "API Test")
if add_record_response[:add_record_to_list_response][:return][:list_records_inserted] != '0'
phone_contact.update(:loaded_at => Time.now)
end
end
This code is taking an array of hashes and creating a new phone_contact for each one. It then makes an api call (add_record_response) to do something with that phone_contact. If that api call is successful, it updates the loaded_at attribute for that specific phone_contact. Then it starts the loop over.
I am allowed something like 7200 api calls per hour with this service - However, I'm only able to make about 1 api call every 4 seconds right now.
Any thoughts on how I could speed this code block up to make faster api calls?
I would suggest using a thread pool. You can define a unit of work to be done and the number of threads you want to process the work on. This way you can get around the bottleneck of waiting for the server to response on each request. Maybe try something like (disclaimer: this was adapted from http://burgestrand.se/code/ruby-thread-pool/)
require 'thread'
class Pool
def initialize(size)
#size = size
#jobs = Queue.new
#pool = Array.new(#size) do |i|
Thread.new do
Thread.current[:id] = i
catch(:exit) do
loop do
job, args = #jobs.pop
job.call(*args)
end
end
end
end
end
def schedule(*args, &block)
#jobs << [block, args]
end
def shutdown
#size.times do
schedule { throw :exit }
end
#pool.map(&:join)
end
end
p = Pool.new(4)
list_entries.do |list_entry|
p.schedule do
phone_contact = PhoneContact.create(list_entity.except(:"#i:type"))
add_record_response = api.add_record_to_list(phone_contact, "API Test")
if add_record_response[:add_record_to_list_response][:return][:list_records_inserted] != '0'
phone_contact.update(:loaded_at => Time.now)
end
puts "Job #{i} finished by thread #{Thread.current[:id]}"
end
at_exit { p.shutdown }
end

Rails cache fetch with failover

We use this to get a value from an external API:
def get_value
Rails.cache.fetch "some_key", expires_in: 15.second do
# hit some external API
end
end
But sometimes the external API goes down and when we try to hit it, it raises exceptions.
To fix this we'd like to:
try updating it every 15 seconds
but if it goes offline, use the old value for up to 5 minutes, retrying every 15 seconds or so
if it's stale for more than 5 minutes, only then start raising exceptions
Is there a convenient wrapper/library for this or what would be a good solution? We could code up something custom, but it seems like a common enough use case there should be something battle tested out there. Thanks!
Didn't end up finding any good solutions, so ended up using this:
# This helper is useful for caching a response from an API, where the API is unreliable
# It will try to refresh the value every :expires_in seconds, but if the block throws an exception it will use the old value for up to :fail_in seconds before actually raising the exception
def cache_with_failover key, options=nil
key_fail = "#{key}_fail"
options ||= {}
options[:expires_in] ||= 15.seconds
options[:fail_in] ||= 5.minutes
val = Rails.cache.read key
return val if val
begin
val = yield
Rails.cache.write key, val, expires_in: options[:expires_in]
Rails.cache.write key_fail, val, expires_in: options[:fail_in]
return val
rescue Exception => e
val = Rails.cache.read key_fail
return val if val
raise e
end
end
# Demo
fail = 10.seconds.from_now
a = cache_with_failover('test', expires_in: 5.seconds, fail_in: 10.seconds) do
if Time.now < fail
Time.now
else
p 'failed'
raise 'a'
end
end
An even better solution would probably exponentially back off retries after the first failure. As it's currently written, it will pummel the api with retries (in the yield) after the first failure.

Resources