ActiveRecord exception caught after being eaten - ruby-on-rails

I've got a GCP pubsub listener that does some work and then saves to ActiveRecord. I don't want to do that work if the DB connection is down, so I've added a pre-flight check. The pre-flight check checks the DB connection, and if it fails, eats the error and raises a RuntimeError. The DB is flighty though, and to account for the scenario where the pre-flight succeeds, but the DB connection dies while the work is being done, I have the caller rescuing ActiveRecord::ActiveRecordError and PG::Error, so we can log that the work was done, but the receipt couldn't be persisted. It's more important that this work not be duplicated than for the receipt to be persisted, so RuntimeError isn't caught, (causing a retry), but the DB errors are. It looks like this (snipping significantly):
# Service
def process
begin
WorkReceipt.do_work
rescue ActiveRecord::ActiveRecordError, PG::Error
Rails.logger.error("Work was done successfully, but not persisted")
end
end
# Model
class WorkReceipt < ActiveRecord::Base
def self.do_work
if !ActiveRecord::Base.connection.active?
Rails.logger.error("DB connection is inactive. Reconnecting...")
begin
ActiveRecord::Base.connection.reconnect!
rescue => e
Rails.logger.error("Could not reestablish connection: #{e}")
raise "Could not connect to database"
end
end
# Lots of hard work
self.create!(
# Some args
)
end
end
Where things get weird is, while testing this, I brought down the DB and fired off 4 of these tasks. The first one handles correctly ("Could not reestablish connection: server closed the connection unexpectedly"), but then the other 3 get "DB connection is inactive. Reconnecting..." (good) followed by "Work was done successfully, but not persisted" (what?!). Even weirder, is that the work has logging and side-effects which I don't see happening. The pre-flight appears to correctly prevent the work from being done, but the database error is showing up in the outer rescue, preventing the retry and making me sad. There is no database access other than the create at the end.
What is going on here? Why does it seem like the database error is skipping past the inner rescue to be caught by the outer one?

Maybe I don't understand how Ruby works, but changing raise "Could not connect to database" to raise RuntimeError.new "Could not connect to database" fixes the problem. I was under the impression that providing a message to raise caused it to emit a RuntimeError without needing to be explicit about it, but here we are.

Related

How can I prevent any ActiveRecord::PreparedStatementCacheExpired errors immediately after running `rake db:migrate`?

I am working on a Rails 5.x application, and I use Postgres as my database.
I often run rake db:migrate on my production servers. Sometimes the migration will add a new column to the database, and this causes some controller actions to crash with the following error:
ActiveRecord::PreparedStatementCacheExpired: ERROR: cached plan must not change result type
This is happening in a critical controller action that needs to have zero downtime, so I need to find a way to prevent this crash from ever happening.
Should I catch the ActiveRecord::PreparedStatementCacheExpired error and retry the save? Or should I add some locking to this particular controller action, so that I don't start serving any new requests while a database migration is running?
What would be the best way to prevent this crash from ever happening again?
I was able to fix this issue in some places by using this retry_on_expired_cache helper:
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
class << self
# Retry automatically on ActiveRecord::PreparedStatementCacheExpired.
# (Do not use this for transactions with side-effects unless it is acceptable
# for these side-effects to occasionally happen twice.)
def retry_on_expired_cache(*_args)
retried ||= false
yield
rescue ActiveRecord::PreparedStatementCacheExpired
raise if retried
retried = true
retry
end
end
end
I would use it like this:
MyModel.retry_on_expired_cache do
#my_model.save
end
Unfortunately this was like playing "whack-a-mole", because this crash just kept happening all over my application during my rolling deploys (I'm not able to restart all the Rails processes at the same time.)
I finally learned that I can turn off prepared_statements to completely avoid this issue. (See this other question and answers on StackOverflow.)
I was worried about the performance penalty, but I found many reports from people who had set prepared_statements: false, and they hadn't noticed any problems. e.g. https://news.ycombinator.com/item?id=7264171
I created a file at config/initializers/disable_prepared_statements.rb:
db_configuration = ActiveRecord::Base.configurations[Rails.env]
db_configuration.merge!('prepared_statements' => false)
ActiveRecord::Base.establish_connection(db_configuration)
This allows me to continue setting the database configuration from the DATABASE_URL env variable, and 'prepared_statements' => false will be injected into the configuration.
This completely solves the ActiveRecord::PreparedStatementCacheExpired errors and makes it much easier to achieve high-availability for my service while still being able to modify the database.

Sidekiq Active Job database rollback on error

I'm noticing that when a Sidekiq / Active Job fails due to an error being thrown, any database changes that occurred during the job are rolled back. This seems to be an intentional feature to make jobs idempotent.
My problem is that the method run by the job can send emails to users and it uses database modifications to prevent re-sending emails. If the database change is rolled back, then the email will be resent whenever the job is retried.
Here's roughly what my job looks like:
class ProcessPaymentsJob < ApplicationJob
queue_as :default
def perform(*args)
begin
# This can send emails to users.
PaymentProcessor.perform
rescue StandardError => error
puts 'PaymentsJob failed, ignoring'
puts error
end
end
end
The job is scheduled to run periodically using sidekiq-scheduler. I'm using rails-api v5.
I've added a rescue to try to prevent the job from rolling back the database changes but it still happens.
It occurred to me that maybe this isn't a Sidekiq issue at all but a feature of Rails.
What's the best solution here to prevent spamming the user with emails?
It sounds like your background job is doing too much. If sending the email has no bearing on whether the job was successful or not you should break the job into two jobs: one to send the email and another to do the other bit of processing work.
Alternatively, you could use Sidekiq Batches and make the first job above dependent on the second executing successfully.
Happy Sidekiq’ing!
You could wrap the database changes in a transaction inside of the PaymentProcessor, rescue the database rollback, and only send the email if the transaction succeeds. Sort of like this:
# ../payment_processor.rb
def perform
ActiveRecord::Base.transaction do
# AllTheThings.save!
end
rescue ActiveRecord::RecordInvalid => exception
# if things fail to save, handle the exception however you like
else
# if no exception is raised, send your email
end

Sidekiq - Only handle error after x retries?

I'm using sidekiq to process thousands of jobs per hour - all of which ping an external API (Google). One out of X thousand requests will return an unexpected (or empty) result. As far as I can tell, this is unavoidable when dealing with an external API.
Currently, when I encounter such response, I raise an Exception so that the retry logic will automatically take care of it on the next try. Something is only really wrong with the same job fails over and over many times. Exceptions are handled by Airbrake.
However my airbrake gets clogged up with these mini-outages that aren't really 'issues'. I'd like Airbrake to only be notified of these issues if the same job has failed X times already.
Is it possible to either
disable the automated airbrake integration so that I can use the sidekiq_retries_exhausted to report the error manually via Airbrake.notify
Rescue the error somehow so it doesn't notify Airbrake but keep retrying it?
Do this in a different way that I'm not thinking of?
Here's my code outline
class GoogleApiWorker
include Sidekiq::Worker
sidekiq_options queue: :critical, backtrace: 5
def perform
# Do stuff interacting with the google API
rescue Exception => e
if is_a_mini_google_outage? e
# How do i make it so this harmless error DOES NOT get reported to Airbrake but still gets retried?
raise e
end
end
def is_a_mini_google_outage? e
# check to see if this is a harmless outage
end
end
As far as I know Sidekiq has a class for retries and jobs, you can get your current job through arguments (comparing - cannot he effective) or jid (in this case you'd need to record the jid somewhere), check the number of retries and then notify or not Airbrake.
https://github.com/mperham/sidekiq/wiki/API
https://github.com/mperham/sidekiq/blob/master/lib/sidekiq/api.rb
(I just don't give more info because I'm not able to)
if you look for Sidekiq solution https://blog.eq8.eu/til/retry-active-job-sidekiq-when-exception.html
if you are more interested in configuring Airbrake so you don't get these errors untill certain retry check Airbrake::Sidekiq::RetryableJobsFilter
https://github.com/airbrake/airbrake#airbrakesidekiqretryablejobsfilter

Dealing with hotlinking and old references after rebuild in rails

I just launched a completely rebuilt in rails website and am using New Relic for error monitoring. I've been getting a lot of errors and alerts for what I'm guessing is people using bookmarks for pages/paths that no longer exist and possibly some hot linking.
What is the best way to resolve this situation so that I stop getting the alerts?
How about ignoring those errors?
ignore_errors - A comma separated list of Exception classes which will
be ignored
In addition, the error collector can be customized programmatically
for more control over filtering. In your application initialization,
you can register a block with the Agent to be called when an error is
detected. The block should return the error to record, or nil if the
error is to be ignored. For example:
config.after_initialize do
::NewRelic::Agent.ignore_error_filter do |error|
if error.message =~ /gateway down/
nil
elsif
error.class.name == "InvalidLicenseException"
StandardError.new "We should never see this..."
else
error
end
end
end
(source)

Releasing a connection in Rails

I'm using Rails 2.3.8.
What's the best way to release a connection on a model to another database?
Let's say I have ModelB.establish_connection("server_b")
Would ModelB.remove_connection do the trick? How would I verify that I've successfully removed the connection?
Looks as though remove_connection is what you're looking for. To verify that you've successfully removed the connection, you could wrap a find method within a rescue block like:
begin
ModelB.find(1)
rescue ConnectionNotEstablished
# if we're here, then we have no connection, which is good in this case
else
# if we're here, then we still have a connection, which is bad...
end

Resources