Table name corruption errors in ActiveRecord - ruby-on-rails

Sporadically we get PG::UndefinedTable errors while using ActiveRecord. The association table name is some how corrupted and I quite often see
Cancelled appended to the end of the table name.
E.g:
ActiveRecord::StatementInvalid: PG::UndefinedTable: ERROR: relation "fooCancell" does not exist
ActiveRecord::StatementInvalid: PG::UndefinedTable: ERROR: relation "Cancelled" does not exist
ActiveRecord::StatementInvalid: PG::UndefinedTable: ERROR: relation "barC" does not exist
In the example above, I have obfuscated the table name by using foo and bar.
We see this errors when the rails project is running inside Puma. Queue workers seems to be doing okay.
The tables in the error message doesn't correspond to real tables or models. It looks like the case of memory corruption. Has anyone seen such issues? If so how did you get around it?
puma.rb
on_worker_boot do
ActiveRecord::Base.establish_connection
end
database.yml
production:
url: <%= ENV["DATABASE_URL"] %>
pool: <%= ENV['DB_CONNECTION_POOL_SIZE'] || 5%>
reaping_frequency: <%= ENV['DB_CONNECTION_REAPING_FREQUENCY'] || 10 %>
prepared_statements: false

I'm hazarding a guess here, based on this possibly related error...
But you might be either:
calling fork within your application; OR
calling ActiveRecord routines (using database calls) before the server (puma) is forking it's worker processes (during the app initialization).
Either of these will break ActiveRecord's synchronization and cause multiple processes to share the database connection pool without synchronizing it's use (resulting in interlaced and corrupt database commands).
If you are using fork, make sure to close all the ActiveRecord database connections and reinitialize the connection pool (there's a function call that does it, but I don't remember it of the top of my head, maybe ActiveRecord.disconnect! or ActiveRecord.connection_pool.disconnect!).
Otherwise, before running Puma (either during the initialization process or using Puma's after_fork), close all the ActiveRecord database connections and reinitialize the connection pool.

It looks like reaping_frequency may be the issue. I found a couple claims that they may have a threading bug. I would try removing that option or setting it to nil and see if that works. The only other thing I can think of is if you are manually calling Thread.new and using active record within it.
Here are the few claims against reaping:
http://omegadelta.net/2014/03/15/the-rails-grim-reaper/
https://github.com/mperham/sidekiq/issues/1936
Search for "DO fear the Reaper" here:
https://www.google.com/amp/s/bibwild.wordpress.com/2014/07/17/activerecord-concurrency-in-rails4-avoid-leaked-connections/amp/

Related

How can I prevent any ActiveRecord::PreparedStatementCacheExpired errors immediately after running `rake db:migrate`?

I am working on a Rails 5.x application, and I use Postgres as my database.
I often run rake db:migrate on my production servers. Sometimes the migration will add a new column to the database, and this causes some controller actions to crash with the following error:
ActiveRecord::PreparedStatementCacheExpired: ERROR: cached plan must not change result type
This is happening in a critical controller action that needs to have zero downtime, so I need to find a way to prevent this crash from ever happening.
Should I catch the ActiveRecord::PreparedStatementCacheExpired error and retry the save? Or should I add some locking to this particular controller action, so that I don't start serving any new requests while a database migration is running?
What would be the best way to prevent this crash from ever happening again?
I was able to fix this issue in some places by using this retry_on_expired_cache helper:
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
class << self
# Retry automatically on ActiveRecord::PreparedStatementCacheExpired.
# (Do not use this for transactions with side-effects unless it is acceptable
# for these side-effects to occasionally happen twice.)
def retry_on_expired_cache(*_args)
retried ||= false
yield
rescue ActiveRecord::PreparedStatementCacheExpired
raise if retried
retried = true
retry
end
end
end
I would use it like this:
MyModel.retry_on_expired_cache do
#my_model.save
end
Unfortunately this was like playing "whack-a-mole", because this crash just kept happening all over my application during my rolling deploys (I'm not able to restart all the Rails processes at the same time.)
I finally learned that I can turn off prepared_statements to completely avoid this issue. (See this other question and answers on StackOverflow.)
I was worried about the performance penalty, but I found many reports from people who had set prepared_statements: false, and they hadn't noticed any problems. e.g. https://news.ycombinator.com/item?id=7264171
I created a file at config/initializers/disable_prepared_statements.rb:
db_configuration = ActiveRecord::Base.configurations[Rails.env]
db_configuration.merge!('prepared_statements' => false)
ActiveRecord::Base.establish_connection(db_configuration)
This allows me to continue setting the database configuration from the DATABASE_URL env variable, and 'prepared_statements' => false will be injected into the configuration.
This completely solves the ActiveRecord::PreparedStatementCacheExpired errors and makes it much easier to achieve high-availability for my service while still being able to modify the database.

RSpec: How to test methods that use Parallel (PG::ConnectionBad error)

In my app I have several Builder classes that are responsible for taking data received from an external API request and building/saving resources to the database. I'm dealing with a large amount of data and have implemented the Parallel gem to speed this up by using multiple processes.
However, I'm finding that any test for a method that uses Parallel fails with the same error:
ActiveRecord::StatementInvalid:
PG::ConnectionBad: PQconsumeInput() server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
Here is an example of the code being tested:
class AirportBuilder < Resource
def build_from_collection
Parallel.each(object_producer, in_processes: 24) do |params|
instance = Airport.find_or_initialize_by(fsid: params[:fs])
build!(instance, params)
end
end
end
I've done some searching on this but all the results in Google have to do with using multiple threads/processes to make the test suite run faster, which is a different problem.
Any ideas on how I can test this effectively without causing the PG error? I realize I may need to stub something out but am not quite sure what to stub and still have a meaningful test.
Thanks in advance to anyone who might be able to help!
Are you using too many database connections than are configured for your test database? Maybe try setting it to a pool size equal to the needs of your script (which looks like 24)?
test:
adapter: whatever
host: whatever
username: whatever
password: whatever
database: whatever
pool: 24
Heads up that you may also want to do some math on the default ActiveRecord connection pool. Some good info in this Heroku dev center article.

Rollback rake db:seed if exception is raised

My seeds file runs through quite a few csv files, does a few checks and creates various ActiveRecord records accordingly. While testing all these files, I finally think I have it and run rake db:seed but if something fails, I want what has been created so far to rollback.
Scenario that has already happened:
Seeds file requires 4 different CSV's
Only 3 of the 4 CSV's were uploaded to staging server
rake db:seed was run and the seeds file blew up half way through because it couldn't find a file, but over 1000 AR objects were created prior to that.
Ideally I'd like to do something like:
begin
CSV.readlines(file1)
CSV.readlines(file2)
CSV.readlines(file3)
CSV.readlines(file4)
rescue
# raise an error
# rollback all objects created prior to error
end
I suppose I could implement something custom but I can't find anything on the rails guides regarding this.
This is the purpose of Active Record Transactions:
Transactions are protective blocks where SQL statements are only
permanent if they can all succeed as one atomic action. The classic
example is a transfer between two accounts where you can only have a
deposit if the withdrawal succeeded and vice versa. Transactions
enforce the integrity of the database and guard the data against
program errors or database break-downs. So basically you should use
transaction blocks whenever you have a number of statements that must
be executed together or not at all.
Try this
ActiveRecord::Base.transaction do
...
end

ruby load file async?

I have some weird issues going on (for a very weird use case as I'll explain). I'm setting up a multi-tenant application using postgres schemas for data multi-tenancy.
Each company in my system will get its own schema. The way I accomplish this is with an after_commit on the model, on create, that then goes and creates a new postgres schema, and loads schema.rb into it. (copied from rake db:schema:load code) using ruby load.
You can see the gem here
Anyway, all this works (in the console). Creating a company creates the new schema and i can switch to it etc... my problem lies in my integration tests. I have an rspec test that creates to companies like so:
before do
#c1 = Factory :company
#c2 = Factory :company
end
What's odd is that I start to get the logs about the db schema loading, but they're truncated. Almost as if they're happening in parallel. Here's a sample output:
>> create: database: unique_name1
-- create_table("first_table_in_schema_rb", {:force=>true})
>> create: database: unique_name2
create: database is my log line, the -- create_table is from schema.rb itself.
As you can see, the second create: database seems to happen while I'm loading schema.rb from the first company creation.
Does anyone know if load is somehow asynchronous? I know ruby doesn't have real threads, but could it be using fibres or something? This is really messing me up because when my test comes around, the postgres schema that was meant to be created doesn't seem to exist.
Rails 3.0.8
Ruby 1.9.2
Im not 100% sure this is your problem because im sure of what happens with require but not with load, the things this happen to me once with require because require is not atomic, so loading code from a file with require will cause a race condition. Maybe that is what is happening with load but i was not able to find any info about load been atomic or not.
nevermind... issue had nothing to do with load it was the fact that i was already connected to the wrong schema when importing the schema.rb
There was in fact an exception being thrown that was silently caught somewhere

SQLite3::BusyException

Running a rails site right now using SQLite3.
About once every 500 requests or so, I get a
ActiveRecord::StatementInvalid (SQLite3::BusyException: database is locked:...
What's the way to fix this that would be minimally invasive to my code?
I'm using SQLLite at the moment because you can store the DB in source control which makes backing up natural and you can push changes out very quickly. However, it's obviously not really set up for concurrent access. I'll migrate over to MySQL tomorrow morning.
You mentioned that this is a Rails site. Rails allows you to set the SQLite retry timeout in your database.yml config file:
production:
adapter: sqlite3
database: db/mysite_prod.sqlite3
timeout: 10000
The timeout value is specified in miliseconds. Increasing it to 10 or 15 seconds should decrease the number of BusyExceptions you see in your log.
This is just a temporary solution, though. If your site needs true concurrency then you will have to migrate to another db engine.
By default, sqlite returns immediatly with a blocked, busy error if the database is busy and locked. You can ask for it to wait and keep trying for a while before giving up. This usually fixes the problem, unless you do have 1000s of threads accessing your db, when I agree sqlite would be inappropriate.
// set SQLite to wait and retry for up to 100ms if database locked
sqlite3_busy_timeout( db, 100 );
All of these things are true, but it doesn't answer the question, which is likely: why does my Rails app occasionally raise a SQLite3::BusyException in production?
#Shalmanese: what is the production hosting environment like? Is it on a shared host? Is the directory that contains the sqlite database on an NFS share? (Likely, on a shared host).
This problem likely has to do with the phenomena of file locking w/ NFS shares and SQLite's lack of concurrency.
If you have this issue but increasing the timeout does not change anything, you might have another concurrency issue with transactions, here is it in summary:
Begin a transaction (aquires a SHARED lock)
Read some data from DB (we are still using the SHARED lock)
Meanwhile, another process starts a transaction and write data (acquiring the RESERVED lock).
Then you try to write, you are now trying to request the RESERVED lock
SQLite raises the SQLITE_BUSY exception immediately (indenpendently of your timeout) because your previous reads may no longer be accurate by the time it can get the RESERVED lock.
One way to fix this is to patch the active_record sqlite adapter to aquire a RESERVED lock directly at the begining of the transaction by padding the :immediate option to the driver. This will decrease performance a bit, but at least all your transactions will honor your timeout and occurs one after the other. Here is how to do this using prepend (Ruby 2.0+) put this in a initializer:
module SqliteTransactionFix
def begin_db_transaction
log('begin immediate transaction', nil) { #connection.transaction(:immediate) }
end
end
module ActiveRecord
module ConnectionAdapters
class SQLiteAdapter < AbstractAdapter
prepend SqliteTransactionFix
end
end
end
Read more here: https://rails.lighthouseapp.com/projects/8994/tickets/5941-sqlite3busyexceptions-are-raised-immediately-in-some-cases-despite-setting-sqlite3_busy_timeout
Just for the record. In one application with Rails 2.3.8 we found out that Rails was ignoring the "timeout" option Rifkin Habsburg suggested.
After some more investigation we found a possibly related bug in Rails dev: http://dev.rubyonrails.org/ticket/8811. And after some more investigation we found the solution (tested with Rails 2.3.8):
Edit this ActiveRecord file: activerecord-2.3.8/lib/active_record/connection_adapters/sqlite_adapter.rb
Replace this:
def begin_db_transaction #:nodoc:
catch_schema_changes { #connection.transaction }
end
with
def begin_db_transaction #:nodoc:
catch_schema_changes { #connection.transaction(:immediate) }
end
And that's all! We haven't noticed a performance drop and now the app supports many more petitions without breaking (it waits for the timeout). Sqlite is nice!
bundle exec rake db:reset
It worked for me it will reset and show the pending migration.
Sqlite can allow other processes to wait until the current one finished.
I use this line to connect when I know I may have multiple processes trying to access the Sqlite DB:
conn = sqlite3.connect('filename', isolation_level = 'exclusive')
According to the Python Sqlite Documentation:
You can control which kind of BEGIN
statements pysqlite implicitly
executes (or none at all) via the
isolation_level parameter to the
connect() call, or via the
isolation_level property of
connections.
I had a similar problem with rake db:migrate. Issue was that the working directory was on a SMB share.
I fixed it by copying the folder over to my local machine.
Most answers are for Rails rather than raw ruby, and OPs question IS for rails, which is fine. :)
So I just want to leave this solution over here should any raw ruby user have this problem, and is not using a yml configuration.
After instancing the connection, you can set it like this:
db = SQLite3::Database.new "#{path_to_your_db}/your_file.db"
db.busy_timeout=(15000) # in ms, meaning it will retry for 15 seconds before it raises an exception.
#This can be any number you want. Default value is 0.
Source: this link
- Open the database
db = sqlite3.open("filename")
-- Ten attempts are made to proceed, if the database is locked
function my_busy_handler(attempts_made)
if attempts_made < 10 then
return true
else
return false
end
end
-- Set the new busy handler
db:set_busy_handler(my_busy_handler)
-- Use the database
db:exec(...)
What table is being accessed when the lock is encountered?
Do you have long-running transactions?
Can you figure out which requests were still being processed when the lock was encountered?
Argh - the bane of my existence over the last week. Sqlite3 locks the db file when any process writes to the database. IE any UPDATE/INSERT type query (also select count(*) for some reason). However, it handles multiple reads just fine.
So, I finally got frustrated enough to write my own thread locking code around the database calls. By ensuring that the application can only have one thread writing to the database at any point, I was able to scale to 1000's of threads.
And yea, its slow as hell. But its also fast enough and correct, which is a nice property to have.
I found a deadlock on sqlite3 ruby extension and fix it here: have a go with it and see if this fixes ur problem.
https://github.com/dxj19831029/sqlite3-ruby
I opened a pull request, no response from them anymore.
Anyway, some busy exception is expected as described in sqlite3 itself.
Be aware with this condition: sqlite busy
The presence of a busy handler does not guarantee that it will be invoked when there is
lock contention. If SQLite determines that invoking the busy handler could result in a
deadlock, it will go ahead and return SQLITE_BUSY or SQLITE_IOERR_BLOCKED instead of
invoking the busy handler. Consider a scenario where one process is holding a read lock
that it is trying to promote to a reserved lock and a second process is holding a reserved
lock that it is trying to promote to an exclusive lock. The first process cannot proceed
because it is blocked by the second and the second process cannot proceed because it is
blocked by the first. If both processes invoke the busy handlers, neither will make any
progress. Therefore, SQLite returns SQLITE_BUSY for the first process, hoping that this
will induce the first process to release its read lock and allow the second process to
proceed.
If you meet this condition, timeout isn't valid anymore. To avoid it, don't put select inside begin/commit. or use exclusive lock for begin/commit.
Hope this helps. :)
this is often a consecutive fault of multiple processes accessing the same database, i.e. if the "allow only one instance" flag was not set in RubyMine
Try running the following, it may help:
ActiveRecord::Base.connection.execute("BEGIN TRANSACTION; END;")
From: Ruby: SQLite3::BusyException: database is locked:
This may clear up the any transaction holding up the system
I believe this happens when a transaction times out. You really should be using a "real" database. Something like Drizzle, or MySQL. Any reason why you prefer SQLite over the two prior options?

Resources