How can I avoid PG::InFailedSqlTransaction when adding new columns? - ruby-on-rails

We are running two rails applications against the same database. When we deploy, we typically deploy to App A, then App B, restarting all rails processes during the deploy. App A runs on 7 servers with at least 20 processes connections to the database. App B runs on 4 servers with at least 8 connections to the database.
Today, when we deployed App A, we added a column to an existing table:
change_table :organizations do |t|
t.integer :users_count, default: 0
end
We expected this to be fine: its new column on an existing table and it has a default. Shortly after the migration ran, a number of errors showed up, from both App A (before it was restarted) and App B (before it was deployed to).
These errors were:
FATAL ActiveRecord::StatementInvalid error: PG::InFailedSqlTransaction:
ERROR: current transaction is aborted, commands ignored until end of
transaction block
In the postgres log, I have 58 errors like this:
postgres[12283]: ERROR: cached plan must not change result type
postgres[12283]: STATEMENT: SELECT "organizations".* FROM
"organizations" WHERE "organizations"."id" = $1 LIMIT $2
This repeats a number of times and goes away after all deploys have finished and all processes restarted.
It appeared that Rails bug #12330 and Rails PR 22170 addressed this in Rails 5.0, but I have that commit and am still seeing this error.
Relevant software versions
Rails 5.0.2
PG 0.19.0
Postgres 9.5
One comment on Rails bug #12330 suggests that I have to add the columns with null defaults. Another suggests performing multiple deploys, one to disable prepare statements, then another to perform the migration and and re-enable prepared statements.
Is there away to avoid this? It clears up when we restart the servers, but I feel like I'm missing something - like only using nullable columns perhaps that would avoid these errors all together. This doesn't happen on every deploy and I don't know how to reproduce it - but this wasn't the first time it has happened.

You have changed table structure and thus what your prepared SELECT returns, because you use "organizations".*, with is returning all columns. PostgreSQL apparently doesn't support updating of prepared statements, so you need to either create new session (reconnect) or use DEALLOCATE to remove that prepared statement.
EDIT: You could also stop using SELECT *.

Related

Dual booting when there is a breaking database change

I am upgrading an app from 6.0 to 6.1, using the next_rails dual boot gem. As part of the upgrade I needed to execute rails active_storage:update which adds a service_name to the ActiveStorage::Blob table. When I run my tests with next rspec spec this works fine, but rspec spec fails with tests that use active_storage with messages like
ActiveRecord::NotNullViolation:
PG::NotNullViolation: ERROR: null value in column "service_name" violates not-null constraint
Is there a way of adding some conditional code (if Rails::VERSION::MINOR) that detects the version is 6.0 and somehow removes the null constraint for the service_name column.
There are two approaches.
Migrate down to the version before the update:
if Rails.version == "6.0.0"
ActiveRecord::Base.connection.migration_context.migrate(20220816031332)
else
ActiveRecord::Base.connection.migration_context.migrate
end
# or maybe invoke the db:migrate tasks instead
You might need to turn off pending migration check: config.active_record.migration_error = false
This approach is general and should work with any migration(s).
The second approach is to call migration helpers manually. This is specific to the problem posed in this question and just changes the columns without updating migration version in the database:
if Rails.version == "6.0.0"
ActiveRecord::Migration.change_column_null(:active_storage_blobs, :service_name, true)
else
ActiveRecord::Migration.change_column_null(:active_storage_blobs, :service_name, false)
end
The code can be placed in config/environment.rb after Rails.application.initialize!. If you wish to place it earlier, you would need to establish the connection to the database manually.
Note that both methods cause persistent changes to the database, e.g. will remain after the tests have run. The changes only affect the database of which ever environment the code is run in e.g. test or development.

Memory leaks on postgresql server after upgrade to Rails 4

We are experiencing a strange problem on a Rails application on Heroku. Juste after migrate from Rails 3.2.17 to Rails 4.0.3 our postgresql server show an infinite increase of memory usage, then it returns the following error on every request :
ERROR: out of memory
DETAIL: Failed on request of size xxx
Juste after releasing the application with rails 4, postgresql memory start to increase.
As you can see in the screenshot below, It increase from 500 MO to more than 3,5Go in 3 hours
Simultaneously, commit per second doubled. It passed from 120 commit per second :
to 280 commit per second :
It is worth noting that when we restart the application, memory go down to a normal value of 600 Mo before going up to more than 3 Go few hours later (then every sql request show the 'out of memory' error). It is like if killing ActiveRecord connections were releasing memory on postgresql server.
We may well have a memory leak somewhere.
However :
It was working very well with Rails 3.2. Maybe this problem is a conjunction between changes we made to adapt our code to Rails 4 and Rails 4 code itself.
Ihe increase of the number of commit per second juste after Rails 4 upgrade seems very odd.
Our stack is :
Heroku, x2 dynos
Postgresql, Ika plan on heroku
Unicorn, 3 workers per instance
Rails 4.0.3
Redis Cache.
Noteworthy Gems : Delayed jobs (4.0.0), Active Admin (on master branch), Comfortable Mexican Sofa (1.11.2)
Nothing seems really fancy in our code.
Our postgresql config is :
work_mem : 100MB
shared_buffers : 1464MB
max_connections : 500
maintenance_work_mem : 64MB
Did someone ever experienced such a behaviour when switching to Rails 4 ? I am looking for idea to reproduce as well.
All help is very welcome.
Thanks in advance.
I don't know what is better : answer my question or update it ... so I choose to answer. Please let me know if it's better to update
We finally find out the problem. Since version 3.1, Rails added prepared statements on simple request like User.find(id). Version 4.0, added prepared statements to requests on associations (has_many, belongs_to, has_one).
For exemple following code :
class User
has_many :adresses
end
user.addresses
generate request
SELECT "addresses".* FROM "addresses" WHERE "addresses"."user_id" = $1 [["user_id", 1]]
The problem is that Rails only add prepared statement variables for foreign keys (here user_id). If you use custom sql request like
user.addresses.where("moved_at < ?", Time.now - 3.month)
it will not add a variable to the prepared statements for moved_at. So it generate a prepared statements every time the request is called. Rails handle prepared statements with a pool of max size 1000.
However, postgresql prepared statements are not shared across connection, so in one or two hours each connection has 1000 prepared statements. Some of them are very big. This lead to very high memory consumption on postgreqsl server.

ruby on rails - delayed jobs with different process id updating the same column

I discovered the problem, it had nothing to do with locks.
It seems that in production, I had a jobs:work running permanently, that was called I don't know how! So all the jobs processed by that process would do something somewhere else!
And that somewhere else is not my database, so I just killed it and all started to work fine.
Sorry, for wasting your time!!
Sorry, forgot to tell that I'm working with rails 2.3.8!
I have asynchronous updates to the same row, same column from different background process. I'm using the delayed_jobs gem.
What I want to do is:
ActiveRecord::Base.connection.execute(
"Update table_name set column = column + #{updated_number}
where id = #{self.id}").
My database is mysql and the table where I write is InnoDB.
So the problem is, running that query in different delayed_jobs will cause some data increments to be lost. please note that (column = column + #{updated_number}) I want to increment the current value on the table!
Using rails lock doesn't work because each delayed job run in a different process, I was thinking more like if the table had some locks to do updates safely.
And one more thing, using lock!, On my development code, I run 3 times the rake jobs:work, then I confirm on the delayed_job table that 3 different process locked 3 jobs, and is the development code it works perfectly.
But when put that code in production it doesn't work. The lost of increment data is still there.
Use pessimistic locking:
your_object.with_lock do
your_object.column += updated_number
your_object.save!
end
This will make sure the updates are synchronized via DB.

Rails and Heroku PGError: column does not exist

This page I have been developing for my app has been working fine locally (using sqllite3) but when I push it to Heroku, which uses PostgreSQL I get this error:
NeighborhoodsController# (ActionView::Template::Error) "PGError: ERROR: column \"isupforconsideration\" does not exist\nLINE 1: ... \"photos\" WHERE (neighborhood = 52 AND isUpForCon...\n
From this line of code:
#photos = Photo.where(["neighborhood = ? AND isUpForConsideration = ?", #neighborhood.id, 1])
isUpForConsideration is defiantly part of the Photo column. All my migrations are up to date, and when I pull the db back locally isUpForConsideration is still there, and the app still works locally.
I've also tried:
#photos = #neighborhood.photos(:conditions => {:isUpForConsideration => 1})
and
#photos = #neighborhood.photos.where(["isUpForConsideration = 1"])
Which gives me this error:
NeighborhoodsController# (ActionView::Template::Error) "PGError: ERROR: column \"isupforconsideration\" does not exist\nLINE 1: ...tos\" WHERE (\"photos\".neighborhood = 52) AND (isUpForCon...\n
Any idea what I could be doing wrong?
Your problem is that table and column names are case sensitive in PostgreSQL. This is normally hidden by automatically converting these to all-lowercase when making queries (hence why you see the error message reporting "isupforconsideration"), but if you managed to dodge that conversion on creation (by double quoting the name, like Rails does for you when you create a table), you'll see this weirdness. You need to enclose "isUpForConsideration" in double quotes when using it in a WHERE clause to fix this.
e.g.
#photos = #neighborhood.photos.where(["\"isUpForConsideration\" = 1"])
Another way to to get this error by modifying a migration and pushing changes up to Heroku without rebuilding the table.
Caveat: You'll lose data and perhaps references, so what I'm about to explain is a bad idea unless you're certain that no references will be lost. Rails provides ways to modify tables with migrations -- create new migrations to modify tables, don't modify the migrations themselves after they're created, generally.
Having said that, you can run heroku run rake db:rollback until that table you changed pops off, and then run heroku run rake db:migrate to put it back with your changes.
In addition, you can use the taps gem to back up and restore data. Pull down the database tables, munge them the way you need and then push the tables back up with taps. I do this quite often during the prototyping phase. I'd never do that with a live app though.

SQLite3::BusyException

Running a rails site right now using SQLite3.
About once every 500 requests or so, I get a
ActiveRecord::StatementInvalid (SQLite3::BusyException: database is locked:...
What's the way to fix this that would be minimally invasive to my code?
I'm using SQLLite at the moment because you can store the DB in source control which makes backing up natural and you can push changes out very quickly. However, it's obviously not really set up for concurrent access. I'll migrate over to MySQL tomorrow morning.
You mentioned that this is a Rails site. Rails allows you to set the SQLite retry timeout in your database.yml config file:
production:
adapter: sqlite3
database: db/mysite_prod.sqlite3
timeout: 10000
The timeout value is specified in miliseconds. Increasing it to 10 or 15 seconds should decrease the number of BusyExceptions you see in your log.
This is just a temporary solution, though. If your site needs true concurrency then you will have to migrate to another db engine.
By default, sqlite returns immediatly with a blocked, busy error if the database is busy and locked. You can ask for it to wait and keep trying for a while before giving up. This usually fixes the problem, unless you do have 1000s of threads accessing your db, when I agree sqlite would be inappropriate.
// set SQLite to wait and retry for up to 100ms if database locked
sqlite3_busy_timeout( db, 100 );
All of these things are true, but it doesn't answer the question, which is likely: why does my Rails app occasionally raise a SQLite3::BusyException in production?
#Shalmanese: what is the production hosting environment like? Is it on a shared host? Is the directory that contains the sqlite database on an NFS share? (Likely, on a shared host).
This problem likely has to do with the phenomena of file locking w/ NFS shares and SQLite's lack of concurrency.
If you have this issue but increasing the timeout does not change anything, you might have another concurrency issue with transactions, here is it in summary:
Begin a transaction (aquires a SHARED lock)
Read some data from DB (we are still using the SHARED lock)
Meanwhile, another process starts a transaction and write data (acquiring the RESERVED lock).
Then you try to write, you are now trying to request the RESERVED lock
SQLite raises the SQLITE_BUSY exception immediately (indenpendently of your timeout) because your previous reads may no longer be accurate by the time it can get the RESERVED lock.
One way to fix this is to patch the active_record sqlite adapter to aquire a RESERVED lock directly at the begining of the transaction by padding the :immediate option to the driver. This will decrease performance a bit, but at least all your transactions will honor your timeout and occurs one after the other. Here is how to do this using prepend (Ruby 2.0+) put this in a initializer:
module SqliteTransactionFix
def begin_db_transaction
log('begin immediate transaction', nil) { #connection.transaction(:immediate) }
end
end
module ActiveRecord
module ConnectionAdapters
class SQLiteAdapter < AbstractAdapter
prepend SqliteTransactionFix
end
end
end
Read more here: https://rails.lighthouseapp.com/projects/8994/tickets/5941-sqlite3busyexceptions-are-raised-immediately-in-some-cases-despite-setting-sqlite3_busy_timeout
Just for the record. In one application with Rails 2.3.8 we found out that Rails was ignoring the "timeout" option Rifkin Habsburg suggested.
After some more investigation we found a possibly related bug in Rails dev: http://dev.rubyonrails.org/ticket/8811. And after some more investigation we found the solution (tested with Rails 2.3.8):
Edit this ActiveRecord file: activerecord-2.3.8/lib/active_record/connection_adapters/sqlite_adapter.rb
Replace this:
def begin_db_transaction #:nodoc:
catch_schema_changes { #connection.transaction }
end
with
def begin_db_transaction #:nodoc:
catch_schema_changes { #connection.transaction(:immediate) }
end
And that's all! We haven't noticed a performance drop and now the app supports many more petitions without breaking (it waits for the timeout). Sqlite is nice!
bundle exec rake db:reset
It worked for me it will reset and show the pending migration.
Sqlite can allow other processes to wait until the current one finished.
I use this line to connect when I know I may have multiple processes trying to access the Sqlite DB:
conn = sqlite3.connect('filename', isolation_level = 'exclusive')
According to the Python Sqlite Documentation:
You can control which kind of BEGIN
statements pysqlite implicitly
executes (or none at all) via the
isolation_level parameter to the
connect() call, or via the
isolation_level property of
connections.
I had a similar problem with rake db:migrate. Issue was that the working directory was on a SMB share.
I fixed it by copying the folder over to my local machine.
Most answers are for Rails rather than raw ruby, and OPs question IS for rails, which is fine. :)
So I just want to leave this solution over here should any raw ruby user have this problem, and is not using a yml configuration.
After instancing the connection, you can set it like this:
db = SQLite3::Database.new "#{path_to_your_db}/your_file.db"
db.busy_timeout=(15000) # in ms, meaning it will retry for 15 seconds before it raises an exception.
#This can be any number you want. Default value is 0.
Source: this link
- Open the database
db = sqlite3.open("filename")
-- Ten attempts are made to proceed, if the database is locked
function my_busy_handler(attempts_made)
if attempts_made < 10 then
return true
else
return false
end
end
-- Set the new busy handler
db:set_busy_handler(my_busy_handler)
-- Use the database
db:exec(...)
What table is being accessed when the lock is encountered?
Do you have long-running transactions?
Can you figure out which requests were still being processed when the lock was encountered?
Argh - the bane of my existence over the last week. Sqlite3 locks the db file when any process writes to the database. IE any UPDATE/INSERT type query (also select count(*) for some reason). However, it handles multiple reads just fine.
So, I finally got frustrated enough to write my own thread locking code around the database calls. By ensuring that the application can only have one thread writing to the database at any point, I was able to scale to 1000's of threads.
And yea, its slow as hell. But its also fast enough and correct, which is a nice property to have.
I found a deadlock on sqlite3 ruby extension and fix it here: have a go with it and see if this fixes ur problem.
https://github.com/dxj19831029/sqlite3-ruby
I opened a pull request, no response from them anymore.
Anyway, some busy exception is expected as described in sqlite3 itself.
Be aware with this condition: sqlite busy
The presence of a busy handler does not guarantee that it will be invoked when there is
lock contention. If SQLite determines that invoking the busy handler could result in a
deadlock, it will go ahead and return SQLITE_BUSY or SQLITE_IOERR_BLOCKED instead of
invoking the busy handler. Consider a scenario where one process is holding a read lock
that it is trying to promote to a reserved lock and a second process is holding a reserved
lock that it is trying to promote to an exclusive lock. The first process cannot proceed
because it is blocked by the second and the second process cannot proceed because it is
blocked by the first. If both processes invoke the busy handlers, neither will make any
progress. Therefore, SQLite returns SQLITE_BUSY for the first process, hoping that this
will induce the first process to release its read lock and allow the second process to
proceed.
If you meet this condition, timeout isn't valid anymore. To avoid it, don't put select inside begin/commit. or use exclusive lock for begin/commit.
Hope this helps. :)
this is often a consecutive fault of multiple processes accessing the same database, i.e. if the "allow only one instance" flag was not set in RubyMine
Try running the following, it may help:
ActiveRecord::Base.connection.execute("BEGIN TRANSACTION; END;")
From: Ruby: SQLite3::BusyException: database is locked:
This may clear up the any transaction holding up the system
I believe this happens when a transaction times out. You really should be using a "real" database. Something like Drizzle, or MySQL. Any reason why you prefer SQLite over the two prior options?

Resources