transaction in activerecord - ruby-on-rails

Folks,
I am fairly new to transactions in activerecord in rails and I have a piece of code, where I do something like:
transaction do
specimen = Specimen.find_by_doc_id(25)
specimen.state = "checking"
specimen.save
result = Inventory.do_check(specimen)
if result
specimen.state="PASS"
else
specimen.state="FAIL"
end
specimen.save
end
My goal here for using a transaction is if I get an exception in Inventory.do_check(it is a client to external web-services and does a bunch of HTTP calls and checks) then I want the specimen.state to rollback to its previous value. I wanted to know if this will work as above? Also, it looks like on my development machine the lock is set on the entire Specimen table, when I try to query that table/model I get a BUSY exception(I am using SQLLite). I was thinking that the lock should only be set on that object/record.
Any feedback is much appreciated, as I said I am really new to this so my question may be very naive.

Implementation and locking depends on the DB. I don't use SQLLite and I won't be surprised if it locks the entire table in such case. But reading should still work, so it's probably because it doesn't allow two concurrent operations on a single connection, so is waiting for your transaction to finish before allowing any other operation. See, for example, this SO answer: https://stackoverflow.com/a/7154699/2117020.
However, my main point is you shouldn't be holding the transaction while accessing external services in any case. However it is implemented, keeping the transaction for seconds is not what you'd want. Looks like in your case all you want is to recover from an exception. Do you simply want to set the state to "FAIL" or "initial" as a result, or does do_check() modify your specimen? If do_check() doesn't modify the specimen, you should better do something like:
specimen = Specimen.find_by_doc_id(25)
specimen.state="checking"
specimen.save
# or simply specimen.update_attribute( :state, "checking" )
begin
specimen.state = Inventory.do_check(specimen) ? "PASS" : "FAIL"
rescue
specimen.state = "FAIL" # or "initial" or whatever
end
specimen.save

The locking is going to be highly dependent on your database. You could use a row lock. Something like this:
specimen = Specimen.find_by_doc_id(25)
success = true
# reloads the record and does a select for update which locks the row until the block exits (its wrapped in a transation)
specimen.with_lock do
result = Inventory.do_check(specimen)
if(result)
specimen.state="PASS"
else
specimen.state="FAIL"
end
specimen.save!
end
Checking the external site in a transaction is not ideal, but if you use with_lock and your database supports row lock, you should just be locking this single row (it will block reads, so use carefully)
Take a look at the pessimistic locking documentation in active record:
http://ruby-docs.com/docs/ruby_1.9.3-rails_3.2.2/Rails%203.2.2/classes/ActiveRecord/Locking/Pessimistic.html

Related

ActiveRecord and Postgres row locking

API clients in a busy application are competing for existing resources. They request 1 or 2 at a time, then attempt actions upon those record. I am trying to use transactions to protect state but am having trouble getting a clear picture of row locks, especially where nested transactions (I guess savepoints, since PG doesn't really do transactions within transactions?) are concerned.
The process should look like this:
Request N resources
Remove those resources from the pool to prevent other users from attempting to claim them
Perform action with those resources
Roll back the entire transaction and return resources to pool if an error occurs
(Assume happy path for all examples. Requests always result in products returned.)
One version could look like this:
def self.do_it(request_count)
Product.transaction do
locked_products = Product.where(state: 'available').lock('FOR UPDATE').limit(request_count).to_a
Product.where(id: locked_products.map(&:id)).update_all(state: 'locked')
do_something(locked_products)
end
end
It seems to me that we could have a deadlock on that first line if two users request 2 and only 3 are available. So, to get around it, I'd like to do...
def self.do_it(request_count)
Product.transaction do
locked_products = []
request_count.times do
Product.transaction(requires_new: true) do
locked_product = Product.where(state: 'available').lock('FOR UPDATE').limit(1).first
locked_product.update!(state: 'locked')
locked_products << locked_product
end
end
do_something(locked_products)
end
end
But from what I've managed to find online, that inner transaction's end will not release the row locks -- they'll only be released when the outermost transaction ends.
Finally, I considered this:
def self.do_it(request_count)
locked_products = []
request_count.times do
Product.transaction do
locked_product = Product.where(state: 'available').lock('FOR UPDATE').limit(1).first
locked_product.update!(state: 'locked')
locked_products << locked_product
end
end
Product.transaction { do_something(locked_products) }
ensure
evaluate_and_cleanup(locked_products)
end
This gives me two completely independent transactions followed by a third that performs the action, but I am forced to do a manual check (or I could rescue) if do_something fails, which makes things messier. It also could lead to deadlocks if someone were to call do_it from within a transaction, which is very possible.
So my big questions:
Is my understanding of the release of row locks correct? Will row locks within nested transactions only be released when the outermost transaction is closed?
Is there a command that will change the lock type without closing the transaction?
My smaller question:
Is there some established or totally obvious pattern here that's jumping out to someone to handle this more sanely?
As it turns out, it was pretty easy to answer these questions by diving into the PostgreSQL console and playing around with transactions.
To answer the big questions:
Yes, my understanding of row locks was correct. Exclusive locks acquired within savepoints are NOT released when the savepoint is released, they are released when the overall transaction is committed.
No, there is no command to change the lock type. What kind of sorcery would that be? Once you have an exclusive lock, all queries that would touch that row must wait for you to release the lock before they can proceed.
Other than committing the transaction, rolling back the savepoint or the transaction will also release the exclusive lock.
In the case of my app, I solved my problem by using multiple transactions and keeping track of state very carefully within the app. This presented a great opportunity for refactoring and the final version of the code is simpler, clearer, and easier to maintain, though it came at the expense of being a bit more spread out than the "throw-it-all-in-a-PG-transaction" approach.

Rails before_save method fails to execute 'else' block

Here is my before_save method:
before_save :check_postal
def check_postal
first_three = self.postal_code[0..2]
first_three.downcase!
postal = Postal.find_by_postal_code(first_three)
if postal
self.zone_id = postal.zone_id
else
PostalError.create(postal_code: self.postal_code)
return false
end
end
Everything runs fine when self.zone_id = postal.zone_id but inside the else statement, my PostalError.create(postal_code: self.postal_code) doesn't save the record to the database..
I know it's got something to do with the return false statement, because when I remove it, it saves fine -- but then that defeats the purpose..
How can I get a new PostalError record to save while returning false to prevent the current object from saving..
You're exactly right: the problem is the before_save.
The entirety of the save process is wrapped in a transaction. If the save fails, whether it be because of a validation failure, an exception being rolled back or something else, the transaction is rolled back. This undoes the creation of your PostalError record.
Normally this is a good thing - it's so that incomplete saves don't leave detritus around
I can think of two ways to solve this. One is to not create the record there at all: use a after_rollback hook to execute it once the danger has passed.
The other way is to create that record using a different database connection (since transactions are a per connection thing). An easy way to do that is to use a different thread:
Thread.new { PostalError.create(...)}.join
I stuck the join on there so that this waits for the thread to complete rather than adding a degree of concurrency to your app that you might not expect.
i don't know, but i trying to guess the solve.
else
PostalError.create(postal_code: self.postal_code)
self.zone_id = postal.zone_id
end

Rails 3.2 ActiveRecord concurrency

I have one application that is a task manager.
Each user can select a new task to be assigned to himself.
Is there a problem of concurrency if 2 users accept the same task at the same moment?
My code looks like this:
if #user.task == nil
#task.user = #user
#task.save
end
if 2 diferent users, on 2 diferent machines open this url at the same time. Will i have a problem?
You can use optimistic locking to prevent other "stale" records from being saved to the database. To enable it, your model needs to have a lock_version column with a default value of 0.
When the record is fetched from the database, the current lock_version comes along with it. When the record is modified and saved to the database, the database row is updated conditionally, by constraining the UPDATE on the lock_version that was present when the record was fetched. If it hasn't changed, the UPDATE will increment the lock_version. If it has changed, the update will do nothing, and an exception (ActiveRecord::StaleObjectError) will be raised. This is the default behavior for ActiveRecord unless turned off as follows:
ActiveRecord::Base.lock_optimistically = false
You can (optionally) use a column-name other than lock_version. To use a custom name, add a line like the following to your model-class:
set_locking_column :some_column_name
An alternative to optimistic locking is pessimistic locking, which relies on table- or row-level locks at the database level. This mechanism will block out all access to a locked row, and thus may negatively affect your performance.
Never tried it but you may use http://api.rubyonrails.org/classes/ActiveRecord/Locking/Pessimistic.html
You should be able to acquire a lock on your specific task, something like that:
#task = Task.find(some_id)
#task.with_lock do
#Then let's check if there's still no one assigned to this task
if #task.user.nil? && #user.task.nil?
#task.user = #user
#task.save
end
end
Again, I never used this so I'd test it with a big sleep inside the lock to make sure it actually locks everything the way you want it
Also I'm not sure about the reload here. Since the row is locked, it may fail. But you have to make sure your object is fresh from the db after acquiring the lock, there may be another way to do it.
EDit : NO need to reload, I checked the source code and with_lock does it for you.
https://github.com/rails/rails/blob/4c5b73fef8a41bd2bd8435fa4b00f7c40b721650/activerecord/lib/active_record/locking/pessimistic.rb#L61

Database lock not working as expected with Rails & Postgres

I have the following code in a rails model:
foo = Food.find(...)
foo.with_lock do
if bar = foo.bars.find_by_stuff(stuff)
# do something with bar
else
bar = foo.bars.create!
# do something with bar
end
end
The goal is to make sure that a Bar of the type being created is not being created twice.
Testing with_lock works at the console confirms my expectations. However, in production, it seems that in either some or all cases the lock is not working as expected, and the redundant Bar is being attempted -- so, the with_lock doesn't (always?) result in the code waiting for its turn.
What could be happening here?
update
so sorry to everyone who was saying "locking foo won't help you"!! my example initially didin't have the bar lookup. this is fixed now.
You're confused about what with_lock does. From the fine manual:
with_lock(lock = true)
Wraps the passed block in a transaction, locking the object before yielding. You pass can the SQL locking clause as argument (see lock!).
If you check what with_lock does internally, you'll see that it is little more than a thin wrapper around lock!:
lock!(lock = true)
Obtain a row lock on this record. Reloads the record to obtain the requested lock.
So with_lock is simply doing a row lock and locking foo's row.
Don't bother with all this locking nonsense. The only sane way to handle this sort of situation is to use a unique constraint in the database, no one but the database can ensure uniqueness unless you want to do absurd things like locking whole tables; then just go ahead and blindly try your INSERT or UPDATE and trap and ignore the exception that will be raised when the unique constraint is violated.
The correct way to handle this situation is actually right in the Rails docs:
http://apidock.com/rails/v4.0.2/ActiveRecord/Relation/find_or_create_by
begin
CreditAccount.find_or_create_by(user_id: user.id)
rescue ActiveRecord::RecordNotUnique
retry
end
("find_or_create_by" is not atomic, its actually a find and then a create. So replace that with your find and then create. The docs on this page describe this case exactly.)
Why don't you use a unique constraint? It's made for uniqueness
A reason why a lock wouldn't be working in a Rails app in query cache.
If you try to obtain an exclusive lock on the same row multiple times in a single request, query cached kicks in so subsequent locking queries never reach the DB itself.
The issue has been reported on Github.

Simulating race conditions in RSpec unit tests

We have an asynchronous task that performs a potentially long-running calculation for an object. The result is then cached on the object. To prevent multiple tasks from repeating the same work, we added locking with an atomic SQL update:
UPDATE objects SET locked = 1 WHERE id = 1234 AND locked = 0
The locking is only for the asynchronous task. The object itself may still be updated by the user. If that happens, any unfinished task for an old version of the object should discard its results as they're likely out-of-date. This is also pretty easy to do with an atomic SQL update:
UPDATE objects SET results = '...' WHERE id = 1234 AND version = 1
If the object has been updated, its version won't match and so the results will be discarded.
These two atomic updates should handle any possible race conditions. The question is how to verify that in unit tests.
The first semaphore is easy to test, as it is simply a matter of setting up two different tests with the two possible scenarios: (1) where the object is locked and (2) where the object is not locked. (We don't need to test the atomicity of the SQL query as that should be the responsibility of the database vendor.)
How does one test the second semaphore? The object needs to be changed by a third party some time after the first semaphore but before the second. This would require a pause in execution so that the update may be reliably and consistently performed, but I know of no support for injecting breakpoints with RSpec. Is there a way to do this? Or is there some other technique I'm overlooking for simulating such race conditions?
You can borrow an idea from electronics manufacturing and put test hooks directly into the production code. Just as a circuit board can be manufactured with special places for test equipment to control and probe the circuit, we can do the same thing with the code.
SUppose we have some code inserting a row into the database:
class TestSubject
def insert_unless_exists
if !row_exists?
insert_row
end
end
end
But this code is running on multiple computers. There's a race condition, then, since another processes may insert the row between our test and our insert, causing a DuplicateKey exception. We want to test that our code handles the exception that results from that race condition. In order to do that, our test needs to insert the row after the call to row_exists? but before the call to insert_row. So let's add a test hook right there:
class TestSubject
def insert_unless_exists
if !row_exists?
before_insert_row_hook
insert_row
end
end
def before_insert_row_hook
end
end
When run in the wild, the hook does nothing except eat up a tiny bit of CPU time. But when the code is being tested for the race condition, the test monkey-patches before_insert_row_hook:
class TestSubject
def before_insert_row_hook
insert_row
end
end
Isn't that sly? Like a parasitic wasp larva that has hijacked the body of an unsuspecting caterpillar, the test hijacked the code under test so that it will create the exact condition we need tested.
This idea is as simple as the XOR cursor, so I suspect many programmers have independently invented it. I have found it to be generally useful for testing code with race conditions. I hope it helps.

Resources