Rails: Thread won't affect database unless joined to main Thread - ruby-on-rails

I have a background operation I would like to occur every 20 seconds in Rails given that some condition is true. It kicked off when a certain controller route is hit, and it looks like this
def startProcess
argId = self.id
t = Thread.new do
while (Argument.isRunning(argId)) do
Argument.update(argId)
Argument.markVotes(argId)
puts "Thread ran"
sleep 20
end
end
end
However, this code does absolutely nothing to my database unless I call "t.join" in which case my whole server is blocked for a long time (but it works).
Why can't the read commit ActiveRecords without being joined to the main thread?
The thread calls methods that look something like
def sample
model = Model.new()
model.save()
end
but the models are not saved to the DB unless the thread is joined to the main thread. Why is this? I have been banging my head about this for hours.
EDIT:
The answer marked correct is technically correct, however this edit is to outline the solution I eventually used. The issues is that Ruby does not have true threading, so even once I got my DB connection working the Thread couldn't get processor time unless there was little traffic to the server.
Solution: start a new Heroku worker instance, point it at the same database, and make it execute a rake task that has the same functionality as the thread. Now everything works great.

You need to re-establish the database connection:
ActiveRecord::Base.establish_connection Rails.env

Related

How to avoid Race Condition and Lock wait timeout updating views of a page

Inside a Rails application, users visit a page where I show a popup.
I want to update a record every time users see that popup.
To avoid race condition I use optimistic locking (so I added a field called lock_version in the popups table).
The code is straightforward:
# inside pages/show.html.erb
<%= render #popup %>
# and inside the popup partial...
...
<%
Popup.transaction do
begin
popup.update_attributes(:views => popup.views + 1)
rescue ActiveRecord::StaleObjectError
retry
end
end
%>
The problem is that lots of users access the page, and mysql exceeds timeout for locking.
So the website freeze and I get lots of these errors:
Lock wait timeout exceeded; try restarting transaction
That's because there are lots of pending requests trying to update the record with an outdated lock_version value.
How can I solve my problem?
You can use increment_counter because it produce one SQL UPDATE query without locking.
But I think will be better in your case to use any key-value DB like Redis to store and update your popup counter because it can do it faster than SQL DB.
If you cannot go with an approach like #maxd noted in their reply, you can utilize an asynchronous library such as Sidekiq to process these sort of requests (wherein they'll just get backed up in the job queue).
lib/some_made_up_class.rb
def increment_popup(popup)
Popup.transaction do
begin
popup.update_attributes(:views => popup.views + 1)
rescue ActiveRecord::StaleObjectError
retry
end
end
end
Then, in another piece of code (your controller, a service, or the view (less ideal to put logic in the view layer).
SomeMadeUpClass.delay.increment_popup(popup)
# OR you can schedule it
SomeMadeUpClass.delay_for(1.second).increment_popup(popup)
This would have the effect of, essentially, queueing up your inserts, while freeing your page and, in theory, helping to reduce the timeouts you're hitting, etc.
While there is certainly more to it than just adding a library (gem) like Sidekiq and the sample code I have here, I think asynchronous libraries/tools will help tremendously.

How to prevent parallel Sidekiq jobs from executing code in Rails

I have around 10 workers that performs a job that includes the following:
user = User.find_or_initialize_by(email: 'some-email#address.com')
if user.new_record?
# ... some code here that does something taking around 5 seconds or so
elsif user.persisted?
# ... some code here that does something taking around 5 seconds or so
end
user.save
The problem is that at certain times, two or more workers run this code at the exact time, and thus I later found out that two or more Users have the same email, in which I should always end up only unique emails.
It is not possible for my situation to create DB Unique Indexes for email as unique emails are conditional -- some Users should have unique email, some do not.
It is noteworthy to mention that my User model has uniqueness validations, but it still doesn't help me because, between .find_or_initialize_by and .save, there is a code that is dependent if the user object is already created or not.
I tried Pessimistic and Optimistic locking, but it didn't help me, or maybe I just didn't implement it properly... should you have some suggestions regarding this.
The solution I can only think of is to lock the other threads (Sidekiq jobs) whenever these lines of codes get executed, but I am not too sure how to implement this nor do I know if this is even a suggestable approach.
I would appreciate any help.
EDIT
In my specific case, it is gonna be hard to put email parameter in the job, as this job is a little more complex than what was just said above. The job is actually an export script in which a section of the job is the code above. I don't think it's also possible to separate the functionality above into another separate worker... as the whole job flow should be serial and that no parts should be processed parallely / asynchronously. This job is just one of the jobs that are managed by another job, in which ultimately is managed by the master job.
Pessimistic locking is what you want but only works on a record that exists - you can't use it with new_record? because there's nothing to lock in the DB yet.
I managed to solve my problem with the following:
I found out that I can actually add a where clause in Rails DB Uniqueness Partial Index, and thus I can now set up uniqueness conditions for different types of Users on the database-level in which other concurrent jobs will now raise an ActiveRecord::RecordNotUnique error if already created.
The only problem now then is the code in between .find_or_initialize_by and .save, since those are time-dependent on the User objects in which always only one concurrent job should always get a .new_record? == true, and other concurrent jobs should then trigger the .persisted? == true as one job would always be first to create it, but... all of these doesn't work yet because it is only at the line .save where the db uniqueness index validation gets called. Therefore, I managed to solve this problem by putting .save before those conditions, and at the same time I added a rescue block for .save which then adds another job to the queue of itself should it trigger the ActiveRecord::RecordNotUnique error, to make sure that async jobs won't get conflicts. The code now looks like below.
user = User.find_or_initialize_by(email: 'some-email#address.com')
begin
user.save
is_new_record = user.new_record?
is_persisted = user.persisted?
rescue ActiveRecord::RecordNotUnique => exception
MyJob.perform_later(params_hash)
end
if is_new_record
# do something if not yet created
elsif is_persisted
# do something if already created
end
I would suggest a different architecture to bypass the problem.
How about a producer-worker model, where one master Sidekiq process gets a list of email addresses, and then spawns a worker Sidekiq process for each email? Sidekiq makes this easy with a dedicated queue for master and workers to communicate.
Doing so, the email address becomes an input parameter of workers, so we know by construction that workers will not stump on each other data.

Rails 4 Multithreaded App - ActiveRecord::ConnectionTimeoutError

I have a simple rails app that scrapes JSON from a remote URL for each instance of a model (let's call it A). The app then creates a new data-point under an associated model of the 1st. Let's call this middle model B and the data point model C. There's also a front end that let's users browse this data graphically/visually.
Thus the hierarchy is A has many -> B which has many -> C. I scrape a URL for each A which returns a few instances of B with new Cs that have data for the respective B.
While attempting to test/scale this app I have encountered a problem where rails will stop processing, hang for a while, and finally throw a "ActiveRecord::ConnectionTimeoutError could not obtain a database connection within 5.000 seconds" Obviously the 5 is just the default.
I can't understand why this is happening when 1) there are no DB calls being made explicitly, 2) the log doesn't show any under the hood DB calls happening when it does work 3) it works sometimes and not others.
What's going on with rails 4 AR and the connection pool?!
A couple of notes:
The general algorithm is to spawn a thread for each model A, scrape the data, create in memory new instances of model C, save all the C's in one transaction at the end.
Sometimes this works, other times it doesn't, i can't figure out what causes it to fail. However, once it fails it seems to fail more and more.
I eager load all the model A's and B's to begin with.
I use a transaction at the end to insert all the newly created C instances.
I currently use resque and resque scheduler to do this work but I highly doubt they are the source of the problem as it persists even if I just do "rails runner Class.do_work"
Any suggestions and or thoughts greatly appreciated!
I believe I have found the cause of this problem. When you loop through an association via
model.association.each do |a|
#work here
end
Rails does some behind the scenes work that "uses" a DB connection. I put uses in quotes because in my case I think the result is actually returned from memory. I eager loaded the association and thus the DB is never actually hit.
Preliminary testing of wrapping my block in a
ActiveRecord::Base.connection_pool.with_connection do
#something me doing?
end
seems to have resolved the issue.
I uncovered this by adding a backtrace to my thread's error message that was printing out.
-----For those using resque----
I also had to add a bit in my resque.rake file to get this fully working as intended.
task 'resque:setup' => :environment do
Resque.after_fork do |job|
ActiveRecord::Base.establish_connection
end
end
If you are you using
ActiveRecord::Base.transaction do
... code
end
to accomplish faster transactions in a thread, note that this locks the database. I had an app that did this for a hugely expensive process, in a thread, and it would lock the DB for over 5 seconds. It is faster, though it will lock your database

Can ruby exceptions be handled asynchronously outside of a Thread::handle_interrupt block?

At first glance, I thought the new ruby 2.0 Thread.handle_interrupt was going to solve all my asynchronous interrupt problems, but unless I'm mistaken I can't get it to do what I want (my question is at the end and in the title).
From the documentation, I can see how I can avoid receiving interrupts in a certain block, deferring them to another block. Here's an example program:
duration = ARGV.shift.to_i
t = Thread.new do
Thread.handle_interrupt(RuntimeError => :never) do
5.times { putc '-'; sleep 1 }
Thread.handle_interrupt(RuntimeError => :immediate) do
begin
5.times { putc '+'; sleep 1}
rescue
puts "received #{$!}"
end
end
end
end
sleep duration
puts "sending"
t.raise "Ka-boom!"
if t.join(20 + duration).nil?
raise "thread failed to join"
end
When run with argument 2 it outputs something like this:
--sending-
--received Ka-boom!
That is, the main thread sends a RuntimeError to the other thread after two seconds, but that thread doesn't handle it until it gets into the inner Thread.handle_interrupt block.
Unfortunately, I don't see how this can help me if I don't know where my thread is getting created, because I can't wrap everything it does in a block. For example, in Rails, what would I wrap the Thread.handle_interrupt or begin...rescue...end blocks around? And wouldn't this differ depending on what webserver is running?
What I was hoping for is a way to register a handler, like the way Kernel.trap works. Namely, I'd like to specify handling code that's context-independent that will handle all exceptions of a certain type:
register_handler_for(SomeExceptionClass) do
... # handle the exception
end
What precipitated this question was how the RabbitMQ gem, bunny sends connection-level errors to the thread that opened the Bunny::Session using Thread#raise. These exceptions could end up anywhere and all I want to do is log them, flag that the connection is unavailable, and continue on my way.
Ideas?
Ruby provides for this with the ruby Queueobject (not to be confused with an AMQP queue). It would be nice if Bunny required you to create a ruby Queue before opening a Bunny::Session, and you passed it that Queue object, to which it would send connection-level errors instead of using Thread#raise to send it back to where ever. You could then simply provide your own Thread to consume messages through the Queue.
It might be worth looking inside the RabbitMQ gem code to see if you could do this, or asking the maintainers of that gem about it.
In Rails this is not likely to work unless you can establish a server-wide thread to consume from the ruby Queue, which of course would be web server specific. I don't see how you can do this from within a short-lived object, e.g. code for a Rails view, where the threads are reused but Bunny doesn't know that (or care).
I'd like to raise (ha-ha!) a pragmatic workaround. Here be dragons. I'm assuming you're building an application and not a library to be redistributed, if not then don't use this.
You can patch Thread#raise, specifically on your session thread instance.
module AsynchronousExceptions
#exception_queue = Queue.new
class << self
attr_reader :exception_queue
end
def raise(*args)
# We do this dance to capture an actual error instance, because
# raise may be called with no arguments, a string, 3 arguments,
# an error, or any object really. We want an actual error.
# NOTE: This might need to be adjusted for proper stack traces.
error = begin
Kernel.raise(*args)
rescue => error
error
end
AsynchronousExceptions.exception_queue.push(error)
end
end
session_thread = Thread.current
session_thread.singleton_class.prepend(AsynchronousExceptions)
Bear in mind that exception_queue is essentially a global. We also patch for everybody, not just the reader loop. Luckily there are few legitimate reasons to do Thread.raise, so you might just get away with this safely.

Connection pool issue with ActiveRecord objects in rufus-scheduler

I'm using rufus-scheduler to run a number of frequent jobs that do some various tasks with ActiveRecord objects. If there is any sort of network or postgresql hiccup, even after recovery, all the threads will throw the following error until the process is restarted:
ActiveRecord::ConnectionTimeoutError (could not obtain a database connection within 5 seconds (waited 5.000122687 seconds). The max pool size is currently 5; consider increasing it.
The error can easily be reproduced by restarting postgres. I've tried playing (up to 15) with the pool size, but no luck there.
That leads me to believe the connections are just in a stale state, which I thought would be fixed with the call to clear_stale_cached_connections!.
Is there a more reliable pattern to do this?
The block that is passed is a simple select and update active record call, and happens to matter what the AR object is.
The rufus job:
scheduler.every '5s' do
db do
DataFeed.update #standard AR select/update
end
end
wrapper:
def db(&block)
begin
ActiveRecord::Base.connection_pool.clear_stale_cached_connections!
#ActiveRecord::Base.establish_connection # this didn't help either way
yield block
rescue Exception => e
raise e
ensure
ActiveRecord::Base.connection.close if ActiveRecord::Base.connection
ActiveRecord::Base.clear_active_connections!
end
end
Rufus scheduler starts a new thread for every job.
ActiveRecord on the other hand cannot share connections between threads, so it needs to assign a connection to a specific thread.
When your thread doesn't have a connection yet, it will get one from the pool.
(If all connections in the pool are in use, it will wait untill one is returned from another thread. Eventually timing out and throwing ConnectionTimeoutError)
It is your responsibility to return it back to the pool when you are done with it, in a Rails app, this is done automatically. But if you are managing your own threads (as rufus does), you have to do this yourself.
Lucklily, there is an api for this:
If you put your code inside a with_connection block, it will get a connection form the pool, and release it when it is done
ActiveRecord::Base.connection_pool.with_connection do
#your code here
end
In your case:
def db
ActiveRecord::Base.connection_pool.with_connection do
yield
end
end
Should do the trick....
http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/ConnectionPool.html#method-i-with_connection
The reason can be that you have many threads which are using all connections, if DataFeed.update method takes more than 5 seconds, than your block can be overlapped.
try
scheduler.every("5s", :allow_overlapping => false) do
#...
end
Also try release connection instead of closing it.
ActiveRecord::Base.connection_pool.release_connection
I don't really know about rufus-scheduler, but I got some ideas.
The first problem could be a bug on rufus-scheduler that does not checkout database connection properly. If it's the case the only solution is to clear stale connections manually as you already do and to inform the author of rufus-scheduler about your issue.
Another problem that could happen is that your DataFeed operation takes a really long time and because it is performed every 5 secondes Rails is running out of database connections, but it's rather unlikely.

Resources