Race Condition in Ruby on Rails - ruby-on-rails

I'm running into a weird bug on Heroku, which I believe may be a race condition, and I'm looking for any sort of advice for solving it.
My application has a model that calls an external API (Twilio, if you're curious) after it's created. In this call, it passes a url to be called back once the third party completes their work. Like this:
class TextMessage < ActiveRecord::Base
after_create :send_sms
def send_sms
call.external.third.party.api(
:callback => sent_text_message_path(self)
)
end
end
Then I have a controller to handle the callback:
class TextMessagesController < ActiveController::Base
def sent
#textmessage = TextMessage.find(params[:id])
#textmessage.sent = true
#textmessage.save
end
end
The problem is that the third party is reporting that they're getting a 404 error on the callback because the model hasn't been created yet. Our own logs confirm this:
2014-03-13T18:12:10.291730+00:00 app[web.1]: ActiveRecord::RecordNotFound (Couldn't find TextMessage with id=32)
We checked and the ID is correct. What's even crazier is that we threw in a puts to log when the model is created and this is what we get:
2014-03-13T18:15:22.192733+00:00 app[web.1]: TextMessage created with ID 35.
2014-03-13T18:15:22.192791+00:00 app[web.1]: ActiveRecord::RecordNotFound (Couldn't find TextMessage with id=35)
Notice the timestamps. These things seem to be happening 58 milliseconds apart, so I'm thinking it's a race condition. We're using Heroku (at least for staging) so I think it might be their virtual Postgres databases that are the problem.
Has anyone had this sort of problem before? If so, how did you fix it? Are there any recommendations?

after_create is processed within the database transaction saving the text message. Therefore the callback that hits another controller cannot read the text message. It is not a good idea to have an external call within a database transaction, because the transaction blocks parts of the database for the whole time the slow external request takes.
The simples solution is to replace after_save with after_commit (see: http://apidock.com/rails/ActiveRecord/Transactions/ClassMethods/after_commit)
Since callbacks tend to become hard to understand (and may lead to problems when testing), I would prefer to make the call explicit by calling another method. Perhaps something like this:
# use instead of .save
def save_and_sent_sms
save and sent_sms
end
Perhaps you want to sent the sms in the background, so it does not slow down the web request for the user. Search for the gems delayed_job or resque for more information.

Do you have master/slave database where you always write to master but read from slave? This sounds like the db replication lag.
We solved such problems by forcing a read being made from the master database in the specific controller action.
Another way would be to call send_sms after the replication has been finished.

Related

Sidekiq mailer job access to db before model is been saved

Probably the title is not self explanatory, the situation is this:
# user.points: 0
user.update!(points: 1000)
UserMailer.notify(user).deliver_later. # user.points = 0 => Error !!!!
user instance is updated and after that the Mailer is called with the user as a parameter, and in the email that changes are non-existent: user.points=0 instead of 1000
But, with a sleep 1 just after the user_update the email is sent with the changes updated, so it seems that the email job is faster than updating data to database.
# user.points: 0
user.update!(points: 1000)
sleep 1
UserMailer.notify(user).deliver_later. # user.points = 1000 => OK
What's the best approach to solve this avoiding this two possible solutions?
One solution could be calling UserMailer.notify not with the user instance but with the user values
Another solution, it could be sending the mail in the user callback after_commit
So, is there another way to solve this keeping the user instance as the parameter and avoiding the after_commit callback?
Thanks
Remember, Sidekiq runs copy of your Rails app in a separate process, using Redis as the medium. When you call deliver_later, it does not actually 'pass' user to the mailer job. It spawns a thread that enqueues the job in Redis, passing a serialized hash of user properties, including the ID.
When the mailer job runs in the Sidekiq process, it loads a fresh copy of user from the database. If the transaction containing your update! in the main Rails app has not yet finished committing, Sidekiq gets the old record from the database. So, it's a race condition.
(update! already wraps an implicit transaction around itself if there isn't one, so wrapping it in your own transaction is redundant, and doesn't help the race condition since nested ActiveRecord transactions commit only when the outermost transaction commits.)
In a pinch, you could delay enqueuing the job with something hacky like .deliver_later(wait_until: 10.seconds.from_now), but your best bet is to put the mailer notification in an after_commit callback on your model.
class User < ApplicationRecord
after_commit :send_points_mailer
def send_points_mailer
return unless previous_changes.includes?(:points)
UserMailer.notify(self).deliver_later
end
end
A model's after_commit callbacks are guaranteed to run after the final transaction is committed, so, like nuking from orbit, it's the only way to be sure.
You didn't mention it, but I'm assuming you are using ActiveRecord? If so you likely need to assure to flush the database transaction before your sidekiq job is scheduled.
https://api.rubyonrails.org/v6.1.4/classes/ActiveRecord/Transactions/ClassMethods.html

how to solve NoMethodError efficiently (rails)?

I am new to rails and notice a very odd pattern. I thought some error messages in Django were obscenely cryptic, but building my second app in rails I notice my errors come up NoMethodError more than 90% of the time.
How do rails people tell the difference between all these errors with the same name?
What does NoMethodError mean at it's core? It seems like what you're calling in the template is misspelled, or you're accessing attributes that don't exist?
Where can this error happen? Is the only possible cause in the template/view (the html.erb file)? Or can a bad call in the controller and whatnot cause same error?
Also, what is the best debugger gem to alleviate these issues? Reading the full trace isn't too helpful for beginners at least for a while, I;d like a first hand account of what debugger someone uses instead of reading hype
Thank you kindly
NoMethodError means you are calling a method on an object, but that object doesn't provide such method.
This is a quite bad error in the sense that is reveals a poorly designed and tested application. It generally means your code is not behaving as you are expected.
The way to reduce such errors is:
Make sure that when you are writing the code you are taking care of the various edge cases that may happen, not just the correct path. In other words, you need to take care of validating what's going on and making sure that if something is not successful (e.g. the user is not supplying all the input requested) your application will handle the case gracefully.
Make sure you write automatic test cases that covers the behavior of each method in your codebase
Keep track of the errors. When an error occurs, write a test to reproduce the behavior, fix the code and check the test passes. That will reduce the risk of regression.
This is not a Rails specific error actually. I'll try to explain what's happening at its core.
Ruby is a language that functions through message passing. Objects communicate by sending messages to each other.
The message needs to be defined as a method on the object to respond to it. This can be directly defined on the object itself, the object's class, the object's class's parents/ancestors or through included modules.
class MyObject
def some_method
puts "Yay!"
end
end
> MyObject.new.some_method
Yay!
Objects can define method_missing to handle unexpected messages.
class MyObject
def method_missing(name, *args, &block)
puts name
end
end
> MyObject.new.some_undefined_method
some_undefined_method
Without the method_missing handler, the object will raise a NoMethodError
class MyObject
end
> MyObject.new.some_undefined_method
NoMethodError: undefined method 'some_undefined_method' for #<MyObject...>
So what does this look like in Rails?
$ rails generate model User name:string
Produces this
# models/user.rb
class User < ActiveRecord::Base
end
Which has the following methods implemented among others (by ActiveRecord)
def name
end
def name=(value)
end
When you do the following:
# create a new User object
> user = User.new
#<User ... >
# call the 'name=' method on the User object
> user.name = "name"
"name"
# call the 'name' method on the User object. Note that these are 2 different methods
> user.name
"name"
> user.some_undefined_method
NoMethodError: undefined method 'some_undefined_method' for #<User...>
You'll see the same results whether you're calling it in your console, your model, your controller or in the view as they're all running the same Ruby code.
ERB view templates are slightly different in that what you enter is only evaluated as Ruby code when it's between <% %> or <%= %>. Otherwise, it gets written out to the page as text.
How do rails people tell the difference between all these errors with
the same name?
We usually look at the stack trace that comes back with the response (in development mode) as well as looking in the logs. In a dev environment, I am running my server in a console where I can scroll through the request.
What does NoMethodError mean at it's core? It seems like what you're
calling in the template is misspelled, or you're accessing attributes
that don't exist?
Due to dynamic coupling nature of Ruby. When Ruby determines that an object doesn't have a method by the name of the one that was called. It looks for a method by the name of "method_missing" within that object. If it's not defined then the super one is called which has the default behaviour of raising an exception. Rails leverages this mechanism heavily in it's internal dispatching
Where can this error happen? Is the only possible cause in the
template/view (the html.erb file)? Or can a bad call in the controller
and whatnot cause same error?
This error can happen wherever you have Ruby code, It has nothing to do with rails
Also, what is the best debugger gem to alleviate these issues? Reading
the full trace isn't too helpful for beginners at least for a while,
I;d like a first hand account of what debugger someone uses instead of
reading hype
An Invaluable tool for debugging is the gem 'pry'. It has many useful plug-able tools that greatly simplify debugging

before_validation not being called?

I'm having a weird issue for which I can't find a logical explanation.
I'm investigating a bug and put some logging in place (through Rollbar) so I can see the evolution some instances of one of my models.
Here is what I got:
class Connexion < ActiveRecord::Base
before_validation :save_info_in_rollbar
after_save :save_info_in_rollbar
def save_info_in_rollbar
Rollbar.log("debug", "Connexion save", :connexion_id => self.id, :connexion_details => self.attributes)
end
end
Now I am getting loads of data in rollbar (pretty much 2 rows for every time a connexion is created/updated). But the weird thing is the following: for some connexions (=> exactly the ones with faulty data which I am investigating), I am getting no data at all!
I don't get how it's possible for a connexion to be created and persisted to the DB, and not have any trace of the before_validation logging.
It looks like the callback is not called, but unless I'm mistaken, it's supposed to be the first one in the callback order => what could prevent it from being called?
EDIT >> Copy and paste from a reply, might be relevant:
There are 3 cases in which a connexion is created or updated, and thoses cases are :
.connexions.create()
connexion.attr = "value"; connexion.save!
connexion.update_attributes(attr: "value")
The only cases in which the callback won’t be run are:
Explicitly skipping validations (e.g. with save(validate: false))
Using an update method that skips Ruby-land (either partially or entirely, see each method’s linked docs) and just runs the SQL directly (e.g. update_columns, update_attribute, update_all).
But: I might be missing a case. Also, I’m assuming there isn’t a bug in ActiveRecord/ActiveModel causing this.
Sorry about the stupid question guys, the explanation was that we were having 2 apps working on the same database, and the modification was made by the other app (which of course was not sending the Rollbar updates).
Sometimes the toughest issues have the most simple answers haha
Firstly, you don't need self in instance methods, as the scope of the method is instance.
Secondly, you need to check, how are you saving the data to the database. You can skip callbacks in Rails: Rails 3 skip validations and callbacks
Thirdly, double check the data.

How to prevent parallel Sidekiq jobs from executing code in Rails

I have around 10 workers that performs a job that includes the following:
user = User.find_or_initialize_by(email: 'some-email#address.com')
if user.new_record?
# ... some code here that does something taking around 5 seconds or so
elsif user.persisted?
# ... some code here that does something taking around 5 seconds or so
end
user.save
The problem is that at certain times, two or more workers run this code at the exact time, and thus I later found out that two or more Users have the same email, in which I should always end up only unique emails.
It is not possible for my situation to create DB Unique Indexes for email as unique emails are conditional -- some Users should have unique email, some do not.
It is noteworthy to mention that my User model has uniqueness validations, but it still doesn't help me because, between .find_or_initialize_by and .save, there is a code that is dependent if the user object is already created or not.
I tried Pessimistic and Optimistic locking, but it didn't help me, or maybe I just didn't implement it properly... should you have some suggestions regarding this.
The solution I can only think of is to lock the other threads (Sidekiq jobs) whenever these lines of codes get executed, but I am not too sure how to implement this nor do I know if this is even a suggestable approach.
I would appreciate any help.
EDIT
In my specific case, it is gonna be hard to put email parameter in the job, as this job is a little more complex than what was just said above. The job is actually an export script in which a section of the job is the code above. I don't think it's also possible to separate the functionality above into another separate worker... as the whole job flow should be serial and that no parts should be processed parallely / asynchronously. This job is just one of the jobs that are managed by another job, in which ultimately is managed by the master job.
Pessimistic locking is what you want but only works on a record that exists - you can't use it with new_record? because there's nothing to lock in the DB yet.
I managed to solve my problem with the following:
I found out that I can actually add a where clause in Rails DB Uniqueness Partial Index, and thus I can now set up uniqueness conditions for different types of Users on the database-level in which other concurrent jobs will now raise an ActiveRecord::RecordNotUnique error if already created.
The only problem now then is the code in between .find_or_initialize_by and .save, since those are time-dependent on the User objects in which always only one concurrent job should always get a .new_record? == true, and other concurrent jobs should then trigger the .persisted? == true as one job would always be first to create it, but... all of these doesn't work yet because it is only at the line .save where the db uniqueness index validation gets called. Therefore, I managed to solve this problem by putting .save before those conditions, and at the same time I added a rescue block for .save which then adds another job to the queue of itself should it trigger the ActiveRecord::RecordNotUnique error, to make sure that async jobs won't get conflicts. The code now looks like below.
user = User.find_or_initialize_by(email: 'some-email#address.com')
begin
user.save
is_new_record = user.new_record?
is_persisted = user.persisted?
rescue ActiveRecord::RecordNotUnique => exception
MyJob.perform_later(params_hash)
end
if is_new_record
# do something if not yet created
elsif is_persisted
# do something if already created
end
I would suggest a different architecture to bypass the problem.
How about a producer-worker model, where one master Sidekiq process gets a list of email addresses, and then spawns a worker Sidekiq process for each email? Sidekiq makes this easy with a dedicated queue for master and workers to communicate.
Doing so, the email address becomes an input parameter of workers, so we know by construction that workers will not stump on each other data.

Catching errors with Ruby Twitter gem, caching methods using delayed_job: What am I doing wrong?

What I'm doing
I'm using the twitter gem (a Ruby wrapper for the Twitter API) in my app, which is run on Heroku. I use Heroku's Scheduler to periodically run caching tasks that use the twitter gem to, for example, update the list of retweets for a particular user. I'm also using delayed_job so scheduler calls a rake task, which calls a method that is 'delayed' (see scheduler.rake below). The method loops through "authentications" (for users who have authenticated twitter through my app) to update each authorized user's retweet cache in the app.
My question
What am I doing wrong? For example, since I'm using Heroku's Scheduler, is delayed_job redundant? Also, you can see I'm not catching (rescuing) any errors. So, if Twitter is unreachable, or if a user's auth token has expired, everything chokes. This is obviously dumb and terrible because if there's an error, the entire thing chokes and ends up creating a failed delayed_job, which causes ripple effects for my app. I can see this is bad, but I'm not sure what the best solution is. How/where should I be catching errors?
I'll put all my code (from the scheduler down to the method being called) for one of my cache methods. I'm really just hoping for a bulleted list (and maybe some code or pseudo-code) berating me for poor coding practice and telling me where I can improve things.
I have seen this SO question, which helps me a little with the begin/rescue block, but I could use more guidance on catching errors, and one the higher-level "is this a good way to do this?" plane.
Code
Heroku Scheduler job:
rake update_retweet_cache
scheduler.rake (in my app)
task :update_retweet_cache => :environment do
Tweet.delay.cache_retweets_for_all_auths
end
Tweet.rb, update_retweet_cache method:
def self.cache_retweets_for_all_auths
#authentications = Authentication.find_all_by_provider("twitter")
#authentications.each do |authentication|
authentication.user.twitter.retweeted_to_me(include_entities: true, count: 200).each do |tweet|
# Actually build the cache - this is good - removing to keep this short
end
end
end
User.rb, twitter method:
def twitter
authentication = Authentication.find_by_user_id_and_provider(self.id, "twitter")
if authentication
#twitter ||= Twitter::Client.new(:oauth_token => authentication.oauth_token, :oauth_token_secret => authentication.oauth_secret)
end
end
Note: As I was posting this, I noticed that I'm finding all "twitter" authentications in the "cache_retweets_for_all_auths" method, then calling the "User.twitter" method, which specifically limits to "twitter" authentications. This is obviously redundant, and I'll fix it.
First what is the exact error you are getting, and what do you want to happen when there is an error?
Edit:
If you just want to catch the errors and log them then the following should work.
def self.cache_retweets_for_all_auths
#authentications = Authentication.find_all_by_provider("twitter")
#authentications.each do |authentication|
being
authentication.user.twitter.retweeted_to_me(include_entities: true, count: 200).each do |tweet|
# Actually build the cache - this is good - removing to keep this short
end
rescue => e
#Either create an object where the error is log, or output it to what ever log you wish.
end
end
end
This way when it fails it will keep moving on to the next user but will still making a note of the error. Most of the time with twitter its just better to do something like this then try to do with each error on its own. I have seen so many weird things out of the twitter API, and random errors, that trying to track down every error almost always turns into a wild goose chase, though it is still good to keep track just in case.
Next for when you should use what.
You should use a scheduler when you need something to happen based on time only, delayed jobs for when its based on an user action, but the 'action' you are going to delay would take to long for a normal response. Sometimes you can just put the thing plainly in the controller also.
So in other words
The scheduler will be fine as long as the time between updates X is less then the time it will take for the update to happen, time Y.
If X < Y then you might want to look at calling the logic from the controller when each indvidual entry is accessed, isntead of trying to do them all at once. The idea being you would only update it after a certain time as passed so. You could store the last time update either on the model itself in a field like twitter_udpate_time or in a redis or memecache instance at a unquie key for the user/auth.
But if the individual update itself is still too long, then thats when you should do the above, but instead of doing the actually update, call a delayed job.
You could even set it up that it only updates or calls the delayed job after a certain number of views, to further limit stuff.
Possible Fancy Pants
Or if you want to get really fancy you could still do it as a cron job, but have a point system based on views that weights which entries should be updated. The idea being certain actions would add points to certain users, and if their points are over a certain amount you update them, and then remove their points. That way you could target the ones you think are the most important, or have the most traffic or show up in the most search results etc etc.
Next off a nick picky thing.
http://api.rubyonrails.org/classes/ActiveRecord/Batches.html
You should be using
#authentications.find_each do |authentication|
instead of
#authentications.each do |authentication|
find_each pulls in only 1000 entries at a time so if you end up with a lof of Authentications you don't end up pulling a crazy amount of entries into memory.

Resources