Race conditions in AR destroy callbacks - ruby-on-rails

I seem to have a race condition in my Rails app. While deleting a user and all of the associated models that depend on it, new associated models are sometimes created by the user. User deletions can take a while if we're deleting a lot of content, so it makes sense that race conditions would exist here. This ends up creating models that point to a user that doesn't exist.
I've tried fixing this by creating a UserDeletion model, which acts as a sort of mutex lock. Before it starts deleting the user, it'll create a new UserDeletion record. When a user tries to create new content, it checks to make sure an associated UserDeletion record doesn't exist. After it's done, it deletes it.
This hasn't solved the problem, though, so I'm wondering how other people have handled similar issues with AR callbacks and race conditions.

First of all when there is a lot of content associated, we moved on to use manual delete process using SQL DELETE instead off Rails destroy. (Though this may not work for You, If You carelessly introduced a lot of callback dependencies that does something after record is destroyed)
def custom_delete
self.class.transaction do
related_objects.delete_all
related_objects_2.delete_all
delete
end
end
If You find Yourself writing this all the time, You can simply wrap it inside class method that accepts list of related_objects keys to delete.
class ActiveRecord::Base
class << self
def bulk_delete_related(*args)
define_method "custom_delete" do
ActiveRecord::Base.transaction do
args.each do |field|
send(field).delete_all
end
end
delete
end
end
end
end
class SomeModel < ActiverRecord::Base
bulk_delete :related_objects, :related_objects2, :related_object
end
I inserted the class method inside ActiveRecord::Base class directly, but probably You should better extract it to module. Also this only speeds things up, but does not resolve the original problem.
Secondly You can introduce FK constraints (we did that to ensure integrity, as we do a lot of custom SQL). It will work the way that User won't be deleted as long as there are linked objects. Though it might not be what You want. To increase effectivity of this solution You can always delegate user deletion to a background job, that will retry deleting user until it's actually can be deleted (no new objects dropped in)
Alternatively You can do the other way around as we did at my previous work in some cases. If it's more important to delete user rather than to be sure that there are no zombie records, use some swipe process to clean up time to time.
Finally the truth is somewhere in the middle - apply constraints to relations that definitely need to be cleaned up before removing user and just rely on sweeper to remove less important ones that shouldn't interfere with user deletion.
Problem is not trivial but it should be solvable to some extent.

Related

Ruby On Rails: Disable `delete_all` when table name is present

delete_all is useful, but I never want to see it called on the same line with a table name. I'd like to disable things like TableName.destroy_all in both console and code.
One interesting issue happened earlier this month:
Application.destroy_all was called on a model instead of applications.destroy_all
(the model has_many applications)
For somebody new to ROR, it looks very similar, but the results were disastrous.
I'm open to some form of lint/code style tool, but that really wouldn't catch it in the console scenario. (Plus, I haven't been able to get rubo-cop to do something like this yet)
Basically, I'm asking for a way to make the console and codebase more secure so that newer developers can't inadvertantly delete everything in a table.
I'm not entirely clear on what you are trying to accomplish, but you could try overriding the method in your ApplicationModel with something like this (assuming Rails 5 or greater, or otherwise a root model in existence).
class ApplicationModel < ActiveRecord::Base
def self.destroy_all(*args)
raise('Cannot destroy all records of a model this way. Did you mean to delete a subset of records instead?')
end
end
Possibly make this method private if you'd like it even harder to run...
def self.destroy_all(*args)
raise('Cannot destroy all records of a model this way. Did you mean to delete a subset of records instead?')
end
private_class_method :destroy_all
You could get fancy and allow this to be bypassed with a special argument that you check for, but give this a try and see how it goes.

How to avoid a circular loop

I think I'm being dense here because I keep getting a stack too deep error...
I have a Child and a Parent relational objects. I want 2 things to happen:
if you try to update the Child, you cannot update its status_id to 1 unless it has a Parent association
if you create a Parent and then attach it to the Child, then the Child's status should be auto-set to 1.
Here's how the Parent association gets added:
parent = Parent.new
if parent.save
child.update_attributes(parent_id:1)
end
I have these callbacks on the Child model:
validate :mark_complete
after_update :set_complete
# this callback is here because there is a way to update the Child model attributes
def mark_complete
if self.status_id == 1 && self.parent.blank?
errors[:base] << ""
end
end
def set_complete
if self.logistic.present?
self.update_attribute(:status_id, 1)
end
end
The code above is actually not that efficient because it's 2 db hits when ideally it would be 1, done all at once. But I find it too brain draining to figure out why... I'm not sure why it's not even working, and therefore can't even begin to think about making this a singular db transaction.
EXAMPLE
Hopefully this helps clarify. Imagine a Charge model and an Item model. Each Item has a Charge. The Item also has an attribute paid. Two things:
If you update the Item, you cannot update the paid to true until the Item has been associated with a Charge object
If you link a Charge object to a Item by updating the charge_id attribute on the Item, then code should save you time and auto set the paid as true
There's a lot that I find confusing here, but it seems to me that you call :set_complete after_update and within set_complete you are updating attributes, thus you seem to have a perpetual loop there. There might be other loops that I can't see but that one stands out to me.
One way to avoid a circularly recursive situation like this is to provide a flag as a parameter (or otherwise) that will stop the loop from continuing.
In this case, (though I am not sure about the case entirely) I think you could provide a flag indicating the origin of the call. If the origin of the update is a charge being attached, then pass a flag that will stop the check from happening or modify it to keep the loop from happening. Perhaps a secondary set of logic is in order for such a case?
I faced a stack level too deep problem some time back when working with ActiveRecord callbacks.
In my case the problem was with update_attribute after the update goes through the callback i.e. set_complete in your case is called again in which the update_attribute is triggered again in turn and this repeats endlessly.
I got around that by using update_column instead which does not trigger any callbacks or validations however setting a flag is what was advised more often online.
At this point I do not have an answer for reducing your database write operations, and will add to this answer if I can think of anything.
Hope this helps

Observing conditions across multiple models

I'm building a flow whereby a user can administer an event, specifically doing the following:
Register attendees
Attach photos
Attach fitness information
Each of these currently happens in a seperate controller, and can happen in any order.
Having completed all three, I'd then like to generate an email out to all attendees with links ot the photos, etc.
I'm having trouble finding the best approach to check against the three conditions listed above. Currently, I'm approaching it by creating a service called GenerateEmailsToAttendees with a method .try. This checks against the conditions, and if all are met, generates the emails: e.g:
class GenerateEmailsToAttendees
def try(event)
if event.has_some_fitness_activities? and event.has_some_attendees? and event.has_some_photos?
event.attendances.each do |attendance|
attendance.notify_user_about_write_up
end
end
end
end
The problem now is that I have this GenerateEmailsToAttendees scattered across three controllers (AttendeesController#register, PhotosController#attach and FitnessInfoController#attach). I also run the risk of duplicating the notifications to the users.
Is there a better way? Could I use an observer to watch for the three conditions being met?
I can provide more information on the model structure if it's useful.
Thanks!
How about moving your observer to a cron job? i.e: remove it from all three controllers, and just put it in a rake task and schedule it to run every week/day/hour etc on all events that have met the conditions. You should probably set a flag on the event if the email has been generated so you don't spam the same user twice. I understand that this might not be realtime but it'll definitely solve your problem. I would recommend using https://github.com/javan/whenever for managing your cronjobs.
I would put this into an after_save callback: then Rails will just take care of it automatically. You will probably need some system to ensure that this only happens once. I would do something like this:
add a new boolean field to track whether the event has all of the required "stuff" done in order to send out the email, eg "published"
when the various things that can make an event "published" happen, call a method in the Event model which tests if the event is ready to be published and currently NOT published: if it is, then update the model to be published and send the email.
eg - (i'm guessing at your join table names here)
#app/models/event_attendance.rb
after_create :is_event_publishable?
def is_event_publishable?
self.event.publishable?
end
#app/models/event_fitness_activity.rb
after_create :is_event_publishable?
def is_event_publishable?
self.event.publishable?
end
#app/models/event_photo.rb
after_create :is_event_publishable?
def is_event_publishable?
self.event.publishable?
end
#app/models/event.rb
def publishable?
if !self.published && self.fitness_activities.size > 0 and self.attendences.size > 0 and self.photos.size > 0
self.attendances.each do |attendance|
attendance.notify_user_about_write_up
end
end
end
Now you don't need anything to do with this at all in your controllers. Generally i'm in favour of keeping controllers as absolutely standard as possible.
Yes, you can create an observer that watches multiple models with a single 'after_save' callback using something like
observe :account, :balance
def after_save(record)
make your checks here
end

Record changes pend approval by a privileged user; Its like versioning combined with approvals

I have a requirement that certain attribute changes to records are not reflected in the user interface until those changes are approved. Further, if a change is made to an approved record, the user will be presented with the record as it exists before approval.
My first try...
was to go to a versioning plugin such as paper_trail, acts_as_audited, etc. and add an approved attribute to their version model. Doing so would not only give me the ability to 'rollback' through versions of the record, but also SHOULD allow me to differentiate between whether a version has been approved or not.
I have been working down this train of thought for awhile now, and the problem I keep running into is on the user side. That is, how do I query for a collection of approved records? I could (and tried) writing some helper methods that get a collection of records, and then loop over them to find an "approved" version of the record. My primary gripe with this is how quickly the number of database hits can grow. My next attempt was to do something as follows:
Version.
where(:item_type => MyModel.name, :approved => true).
group(:item_type).collect do |v|
# like the 'reify' method of paper_trail
v.some_method_that_converts_the_version_to_a_record
end
So assuming that the some_method... call doesn't hit the database, we kind of end up with the data we're interested in. The main problem I ran into with this method is I can't use this "finder" as a scope. That is, I can't append additional scopes to this lookup to narrow my results further. For example, my records may also have a cool scope that only shows records where :cool => true. Ideally, I would want to look up my records as MyModel.approved.cool, but here I guess I would have to get my collection of approved models and then loop over them for cool ones would would result in the very least in having a bunch of records initialized in memory for no reason.
My next try...
involved creating a special type of "pending record" that basically help "potential" changes to a record. So on the user end you would lookup whatever you wanted as you normally would. Whenever a pending record is apply!(ed) it would simply makes those changes to the actual record, and alls well... Except about 30 minutes into it I realize that it all breaks down if an "admin" wishes to go back and contribute more to his change before approving it. I guess my only option would be either to:
Force the admin to approve all changes before making additional ones (that won't go over well... nor should it).
Try to read the changes out of the "pending record" model and apply them to the existing record without saving. Something about this idea just doesn't quite sound "right".
I would love someone's input on this issue. I have been wrestling with it for some time, and I just can't seem to find the way that feels right. I like to live by the "if its hard to get your head around it, you're probably doing it wrong" mantra.
And this is kicking my tail...
How about, create an association:
class MyModel < AR::Base
belongs_to :my_model
has_one :new_version, :class_name => MyModel
# ...
end
When an edit is made, you basically clone the existing object to a new one. Associate the existing object and the new one, and set a has_edits attribute on the existing object, the pending_approval attribute on the new one.
How you treat the objects once the admin approves it depends on whether you have other associations that depend on the id of the original model.
In any case, you can reduce your queries to:
objects_pending_edits = MyModel.where("has_edits = true").all
then with any given one, you can access the new edits with obj.new_version. If you're really wanting to reduce database traffic, eager-load that association.

What is the best way to override Rails ActiveRecord destroy behavior?

I have an application where I would like to override the behavior of destroy for many of my models. The use case is that users may have a legitimate need to delete a particular record, but actually deleting the row from the database would destroy referential integrity that affects other related models. For example, a user of the system may want to delete a customer with whom they no longer do business, but transactions with that customer need to be maintained.
It seems I have at least two options:
Duplicate data into the necessarily models effectively denormalizing my data model so that deleted records won't affect related data.
Override the "destroy" behavior of ActiveRecord to do something like set a flag indicating the user "deleted" the record and use this flag to hide the record.
Am I missing a better way?
Option 1 seems like a horrible idea to me, though I'd love to hear arguments to the contrary.
Option 2 seems somewhat Rails-ish but I'm wondering the best way to handle it. Should I create my own parent class that inherits from ActiveRecord::Base, override the destroy method there, then inherit from that class in the models where I want this behavior? Should I also override finder behavior so records marked as deleted aren't returned by default?
If I did this, how would I handle dynamic finders? What about named scopes?
If you're not actually interested in seeing those records again, but only care that the children still exist when the parent is destroyed, the job is simple: add :dependent => :nullify to the has_many call to set references to the parent to NULL automatically upon destruction, and teach the view to deal with that reference being missing. However, this only works if you're okay with not ever seeing the row again, i.e. viewing those transactions shows "[NO LONGER EXISTS]" under company name.
If you do want to see that data again, it sounds like what you want has nothing to do with actually destroying records, which means that you will never need to refer to them again. Hiding seems to be the way to go.
Instead of overriding destroy, since you're not actually destroying the record, it seems significantly simpler to put your behavior in a hide method that triggers a flag, as you suggested.
From there, whenever you want to list these records and only include visible records, one simple solution is to include a visible scope that doesn't include hidden records, and not include it when you want to find that specific, hidden record again. Another path is to use default_scope to hide hidden records and use Model.with_exclusive_scope { find(id) } to pull up a hidden record, but I'd recommend against it, since it could be a serious gotcha for an incoming developer, and fundamentally changes what Model.all returns to not at all reflect what the method call suggests.
I understand the desire to make the controllers look like they're doing things the Rails way, but when you're not really doing things the Rails way, it's best to be explicit about it, especially when it's really not that much of a pain to do so.
I wrote a plugin for this exact purpose, called paranoia. I "borrowed" the idea from acts_as_paranoid and basically re-wrote AAP using much less code.
When you call destroy on a record, it doesn't actually delete it. Instead, it will set a deleted_at column in your database to the current time.
The README on the GitHub page should be helpful for installation & usage. If it isn't, then let me know and I'll see if I can fix that for you.

Resources