Rails: Object destroy performance - ruby-on-rails

In my rails app. I have my base model - User. There are lot of associated objects to the user.
Class User
has_many :contents, dependent: destroy
has_many :messages, dependent: destroy
has_many :ratings, dependent: destroy
has_many :groups, dependent: destroy
...
end
When I want to remove user from my system (destroy user object), it takes about a minute to destroy all its associated objects. What is the best way to handle such cases?
One way that comes to my mind is:
Destroying in delayed_job:
But till the point user object gets destroyed in delayed job, this user should not be visible for others. Handle this case by having a flag say - deleted in user model and not fetching in results. But I use sphinx as well, and needs to make sure this user does not come up in sphinx results as well.
Is there a better way to handle such cases?

The challenge is that, as you probably already know, the .destroy method will load each of the children objects and then call their .destroy methods.
The value in this is that any callbacks on the children are evaluated before doing the final destroy. So if a child needs to clear up anything elsewhere then it will do so. Also if the dependent objects throw an error during the destroy method, the entire destroy operation will rollback and you won't end up with some half-dead object limping around.
.delete will destroy the objects without loading them into memory or performing their callbacks. However (obviously) it won't perform their callbacks.
If you want to speed things up you could either simply do dependent: :delete as Octopus-Paul suggests. This will be fine however it won't destroy dependents on those objects, so for instance if groups had messages associated with them or perhaps ratings had comments associated with them, none of those will be destroyed.
To ensure that all downstream dependents get destroyed and any necessary callbacks are honoured I think the best you can do is to write a custom before_destroy method which does all clearing up but uses .delete and .delete_all in order to speed things up.
This will create legacy issues in that someone writing code downstream won't necessarily anticipate your method however you can judge the risk of that. The alternative (as you say) is to use a flag and do the job asynchronously. I'd imagine that this has fewer risks in the future but may be more expensive to implement today.

:dependent
Controls what happens to the associated object when its owner is destroyed:
:destroy causes the associated object to also be destroyed
:delete causes the associated object to be deleted directly from the database (so callbacks will not execute)
Delete will be much faster because it will simply run a database query for each association of the deleted user.
Find more options here: http://guides.rubyonrails.org/association_basics.html#options-for-has-one-dependent

Related

Prevent object destruction except by parent

Ruby 2.1.2 and Rails 4: I have a parent object and a child object. I have a before_destroy callback for the child object that may prevent its destruction based on a flag. However, I also need its parent to be able to destroy it via a dependent: :destroy relationship.
How can I check the source of its destruction in my validation?
I found marked_for_destruction? and a host of related questions here, but none seem concerned with the before_destroy callback, which runs before the object (or even its parent) are marked for destruction. I've been prying through what's accessible in the callback for a while now and can't seem to find anything.
I could obviously go with dependent: :delete instead, although that seems like it misses the point. I'm sure I could come up with something else like doing a before_destroy on the parent, and then calling a monkey-patched destroy method with some arguments or some such thing, but it also seems to miss the point.
Any suggestions? Is there some property on the parent that I'm missing, or a way to trace the destroy call's source or something? Thanks in advance!

Race conditions in AR destroy callbacks

I seem to have a race condition in my Rails app. While deleting a user and all of the associated models that depend on it, new associated models are sometimes created by the user. User deletions can take a while if we're deleting a lot of content, so it makes sense that race conditions would exist here. This ends up creating models that point to a user that doesn't exist.
I've tried fixing this by creating a UserDeletion model, which acts as a sort of mutex lock. Before it starts deleting the user, it'll create a new UserDeletion record. When a user tries to create new content, it checks to make sure an associated UserDeletion record doesn't exist. After it's done, it deletes it.
This hasn't solved the problem, though, so I'm wondering how other people have handled similar issues with AR callbacks and race conditions.
First of all when there is a lot of content associated, we moved on to use manual delete process using SQL DELETE instead off Rails destroy. (Though this may not work for You, If You carelessly introduced a lot of callback dependencies that does something after record is destroyed)
def custom_delete
self.class.transaction do
related_objects.delete_all
related_objects_2.delete_all
delete
end
end
If You find Yourself writing this all the time, You can simply wrap it inside class method that accepts list of related_objects keys to delete.
class ActiveRecord::Base
class << self
def bulk_delete_related(*args)
define_method "custom_delete" do
ActiveRecord::Base.transaction do
args.each do |field|
send(field).delete_all
end
end
delete
end
end
end
end
class SomeModel < ActiverRecord::Base
bulk_delete :related_objects, :related_objects2, :related_object
end
I inserted the class method inside ActiveRecord::Base class directly, but probably You should better extract it to module. Also this only speeds things up, but does not resolve the original problem.
Secondly You can introduce FK constraints (we did that to ensure integrity, as we do a lot of custom SQL). It will work the way that User won't be deleted as long as there are linked objects. Though it might not be what You want. To increase effectivity of this solution You can always delegate user deletion to a background job, that will retry deleting user until it's actually can be deleted (no new objects dropped in)
Alternatively You can do the other way around as we did at my previous work in some cases. If it's more important to delete user rather than to be sure that there are no zombie records, use some swipe process to clean up time to time.
Finally the truth is somewhere in the middle - apply constraints to relations that definitely need to be cleaned up before removing user and just rely on sweeper to remove less important ones that shouldn't interfere with user deletion.
Problem is not trivial but it should be solvable to some extent.

delete_all vs destroy_all?

I am looking for the best approach to delete records from a table. For instance, I have a user whose user ID is across many tables. I want to delete this user and every record that has his ID in all tables.
u = User.find_by_name('JohnBoy')
u.usage_indexes.destroy_all
u.sources.destroy_all
u.user_stats.destroy_all
u.delete
This works and removes all references of the user from all tables, but I heard that destroy_all was very process heavy, so I tried delete_all. It only removes the user from his own user table and the id from all the other tables are made null, but leaves the records intact in them. Can someone share what the correct process is for performing a task like this?
I see that destroy_all calls the destroy function on all associated objects but I just want to confirm the correct approach.
You are right. If you want to delete the User and all associated objects -> destroy_all
However, if you just want to delete the User without suppressing all associated objects -> delete_all
According to this post : Rails :dependent => :destroy VS :dependent => :delete_all
destroy / destroy_all: The associated objects are destroyed alongside this object by calling their destroy method
delete / delete_all: All associated objects are destroyed immediately without calling their :destroy method
delete_all is a single SQL DELETE statement and nothing more. destroy_all calls destroy() on all matching results of :conditions (if you have one) which could be at least NUM_OF_RESULTS SQL statements.
If you have to do something drastic such as destroy_all() on large dataset, I would probably not do it from the app and handle it manually with care. If the dataset is small enough, you wouldn't hurt as much.
To avoid the fact that destroy_all instantiates all the records and destroys them one at a time, you can use it directly from the model class.
So instead of :
u = User.find_by_name('JohnBoy')
u.usage_indexes.destroy_all
You can do :
u = User.find_by_name('JohnBoy')
UsageIndex.destroy_all "user_id = #{u.id}"
The result is one query to destroy all the associated records
I’ve made a small gem that can alleviate the need to manually delete associated records in some circumstances.
This gem adds a new option for ActiveRecord associations:
dependent: :delete_recursively
When you destroy a record, all records that are associated using this option will be deleted recursively (i.e. across models), without instantiating any of them.
Note that, just like dependent: :delete or dependent: :delete_all, this new option does not trigger the around/before/after_destroy callbacks of the dependent records.
However, it is possible to have dependent: :destroy associations anywhere within a chain of models that are otherwise associated with dependent: :delete_recursively. The :destroy option will work normally anywhere up or down the line, instantiating and destroying all relevant records and thus also triggering their callbacks.

Is it possible to manipulate a has_many through in memory only?

Platform: Rails 3
Requirement:
A form (businesses#edit) that allows a user to submit an update to a Business record (businesses#update.) When submitted businesses#update will not persist the changes to the database but will instead send an email with the new information for manual review.
How it's accomplished:
Load the Business model, update the properties in memory, pass it to an ActionMailer, finish without ever saving the model.
Problem:
The Business model has a has_many :business_categories :through :business_categories_businesses which is a has_many :business_categories_businesses and whenever I manipulate the Business.business_categories (#business.business_categories = BusinessCategory.where(:id => params[:business][:business_categories]) for example.) the changes are immediately persisted to the database.
I can not find any way of manipulating that collection in memory only so that it can be passed to an action mailer. In the short-term I'm going to hack it by not doing the assignment and simply pass the new collection of BusinessCategories to the ActionMailer to work out itself, but now this is just annoying me and I defer to the collective wisdom of the crowd:
It is possible to manipulate has_many relations in memory only and if so how? Please save my sanity.
Thank you in advance!

Evaluating :dependent => :destroy

In Rails 2.2.2 (ruby 1.8.7-p72), I'd like to evaluate the impact of destroying an object before actually doing it. I.e. I would like to be able to generate a list of all objects that will be affected by :dependent => :destroy (via an object's associations). The real problem I'm trying to solve is to give a user a list of everything that will be deleted and having them confirm the action.
Can anyone recommend a good way to go about this? I've just started looking into ActiveRecord::Associations, but I haven't made much headway.
Update: In my particular case, I've got various levels of objects (A --> B --> C).
This should help get you started... Obviously you'll have to customize it but this lists all association names that are dependent destroy on the class BlogEntry:
BlogEntry.reflect_on_all_associations.map do |association|
if association.options[:dependent] == :destroy
# do something here...
association.name
end
end.compact
=> [:taggings, :comments]
Just manually maintain a list of associated object with dependent destroy (probably a go thing to do anyway) and then have named_scopes for each to pull in the included objects to display.
I'd say that as mentioned have a way of displaying affected records to the user, then have two buttons/links, one that is a delete, maybe with a confirm alert for the user which asks if they have checked the other link which is a list of all records they will be affecting.
Then if you want to be really sure you could also do a soft delete by marking them as deleted at in the database instead of actually deleting them which may well come in handy, I don't know how you would handle that on the automatic dependent delete, maybe with acts_as_paranoid, or some kind of self rolled version with a callback on the parent model.
Recently I wrote a simple Rails plugin that solves this problem.
Check it out on github: http://github.com/murbanski/affected_on_destroy/tree

Resources