delete_all vs destroy_all? - ruby-on-rails

I am looking for the best approach to delete records from a table. For instance, I have a user whose user ID is across many tables. I want to delete this user and every record that has his ID in all tables.
u = User.find_by_name('JohnBoy')
u.usage_indexes.destroy_all
u.sources.destroy_all
u.user_stats.destroy_all
u.delete
This works and removes all references of the user from all tables, but I heard that destroy_all was very process heavy, so I tried delete_all. It only removes the user from his own user table and the id from all the other tables are made null, but leaves the records intact in them. Can someone share what the correct process is for performing a task like this?
I see that destroy_all calls the destroy function on all associated objects but I just want to confirm the correct approach.

You are right. If you want to delete the User and all associated objects -> destroy_all
However, if you just want to delete the User without suppressing all associated objects -> delete_all
According to this post : Rails :dependent => :destroy VS :dependent => :delete_all
destroy / destroy_all: The associated objects are destroyed alongside this object by calling their destroy method
delete / delete_all: All associated objects are destroyed immediately without calling their :destroy method

delete_all is a single SQL DELETE statement and nothing more. destroy_all calls destroy() on all matching results of :conditions (if you have one) which could be at least NUM_OF_RESULTS SQL statements.
If you have to do something drastic such as destroy_all() on large dataset, I would probably not do it from the app and handle it manually with care. If the dataset is small enough, you wouldn't hurt as much.

To avoid the fact that destroy_all instantiates all the records and destroys them one at a time, you can use it directly from the model class.
So instead of :
u = User.find_by_name('JohnBoy')
u.usage_indexes.destroy_all
You can do :
u = User.find_by_name('JohnBoy')
UsageIndex.destroy_all "user_id = #{u.id}"
The result is one query to destroy all the associated records

I’ve made a small gem that can alleviate the need to manually delete associated records in some circumstances.
This gem adds a new option for ActiveRecord associations:
dependent: :delete_recursively
When you destroy a record, all records that are associated using this option will be deleted recursively (i.e. across models), without instantiating any of them.
Note that, just like dependent: :delete or dependent: :delete_all, this new option does not trigger the around/before/after_destroy callbacks of the dependent records.
However, it is possible to have dependent: :destroy associations anywhere within a chain of models that are otherwise associated with dependent: :delete_recursively. The :destroy option will work normally anywhere up or down the line, instantiating and destroying all relevant records and thus also triggering their callbacks.

Related

Rails: Object destroy performance

In my rails app. I have my base model - User. There are lot of associated objects to the user.
Class User
has_many :contents, dependent: destroy
has_many :messages, dependent: destroy
has_many :ratings, dependent: destroy
has_many :groups, dependent: destroy
...
end
When I want to remove user from my system (destroy user object), it takes about a minute to destroy all its associated objects. What is the best way to handle such cases?
One way that comes to my mind is:
Destroying in delayed_job:
But till the point user object gets destroyed in delayed job, this user should not be visible for others. Handle this case by having a flag say - deleted in user model and not fetching in results. But I use sphinx as well, and needs to make sure this user does not come up in sphinx results as well.
Is there a better way to handle such cases?
The challenge is that, as you probably already know, the .destroy method will load each of the children objects and then call their .destroy methods.
The value in this is that any callbacks on the children are evaluated before doing the final destroy. So if a child needs to clear up anything elsewhere then it will do so. Also if the dependent objects throw an error during the destroy method, the entire destroy operation will rollback and you won't end up with some half-dead object limping around.
.delete will destroy the objects without loading them into memory or performing their callbacks. However (obviously) it won't perform their callbacks.
If you want to speed things up you could either simply do dependent: :delete as Octopus-Paul suggests. This will be fine however it won't destroy dependents on those objects, so for instance if groups had messages associated with them or perhaps ratings had comments associated with them, none of those will be destroyed.
To ensure that all downstream dependents get destroyed and any necessary callbacks are honoured I think the best you can do is to write a custom before_destroy method which does all clearing up but uses .delete and .delete_all in order to speed things up.
This will create legacy issues in that someone writing code downstream won't necessarily anticipate your method however you can judge the risk of that. The alternative (as you say) is to use a flag and do the job asynchronously. I'd imagine that this has fewer risks in the future but may be more expensive to implement today.
:dependent
Controls what happens to the associated object when its owner is destroyed:
:destroy causes the associated object to also be destroyed
:delete causes the associated object to be deleted directly from the database (so callbacks will not execute)
Delete will be much faster because it will simply run a database query for each association of the deleted user.
Find more options here: http://guides.rubyonrails.org/association_basics.html#options-for-has-one-dependent

model relations and bulk deletes

I have a model relation of dependant=>destroy that has to do 50K+ deletes when the destroy is triggered. Looking at the console, rails is trying to do an explicit delete with ID for every single row, which is taking a while. Is there a way for me to force rails to do a bulk delete? Or, I can remove the model dependency, is there a way to do this kind of bulk delete from the code?
Thanks
You should be able to set dependent: delete_all
If you can't get that to work, you might want to use delete_all in your own callback.
To be clear, delete_all should generate a single statement to delete all child objects

dependent deletes on polymorphic relations

I'm using a Polymorphic relation with multiple tables. Object Window has ChartWindow, PluginWindow or PortletWindow. I used a class_eval (relate_to_details) technique to define detail tables so that each object can have it's own table with distinct attributes.
PluginWindowDetail is the detail table for PluginWindow. PluginWindow has a plugin_id (plugin_window_details.plugin_id) So, I defined a has_one association in PluginWindow ( has_one :plugin_window_detail, :dependent => :delete) because I want the Window to be deleted with the Plugin is deleted.
However, I realized that this isn't getting me what I want. Deleting the PluginWindowDetail won't delete the PluginWindow.. and since I'm using the class_eval technique instead of a regular ActiveRecord association, I'm not sure how I can do this without coding it myself (which maybe I should)
Anyways.. gists with code are here https://gist.github.com/3206666 . Any help would be appreciated.
I think the simpler way to do it is to use the before_destroy callback. It will be more flexible.

Rails Associations: HABTM?

Hey guys, I'm at a deadlock here after thinking about this for too long.
Context: Given the following models:
User
Item
Lock
Here's the scenario: A lock is basically like a 'hold'. A user can place a 'lock' on any given item to signal to the system that the item should not be deleted. Items wont be deleted until the lock is cleared.
Here's the tricky part. The lock is its own model because I want multiple users to be able to lock any given item. So let's say Bob locks an item, one didn't already exist so it creates a lock for that item, and information stating that Bob is currently associated with that lock. John comes and locks the same item, but a lock already exists, so John is simply 'added under' the same lock. The lock won't be removed until all users choose to 'unlock', or disassociate themselves with that lock.
My confusion is how I should model these relationships. A user can of course have many locks, each associated with a different item (since any given item can have at most one lock). The locks themselves can have many users. From the point of view of the item, each item can have one lock associated with many users.
So in other words, I would like to access the information a little something like this:
item.lock.users # get the users 'locking' the item
user.locks # get the items the user is currently 'locking
Perhaps the separate Lock model isn't required, but I figured it would be in order to signify that multiple users can be locking a particular item.
I think what further complicates things is that items are added by users, so I would want to have a way to access the items by a user for example user.items or item.user.
Right now I have:
user has and belongs to many locks
lock has and belongs to many users
user has many items
item belongs to user
item has one lock
lock belongs to item
Does this seem correct?
I think what you're doing will work though you may not have to use the habtm. What if an item can have many locks and can only be deleted when it has no locks. That way you could add a date/reason/comment for each lock by user.
user
has_many :locks
has_many :items
lock
belongs_to :user
belongs_to :item
item
belongs_to :user
has_many :locks
This will still allow you to do user.locks though item.lock.users won't work, but by looking at each lock you'll easily be able to get the users.
item.locks.each do |lock|
puts lock.user
end
You don't need the lock model. You can simply set up a habtm relationship between users and items:
class User < ActiveRecord::Base
has_many :items
has_and_belongs_to_many :locks, :class_name => "Item"
end
User.first.items # => [<#Item>, <#Item>, ...] # Items created by user
User.first.locks # => [<#Item>, <#Item>, ...] # Items locked by user
class Item < ActiveRecord::Base
belongs_to :user
has_and_belongs_to_many :lock_users, :class_name => "User"
end
Item.first.user # => <#User> # Creator of the item
Item.first.lock_users # => [<#User>, <#User>, ...] # "Lockers" of the item
You'll have to create a join table, of course, and be mindful of what Rails expect the join table to be named. You may be better off specifying the :join_table option for the habtm.
The key here is that relationships in Rails are very flexible. You can have multiple relationships between the two tables; you can have both the 'created by' and the lock relationships, independently of each other. All you have to do is use different names for the relationships.
I can relate to "thinking about this for too long". When that thought creeps into my mind I back away and work on other parts of the code. Relationships seem to reveal themselves over time as they're really a convenience to spare us writing a bunch of code. They're not essential, so, at least during the development phase, we can postpone the declaration of the relationships and see what we need a bit later.
Yeah, we're supposed to always know ahead of time according to the pragmatists, but in real life we're often having to use our common sense and experience and build a working prototype, then fine tune it. It's during that fine-tuning stage I tweak my relationships that weren't exactly clear before, and adjust my code.
Sniff... sniff... jeez, now my bosses know I'm not perfect... sniff....
Back to the question at hand: Normally, for locks to prevent accidental (or on purpose) deletion, I create a boolean field in my main table and if that record should be purged set it to true. For what you're doing I'd probably get rid of the flag field altogether and have a separate table that is the IDs of the records to lock, along with the user's IDs who want to keep the record. Delete those user's records if/when they think it's time to delete the record. When it's time to do some purging I'd check against that table. Something similar to:
delete from table1 where id not in (select distinct(table1_id) from table2)
The thing I don't like about it is there's potential to have another table full of "keep this record" records, and that's when I add another table for users to terminate who can't decide what things need to be deleted.

Evaluating :dependent => :destroy

In Rails 2.2.2 (ruby 1.8.7-p72), I'd like to evaluate the impact of destroying an object before actually doing it. I.e. I would like to be able to generate a list of all objects that will be affected by :dependent => :destroy (via an object's associations). The real problem I'm trying to solve is to give a user a list of everything that will be deleted and having them confirm the action.
Can anyone recommend a good way to go about this? I've just started looking into ActiveRecord::Associations, but I haven't made much headway.
Update: In my particular case, I've got various levels of objects (A --> B --> C).
This should help get you started... Obviously you'll have to customize it but this lists all association names that are dependent destroy on the class BlogEntry:
BlogEntry.reflect_on_all_associations.map do |association|
if association.options[:dependent] == :destroy
# do something here...
association.name
end
end.compact
=> [:taggings, :comments]
Just manually maintain a list of associated object with dependent destroy (probably a go thing to do anyway) and then have named_scopes for each to pull in the included objects to display.
I'd say that as mentioned have a way of displaying affected records to the user, then have two buttons/links, one that is a delete, maybe with a confirm alert for the user which asks if they have checked the other link which is a list of all records they will be affecting.
Then if you want to be really sure you could also do a soft delete by marking them as deleted at in the database instead of actually deleting them which may well come in handy, I don't know how you would handle that on the automatic dependent delete, maybe with acts_as_paranoid, or some kind of self rolled version with a callback on the parent model.
Recently I wrote a simple Rails plugin that solves this problem.
Check it out on github: http://github.com/murbanski/affected_on_destroy/tree

Resources