Over a period of time my Rails app has had various rewrites, and in some cases incorrect model associations.
Currently my User model has_many :posts and its destroy method correctly removes all dependent Posts, but at times when things were improperly written this did not happen. I'm now left with a handful of Post records that cause errors all over because their User does not exist.
What would be the most efficient way to manually filter through all Post records, check if its User actually exists, and if not destroy that Post?
I'm imagining something like:
Post.all.select{ |post| post.user.nil? }.destroy
But that seems incredibly inefficient for thousands of records. I'd love to know the best way to do this. Thank you!
any reason why you cannot do it directly on the database?
delete from posts where user_id not in (select id from users);
Fastest way would probably be to do it directory in the db console, but if you've got other dependent relationships and activerecord callbacks that you need to get fired, you could do something like:
Post.where("id in (select p.id from posts p left outer join users u on p.user_id = u.id where u.id is null)").destroy_all
Delete the orphan posts using
If you want to destroy the dependent associations with Post and run the callbacks
Post.where("user_id NOT IN (?)", User.pluck(:id)).destroy_all
If you just want to delete the posts
Post.where("user_id NOT IN (?)", User.pluck(:id)).delete_all
Here is one good post about finding and deleting orphan records
Related
I have a problem - I do not know how to get associated records only if condition is met.
I have Posts model and Comments, Post has_many :comments, Comment belongs_to :post.
Now, I want to retrieve All of the Posts, but only with specific comments (lets say with user_id = 1).
How can I achieve that?
Query like
Post.includes(:comments).where("comments.user_id = ?", "1") will retrieve only some Posts, I want all of them, but only with comments with user_id equal to 1.
I guess I should use LEFT JOIN of some sort, maybe something like
posts.joins("LEFT JOIN comments ON posts.id comments.post_id")
but I am not sure how to put condition restricting right table results.
It can be achieved as below
Post.includes(:comments).where("comments.user_id = ?", "1").references(:comments)
For more information go here
Consider the following:
class User < ActiveRecord::Base
has_many :posts
end
Now I want to get posts for some banned users.
User.where(is_banned: true).posts
This produces a NoMethodError as posts is not defined on ActiveRecord::Relation.
What is the slickest way of making the code above work?
I can think of
User.where(is_banned: true).map(&:posts).flatten.uniq
But this is inefficient.
I can also think of
user_scope = User.where(is_banned: true)
Post.where(user: user_scope)
This requires the user association to be set up in the Post model and it appears to generate a nested select. I don't know about the efficiency.
Ideally, I would like a technique that allows traversing multiple relations, so I can write something like:
User.where(is_banned: true).posts.comments.votes.voters
which should give me every voter (user) who has voted for a comment on a post written by a banned user.
Why not just use joins?
Post.joins(:user).where(users: {is_banned: true})
This will generate SQL to the effect of
SELECT *
FROM posts
INNER JOIN users ON posts.user_id = users.id
WHERE users.is_banned = true
This seems to be exactly what you are looking for. As far as your long chain goes you can do the same thing just with a much deeper join.
In your code:
User.where(is_banned: true)
will be and ActiveRecord::Relation and you need one record. So doing if from the User model would be more complicated. Depending on how the relationship is set up you could add a scope in your Post model.
scope :banned_users, -> { joins(:users).where('is_banned = ?', true) }
Then you would just call Post.banned_users to get all the post created by banned users.
Here's a start of a solution for your ideal technique. It probably doesn't work as written with extended chaining, and performance would probably be pretty bad. It would also require that you define the inverse_of for each association —
module LocalRelationExtensions
def method_missing(meth, *args, &blk)
if (assoc = self.klass.reflect_on_association(meth)) && (inverse = assoc.inverse_of)
assoc.klass.joins(inverse.name).merge(self)
else
super
end
end
end
ActiveRecord::Relation.include(LocalRelationExtensions)
But really you should use the comment of #engineersmnky.
I have the following code which when run creates / deletes records on table project_users using a has many through relationship and this works fine.
#project.users = #account.users.find(params[:users_ids])
However, i'm now in a situation where I need to set a 3rd foreign key value on this middle project_users table called role_id.
Are there any suggestions to efficiently set the role_id at the same time as the code above.
I've come up with a couple of solutions such as the one below, but I don't like the idea of deleting all records if only an update is required.
users = #account.users.find(params[:users_granted].keys)
#project.users = []
users.each do |user|
#project.project_users.create(:user_id => user.id, :project_role_id => params[:users_granted][user.id.to_s])
end
Any thoughts / suggestions would be welcome, chances are i'm thinking about this the completely wrong way!
Thanks,
Scott
I have a fairly straightforward query that I can't seem to get right....
My model:
User - has many Friendships (with other users)
User - has many submissions
User - has many comments
User - has many votes
I need a count that represents
All current_user's friends, whose submissions, comments or votes created_at dates are > current_user.last_refresh_date
Right now, I am building up an array by iterating over friendships and adding all submissions, comment and votes. I then re-iterate this built-up array while comparing the dates to determine if the count should be incremented. Not the most ideal solution.
Edit:
#TobiasCohen
Efficient solution. Thanks!
Followup:
I wish to add yet one more count to the present query. I need to count all new comments & votes on the current_user.submissions that are not part of the original count (ie. not a friend).
Psuedo-code :
current_user.submissions.join(:comments, :votes, :friends).where('last_activity >? AND friend_id != ?', current_user.last_refresh_date, current_user.id).count
I can't quite get the query correct (new to complex queries via active record).
I was going to make it a separate query and then add it to the original count. Can it be absorbed into one query instead of two?
I think you'd get the best results by adding a cache column on User, let's call it :last_activity_at, then update this with an after_create callback on Submission, Comment and Vote.
class Submission < ActiveRecord::Base
belongs_to :user
after_create :update_user_last_activity_at
private
def update_user_last_activity_at
user.update_attribute :last_activity_at, Time.now
end
end
You could then fetch users simply with:
current_user.friends.where('last_activity_at > ?', current_user.last_refresh_date)
I am trying to write a method that would apply directly to several models with HABTM relations to clean up any unused relations.
def cleanup
self.all.each do |f|
if f.videos.count == 0
self.destroy(f)
end
end
end
Where do I save this method to and is this even the correct syntax for such a method? It would theoretically be run as:
>>Tag.cleanup
Write external module and include it in each Model you need
Sadly people keep on using has_and_belongs_to_many even though it leads to all kinds of orphans like this. A has_many ..., :through relationship can be flagged :dependent => :destroy to clean up unused children automatically. It's common that you'll have unused join records and they are obnoxious to remove.
What you might do is approach this from a SQL angle since has_and_belongs_to_many records are inaccessible if their parent records are no longer defined. They simply do not exist as far as ActiveRecord is concerned. Using a join model means you can always access this data since they are issued their own ids.
has_and_belongs_to_many relationships are based on a compound key which makes removing them a serious nuisance. Normally you'd do a DELETE FROM table WHERE id IN (...) AND ... and be confident that only the target records are removed. With a compound key you can't do this.
You may find this works for an example Tag to Item relationship:
DELETE FROM item_tags, tags, items WHERE item_tags.tag_id=tags.id AND item_tags.item_id=items.id AND tags.id IS NULL AND items.id IS NULL
The DELETE statement can be really particular about how it operates and does not give the same latitude as a SELECT with joins that can be defined as left or right, inner or outer as required.
If you had a primary ID key in your join table you could do it easily:
DELETE FROM item_tags WHERE id IN (SELECT id FROM item_tags LEFT JOIN tags ON item_tags.tag_id=tags.id LEFT JOIN items ON item_tags.item_id=items.id WHERE tags.id IS NULL AND items.id IS NULL)
In fact, it might be advantageous to add a primary key to your relationship table even if ActiveRecord ignores it.
Edit:
As for your module issue, if you're stuck with that approach:
module CleanupMethods
def cleanup
# ...
end
end
class Tag
# Import the module methods as class methods
extend CleanupMethods
end
If you use a counter cache column you can do this a lot more easily, but you will also have to ensure your counter caches are accurate.
You want to add a class method to the Tag class, and instead of iterating through all the tag objects (requiring rails to load each one) and then checking for videos through Active Record, it's faster to load all the orphaned records using a query and then destroy only those.
Guessing that you have tags and videos, here, and that tag_videos is your join table, in Rails 2.x you might write
def self.cleanup
find(:all, :conditions => "id NOT IN (select tag_id from tag_videos)").destroy_all
end
In Rails 3 you'd write
def self.cleanup
where("id NOT IN (select tag_id from tag_videos)").destroy_all
end