ActiveRecord - find records that its association count is 0 - ruby-on-rails

In my Ruby on Rails app I have the following model:
class SlideGroup < ApplicationRecord
has_many :survey_group_lists, foreign_key: 'group_id'
has_many :surveys, through: :survey_group_lists
end
I want to find all orphaned slide groups. Orphaned slide group is slide group which is not connected to any survey. I've been trying following query but it does not return anything and I'm sure that I have orphaned records in my test database:
SlideGroup.joins(:surveys).group("slide_groups.id, surveys.id").having("count(surveys.id) = ?",0)
this generates following sql query:
SlideGroup Load (9.3ms) SELECT "slide_groups".* FROM "slide_groups" INNER JOIN "survey_group_lists" ON "survey_group_lists"."group_id" = "slide_groups"."id" INNER JOIN "surveys" ON "surveys"."id" = "survey_group_lists"."survey_id" GROUP BY slide_groups.id, surveys.id HAVING (count(surveys.id) = 0)

You're using joins, which is INNER JOIN, whereas what you need is an OUTER JOIN -
includes:
SlideGroup.includes(:surveys).group("slide_groups.id, surveys.id").having("count(surveys.id) = ?",0)
A bit cleaner query:
SlideGroup.includes(:surveys).where(surveys: { id: nil })

Finding orphan records has been explained by others.
I see problems with this approach:
There should not be any orphan in the first place
The presence of a survey.id does not guarantee the presence of a Survey
What about SurveyGroupList that are orphan?
So the proper solution would be to ensure that no orphans are left in the DB. By implementing the proper logic AND adding foreign keys with on delete cascade to the DB. You can also add dependent: :destroy option to your associations but this only works if you use #destroy on your models (not delete) and of course does not work if you delete directly via SQL.

Related

Disable Rails STI for certain ActiveRecord queries only

I can disable STI for a complete model but I'd like to just disable it when doing certain queries. Things were working swell in Rails 3.2, but I've upgraded to Rails 4.1.13 and a new thing is happening that's breaking the query.
I have a base class called Person with a fairly complex scope.
class Person < ActiveRecord::Base
include ActiveRecord::Sanitization
has_many :approvals, :dependent => :destroy
has_many :years, through: :approvals
scope :for_current_or_last_year, lambda {
joins(:approvals).merge(Approval.for_current_or_last_year) }
scope :for_current_or_last_year_latest_only, ->{
approvals_sql = for_current_or_last_year.select('person_id AS ap_person_id, MAX(year_id) AS max_year, year_id AS ap_year_id, status_name AS ap_status_name, active AS ap_active, approved AS ap_approved').group(:id).to_sql
approvals_sql = select("*").from("(#{approvals_sql}) AS ap, approvals AS ap2").where('ap2.person_id = ap.ap_person_id AND ap2.year_id = ap.max_year').to_sql
select("people.id, ap_approved, ap_year_id, ap_status_name, ap_active").
joins("JOIN (#{approvals_sql}) filtered_people ON people.id =
filtered_people.person_id").uniq
}
end
And inheriting classes called Member and Staff. The only thing related to this I had to comment out to get my tests to pass with Rails 4. It may be the problem, but uncommenting it hasn't helped in this case.
class Member < Person
#has_many :approvals, :foreign_key => 'person_id', :dependent => :destroy, :class_name => "MemberApproval"
end
The problem happens when I do the query Member.for_current_or_last_year_latest_only
I get the error unknown column 'people.type'
When I look at the SQL, I can see the problem line but I don't know how to remove it or make it work.
Member.for_current_or_last_year_latest_only.to_sql results in.
SELECT DISTINCT people.id, ap_approved, ap_year_id, ap_status_name, ap_active
FROM `people`
JOIN (SELECT * FROM (SELECT person_id AS ap_person_id, MAX(year_id) AS max_year, year_id AS ap_year_id, status_name AS ap_status_name, active AS ap_active, approved AS ap_approved
FROM `people` INNER JOIN `approvals` ON `approvals`.`person_id` = `people`.`id`
WHERE `people`.`type` IN ('Member') AND ((`approvals`.`year_id` = 9 OR `approvals`.`year_id` = 8))
GROUP BY `people`.`id`) AS ap, approvals AS ap2
WHERE `people`.`type` IN ('Member') AND (ap2.person_id = ap.ap_person_id AND ap2.year_id = ap.max_year)) filtered_people ON people.id = filtered_people.person_id
WHERE `people`.`type` IN ('Member')
If I remove people.type IN ('Member') AND from the beginning of the second to last WHERE clause the query runs successfully. And btw, that part isn't in the query generated from the old Rails 3.2 code, neither is the one above it (only the last one matches the Rails 3.2 query). The problem is, that part is being generated from rails Single Table Inheritance I assume, so I can't just delete it from my query. It's not the only place that is getting added into the original query, but that's the only one that is causing it to break.
Does anybody have any idea how I can either disable STI for only certain queries or add something to my query that will make it work? I've tried putting people.type in every one of the SELECT queries to try and make it available but to no avail.
Thanks for taking the time to look at this.
I was apparently making this harder than it really was...I just needed to add unscoped to the front of the two approval_sql sub-queries. Thanks for helping my brain change gears.

Rails includes association with condition in left join

Can I add some condition to the LEFT JOIN sql that Rails generate for the includes method? (Rails 4.2.1, postresql).
I need to get all(!) the users with preloading ( not N+1 when I will puts in a view count of comments, posts and etc) of associations, but associations need to be filtered by some conditions.
Example:
User.includes(:comments)
# => SELECT * FROM users LEFT JOIN comments ON ...
This will return all the users and preload comments if they exists.
If I will add some conditions for the "comments" association in where, then SQL doesn't return ALL the users, for example:
User.includes(:comments).where(comments: {published_at: Date.today})
# => SELECT * FROM users LEFT JOIN comments ON ... WHERE comments.published_at = ...
This will return only users, that have comments, published today.
I need to put conditions inside LEFT JOIN AND save preloading (load objects to the memory - simple left join with joins method doesn't preload associations).
SELECT * FROM users LEFT JOIN comments ON (... AND comments.published_at = ...)
Those SQL will return right what I need (all the users, and their comments, published in requested date, if they exists)! But ... I cant generate it with the Rails includes method, and `joins' doesn't preload associations.
What do you advice me? Thanks!
Rails doesn't have methods in the framework library to do what you want.
This might work, though
class User < ActiveRecord::Base
has_many :comments
has_many :recent_comments, -> { where(published_at: Date.today) }, class_name: "Comment"
end
Then query for Users and preload recent_comments
#users = User.preload(:recent_comments)

Rails 4 Left outer join multiple tables with Group By and count

Is it possible to do a left outer join in Rails4 with group by and counts. I am trying to write a scope which will do a left outer join of users with the messages, comments and likes tables and then group by id to get total count. In case there is no association, the count should be zero.
So the final result set would be cuuser.*, message_count, likes_count and comments_count. Any idea how this can be accomplished? Thanks in Advance!
class Cuuser < ActiveRecord::Base
has_and_belongs_to_many :groups
has_many :messages
has_many :comments
has_many :likes
validates :username, format: { without: /\s/ }
scope :superusers, -> { joins(:comments, :likes).
select('cuusers.id').
group('cuusers.id').
having('count(comments.id) + count(likes.id) > 2')}
end
You could drop to SQL strings:
joins("LEFT OUTER JOIN comments ON comments.cuuser_id = cuuser.id LEFT OUTER JOIN likes ON likes.cuuser_id = cuuser.id LEFT OUTER JOIN messages on messages.cuuser_id = cuuser.id")
This isn't great. You're sacrificing some portability and the ability of ActiveRecord to guess the association.
If you used the Squeel gem you could use the following format:
joins{ [comments.outer, likes.outer, messages.outer] }
Squeel exposes Arel in a slightly more sane format, so you can do things like left outer joins while still guessing associations from the model definitions.
You could use Arel of course, but things get very clunky very quickly.
To get your counts, try:
select('cuusers.id, count(messages.id) as message_count, count(likes.id) as likes_count, count(comments.id) as comments_count').
They should be available as attributes on the returned objects, just like ordinary database fields.

Specifying conditions on eager loaded associations returns ActiveRecord::RecordNotFound

The problem is that when a Restaurant does not have any MenuItems that match the condition, ActiveRecord says it can't find the Restaurant. Here's the relevant code:
class Restaurant < ActiveRecord::Base
has_many :menu_items, dependent: :destroy
has_many :meals, through: :menu_items
def self.with_meals_of_the_week
includes({menu_items: :meal}).where(:'menu_items.date' => Time.now.beginning_of_week..Time.now.end_of_week)
end
end
And the sql code generated:
Restaurant Load (0.0ms)←[0m ←[1mSELECT DISTINCT "restaurants".id FROM "restaurants"
LEFT OUTER JOIN "menu_items" ON "menu_items"."restaurant_id" = "restaurants"."id"
LEFT OUTER JOIN "meals" ON "meals"."id" = "menu_items"."meal_id" WHERE
"restaurants"."id" = ? AND ("menu_items"."date" BETWEEN '2012-10-14 23:00:00.000000'
AND '2012-10-21 22:59:59.999999') LIMIT 1←[0m [["id", "1"]]
However, according to this part of the Rails Guides, this shouldn't be happening:
Post.includes(:comments).where("comments.visible", true)
If, in the case of this includes query, there were no comments for any posts, all the posts would still be loaded.
The SQL generated is a correct translation of your query. But look at it,
just at the SQL level (i shortened it a bit):
SELECT *
FROM
"restaurants"
LEFT OUTER JOIN
"menu_items" ON "menu_items"."restaurant_id" = "restaurants"."id"
LEFT OUTER JOIN
"meals" ON "meals"."id" = "menu_items"."meal_id"
WHERE
"restaurants"."id" = ?
AND
("menu_items"."date" BETWEEN '2012-10-14' AND '2012-10-21')
the left outer joins do the work you expect them to do: restaurants
are combined with menu_items and meals; if there is no menu_item to
go with a restaurant, the restaurant is still kept in the result, with
all the missing pieces (menu_items.id, menu_items.date, ...) filled in with NULL
now look aht the second part of the where: the BETWEEN operator demands,
that menu_items.date is not null! and this
is where you filter out all the restaurants without meals.
so we need to change the query in a way that makes having null-dates ok.
going back to ruby, you can write:
def self.with_meals_of_the_week
includes({menu_items: :meal})
.where('menu_items.date is NULL or menu_items.date between ? and ?',
Time.now.beginning_of_week,
Time.now.end_of_week
)
end
The resulting SQL is now
.... WHERE (menu_items.date is NULL or menu_items.date between '2012-10-21' and '2012-10-28')
and the restaurants without meals stay in.
As it is said in Rails Guide, all Posts in your query will be returned only if you will not use "where" clause with "includes", cause using "where" clause generates OUTER JOIN request to DB with WHERE by right outer table so DB will return nothing.
Such implementation is very helpful when you need some objects (all, or some of them - using where by base model) and if there are related models just get all of them, but if not - ok just get list of base models.
On other hand if you trying to use conditions on including tables then in most cases you want to select objects only with this conditions it means you want to select Restaurants only which has meals_items.
So in your case, if you still want to use only 2 queries (and not N+1) I would probably do something like this:
class Restaurant < ActiveRecord::Base
has_many :menu_items, dependent: :destroy
has_many :meals, through: :menu_items
cattr_accessor :meals_of_the_week
def self.with_meals_of_the_week
restaurants = Restaurant.all
meals_of_the_week = {}
MenuItems.includes(:meal).where(date: Time.now.beginning_of_week..Time.now.end_of_week, restaurant_id => restaurants).each do |menu_item|
meals_of_the_week[menu_item.restaurant_id] = menu_item
end
restaurants.each { |r| r.meals_of_the_week = meals_of_the_week[r.id] }
restaurants
end
end
Update: Rails 4 will raise Deprecation warning when you simply try to do conditions on models
Sorry for possible typo.
I think there is some misunderstanding of this
If there was no where condition, this would generate the normal set of two queries.
If, in the case of this includes query, there were no comments for any
posts, all the posts would still be loaded. By using joins (an INNER
JOIN), the join conditions must match, otherwise no records will be
returned.
[from guides]
I think this statements doesn't refer to the example Post.includes(:comments).where("comments.visible", true)
but refer to one without where statement Post.includes(:comments)
So all work right! This is the way LEFT OUTER JOIN work.
So... you wrote: "If, in the case of this includes query, there were no comments for any posts, all the posts would still be loaded." Ok! But this is true ONLY when there is NO where clause! You missed the context of the phrase.

ActiveRecord find categories which contain at least one item

Support I have two models for items and categories, in a many-to-many relation
class Item < ActiveRecord::Base
has_and_belongs_to_many :categories
class Category < ActiveRecord::Base
has_and_belongs_to_many :items
Now I want to filter out categories which contain at least one items, what will be the best way to do this?
I would like to echo #Delba's answer and expand on it because it's correct - what #huan son is suggesting with the count column is completely unnecessary, if you have your indexes set up correctly.
I would add that you probably want to use .uniq, as it's a many-to-many you only want DISTINCT categories to come back:
Category.joins(:items).uniq
Using the joins query will let you more easily work conditions into your count of items too, giving much more flexibility. For example you might not want to count items where enabled = false:
Category.joins(:items).where(:items => { :enabled => true }).uniq
This would generate the following SQL, using inner joins which are EXTREMELY fast:
SELECT `categories`.* FROM `categories` INNER JOIN `categories_items` ON `categories_items`.`category_id` = `categories`.`id` INNER JOIN `items` ON `items`.`id` = `categories_items`.`item_id` WHERE `items`.`enabled` = 1
Good luck,
Stu
Category.joins(:items)
More details here: http://guides.rubyonrails.org/active_record_querying.html#joining-tables
please notice, what the other guys answererd is NOT performant!
the most performant solution:
better to work with a counter_cache and save the items_count in the model!
scope :with_items, where("items_count > 0")
has_and_belongs_to_many :categories, :after_add=>:update_count, :after_remove=>:update_count
def update_count(category)
category.items_count = category.items.count
category.save
end
for normal "belongs_to" relation you just write
belongs_to :parent, :counter_cache=>true
and in the parent_model you have an field items_count (items is the pluralized has_many class name)
http://api.rubyonrails.org/classes/ActiveRecord/Associations/ClassMethods.html
in a has_and_belongs_to_many relation you have to write it as your own as above
scope :has_item, where("#{table_name}.id IN (SELECT categories_items.category_id FROM categories_items")
This will return all categories which have an entry in the join table because, ostensibly, a category shouldn't have an entry there if it does not have an item. You could add a AND categories_items.item_id IS NOT NULL to the subselect condition just to be sure.
In case you're not aware, table_name is a method which returns the table name of ActiveRecord class calling it. In this case it would be "categories".

Resources