Using Rails includes with conditions on children - ruby-on-rails

I have a model Parent that has many children Child. I want to get all Parent models and show every Child of the Parent as well. This is a classic use case for Rails' includes method, as far as I can tell.
However, I can't get Rails to add conditions to the child models without limiting the Parent models to those that have children.
For example, this only outputs parents that have children:
Parent.includes(:children).where(children: {age: 10}).each do |parent|
# output parent info
parent.children.where("age = 10").each do |child|
#output child info
end
end
I've looked at Rails includes with conditions but it seems like I'm having the same trouble as the question's OP and neither part of the accepted answer doesn't solve it (it either has only some parents, or resorts to multiple queries).

You need to use LEFT JOIN.
Parent.joins("LEFT JOIN children ON parent.id = children.parent_id")
.where("parent.age = 10 AND children.age = 10")
.select("parent.*, children.*")
If you want to select rows from the parent table which may or may not have corresponding rows in the children table, you use the LEFT JOIN clause. In case there is no matching row in the children table, the values of the columns in the children table are substituted by the NULL values.

I ran into this issue, thus stumbling across this question. Sadly, none of the answers so far are solutions. Happily, I have found the solution! Thanks in-part to the docs :) http://apidock.com/rails/ActiveRecord/QueryMethods/includes
As the docs suggest, simply including the association, and then adding a condition to it is not sufficient; you must also "reference" the association references(:children).
Now, additionally you can see that I'm using some syntactic sugar that I recommend for merging in your conditions, versus re-writing them. Use this when possible.
Parent.includes(:children).merge(Child.at_school).references(:children).first
So what I did, and what I suggest doing is setting up a scope for this:
class Parent < ActiveRecord::Model
has_many :children
scope :with_children_at_school, -> { includes(:children).merge(Child.at_school).references(:children) }
# ...
end
And then you can just call Parent.with_children_at_school.first (or whatever else you want to chain on to the end!
I hope this helps!

This a limitation of the includes method. What you need is an outer join and unfortunately rails doesnt have a good way to force an outer join without using the raw sql syntax (#joins defaults to inner join and #includes eager loads).
try using something along the lines of
Parent.joins('LEFT OUTER JOIN child on child.parent_id = parent.id').where(...)
this should grab all parents, even those without children

This is not a 100% answer, but one approach is to accept that you wil get all child records returned by the eager loading, but to choose the ones that you then want using a non-ActiveRecord method.
You will includes more child records in the eager loading than you need, so that's less efficient than a perfect solution, but you still get the records you want:
Parent.includes(:children).each do |parent|
parent.children.select{|child| child.age == 10}.each do |child|
blah blah...
end
end
I'm assuming here that you need a lot of flexibility on your select criteria, and that an association based on a scope would not offer such flexibility.

The parents who don't have children will have a children.age of NULL, you are only filtering for children.age = 10.
Try
where('children.age = 10 or children.age is null')

Related

How to remove some items from a relation?

I am loading data from two models, and once the data are loaded in the variables, then I need to remove those items from the first relation, that are not in the second one.
A sample:
users = User.all
articles = Articles.order('created_at DESC').limit(100)
I have these two variables filled with relational data. Now I would need to remove from articles all items, where user_id value is not included in the users object. So in the articles would stay only items with user_id, that is in the variable users.
I tried it with a loop, but it was very slow. How do I do it effectively?
EDIT:
I know there's a way to avoid doing this by building a better query, but in my case, I cannot do that (although I agree that in the example above it's possible to do that). That thing is that I have in 2 variables loaded data from database and I would need to process them with Ruby. Is there a command for doing that?
Thank you
Assuming you have a belongs_to relation on the Article model:
articles.where.not(users: users)
This would give you at most 100, but probably less. If you want to return 100 with the condition (I haven't tested, but the idea is the same, put the conditions for users in the where statement):
Articles.includes(:users).where.not(users: true).order('created_at DESC').limit(100)
The best way to do this would probably be with a SQL join. Would this work?
Articles.joins(:user).order('created_at DESC').limit(100)

Rails select in scope

In my model User, I have scope set up:
scope :count_likes, lambda {
select("(SELECT count(*) from another_model) AS count")
}
If I want to get all attributes of my User + count_likes, I have to do:
Model.count_likes.select("users.*")
because calling select() will the default "*"
I use count_likes scope a lot of my application and my issue is that I have to append select("users.*") everywhere.
I know about the default scope, however, I don't think doing select("users.*") in default scope if a good idea.
Is there a DRY / better way of doing this?
Thanks
This isn't really another answer. I wanted to leave a comment about the joins, but comments cannot run long and I wanted to provide code examples.
What you need is to sometimes get all the fields and counts of a related table, and other times get the counts without the users.* fields, (and maybe sometimes just the user.* fields without the counts). So, you are going to have to tell the code which one you want. I think what you are looking for is an except type of thing, where by default you get the user.* fields and the counts, but when you only want the counts, to specify turning off the select('user.*'). I don't think there is such a solution, except maybe using the default scope. I suggest having one scope for just the counts, and one scope for users fields and the counts.
Here is what I would do:
class Users
has_many :likes
def self.with_count_likes
joins(:likes)
.select('users.*, count(likes.id) as count')
.group('users.id')
end
def self.count_likes
joins(:likes)
.select('users.id, users.username, count(likes.id) as count')
.group('users.id')
end
...
Call with_count_likes (or chain it into a query) when you want all the users fields and the likes counts. Call count_likes when you want just the counts and a few identifying fields.
I'm assuming here that whenever you want the counts, you want some users fields to identify what/(who) the counts are for.
Note that some databases (like Oracle) may require grouping by 'users.*'. This is the standard in SQL, but some databases like mySQL only use the primary key.
You may simply add users.* to the scope.
scope :count_likes, lambda {
select("(SELECT count(*) from another_model) AS count, users.*")
}
HTH
EDIT: I am not sure of exactly what you are trying to achieve, but you should consider using joins and get the data by joining tables appropriately.
EDIT: Usually I am not a big fan of making such changes, but as situation suggests sometimes we need to get our hands dirty. In this case, I would try to reduce the number of operations in terms of making changes. Consider:
scope :count_likes, Proc.new { |all| s = select("(SELECT count(*) from another_model) AS count"); s = s.select("users.*") unless all == false; s }
Now you will get users.* everywhere. For specific places where you just need the count, you may replace it like User.count_likes(false) and it will give you just the counts. Thus minimal changes.
There may be another possibility of appending multiple scopes together, one for counts, one for users.* and use them to achieve the above effect.

Rails query] difference between joins and includes

#teachers = User.joins(:students).where("student_id IS NOT NULL")
The above works, the below doesn't.
#teachers = User.includes(:students).where("student_id IS NOT NULL")
As far as I understand, joins and includes should both bring the same result with different performance. According to this, you use includes to load associated records of the objects called by Model, where joins to simply add two tables together. Using includes can also prevent the N+1 queries.
First question: why does my second line of code not work?
Second question: should anyone always use includes in a case similar to above?
You use joins when you want to query against the joined model. This is doing an inner join between your tables.
Includes is when you want to eager load the associated model to the end result.
This allows you to call the association on any of the results without having to again do the db lookup.
You cannot query against a model that is loaded via includes. If you want to query against it you must use joins( you can do both! )

Rails Hide Specific Record From All Select * Type Queries

I have a record in a table that serves as a placeholder of sorts, and doesn't represent actual data. It's bad design, I know, but I have some very awkward requirements that I have to deal with and I saw no other solutions so it's a bit of a hotfix per se.
Now lets say I have a series of SELECT *s throughout my application and I don't want to have to explicitly exclude that single record for each of them. Is there anything I can drop into my model to exclude it from all queries except for the ones where it's explicitly called? Or perhaps some logic I can put directly into my PG database?
It's the very first record in the table with an ID of 0.
Add a default scope
default_scope where('id != 0')
to your model...
In any case you want to avoid that default scope in some query, you can have Model.unscoped... there...
One solution would be to define a default_scope that would exclude those records, see the doc
So when doing YourModel.all, if the default_scope on YourModel excludes the correct records, you'll get what you want.
But as you said, it's bad design !
Create a view excluding it:
create view v as
select *
from t
where id != 0
Now select from the view:
select *
from v

How do I get Rails to eager load counts?

This is related to a question a year and change ago.
I put up an example of the question that should work out of the box, provided you have sqlite3 available: https://github.com/cairo140/rails-eager-loading-counts-demo
Installation instructions (for the main branch)
git clone git://github.com/cairo140/rails-eager-loading-counts-demo.git
cd rails-eager-loading-counts-demo
rails s
I have a fuller write-up in the repository, but my general question is this.
How can I make Rails eager load counts in a way that minimizes db queries across the board?
The n+1 problem emerges whenever you use #count on an association, despite having included that association via #includes(:associated) in the ActiveRelation. A workaround is to use #length, but this works well only when the object it's being called on has already been loaded up, not to mention that I suspect it duplicates something that the Rails internals have done already. Also, an issue with using #length is that it results in an unfortunate over-loading when the association was not loaded to begin with and the count is all you need.
From the readme:
We can dodge this issue by running #length on the posts array (see appendix), which is already loaded, but it would be nice to have count readily available as well. Not only is it more consistent; it provides a path of access that doesn't necessarily require posts to be loaded. For instance, if you have a partial that displays the count no matter what, but half the time, the partial is called with posts loaded and half the time without, you are faced with the following scenario:
Using #count
n COUNT style queries when posts are already loaded
n COUNT style queries when posts are not already loaded
Using #length
Zero additional queries when posts are already loaded
n * style queries when posts are not already loaded
Between these two choices, there is no dominant option. But it would be nice to revise #count to defer to #length or access the length that is some other way stored behind the scenes so that we can have the following scenario:
Using revised #count
Zero additional queries when posts are already loaded
n COUNT style queries when posts are not already loaded
So what's the correct approach here? Is there something I've overlooked (very, very likely)?
As #apneadiving suggested, counter_cache works well because the counter column gets automatically updated when records are added or removed. So when you load the parent object, the count is included in the object without needing to access the other table.
However, if for whatever reason you don't like that approach, you could do this:
Post.find(:all,
:select => "posts.*, count(comments.id) `comments_count`",
:joins => "left join comments on comments.post_id = posts.id")
An alternative approach to the one of Zubin:
Post.select('posts.*, count(comments.id) `comments_count`').joins(:comments).group('posts.id')
It appears that the best way to implement this sort of facility might be to create SQL Views (ref: here and here) for the seperate model-and-child-count objects that you want; and their associated ActiveRecord models.
You might be able to be very clever and use subclassing on the original model combined with set_table_name :sql_view_name to retain all the original methods on the objects, and maybe even some of their associations.
For instance, say we were to add 'Post.has_many :comments' to your example, like in #Zubin's answer above; then one might be able to do:
class CreatePostsWithCommentsCountsView < ActiveRecord::Migration
def self.up
#Create SQL View called posts_with_comments_counts which maps over
# select posts.*, count(comments.id) as comments_count from posts
# left outer join comments on comments.post_id = posts.id
# group by posts.id
# (As zubin pointed out above.)
#*Except* this is in SQL so perhaps we'll be able to do further
# reducing queries against it *as though it were any other table.*
end
end
class PostWithCommentsCount < Post #Here there be cleverness.
#The class definition sets up PWCC
# with all the regular methods of
# Post (pointing to the posts table
# due to Rails' STI facility.)
set_table_name :posts_with_comment_counts #But then we point it to the
# SQL view instead.
#If you don't really care about
# the methods of Post being in PWCC
# then you could just make it a
# normal subclass of AR::Base.
end
PostWithCommentsCount.all(:include => :user) #Obviously, this sort of "upward
# looking" include is best used in big lists like "latest posts" rather than
# "These posts for this user." But hopefully it illustrates the improved
# activerecordiness of this style of solution.
PostWithCommentsCount.all(:include => :comments) #And I'm pretty sure you
# should be able to do this without issue as well. And it _should_ only be
# the two queries.
I have set up a small gem that adds an includes_count method to ActiveRecord, that uses a SELECT COUNT to fetch the number of records in an association, without resorting to a JOIN which might be expensive (depending on the case).
See https://github.com/manastech/includes-count
Hope it helps!

Resources