I've been trying to figure out some odd behavior when combining a has_one association and includes.
class Post < ApplicationRecord
has_many :comments
has_one :latest_comment, -> { order('comments.id DESC').limit(1) }, class_name: 'Comment'
end
class Comment < ApplicationRecord
belongs_to :post
end
To test this I created two posts with two comments each. Here are some rails console commands that show the odd behavior. When we use includes then it ignores the order of the latest_comment association.
posts = Post.includes(:latest_comment).references(:latest_comment)
posts.map {|p| p.latest_comment.id}
=> [1, 3]
posts.map {|p| p.comments.last.id}
=> [2, 4]
I would expect these commands to have the same output. posts.map {|p| p.latest_comment.id} should return [2, 4]. I can't use the second command because of n+1 query problems.
If you call the latest comment individually (similar to comments.last above) then things work as expected.
[Post.first.latest_comment.id, Post.last.latest_comment.id]
=> [2, 4]
If you have another way of achieving this behavior I'd welcome the input. This one is baffling me.
I think the cleanest way to make this work with PostgreSQL is to use a database view to back your has_one :latest_comment association. A database view is, more or less, a named query that acts like a read-only table.
There are three broad choices here:
Use lots of queries: one to get the posts and then one for each post to get its latest comment.
Denormalize the latest comment into the post or its own table.
Use a window function to peel off the latest comments from the comments table.
(1) is what we're trying to avoid. (2) tends to lead to a cascade of over-complications and bugs. (3) is nice because it lets the database do what it does well (manage and query data) but ActiveRecord has a limited understanding of SQL so a little extra machinery is needed to make it behave.
We can use the row_number window function to find the latest comment per-post:
select *
from (
select comments.*,
row_number() over (partition by post_id order by created_at desc) as rn
from comments
) dt
where dt.rn = 1
Play with the inner query in psql and you should see what row_number() is doing.
If we wrap that query in a latest_comments view and stick a LatestComment model in front of it, you can has_one :latest_comment and things will work. Of course, it isn't quite that easy:
ActiveRecord doesn't understand views in migrations so you can try to use something like scenic or switch from schema.rb to structure.sql.
Create the view:
class CreateLatestComments < ActiveRecord::Migration[5.2]
def up
connection.execute(%q(
create view latest_comments (id, post_id, created_at, ...) as
select id, post_id, created_at, ...
from (
select id, post_id, created_at, ...,
row_number() over (partition by post_id order by created_at desc) as rn
from comments
) dt
where dt. rn = 1
))
end
def down
connection.execute('drop view latest_comments')
end
end
That will look more like a normal Rails migration if you're using scenic. I don't know the structure of your comments table, hence all the ...s in there; you can use select * if you prefer and don't mind the stray rn column in your LatestComment. You might want to review your indexes on comments to make this query more efficient but you'd be doing that sooner or later anyway.
Create the model and don't forget to manually set the primary key or includes and references won't preload anything (but preload will):
class LatestComment < ApplicationRecord
self.primary_key = :id
belongs_to :post
end
Simplify your existing has_one to just:
has_one :latest_comment
Maybe add a quick test to your test suite to make sure that Comment and LatestComment have the same columns. The view won't automatically update itself as the comments table changes but a simple test will serve as a reminder.
When someone complains about "logic in the database", tell them to take their dogma elsewhere as you have work to do.
Just so it doesn't get lost in the comments, your main problem is that you're abusing the scope argument in the has_one association. When you say something like this:
Post.includes(:latest_comment).references(:latest_comment)
the scope argument to has_one ends up in the join condition of the LEFT JOIN that includes and references add to the query. ORDER BY doesn't make sense in a join condition so ActiveRecord doesn't include it and your association falls apart. You can't make the scope instance-dependent (i.e. ->(post) { some_query_with_post_in_a_where... }) to get a WHERE clause into the join condition, then ActiveRecord will give you an ArgumentError because ActiveRecord doesn't know how to use an instance-dependent scope with includes and references.
Related
I already know how to use Rails to create subquery within a where condition, like so:
Order.where(item_id: Item.select(:id).where(user_id: 10))
However, my case is a little bit more tricky as you'll see. I'm trying to convert this query:
Post.find_by_sql(
<<-SQL
SELECT posts.*
FROM posts
WHERE (
SELECT name
FROM moderation_events
WHERE moderable_id = posts.id
AND moderable_type = 'Post'
ORDER BY created_at DESC
LIMIT 1
) = 'reported'
SQL
)
into an ActiveRecord/Arel-like(ish) call but couldn't find a way so far, therefore the raw SQL code and the use of find_by_sql.
I'm wondering if anyone out there already faced the same issue and if there's a better way to write this query ?
EDIT
The raw query above is working and returns exactly the result I want. I'm using PostgreSQL.
Post model
class Post < ApplicationRecord
has_many :moderation_events, as: :moderable, dependent: :destroy, inverse_of: :moderable
end
ModerationEvent model
class ModerationEvent < ApplicationRecord
belongs_to :moderable, polymorphic: true
belongs_to :post, foreign_key: :moderable_id, inverse_of: :moderation_events
end
EDIT 2
I had tried to used Rails associations to query it, using includes, joins and the like. However, the query above is very specific and work well with that form. Altering it with a JOIN query does not return the expected results.
The ORDER and LIMIT statement are very important here and cannot be moved outside of it.
A post can have multiple moderation_events. A moderation event can have multiple name (a.k.a type): reported, validated, moved and deleted.
Here is what the query is doing:
Getting all posts having their last moderation event to be a 'reported' event
I'm not trying to alter the query above because it does works well and fast in our case. I'm just trying to convert it in a more active record fashion without changing it, if possible
I've got two basic models with a join table. I've added a scope to compute a count through the relation and expose it as an attribute/psuedo-column. Everything works fine, but I'd now like to query a subset of columns and include the count column, but I don't know how to reference it.
tldr; How can I include an aggregate such as a count in my Arel query while also selecting a subset of columns?
Models are Employer and Employee, joined through Job. Here's the relevant code from Employer:
class Employer < ApplicationRecord
belongs_to :user
has_many :jobs
has_many :employees, through: :jobs
scope :include_counts, -> do
left_outer_joins(:employees).
group("employers.id").
select("employers.*, count(employees.*) as employees_count")
end
end
This allows me to load an employer with counts:
employers = Employer.include_counts.where(id: 1)
And then reference the count:
count = employers[0].employees_count
I'm loading the record in my controller, which then renders it. I don't want to render more fields than I need to, though. Prior to adding the count, I could do this:
employers = Employer.where(id: 1).select(:id, :name)
When I add my include_counts scope, it basically ignores the select(). It doesn't fail, but it ends up including ALL the columns, because of this line in my scope:
select("employers.*, count(employees.*) as employees_count")
If I remove employers.* from the scope, then I don't get ANY columns in my result, with or without a select() clause.
I tried this:
employers = Employer.include_counts.where(id: 1).select(:id, :name, :employee_counts)
...but that produces the following SQL:
SELECT employers.*, count(employees.*) as employees_count, id, name, employees_count FROM
...and an SQL error because column employees_count doesn't exist and id and name are ambiguous.
The only thing that sort of works is this:
employers = Employer.include_counts.where(id: 1).select("employers.id, employers.name, count(employees.*) as employees_count")
...but that actually selects ALL the columns in employers, due to the scope clause again.
I also don't want that raw SQL leaking into my controller if I can avoid it. Is there a more idiomatic way to do this with Rails/Arel?
If I can't find another way to do the query, I'll probably create another scope or custom finder in the model, so that the controller code is cleaner. I'm open to suggestions for doing that as well, but I'd like to know if there's a simple way to reference computed aggregate columns like this as though they were any other column.
Comment belongs to Post.
Post belongs to Category.
How would I get a collection of every lastly updated comment for each post, all belonging to one single category?
I've tried this but it just gives me one post:
category.posts.joins(:comments).order('updated_at DESC').first
Update
What I want is to fetch one commment per post, the last updated comment for each post.
Rails doesn't do this particularly well, especially with Postgres which forbids the obvious solution (as given by #Jon and #Deefour).
Here's the solution I've used, translated to your example domain:
class Comment < ActiveRecord::Base
scope :most_recent, -> { joins(
"INNER JOIN (
SELECT DISTINCT ON (post_id) post_id,id FROM comments ORDER BY post_id,updated_at DESC,id
) most_recent ON (most_recent.id=comments.id)"
)}
...
(DISTINCT ON is a Postgres extension to the SQL standard so it won't work on other databases.)
Brief explanation: the DISTINCT ON gets rid of all the rows except the first one for each post_id. It decides which row the first one is by using the ORDER BY, which has to start with post_id and then orders by updated at DESC to get the most recent, and then id as a tie-breaker (usually not necessary).
Then you would use it like this:
Comment.most_recent.joins(:post).where("posts.category_id" => category.id)
The query it generates is something like:
SELECT *
FROM comments
INNER JOIN posts ON (posts.id=comments.post_id)
INNER JOIN (
SELECT DISTINCT ON (post_id) post_id,id FROM comments ORDER BY post_id,updated_at DESC,id
) most_recent ON (most_recent.id=comments.id)
WHERE
posts.category_id=#{category.id}
Single query, pretty efficient. I'd be ecstatic if someone could give me a less complex solution though!
If you want a collection of every last updated Comment, you need to base your query on Comment, not Category.
Comment.joins(:post).
where("posts.category_id = ?", category.id).
group("posts.id").
order("comments.updated_at desc")
What you're basically asking for is a has_many :through association.
Try setting up your Category model something like this:
class Category < ActiveRecord::Base
has_many :posts
has_many :comments, through: :posts
end
Then you can simply do this to get the last 10 updated comments:
category.comments.order('updated_at DESC').limit(10)
You could make this more readable with a named scope on your Comment model:
class Comment < ActiveRecord::Base
scope :recently_updated, -> { order('updated_at DESC').limit(10) }
end
Giving you this query to use to get the same 10 comments:
category.comments.recently_updated
EDIT
So, a similar solution for what you actually wanted to ask for, however it requires you to approach your associations from the Comment end of things.
First of all, set up an association on Comment so that it has knowledge of its Category:
class Comment < ActiveRecord::Base
belongs_to :post
has_one :category, through: :post
end
Now you can query your comments like so:
Comment.order('updated_at desc').joins(:post).where('posts.category' => category).group(:post_id)
Somewhat long-winded, but it works.
.first is grabbing only one for you. The first one to be exact. So drop the .first. So instead do:
category.posts.joins(:comments).order('updated_at DESC')
This may be a simple question, but I seem to be pulling my hair out to find an elegant solution here. I have two ActiveRecord model classes, with a has_one and belongs_to association between them:
class Item < ActiveRecord::Base
has_one :purchase
end
class Purchase < ActiveRecord::Base
belongs_to :item
end
I'm looking for an elegant way to find all Item objects, that have no purchase object associated with them, ideally without resorting to having a boolean is_purchased or similar attribute on the Item.
Right now I have:
purchases = Purchase.all
Item.where('id not in (?)', purchases.map(&:item_id))
Which works, but seems inefficient to me, as it's performing two queries (and purchases could be a massive record set).
Running Rails 3.1.0
It's quite common task, SQL OUTER JOIN usually works fine for it. Take a look here, for example.
In you case try to use something like
not_purchased_items = Item.joins("LEFT OUTER JOIN purchases ON purchases.item_id = items.id").where("purchases.id IS null")
Found two other railsey ways of doing this:
Item.includes(:purchase).references(:purchase).where("purchases.id IS NULL")
Item.includes(:purchase).where(purchases: { id: nil })
Technically the first example works without the 'references' clause but Rails 4 spits deprecation warnings without it.
A more concise version of #dimuch solution is to use the left_outer_joins method introduced in Rails 5:
Item.left_outer_joins(:purchase).where(purchases: {id: nil})
Note that in the left_outer_joins call :purchase is singular (it is the name of the method created by the has_one declaration), and in the where clause :purchases is plural (here it is the name of the table that the id field belongs to.)
Rails 6.1 has added a query method called missing in the ActiveRecord::QueryMethods::WhereChain class.
It returns a new relation with a left outer join and where clause between the parent and child models to identify missing relations.
Example:
Item.where.missing(:purchase)
I am using Rails v2.3.2.
I have a model called UsersCar:
class UsersCar < ActiveRecord::Base
belongs_to :car
belongs_to :user
end
This model mapped to a database table users_cars, which only contains two columns : user_id, car_id.
I would like to use Rails way to count the number of car_id where user_id=3. I konw in plain SQL query I can achieve this by:
SELECT COUNT(*) FROM users_cars WHERE user_id=3;
Now, I would like to get it by Rails way, I know I can do:
UsersCar.count()
but how can I put the ...where user_id=3 clause in Rails way?
According to the Ruby on Rails Guides, you can pass conditions to the count() method. For example:
UsersCar.count(:conditions => ["user_id = ?", 3])
will generates:
SELECT count(*) AS count_all FROM users_cars WHERE (user_id = 3)
If you have the User object, you could do
user.cars.size
or
user.cars.count
Another way would be to do:
UserCar.find(:user_id => 3).size
And the last way that I can think of is the one mentioned above, i.e. 'UserCar.count(conditions)'.
With the belogngs to association, you get several "magic" methods on the parent item to reference its children.
In your case:
users_car = UsersCar.find(1) #=>one record of users_car with id = 1.
users_car.users #=>a list of associated users.
users_car.users.count #=>the amount of associated users.
However, I think you are understanding the associations wrong, based on the fact that your UsersCar is named awkwardly.
It seems you want
User has_and_belongs_to_many :cars
Car has_and_belongs_to_manu :users
Please read abovementioned guide on associations if you want to know more about many-to-many associations in Rails.
I managed to find the way to count with condition:
UsersCar.count(:condition=>"user_id=3")