Environment: Rails 3.2.22
Question:
Lets say I have the models Topics, Posts, and User.
Posts belongs to Topics
User has many Posts
I want to make a query of Topic.all, but includes all posts associated to a user.
I've tried include and eager_load with a where condition for the user id, but only topics with a post which meets the condition are return.
What I want is all topics return and include only posts which match the user_id condition.
After playing around with ActiveRecord I figured out how to do the query. It requires the left join as pointed out by #pshoukry, but it is missing two items.
AND statement is required to include only posts for a specific user.
An ActiveRecord method select needs to be appended at the end to include the fields you want.
To include all fields:
Topic.joins("LEFT JOIN posts ON posts.topic_id = topics.id AND posts.user_id = ?", user.id).select('topics.*, posts.*')
Now for the caveat. For those using Postgres and on Rails version 3.2.* there is a bug where the joined table will only return strings for ALL columns, disregarding the data type set. This issue is not present with Rails 4. There was an issue posted in the Rail's Github repo, but I can't seem to locate it. Since 3.2 is no longer supported they have no intention of fixing it.
Try using left join in your relation
Topic.joins("LEFT JOIN posts ON topics.id = posts.topic_id")
Related
I have Posts and Comments tables. They have belongs_to and has_many relation. Everything works great.
What I need to do is writing the SQL to pull posts that have comments. How can I do that in the controller?
I need some sort of Join I guess. Right?
Thank you
Post.joins(:comments) by itself will give you all those post that do have comments related.
The INNER JOIN does that work. When you use an inner join between two tables it returns a new set of data with all of the instances of the join where the condition was met, the rows are ignored otherwise.
I'm writing a search for a project I'm working on. It is meant to be able to search the body of articles and produce a list of their authors, ordered by the number of matching articles and including the relevant articles only, not all of their articles.
I currently have the following query:
Author.includes(:articles).where('articles.body ilike ?', '%foo%').references(:articles)
The use of includes in this case makes it so that all the relevant articles (not all articles) are preloaded, that's exactly what I want. However, when it comes to ordering by the number of included articles, I'm not sure how to proceed.
I should note I want to do this in ActiveRecord because pagination will be applied after the query. Not after a Ruby solution.
I should note I'm using PostgreSQL 9.3.
Edit: using raw SQL
This seems to work on its own like so:
Author.includes(:articles).where('articles.body ilike ?', '%foo%').references(:articles).select('authors.*, (SELECT COUNT(0) FROM articles WHERE articles.author_id = authors.id) AS article_count').order('article_count DESC')
This works fine. However, if I add .limit(1) it breaks.
PG::UndefinedColumn: ERROR: column "article_count" does not exist
Any idea why adding limit breaks it? The query seems very different too
SELECT DISTINCT "authors"."id", article_count AS alias_0 FROM "authors" LEFT OUTER JOIN "articles" ON "articles"."author_id" = "authors"."id" WHERE (articles.body ilike '%microsoft%') ORDER BY article_count DESC LIMIT 1
I don't think there's an out of the box solution for this. You have to write raw sql to do this but you can combine it with existing ActiveRecord queries.
Author
.includes(:articles)
.select('authors.*, (SELECT COUNT(0) FROM articles WHERE articles.author_id = authors.id) AS article_count')
.order('article_count DESC')
So the only thing to explain here is the select part. The first part, authors.*, selects all fields under the authors table and this is the default. Since we want to also count the number of articles, we create a subquery and pass its result as one of the pseudo columns of authors (we called it article_count). The last part is to just call order using article_count.
This solution assumes a couple of things which you'll have to fine tune depending on your setup.
Author by convention in rails maps to an authors table. If it is an STI (inherits from a User class and is using users table), you'll need to change authors to users.
articles.author_id assumes that the foreign key is author_id (and essentially, an article is only written by a single author). Change to whatever the foreign key is.
So given that, you'll have an array of authors ordered by the number of articles they've written.
The query below gets me the user's next question based on the status of that question. It gets all the questions for that specific section and then the scope does a LEFT JOIN on the statuses that belong to that user.
My question is, this doesn't seem like a very Railsy way to do it - is there a better way of filtering my table rather than this clumsy AND and .to_s business. My issue is that obviously, if any user has answered that question, then the left join will fill up with that user's answer, whereas I require it to be null.
Essentially the query works but is ugly and I can't figure out if it's the most efficient way!
scope :next_for_user, lambda { |user|
joins("LEFT JOIN user_question_statuses ON user_question_statuses.question_id = questions.id AND user_question_statuses.user_id = ", user.id.to_s).
reorder("user_question_statuses.answered ASC NULLS FIRST").
order("user_question_statuses.updated_at ASC NULLS FIRST").
limit(1)
}
Edit:
I realise this method is particularly vulnerable to SQL injection so I've replaced the main line in the query with:
joins(sanitize_sql_array(["LEFT JOIN user_question_statuses ON user_question_statuses.question_id = questions.id AND user_question_statuses.user_id = %d", user.id]))
which seems to work and forces the input to be an integer only.
Edit 2:
My other option is to use the find_each and then user first_or_create to create empty question statuses for that particular section of questions for the current user. This could happen as and when they need them before looking for a question. This would allow me to do a RIGHT JOIN from the questions on to those statuses, knowing they exist but if the first method is efficient and safe (and as Railsy as it can be), then there's not reason to change that.
Edit 3:
I have structured this query in this way because - from the section model that has_many questions - I want to find the next question that should be passed to a user.
To find this I need to join all of the user_question_statuses on to all of the section model's questions. The only way this can be done is on question.id. However, there are many user_question_statuses with that question id for different users. So when joining I need the AND clause to filter down the user_question_statuses to only ones from that user before the join happens. A user hey obviously only have one status per question.
I use a LEFT JOIN so that if a status does not yet exist (they only get created after a user attempts a question for the first time) there are still statuses with NULLs everywhere so that they create a row from which to then move to the top (hence NULLS FIRST) and potentially server to the user.
This may all be extremely unclear!
What you did was skipping the ORM layer offered by Active Record and constructed the queries by yourself. Your feeling is correct that this approach as many limitations and is not a well fit to MVC model that rails is following. I would suggest a read through Active Record Query Interface to get concepts of doing it in the rails/oop way
#teachers = User.joins(:students).where("student_id IS NOT NULL")
The above works, the below doesn't.
#teachers = User.includes(:students).where("student_id IS NOT NULL")
As far as I understand, joins and includes should both bring the same result with different performance. According to this, you use includes to load associated records of the objects called by Model, where joins to simply add two tables together. Using includes can also prevent the N+1 queries.
First question: why does my second line of code not work?
Second question: should anyone always use includes in a case similar to above?
You use joins when you want to query against the joined model. This is doing an inner join between your tables.
Includes is when you want to eager load the associated model to the end result.
This allows you to call the association on any of the results without having to again do the db lookup.
You cannot query against a model that is loaded via includes. If you want to query against it you must use joins( you can do both! )
How can I write this SQL statement the 'Rails way'?
SELECT users.*, count(invitations.id) AS invitations_count
FROM users
LEFT OUTER JOIN invitations ON invitations.sender_id = users.id
GROUP BY users.id
HAVING count(invitations.sender_id) < 5 AND users.email_address NOTNULL
ORDER BY invitations_count
Should I be using squeel gem for these kinds of queries?
This is available if you want to just transfer the SQL:
User.find_by_sql("...")
However, if you follow rails conventions with your naming (invitations.user_id), and store the invitation count on users (and update it when you add invitations) rather than doing a join to get it each time, you could do this:
On users:
scope :emailable, where('users.email_address IS NOT NULL')
scope :low_invitations, where('users.invitation_count < 5')
Then to query users with under 5 invites and an email address, ordered by no of invites:
#users = User.emailable.low_invitations.order('invitation_count asc')
Then access the invitations for a user with something like:
#user.invitations
#user.invitations.count
etc
For the above, you would have to add an invitation_count col to users, change sender_id to user_id, and add some scopes to the user model. You could also probably use joins to get a count without having a denormalised invitation_count.
If you are going to use rails it's far easier to go with these conventions than against them, and you might find it worthwhile setting up a small experimental app and playing with relations, plus reading the associations guide:
http://guides.rubyonrails.org/association_basics.html
You need to define proper models for Invitations and Users, then connect them via n*1 or n*n relationships (via belongs_to, has_many, etc). Then you'll be able to write code that will generate this or similar query "under the hood".
When using Rails you have to stop thinking SQL and start thinking Models, Views, Controllers.