I am struggling to use Rails' ActiveRecord query interface to replicate a query that has an inner join subquery. How would I replicate the following:
SELECT ass.name, COUNT(DISTINCT a.question_id) AS
answered_questions, tq.total_questions
FROM assessments AS ass
INNER JOIN (SELECT ass.id, COUNT(q.id) AS total_questions FROM
questions AS q INNER JOIN assessments AS ass ON ass.id=q.assessment_id
GROUP BY
ass.id) as tq ON tq.id=ass.id
INNER JOIN questions AS q ON q.assessment_id=ass.id
INNER JOIN answers AS a ON a.assessment_id=ass.id AND a.question_id=q.id
INNER JOIN org_assesments AS oa ON ass.id=oa.assessment_id
INNER JOIN users AS u ON oa.org_id=u.org_id AND
a.user_id=u.id
WHERE u.id=1
GROUP BY ass.name, tq.total_questions
ORDER BY ass.created_at DESC
LIMIT 10
I don't seem to be able to get this to work with the subquery using the query builder. Without the subquery I have this, which works and gives me the assessment title and number of questions answered:
Question.joins(:assessment => {:org_assessments => {:org => :users}}).joins(:answers)
.where(answers:{:user_id => params[:id]})
.distinct('answers.question_id').group(['assessments.name']).count()
How can I write this to include the subquery as in the original SQL above?
You may send the subquery as a string to the joins method:
subquery =
TotalQuestion.
joins(:assessments).
group('assessments.id').
select('assessments.id, COUNT(q.id) as total_questions').to_sql
Question.joins("(#{sub_query}) as tq on tq.id=ass.id")
And you can combine it with the other parts of the query:
Question.
joins(:assessment => {:org_assessments => {:org => :users}}).joins(:answers).
joins("(#{sub_query}) as tq on tq.id=ass.id").
where(answers:{:user_id => params[:id]}).
distinct('answers.question_id').group(['assessments.name']).count()
Related
I have two tables users and posts and they have association of has_many. I want to fetch details of both users and posts in a single query. I'm able to manage the sql query but I don't want to use the raw query in the code (using execute method) as i think it is kind of simple thing and can be written using active record.
Here is the sql query
SELECT a.id, a.name, a.timestamp, b.id, b.user_id, b.title
FROM users a
INNER JOIN (SELECT id, user_id, title, from, to FROM posts) b on b.user_id = a.id
where id IN ( 1, 2, 3);
I think includes does not help here because i'm dealing with large data.
Can any one help me ?
If you just want those specific columns and nothing else then this will work
User.joins(:post)
.where(id: [1,2,3])
.select("users.id, users.name, users.timestamp,
posts.id as post_id, posts.user_id as post_user_id,
posts.title as post_title")
This will return an ActiveRecord::Relation of User objects with virtual attributes for post_id, post_user_id (Not sure why you need this one since you already selected users.id), and post_title.
The query produced will be
SELECT users.id,
users.name,
users.timestamp,
posts.id as post_id,
posts.user_id as post_user_id,
posts.title as post_title
FROM users
INNER JOIN posts on posts.user_id = users.id
where users.id IN ( 1, 2, 3);
Please note you may have multiple User objects, one for each Post, just as the SQL query does.
You can execute your exact query using the string version of joins e.g.
User.joins("INNER JOIN (SELECT id, user_id, title, from, to FROM posts) b on b.user_id = users.id")
.where(id: [1,2,3])
.select("users.id, users.name, users.timestamp,
b.id as post_id, b.user_id as post_user_id,
b.title as post_title")
Additionally to avoid some of the overhead you can use arel instead e.g.
users_table = User.arel_table
posts_table = Post.arel_table
query = users_table.project(Arel.star)
.join(posts_table)
.on(posts_table[:user_id].eq(users_table[:id]))
.where(users_table[:id].in([1,2,3]))
ActiveRecord::Base.connection.exec_query(query.to_sql)
This will return an ActiveRecord::Result with 2 useful methods columns (the columns selected) and rows. You can convert this to a Hash(#to_hash) but note that any columns with duplicate names (id for instance) will overwrite one another.
You could fix this by specifying the colums you want selected in the project portion. e.g. your current query would be:
query = users_table.project(
users_table[:id],
users_table[:name],
users_table[:timestamp],
posts_table[:id].as('post_id'),
posts_table[:user_id].as('post_user_id'),
posts_table[:title].as('post_title')
).join(posts_table)
.on(posts_table[:user_id].eq(users_table[:id]))
.where(users_table[:id].in([1,2,3]))
ActiveRecord::Base.connection.exec_query(query.to_sql).to_hash
Since none of the names collide now it can be structured into a nice Hash where the keys are the column names and the values or the row value for that record.
users = User.joins(:posts).includes(:posts).where(id: [1, 2, 3])
Will give you all the users with theirs posts.
then you can do whatever you want with them, but to access posts data for first retrieved user
first_user_posts = users.first.posts # this will not make additional DB queries as you used includes and data is already added
We use joins to have INNER JOIN statement in the SQL
We use includes to load all posts in the memory
I have two tables users and posts and they have association of
has_many. I want to fetch details of both users and posts in a single
query.
can be done with includes like
users = User.includes(:posts).where({posts: {user_id: [1,2,3]}})
other is eager_load and preload you can use as per your requirements, for more https://blog.arkency.com/2013/12/rails4-preloading/
I want to produce the following sql using active record.
WHERE (column_name1, column_name1) IN (SELECT ....)
I don't know how to do this is active record.
I've tried these so far
where('column_name1, column_name2' => {})
where([:column_name1, :column_name2] => {})
This is the full query I'd like to create
SELECT a, Count(1)
FROM table
WHERE ( a, b ) IN (SELECT a,
Max(b)
FROM table
GROUP BY a)
GROUP BY a
HAVING Count(1) > 1)
I've already written a scope for the subquery
Thanks in advance.
WHERE (column_name1, column_name1) IN (SELECT ....) is not a valid construct in sql; so it can't be done in active record either.
The valid way of accomplishing the same in SQL would be:
WHERE column_name1 IN (select ....) OR column_name2 IN (select ...)
The same query can be used directly in the active record:
where("column_name1 IN (select ...) OR column_name2 IN (select...)")
Avoiding duplication:
selected_values = select ...
where("column_name IN ? OR column_name2 in ?", selected_values, selected_values)
So I decided to use an inner join to gain the same functionality. Here is my solution.
select(:column1, 'Count(1)').
joins("INNER JOIN (#{subquery.to_sql}) AS table2 ON
table1.column1=table2.column1
AND table1.column2=table2.column2")
I have a User model and an Item model. I want to rank users according to the value of the items they have. I want to do the equivalent of this query:
SELECT rank() OVER (ORDER BY grand_total DESC), u.*, grand_total
FROM users AS u
JOIN
(SELECT user_id, SUM(amount) AS grand_total FROM items WHERE EXTRACT(YEAR FROM sold_at)='2012' GROUP BY user_id) AS i
ON u.id = i.user_id;
Specifically, I don't know how to join on my select.
Given the problem as you describe it, I would write the query thus:
select users.*, sum(items.amount) as rank
from users
join items on items.user_id = users.id
group by users.id
order by rank desc;
Which would translate into AREL as:
User.select('users.*, sum(items.amount) as rank').joins('join items on items.user_id = users.id').group('users.id').order('rank desc')
This has the handy side-effect that you can call .rank on the resulting User objects and get the value of the rank column from the query, in case you need to display it.
Is there something about your situation I'm not grasping here, or would this work?
Let's say that I have 4 models which are related in the following ways:
Schedule has foreign key to Project
Schedule has foreign key to User
Project has foreign key to Client
In my Schedule#index view I want the most optimized SQL so that I can display links to the Schedule's associated Project, Client, and User. So, I should not pull all of the columns for the Project, Client, and User; only their IDs and Name.
If I were to manually write the SQL it might look like this:
select
s.id,
s.schedule_name,
s.schedule_type,
s.project_id,
p.name project_name,
p.client_id client_id,
c.name client_name,
s.user_id,
u.login user_login,
s.created_at,
s.updated_at,
s.data_count
from
Users u inner join
Clients c inner join
Schedules s inner join
Projects p
on p.id = s.project_id
on c.id = p.client_id
on u.id = s.user_id
order by
s.created_at desc
My question is: What would the ActiveRecord code look like to get Rails 3 to generate that SQL? For example, somthing like:
#schedules = Schedule. # ?
I already have the associations setup in the models (i.e. has_many / belongs_to).
I think this will build (or at least help) you get what you're looking for:
Schedule.select("schedules.id, schedules.schedule_name, projects.name as project_name").joins(:user, :project=>:client).order("schedules.created_at DESC")
should yield:
SELECT schedules.id, schedules.schedule_name, projects.name as project_name FROM `schedules` INNER JOIN `users` ON `users`.`id` = `schedules`.`user_id` INNER JOIN `projects` ON `projects`.`id` = `schedules`.`project_id` INNER JOIN `clients` ON `clients`.`id` = `projects`.`client_id`
The main problem I see in your approach is that you're looking for schedule objects but basing your initial "FROM" clause on "User" and your associations given are also on Schedule, so I built this solution based on the plain assumption that you want schedules!
I also didn't include all of your selects to save some typing, but you get the idea. You will simply have to add each one qualified with its full table name.
Short version: How do I write this query in squeel?
SELECT OneTable.*, my_count
FROM OneTable JOIN (
SELECT DISTINCT one_id, count(*) AS my_count
FROM AnotherTable
GROUP BY one_id
) counts
ON OneTable.id=counts.one_id
Long version: rocket_tag is a gem that adds simple tagging to models. It adds a method tagged_with. Supposing my model is User, with an id and name, I could invoke User.tagged_with ['admin','sales']. Internally it uses this squeel code:
select{count(~id).as(tags_count)}
.select("#{self.table_name}.*").
joins{tags}.
where{tags.name.in(my{tags_list})}.
group{~id}
Which generates this query:
SELECT count(users.id) AS tags_count, users.*
FROM users INNER JOIN taggings
ON taggings.taggable_id = users.id
AND taggings.taggable_type = 'User'
INNER JOIN tags
ON tags.id = taggings.tag_id
WHERE tags.name IN ('admin','sales')
GROUP BY users.id
Some RDBMSs are happy with this, but postgres complains:
ERROR: column "users.name" must appear in the GROUP BY
clause or be used in an aggregate function
I believe a more agreeable way to write the query would be:
SELECT users.*, tags_count FROM users INNER JOIN (
SELECT DISTINCT taggable_id, count(*) AS tags_count
FROM taggings INNER JOIN tags
ON tags.id = taggings.tag_id
WHERE tags.name IN ('admin','sales')
GROUP BY taggable_id
) tag_counts
ON users.id = tag_counts.taggable_id
Is there any way to express this using squeel?
I wouldn't know about Squeel, but the error you see could be fixed by upgrading PostgreSQL.
Some RDBMSs are happy with this, but postgres complains:
ERROR: column "users.name" must appear in the GROUP BY clause or be
used in an aggregate function
Starting with PostgreSQL 9.1, once you list a primary key in the GROUP BY you can skip additional columns for this table and still use them in the SELECT list. The release notes for version 9.1 tell us:
Allow non-GROUP BY columns in the query target list when the primary
key is specified in the GROUP BY clause
BTW, your alternative query can be simplified, an additional DISTINCT would be redundant.
SELECT o.*, c.my_count
FROM onetable o
JOIN (
SELECT one_id, count(*) AS my_count
FROM anothertable
GROUP BY one_id
) c ON o.id = counts.one_id