I'm using rails 5 and I'm trying to bring all the Users order by the amount of Books read.
users :has_many books
I have tried
users = user.joins(:books).where('books.read = ?', 1).group('users.id').order("count(books.user_id)")
But this is removing all the users that haven't read books instead of show them last.
The column books.read can have several values. Users can have many or 0 books.
I want the users with books and no books orders by books.read = 1.
How can I achieved this?
Probably you need LEFT JOIN with required condition, but without WHERE clause
User.
joins('LEFT JOIN books on books.user_id = users.id AND books.read = 1').
group(:id).
order('COUNT(books.id) DESC')
Related
In the following book club example with associations:
class User
has_and_belongs_to_many :clubs
has_and_belongs_to_many :books
end
class Club
has_and_belongs_to_many :users
has_and_belongs_to_many :books
end
class Book
has_and_belongs_to_many :users
has_and_belongs_to_many :clubs
end
given a specific club record:
club = Club.find(params[:id])
how can I find all the users in the club who have all books in array of books?
club.users.where_has_all_books(books)
In PostgreSQL it can be done with a single query. (Maybe in MySQL too, I'm just not sure.)
So, some basic assumptions first. 3 tables: clubs, users and books, every table has id as a primary key. 3 join tables, books_clubs, books_users, clubs_users, each table contains pairs of ids (for books_clubs it will be [book_id, club_id]), and those pairs are unique within that table. Quite reasonable conditions IMO.
Building a query:
First, let's get ids of books from given club:
SELECT book_id
FROM books_clubs
WHERE club_id = 1
ORDER BY book_id
Then get users from given club, and group them by user.id:
SELECT CU.user_id
FROM clubs_users CU
JOIN users U ON U.id = CU.user_id
JOIN books_users BU ON BU.user_id = CU.user_id
WHERE CU.club_id = 1
GROUP BY CU.user_id
Join these two queries by adding having to 2nd query:
HAVING array_agg(BU.book_id ORDER BY BU.book_id) #> ARRAY(##1##)
where ##1## is the 1st query.
What's going on here: Function array_agg from the left part creates a sorted list (of array type) of book_ids. These are books of user. ARRAY(##1##) from the right part returns the sorted list of books of the club. And operator #> checks if 1st array contains all elements of the 2nd (ie if user has all books of the club).
Since 1st query needs to be performed only once, it can be moved to WITH clause.
Your complete query:
WITH club_book_ids AS (
SELECT book_id
FROM books_clubs
WHERE club_id = :club_id
ORDER BY book_id
)
SELECT CU.user_id
FROM clubs_users CU
JOIN users U ON U.id = CU.user_id
JOIN books_users BU ON BU.user_id = CU.user_id
WHERE CU.club_id = :club_id
GROUP BY CU.user_id
HAVING array_agg(BU.book_id ORDER BY BU.book_id) #> ARRAY(SELECT * FROM club_book_ids);
It can be verified in this sandbox: https://www.db-fiddle.com/f/cdPtRfT2uSGp4DSDywST92/5
Wrap it to find_by_sql and that's it.
Some notes:
ordering by book_id is not necessary; #> operator works with unordered arrays too. I just have a suspicion that comparison of ordered array is faster.
JOIN users U ON U.id = CU.user_id in 2nd query is only necessary for fetching user properties; in case of fetching user ids only it can be removed
It appears to work by grouping and counting.
club.users.joins(:books).where(books: { id: club.books.pluck(:id) }).group('users.id').having('count(*) = ?', club.books.count)
If anyone knows how to run the query without intermediate queries that would be great and I will accept the answer.
This looks like a situation where you'd make two queries, one to get all the ids you need, the other select perform a WHERE IN.
I have two tables users and posts and they have association of has_many. I want to fetch details of both users and posts in a single query. I'm able to manage the sql query but I don't want to use the raw query in the code (using execute method) as i think it is kind of simple thing and can be written using active record.
Here is the sql query
SELECT a.id, a.name, a.timestamp, b.id, b.user_id, b.title
FROM users a
INNER JOIN (SELECT id, user_id, title, from, to FROM posts) b on b.user_id = a.id
where id IN ( 1, 2, 3);
I think includes does not help here because i'm dealing with large data.
Can any one help me ?
If you just want those specific columns and nothing else then this will work
User.joins(:post)
.where(id: [1,2,3])
.select("users.id, users.name, users.timestamp,
posts.id as post_id, posts.user_id as post_user_id,
posts.title as post_title")
This will return an ActiveRecord::Relation of User objects with virtual attributes for post_id, post_user_id (Not sure why you need this one since you already selected users.id), and post_title.
The query produced will be
SELECT users.id,
users.name,
users.timestamp,
posts.id as post_id,
posts.user_id as post_user_id,
posts.title as post_title
FROM users
INNER JOIN posts on posts.user_id = users.id
where users.id IN ( 1, 2, 3);
Please note you may have multiple User objects, one for each Post, just as the SQL query does.
You can execute your exact query using the string version of joins e.g.
User.joins("INNER JOIN (SELECT id, user_id, title, from, to FROM posts) b on b.user_id = users.id")
.where(id: [1,2,3])
.select("users.id, users.name, users.timestamp,
b.id as post_id, b.user_id as post_user_id,
b.title as post_title")
Additionally to avoid some of the overhead you can use arel instead e.g.
users_table = User.arel_table
posts_table = Post.arel_table
query = users_table.project(Arel.star)
.join(posts_table)
.on(posts_table[:user_id].eq(users_table[:id]))
.where(users_table[:id].in([1,2,3]))
ActiveRecord::Base.connection.exec_query(query.to_sql)
This will return an ActiveRecord::Result with 2 useful methods columns (the columns selected) and rows. You can convert this to a Hash(#to_hash) but note that any columns with duplicate names (id for instance) will overwrite one another.
You could fix this by specifying the colums you want selected in the project portion. e.g. your current query would be:
query = users_table.project(
users_table[:id],
users_table[:name],
users_table[:timestamp],
posts_table[:id].as('post_id'),
posts_table[:user_id].as('post_user_id'),
posts_table[:title].as('post_title')
).join(posts_table)
.on(posts_table[:user_id].eq(users_table[:id]))
.where(users_table[:id].in([1,2,3]))
ActiveRecord::Base.connection.exec_query(query.to_sql).to_hash
Since none of the names collide now it can be structured into a nice Hash where the keys are the column names and the values or the row value for that record.
users = User.joins(:posts).includes(:posts).where(id: [1, 2, 3])
Will give you all the users with theirs posts.
then you can do whatever you want with them, but to access posts data for first retrieved user
first_user_posts = users.first.posts # this will not make additional DB queries as you used includes and data is already added
We use joins to have INNER JOIN statement in the SQL
We use includes to load all posts in the memory
I have two tables users and posts and they have association of
has_many. I want to fetch details of both users and posts in a single
query.
can be done with includes like
users = User.includes(:posts).where({posts: {user_id: [1,2,3]}})
other is eager_load and preload you can use as per your requirements, for more https://blog.arkency.com/2013/12/rails4-preloading/
I have and object rides and ride belongs_to company.
I get a list of rides
#rides = Ride.where(...)
What I need now is to store all companies of those ride in #companies where I want to have every company only once, even if two rides have the same company.
You can get all unique companies of all rides as below:
#rides = Ride.includes(:company).where(...)
#companies = #rides.map(&:company).uniq
Note: includes will load all companies in single query which are associated to resulting rides (prevents N+1 query problem).
PostgresQL is very efficient at doing that kind of work, much more so than Ruby.
You could write:
#rides = Ride.includes(:company).where(foo: "bar")
#companies = #rides.distinct.pluck('companies.name')
Which will result in the SQL query:
SELECT DISTINCT companies.name FROM "rides" LEFT OUTER JOIN "companies" ON "companies"."id" = "rides"."company_id" WHERE (rides.foo IS "bar")
In rails, is there a way to fetch transitive models. We have following model structure.
A customer has many purchases and a purchase has many orders. There is no direct relation between customer and order model. They can be linked through purchase model. Now I want to fetch all orders belongs to a customer. Is there a way of achieving this through a single query. Our current models look something like.
Customer
- customer_id
Purchase
- purchase_id
- customer_id
Order
- order_id
- purchase_id
- status
My usecase is to given a customer object, list all orders of a customer which are in a specific state (e.g status = 'Complete').
Row SQL would look something like
SELECT purchase_id, order_id FROM Customer c INNER JOIN Purchase p ON p.customer_id = c.customer_id INNER JOIN Order o ON o.purchase_id = p.purchase_id WHERE o.status = 'Complete';
You can do with this:
Order.select('purchases.id AS purchase_id, orders.id AS order_id').joins(purchase: :customer).where('orders.status = ?', 'Complete')
I hope this help you.
I am looking or help creating an advanced query in Rails/Active Record (or SQL using Postgres) for a contest.
I have a contest table, a users table and an activities table. Contests have many users (participants) and users have many activities. Each activity has points, a 'trackable_type', and other attributes.
What I want selected:
- first_name from the users table
- last_name from the users table
- a sum of the points for each user as total_contest_points
- a count of activities that have the type 'Course' for each user as total_contest_courses
- the results returned ranked by total_contest_points and then total_contest_courses
I have taken a look at the postgres_ext gem and tried writing sql, but I can't seem to get the ranking to work. Here's what I have so far:
time_range = start_time..end_time
contestants = participants.joins(:activities).where('activities.created_at' => time_range).select("
users.id,
users.first_name,
users.last_name,
sum(activities.points) as total_contest_points,
sum(
case
when
activities.trackable_type='Course'
then
1
else
0
end
) as total_contest_courses,
rank() OVER (ORDER BY total_contest_points DESC) as contest_rank
").group('users.id').limit(limit)
Thanks for your help and suggestions.