Sequel -- How To Construct This Query? - ruby-on-rails

I have a users table, which has a one-to-many relationship with a user_purchases table via the foreign key user_id. That is, each user can make many purchases (or may have none, in which case he will have no entries in the user_purchases table).
user_purchases has only one other field that is of interest here, which is purchase_date.
I am trying to write a Sequel ORM statement that will return a dataset with the following columns:
user_id
date of the users SECOND purchase, if it exists
So users who have not made at least 2 purchases will not appear in this dataset. What is the best way to write this Sequel statement?
Please note I am looking for a dataset with ALL users returned who have >= 2 purchases
Thanks!
EDIT FOR CLARITY
Here is a similar statement I wrote to get users and their first purchase date (as opposed to 2nd purchase date, which I am asking for help with in the current post):
DB[:users].join(:user_purchases, :user_id => :id)
.select{[:user_id, min(:purchase_date)]}
.group(:user_id)

You don't seem to be worried about the dates, just the counts so
DB[:user_purchases].group_and_count(:user_id).having(:count > 1).all
will return a list of user_ids and counts where the count (of purchases) is >= 2. Something like
[{:count=>2, :user_id=>1}, {:count=>7, :user_id=>2}, {:count=>2, :user_id=>3}, ...]
If you want to get the users with that, the easiest way with Sequel is probably to extract just the list of user_ids and feed that back into another query:
DB[:users].where(:id => DB[:user_purchases].group_and_count(:user_id).
having(:count > 1).all.map{|row| row[:user_id]}).all
Edit:
I felt like there should be a more succinct way and then I saw this answer (from Sequel author Jeremy Evans) to another question using select_group and select_more : https://stackoverflow.com/a/10886982/131226
This should do it without the subselect:
DB[:users].
left_join(:user_purchases, :user_id=>:id).
select_group(:id).
select_more{count(:purchase_date).as(:purchase_count)}.
having(:purchase_count > 1)
It generates this SQL
SELECT `id`, count(`purchase_date`) AS 'purchase_count'
FROM `users` LEFT JOIN `user_purchases`
ON (`user_purchases`.`user_id` = `users`.`id`)
GROUP BY `id` HAVING (`purchase_count` > 1)"

Generally, this could be the SQL query that you need:
SELECT u.id, up1.purchase_date FROM users u
LEFT JOIN user_purchases up1 ON u.id = up1.user_id
LEFT JOIN user_purchases up2 ON u.id = up2.user_id AND up2.purchase_date < up1.purchase_date
GROUP BY u.id, up1.purchase_date
HAVING COUNT(up2.purchase_date) = 1;
Try converting that to sequel, if you don't get any better answers.

The date of the user's second purchase would be the second row retrieved if you do an order_by(:purchase_date) as part of your query.
To access that, do a limit(2) to constrain the query to two results then take the [-1] (or last) one. So, if you're not using models and are working with datasets only, and know the user_id you're interested in, your (untested) query would be:
DB[:user_purchases].where(:user_id => user_id).order_by(:user_purchases__purchase_date).limit(2)[-1]
Here's some output from Sequel's console:
DB[:user_purchases].where(:user_id => 1).order_by(:purchase_date).limit(2).sql
=> "SELECT * FROM user_purchases WHERE (user_id = 1) ORDER BY purchase_date LIMIT 2"
Add the appropriate select clause:
.select(:user_id, :purchase_date)
and you should be done:
DB[:user_purchases].select(:user_id, :purchase_date).where(:user_id => 1).order_by(:purchase_date).limit(2).sql
=> "SELECT user_id, purchase_date FROM user_purchases WHERE (user_id = 1) ORDER BY purchase_date LIMIT 2"

Related

How to write sub query in active record?

I have two tables users and posts and they have association of has_many. I want to fetch details of both users and posts in a single query. I'm able to manage the sql query but I don't want to use the raw query in the code (using execute method) as i think it is kind of simple thing and can be written using active record.
Here is the sql query
SELECT a.id, a.name, a.timestamp, b.id, b.user_id, b.title
FROM users a
INNER JOIN (SELECT id, user_id, title, from, to FROM posts) b on b.user_id = a.id
where id IN ( 1, 2, 3);
I think includes does not help here because i'm dealing with large data.
Can any one help me ?
If you just want those specific columns and nothing else then this will work
User.joins(:post)
.where(id: [1,2,3])
.select("users.id, users.name, users.timestamp,
posts.id as post_id, posts.user_id as post_user_id,
posts.title as post_title")
This will return an ActiveRecord::Relation of User objects with virtual attributes for post_id, post_user_id (Not sure why you need this one since you already selected users.id), and post_title.
The query produced will be
SELECT users.id,
users.name,
users.timestamp,
posts.id as post_id,
posts.user_id as post_user_id,
posts.title as post_title
FROM users
INNER JOIN posts on posts.user_id = users.id
where users.id IN ( 1, 2, 3);
Please note you may have multiple User objects, one for each Post, just as the SQL query does.
You can execute your exact query using the string version of joins e.g.
User.joins("INNER JOIN (SELECT id, user_id, title, from, to FROM posts) b on b.user_id = users.id")
.where(id: [1,2,3])
.select("users.id, users.name, users.timestamp,
b.id as post_id, b.user_id as post_user_id,
b.title as post_title")
Additionally to avoid some of the overhead you can use arel instead e.g.
users_table = User.arel_table
posts_table = Post.arel_table
query = users_table.project(Arel.star)
.join(posts_table)
.on(posts_table[:user_id].eq(users_table[:id]))
.where(users_table[:id].in([1,2,3]))
ActiveRecord::Base.connection.exec_query(query.to_sql)
This will return an ActiveRecord::Result with 2 useful methods columns (the columns selected) and rows. You can convert this to a Hash(#to_hash) but note that any columns with duplicate names (id for instance) will overwrite one another.
You could fix this by specifying the colums you want selected in the project portion. e.g. your current query would be:
query = users_table.project(
users_table[:id],
users_table[:name],
users_table[:timestamp],
posts_table[:id].as('post_id'),
posts_table[:user_id].as('post_user_id'),
posts_table[:title].as('post_title')
).join(posts_table)
.on(posts_table[:user_id].eq(users_table[:id]))
.where(users_table[:id].in([1,2,3]))
ActiveRecord::Base.connection.exec_query(query.to_sql).to_hash
Since none of the names collide now it can be structured into a nice Hash where the keys are the column names and the values or the row value for that record.
users = User.joins(:posts).includes(:posts).where(id: [1, 2, 3])
Will give you all the users with theirs posts.
then you can do whatever you want with them, but to access posts data for first retrieved user
first_user_posts = users.first.posts # this will not make additional DB queries as you used includes and data is already added
We use joins to have INNER JOIN statement in the SQL
We use includes to load all posts in the memory
I have two tables users and posts and they have association of
has_many. I want to fetch details of both users and posts in a single
query.
can be done with includes like
users = User.includes(:posts).where({posts: {user_id: [1,2,3]}})
other is eager_load and preload you can use as per your requirements, for more https://blog.arkency.com/2013/12/rails4-preloading/

How can I apply LIMIT/ORDER conditions to an ActiveRecord Join?

Currently I have a Conversation has many Messages relationship, and a User has many Conversations relationship.
I would like to create an ActiveRecord Query to get The Last Message of each conversation that a user has.
Let's say I have the conversations ids in an array...
ids = [24, 22, 23]
This query:
Message.where(conversation_id: ids).joins(:conversation).order(created_at: :desc)
... is correct in terms that it returns ALL the Messages across all the user's conversations.
Using the same query above, If I map an array of the conversation_ids:
Message.where(conversation_id: ids).joins(:conversation).order(created_at: :desc).map(&:conversation_id)
I get an array like this: [24, 24, 22, 22, 23, 22] that tells me there are 3 messages in conversation with conversation_id=22, 2 messages with conversation_id=24, 1 with conversation_id=23.
This is good, But now my Question now is, How can I create an ActiveRecord Query to get just One Message from each Conversation? (the last one that was created)
I assume I have to use the limit()/order() methods, but I have no idea how to do it, it's a little too advanced for me.
Thanks for all your help in advance.
joins can accept a string, and you can specify any join you want as plain text. See doco.
Example:
User.joins("LEFT JOIN bookmarks ON bookmarks.bookmarkable_type = 'Post' AND bookmarks.user_id = users.id")
# SELECT "users".* FROM "users" LEFT JOIN bookmarks ON bookmarks.bookmarkable_type = 'Post' AND bookmarks.user_id = users.id
As for the problem of joining to the latest record, that's another question in it's own right, and has an answer on stackoverflow here.
Example:
SELECT c.*, p.*
FROM customer c INNER JOIN
(
SELECT customer_id,
MAX(date) MaxDate
FROM purchase
GROUP BY customer_id
) MaxDates ON c.id = MaxDates.customer_id INNER JOIN
purchase p ON MaxDates.customer_id = p.customer_id
AND MaxDates.MaxDate = p.date

How to UNION tables and make results accessible in a Ruby view

I'm quite new to RoR and creating a student project for a course I'm taking. I'm wanting to construct a type of query we didn't cover in the course and which I know I could do in a snap in .NET and SQL. I'm having a heck of a time though getting it implemented the Ruby way.
What I'd like to do: Display a list on a user's page of all "posts" by that user's friends.
"Posts" are found in both a questions table and in a blurbs table that users contribute to. I'd like to UNION these two into a single recordset to sort by updated_at DESC.
The table column names are not the same however, and this is my sticking point since other successful answers I've seen have hinged on column names being the same between the two.
In SQL I'd write something like (emphasis on like):
SELECT b.Blurb AS 'UserPost', b.updated_at, u.username as 'Author'
FROM Blurbs b
INNER JOIN Users u ON b.User_ID = u.ID
WHERE u.ID IN
(SELECT f.friend_id FROM Friendships f WHERE f.User_ID = [current user])
ORDER BY b.updated_at DESC
UNION
SELECT q.Question, q.updated_at, u.username
FROM Questions q
INNER JOIN Users u ON q.User_ID = u.ID
WHERE u.ID IN
(SELECT f.friend_id FROM Friendships f WHERE f.User_ID = [current user])
ORDER BY b.updated_at DESC
The User model's (applicable) relationships are:
has_many :friendships
has_many :friends, through: :friendships
has_many :questions
has_many :blurbs
And the Question and Blurb models both have belongs_to :user
In the view I'd like to display the contents of the 'UserPost' column and the 'Author'. I'm sure this is possible, I'm just too new still to ActiveRecord and how statements are formed. Happy to have some input or review any relevant links that speak to this specifically!
Final Solution
Hopefully this will assist others in the future with Ruby UNION questions. Thanks to #Plamena's input the final implementation ended up as:
def friend_posts
sql = "...the UNION statement seen above..."
ActiveRecord::Base.connection.select_all(ActiveRecord::Base.send("sanitize_sql_array",[sql, self.id, self.id] ) )
end
Currently Active Record lacks union support. You can use SQL:
sql = <<-SQL
# your sql query goes here
SELECT b.created_at ...
UNION(
SELECT q.created_at
....
)
SQL
posts = ActiveRecord::Base.connection.select_all(sql)
Then you can iterate the result:
posts.each do |post|
# post is a hash
p post['created_at']
end
Your best way to do this is to just use the power of Rails
If you want all of something belonging to a user's friend:
current_user.friends.find(id_of_friend).first.questions
This would get all of the questions from a certain friend.
Now, it seems that you have writings in multiple places (this is hard to visualise without your providing a model of how writings is connected to everywhere else). Can you provide this?
#blurbs = Blurb.includes(:user)
#blurbs.each do |blurb|
p blurb.blurb, blurb.user.username
end

How to get records based on an offset around a particular record?

I'm building a search UI which searches for comments. When a user clicks on a search result (comment), I want to show the surrounding comments.
My model:
Group (id, title) - A Group has many comments
Comment (id, group_id, content)
For example:
When a user clicks on a comment with comment.id equal to 26. I would first find all the comments for that group:
comment = Comment.find(26)
comments = Comment.where(:group_id => comment.group_id)
I now have all of the group's comments. What I then want to do is show comment.id 26, with a max of 10 comments before and 10 comments after.
How can I modify comments to show that offset?
Sounds simple, but it's tricky to get the best performance for this. In any case, you must let the database do the work. That will be faster by an order of magnitude than fetching all rows and filter / sort on the client side.
If by "before" and "after" you mean smaller / bigger comment.id, and we further assume that there can be gaps in the id space, this one query should do all:
WITH x AS (SELECT id, group_id FROM comment WHERE id = 26) -- enter value once
(
SELECT *
FROM x
JOIN comment c USING (group_id)
WHERE c.id > x.id
ORDER BY c.id
LIMIT 10
)
UNION ALL
(
SELECT *
FROM x
JOIN comment c USING (group_id)
WHERE c.id < x.id
ORDER BY c.id DESC
LIMIT 10
)
I'll leave paraphrasing that in Ruby syntax to you, that's not my area of expertise.
Returns 10 earlier comments and 10 later ones. Fewer if fewer exist. Use <= in the 2nd leg of the UNION ALL query to include the selected comment itself.
If you need the rows sorted, add another query level on top with ORDER BY.
Should be very fast in combination with these two indexes for the table comment:
one on (id) - probably covered automatically the primary key.
one on (group_id, id)
For read-only data you could create a materialized view with a gap-less row-number that would make this even faster.
More explanation about parenthesis, indexes, and performance in this closely related answer.
Something like:
comment = Comment.find(26)
before_comments = Comment.
where('created_at <= ?', comment.created_at).
where('id != ?', comment.id).
where(group_id: comment.group_id).
order('created_at DESC').limit(10)
after_comments = Comment.
where('created_at >= ?', comment.created_at).
where('id != ?', comment.id).
where(group_id: comment.group_id).
order('created_at DESC').limit(10)

How do I get Rails ActiveRecord to generate optimized SQL?

Let's say that I have 4 models which are related in the following ways:
Schedule has foreign key to Project
Schedule has foreign key to User
Project has foreign key to Client
In my Schedule#index view I want the most optimized SQL so that I can display links to the Schedule's associated Project, Client, and User. So, I should not pull all of the columns for the Project, Client, and User; only their IDs and Name.
If I were to manually write the SQL it might look like this:
select
s.id,
s.schedule_name,
s.schedule_type,
s.project_id,
p.name project_name,
p.client_id client_id,
c.name client_name,
s.user_id,
u.login user_login,
s.created_at,
s.updated_at,
s.data_count
from
Users u inner join
Clients c inner join
Schedules s inner join
Projects p
on p.id = s.project_id
on c.id = p.client_id
on u.id = s.user_id
order by
s.created_at desc
My question is: What would the ActiveRecord code look like to get Rails 3 to generate that SQL? For example, somthing like:
#schedules = Schedule. # ?
I already have the associations setup in the models (i.e. has_many / belongs_to).
I think this will build (or at least help) you get what you're looking for:
Schedule.select("schedules.id, schedules.schedule_name, projects.name as project_name").joins(:user, :project=>:client).order("schedules.created_at DESC")
should yield:
SELECT schedules.id, schedules.schedule_name, projects.name as project_name FROM `schedules` INNER JOIN `users` ON `users`.`id` = `schedules`.`user_id` INNER JOIN `projects` ON `projects`.`id` = `schedules`.`project_id` INNER JOIN `clients` ON `clients`.`id` = `projects`.`client_id`
The main problem I see in your approach is that you're looking for schedule objects but basing your initial "FROM" clause on "User" and your associations given are also on Schedule, so I built this solution based on the plain assumption that you want schedules!
I also didn't include all of your selects to save some typing, but you get the idea. You will simply have to add each one qualified with its full table name.

Resources