How to get records based on an offset around a particular record? - ruby-on-rails

I'm building a search UI which searches for comments. When a user clicks on a search result (comment), I want to show the surrounding comments.
My model:
Group (id, title) - A Group has many comments
Comment (id, group_id, content)
For example:
When a user clicks on a comment with comment.id equal to 26. I would first find all the comments for that group:
comment = Comment.find(26)
comments = Comment.where(:group_id => comment.group_id)
I now have all of the group's comments. What I then want to do is show comment.id 26, with a max of 10 comments before and 10 comments after.
How can I modify comments to show that offset?

Sounds simple, but it's tricky to get the best performance for this. In any case, you must let the database do the work. That will be faster by an order of magnitude than fetching all rows and filter / sort on the client side.
If by "before" and "after" you mean smaller / bigger comment.id, and we further assume that there can be gaps in the id space, this one query should do all:
WITH x AS (SELECT id, group_id FROM comment WHERE id = 26) -- enter value once
(
SELECT *
FROM x
JOIN comment c USING (group_id)
WHERE c.id > x.id
ORDER BY c.id
LIMIT 10
)
UNION ALL
(
SELECT *
FROM x
JOIN comment c USING (group_id)
WHERE c.id < x.id
ORDER BY c.id DESC
LIMIT 10
)
I'll leave paraphrasing that in Ruby syntax to you, that's not my area of expertise.
Returns 10 earlier comments and 10 later ones. Fewer if fewer exist. Use <= in the 2nd leg of the UNION ALL query to include the selected comment itself.
If you need the rows sorted, add another query level on top with ORDER BY.
Should be very fast in combination with these two indexes for the table comment:
one on (id) - probably covered automatically the primary key.
one on (group_id, id)
For read-only data you could create a materialized view with a gap-less row-number that would make this even faster.
More explanation about parenthesis, indexes, and performance in this closely related answer.

Something like:
comment = Comment.find(26)
before_comments = Comment.
where('created_at <= ?', comment.created_at).
where('id != ?', comment.id).
where(group_id: comment.group_id).
order('created_at DESC').limit(10)
after_comments = Comment.
where('created_at >= ?', comment.created_at).
where('id != ?', comment.id).
where(group_id: comment.group_id).
order('created_at DESC').limit(10)

Related

How to get the top 5 per enum of a model in rails?

Let's say I have a model named post, which has an enum named post_type which can either be
admin, public or user
#app/models/post.rb
class Post < ApplicationRecord
enum post_type: [ :admin, :public, :user ]
end
How can I select 5 last created posts from each category?
I can't think of any other solution than this:
PER_GROUP = 5
admin_posts = Post.admin.order(created_at: :desc).limit(PER_GROUP)
user_posts = Post.user.order(created_at: :desc).limit(PER_GROUP)
public_posts = Post.public.order(created_at: :desc).limit(PER_GROUP)
Is there any way I could fetch all the rows in the required manner from just a single query to the database.
STACK
RAILS : 6
PostgresSQL: 9.4
I am not sure how to translate into RAILS, but it is straight forward Postgres query. You use the row_number window function in a sub-select then keep only rows with row_number less than or equal 5 on the outer select.
select *
from (select post_txt
, posted_type
, row_number() over (partition by posted_type) rn
from enum_table
) pt
where rn <= 5
order by posted_type;
One thing to look out for is the sorting on an enum. Doing so gives results in order of the definition, not a "natural order" (alphanumeric in this case). See example here.
Thanks to #Belayer i was able to come up with a solution.
PER_GROUP = 5
sub_query = Post.select('*', 'row_number() over (partition by "posts"."post_type" ORDER BY posts.created_at DESC ) rn').to_sql
#posts = Post.from("(#{sub_query}) inner_query")
.where('inner_query.rn <= ?', PER_GROUP')
.order(:post_type, created_at: :desc)
.group_by(&:post_type)
Since i am only loading 5 records across just a few different types group_by will work just fine for me.

Rails .where query chained to sql function, is there a way to call it on the results without converting them to an array?

I have a method that ranks user's response rates in our system called ranked_users
def ranked_users
User.joins(:responds).group(:id).select(
"users.*, SUM(CASE WHEN answers.response != 3 THEN 1 ELSE 0 END ) avg, RANK () OVER (
ORDER BY SUM(CASE WHEN answers.response != 3 THEN 1 ELSE 0 END ) DESC, CASE WHEN users.id = '#{
current_user.id
}' THEN 1 ELSE 0 END DESC
) rank"
)
.where('users.active = true')
.where('answers.created_at BETWEEN ? AND ?', Time.now - 12.months, Time.now)
end
result = ranked_users
I then take the top three with top_3 = ranked_users.limit(3)
If the user is not in the top 3, I want to append them with their rank to the list:
user_rank = result.find_by(id: current_user.id)
Whenever I call user_rank.rank it returns 1. I know this is because it's applying the find_by clause first and then ranking them. Is there a way to enforce the find_by clause happens only on the result of the first query? I tried doing result.load.find_by(...) but had the same issue. I could convert the entire result into an array but I want the solution to be highly scalable.
If you expect lots of users with lots of answers and high load on your rating system - you can create a materialized view for the ranking query with (user_id, avg, rank, etc.) and refresh it periodically instead of calculating rank every time (say, a few times per day or even less often). There's gem scenic for this.
You can even have indexes on rank and user id on the view and your query will be two simple fast reads from it.

How to find records with maximum value (defined in associated model) per day

I would like to fetch all records per day with highest priority (as defined in associated model)
I'm struggling to build this with activerecord (rails 4.2)
The problem is very similar to this one
Get records with max value for each group of grouped SQL results
except that the age would come from the second model
or also this one
with activerecord how can I select records based on the highest value of a field?
Model 1: Workduration:
date, duration
belongs_to :timerule
Model 2: Timerule:
priority
has_many :workdurations
I put together the data as follows (all in Workduration)
def self.withPrio
select("workdurations.*, timerules.prio AS prio").joins(:timerule)
end
I couldn't find the proper way to build the LEFT OUTER JOIN (self-join) on it.
Try-And-Error-Code:
Workduration.withPrio.joins("left join ? workdurations.date = wd2.date and workdurations.prio < wd2.prio", Workduration.withPrio)
Any help is appreciated!
I ended up doing this with (a big) find_by_sql and a second query to keep the scopes chainable:
scope :maxPrioIds, ->{find_by_sql('SELECT o.*
FROM
(SELECT workdurations.*, timerules.prio AS prio FROM "workdurations" INNER JOIN "timerules" ON "timerules"."id" = "workdurations"."timerule_id") o
LEFT JOIN (SELECT workdurations.*, timerules.prio AS prio FROM "workdurations" INNER JOIN "timerules" ON "timerules"."id" = "workdurations"."timerule_id") b
ON o.date = b.date AND o.prio < b.prio
WHERE b.prio is NULL').map(&:id)}
scope :relevant, -> {where(id: Workduration.maxPrioIds)}

Sequel -- How To Construct This Query?

I have a users table, which has a one-to-many relationship with a user_purchases table via the foreign key user_id. That is, each user can make many purchases (or may have none, in which case he will have no entries in the user_purchases table).
user_purchases has only one other field that is of interest here, which is purchase_date.
I am trying to write a Sequel ORM statement that will return a dataset with the following columns:
user_id
date of the users SECOND purchase, if it exists
So users who have not made at least 2 purchases will not appear in this dataset. What is the best way to write this Sequel statement?
Please note I am looking for a dataset with ALL users returned who have >= 2 purchases
Thanks!
EDIT FOR CLARITY
Here is a similar statement I wrote to get users and their first purchase date (as opposed to 2nd purchase date, which I am asking for help with in the current post):
DB[:users].join(:user_purchases, :user_id => :id)
.select{[:user_id, min(:purchase_date)]}
.group(:user_id)
You don't seem to be worried about the dates, just the counts so
DB[:user_purchases].group_and_count(:user_id).having(:count > 1).all
will return a list of user_ids and counts where the count (of purchases) is >= 2. Something like
[{:count=>2, :user_id=>1}, {:count=>7, :user_id=>2}, {:count=>2, :user_id=>3}, ...]
If you want to get the users with that, the easiest way with Sequel is probably to extract just the list of user_ids and feed that back into another query:
DB[:users].where(:id => DB[:user_purchases].group_and_count(:user_id).
having(:count > 1).all.map{|row| row[:user_id]}).all
Edit:
I felt like there should be a more succinct way and then I saw this answer (from Sequel author Jeremy Evans) to another question using select_group and select_more : https://stackoverflow.com/a/10886982/131226
This should do it without the subselect:
DB[:users].
left_join(:user_purchases, :user_id=>:id).
select_group(:id).
select_more{count(:purchase_date).as(:purchase_count)}.
having(:purchase_count > 1)
It generates this SQL
SELECT `id`, count(`purchase_date`) AS 'purchase_count'
FROM `users` LEFT JOIN `user_purchases`
ON (`user_purchases`.`user_id` = `users`.`id`)
GROUP BY `id` HAVING (`purchase_count` > 1)"
Generally, this could be the SQL query that you need:
SELECT u.id, up1.purchase_date FROM users u
LEFT JOIN user_purchases up1 ON u.id = up1.user_id
LEFT JOIN user_purchases up2 ON u.id = up2.user_id AND up2.purchase_date < up1.purchase_date
GROUP BY u.id, up1.purchase_date
HAVING COUNT(up2.purchase_date) = 1;
Try converting that to sequel, if you don't get any better answers.
The date of the user's second purchase would be the second row retrieved if you do an order_by(:purchase_date) as part of your query.
To access that, do a limit(2) to constrain the query to two results then take the [-1] (or last) one. So, if you're not using models and are working with datasets only, and know the user_id you're interested in, your (untested) query would be:
DB[:user_purchases].where(:user_id => user_id).order_by(:user_purchases__purchase_date).limit(2)[-1]
Here's some output from Sequel's console:
DB[:user_purchases].where(:user_id => 1).order_by(:purchase_date).limit(2).sql
=> "SELECT * FROM user_purchases WHERE (user_id = 1) ORDER BY purchase_date LIMIT 2"
Add the appropriate select clause:
.select(:user_id, :purchase_date)
and you should be done:
DB[:user_purchases].select(:user_id, :purchase_date).where(:user_id => 1).order_by(:purchase_date).limit(2).sql
=> "SELECT user_id, purchase_date FROM user_purchases WHERE (user_id = 1) ORDER BY purchase_date LIMIT 2"

Find overlapping seasons where seasons have_many date_ranges

I have the following setup:
class Season < AR::Base
has_many :date_ranges
end
class DateRange < AR::Base
# has a :starts_at & :ends_at
end
How would I find all overlapping seasons from a season instance? I have already tried with a couple of different queries (below). But the problem I keep hitting is the fact that the season im checking for also possible has multiple date_ranges. I could solve it with a loop but i'd rather only use a query.
This query looks up all the seasons that overlap but it only does that for 1 input date_range
Season.joins(:date_ranges).where("starts_at <= ? AND ends_at >= ?", ends_at, starts_at)
Maybe I need something to chain a couple of OR's together for each date_range on the instance but where() only uses AND.
So in short, finding the overlap is not the problem, but how do I find overlap of multiple date_ranges to the entire database?
The easiest way to do this is through straight SQL. Something like this:
DateRange.find_by_sql(%q{
select a.*
from date_ranges a
join date_ranges b on
a.id < b.id
and (
(a.ends_at >= b.starts_at and a.ends_at <= b.ends_at)
or (a.starts_at >= b.starts_at and a.starts_at <= b.ends_at)
or (a.starts_at <= b.starts_at and a.ends_at >= b.ends_at)
)
where season_id = ?
}, season_id)
The basic idea is to join the table to itself so that you can easily compare the ranges. The a.id < b.id is there to get unique results and filter out "ranges matches itself" cases. The inner or conditions check for both types of overlaps:
[as-----ae] [as-----ae]
[bs-----be] [bs-----be]
and
[as--------------ae] [as----ae]
[bs----be] [bs--------------be]
You might want to think about the end points though, that query considers two intervals to overlap if they only match at an endpoint and that might not be what you want.
Presumably you already have a unique constraint on the (season_id, starts_at, ends_at) triples and presumably you're already ensuring that starts_at <= ends_at.

Resources