How to get the top 5 per enum of a model in rails? - ruby-on-rails

Let's say I have a model named post, which has an enum named post_type which can either be
admin, public or user
#app/models/post.rb
class Post < ApplicationRecord
enum post_type: [ :admin, :public, :user ]
end
How can I select 5 last created posts from each category?
I can't think of any other solution than this:
PER_GROUP = 5
admin_posts = Post.admin.order(created_at: :desc).limit(PER_GROUP)
user_posts = Post.user.order(created_at: :desc).limit(PER_GROUP)
public_posts = Post.public.order(created_at: :desc).limit(PER_GROUP)
Is there any way I could fetch all the rows in the required manner from just a single query to the database.
STACK
RAILS : 6
PostgresSQL: 9.4

I am not sure how to translate into RAILS, but it is straight forward Postgres query. You use the row_number window function in a sub-select then keep only rows with row_number less than or equal 5 on the outer select.
select *
from (select post_txt
, posted_type
, row_number() over (partition by posted_type) rn
from enum_table
) pt
where rn <= 5
order by posted_type;
One thing to look out for is the sorting on an enum. Doing so gives results in order of the definition, not a "natural order" (alphanumeric in this case). See example here.

Thanks to #Belayer i was able to come up with a solution.
PER_GROUP = 5
sub_query = Post.select('*', 'row_number() over (partition by "posts"."post_type" ORDER BY posts.created_at DESC ) rn').to_sql
#posts = Post.from("(#{sub_query}) inner_query")
.where('inner_query.rn <= ?', PER_GROUP')
.order(:post_type, created_at: :desc)
.group_by(&:post_type)
Since i am only loading 5 records across just a few different types group_by will work just fine for me.

Related

Update models based on association record value

I have thousands of price comparators where each of them have many products. The comparator has an attribute :minimum_price which is the minimum price of it's products. What would be the fastest way to update all comparators :minimum_price
Comparator.rb
has_many :products
Product.rb
belongs_to :comparator
Let's imagine the following:
comparator_1 have 3 products with a price of 3, 5, 7
comparator_2 have 2 products with a price of 2, 4
How could I update all comparators :minimum_price in one query ?
Updating all in one query will require the use of a CTE which are not supported by default by ActiveRecord. There are libraries that provide you with tools to use them in Rails (e.g. this) or you can also do it with a direct query like this:
ActiveRecord::Base.connection.execute("
update comparators set minimum_price = min_vals.min_price
from (
select comparators.id as comp_id, min(products.price) as min_price
from comparators inner join products on comparators.id = products.comparator_id
group by comparators.id
) as min_vals
where comparators.id = min_vals.comp_id
")
NOTE: This is a postgresql query, so the syntax may vary slightly if it's a different database.
Try this, but I don't know how you store your minimum values.
Comparator.update_all([
'minimum_price = ?, updated_at = ?',
Product.find(self.product_id).price, Time.now
])

ActiveRecord sort model on attribute of last has_many relation

I've been digging around for this for awhile... I can't find a graceful solution. I have loans and loans has_many :decisions. decisions has an attribute that I care about, called risk_rating.
I'd like to sort loans based on the most recent decision (based on created_at, per usual), but by the risk_rating.
Loan.includes(:decisions).references(:decisions).order('decisions.risk_rating DESC') doesn't work...
I want loans... sorted by their most recent decision's risk_rating. This seems like it should be easier than it is.
I'm currently doing this outside of the database like this, but it's chewing up time and memory:
Loan.all.sort do |x,y|
x.decisions.last.try(:risk_rating).to_f <=> y.decisions.last.try(:risk_rating).to_f
end
I'd like to show the performance I'm getting with the proposed answer, along with an inaccuracy...
Benchmark.bm do |x|
x.report{ Loan.joins('LEFT JOIN decisions ON decisions.loan_id = loans.id').group('loans.id').order('MAX(decisions.risk_rating) DESC').limit(10).map{|l| l.decisions.last.try(:risk_rating)} }
end
user system total real
0.020000 0.000000 0.020000 ( 20.573096)
=> [0.936775, 0.934465, 0.932088, 0.922352, 0.921882, 0.794724, 0.919432, 0.918385, 0.916952, 0.914938]
The order isn't right. That 0.794724 is out of place.
To that extent... I'm only seeing one attribute in the proposed answer. I don't see the connection =/
Alright, it looks like I'm working late tonight because I couldn't help but jump in:
class Loan < ApplicationRecord
has_many :decisions
has_one :latest_decision, -> { merge(Decision.latest) }, class_name: 'Decision'
end
class Decision < ApplicationRecord
belongs_to :loan
def latest
t1 = arel_table
t2 = arel_table.alias('t2')
# Self join based on `loan_id` prefer latest `created_at`
join_on = t1[:loan_id].eq(t2[:loan_id]).and(
t1[:created_at].lt(t2[:created_at]))
where(t2[:loan_id].eq(nil)).joins(
t1.create_join(t2, t1.create_on(join_condition), Arel::Nodes::OuterJoin)
)
end
end
Loan.includes(:latest_decision)
This doesn't sort, just provides the latest decision for each loan. Throwing an order that references access_codes messes things up because of the table aliasing. I don't have the time to work that kink out now, but I bet you can figure it out if you check out some of the great resources on Arel and how to use it with ActiveRecord. I really enjoy this one.
At first let's write sql-query which will select necessary data. SO contains a question which may helps here: Select most recent row with GROUP BY in MySQL. My best version:
SELECT loans.*
FROM loans
LEFT JOIN (
SELECT loan_id, MAX(id) as id
FROM decisions
GROUP BY loan_id) d ON d.loan_id = loans.id
LEFT JOIN decisions ON decisions.id = d.id
ORDER BY decisions.risk_rating DESC
This code suppose MAX(id) gives id of the recent row in group.
You may do the same query by this Rails code:
sub_query =
Decision.select('loan_id, MAX(id) as id').
group(:loan_id).to_sql
Loan.
joins("LEFT JOIN (#{sub_query}) d ON d.loan_id = loans.id").
joins("LEFT JOIN decisions ON decisions.id = d.id").
order("decisions.risk_rating DESC")
Unfortunately, I don't have MySQL at hand and I can't try this code. Hope it will work.

Sorting 2 arrays that have been added together

In my app, users can create galleries that their work may or may not be in. Users have and belong to many Galleries, and each gallery has a 'creator' that is designated by the gallery's user_id field.
So to get the 5 latest galleries a user is in, I can do something like:
included_in = #user.galleries.order('created_at DESC').uniq.first(5)
# SELECT DISTINCT "galleries".* FROM "galleries" INNER JOIN "galleries_users" ON "galleries"."id" = "galleries_users"."gallery_id" WHERE "galleries_users"."user_id" = 10 ORDER BY created_at DESC LIMIT 5
and to get the 5 latest galleries they've created, I can do:
created = Gallery.where(user_id: id).order('created_at DESC').uniq.first(5)
# SELECT DISTINCT "galleries".* FROM "galleries" WHERE "galleries"."user_id" = 10 ORDER BY created_at DESC LIMIT 5
I want to display these two together, so that it's the 5 latest galleries that they've created OR they're in. Something like the equivalent of:
(included_in + created).order('created_at DESC').uniq.first(5)
Does anyone know how to construct an efficient query or post-query loop that does this?
order isn't available, but you can use sort - or sort_by as suggested by jvnill in this answer
As you've stated, you can use the following code:
(included_in + created).sort_by(&:created_at).uniq.first(5)
How about:
Gallery.joins("LEFT JOIN galleries_users ON galleries.id = galleries_users.gallery_id")
.where("galleries_users.user_id = :user_id OR galleries.user_id = :user_id", user_id: id)
.order("galleries.created_at DESC")
.limit(5)
I'm assuming the name of your HABTM join table is galleries_users (which is the default).
Union the two queries and select the order from the union.

How to get records based on an offset around a particular record?

I'm building a search UI which searches for comments. When a user clicks on a search result (comment), I want to show the surrounding comments.
My model:
Group (id, title) - A Group has many comments
Comment (id, group_id, content)
For example:
When a user clicks on a comment with comment.id equal to 26. I would first find all the comments for that group:
comment = Comment.find(26)
comments = Comment.where(:group_id => comment.group_id)
I now have all of the group's comments. What I then want to do is show comment.id 26, with a max of 10 comments before and 10 comments after.
How can I modify comments to show that offset?
Sounds simple, but it's tricky to get the best performance for this. In any case, you must let the database do the work. That will be faster by an order of magnitude than fetching all rows and filter / sort on the client side.
If by "before" and "after" you mean smaller / bigger comment.id, and we further assume that there can be gaps in the id space, this one query should do all:
WITH x AS (SELECT id, group_id FROM comment WHERE id = 26) -- enter value once
(
SELECT *
FROM x
JOIN comment c USING (group_id)
WHERE c.id > x.id
ORDER BY c.id
LIMIT 10
)
UNION ALL
(
SELECT *
FROM x
JOIN comment c USING (group_id)
WHERE c.id < x.id
ORDER BY c.id DESC
LIMIT 10
)
I'll leave paraphrasing that in Ruby syntax to you, that's not my area of expertise.
Returns 10 earlier comments and 10 later ones. Fewer if fewer exist. Use <= in the 2nd leg of the UNION ALL query to include the selected comment itself.
If you need the rows sorted, add another query level on top with ORDER BY.
Should be very fast in combination with these two indexes for the table comment:
one on (id) - probably covered automatically the primary key.
one on (group_id, id)
For read-only data you could create a materialized view with a gap-less row-number that would make this even faster.
More explanation about parenthesis, indexes, and performance in this closely related answer.
Something like:
comment = Comment.find(26)
before_comments = Comment.
where('created_at <= ?', comment.created_at).
where('id != ?', comment.id).
where(group_id: comment.group_id).
order('created_at DESC').limit(10)
after_comments = Comment.
where('created_at >= ?', comment.created_at).
where('id != ?', comment.id).
where(group_id: comment.group_id).
order('created_at DESC').limit(10)

Sequel -- How To Construct This Query?

I have a users table, which has a one-to-many relationship with a user_purchases table via the foreign key user_id. That is, each user can make many purchases (or may have none, in which case he will have no entries in the user_purchases table).
user_purchases has only one other field that is of interest here, which is purchase_date.
I am trying to write a Sequel ORM statement that will return a dataset with the following columns:
user_id
date of the users SECOND purchase, if it exists
So users who have not made at least 2 purchases will not appear in this dataset. What is the best way to write this Sequel statement?
Please note I am looking for a dataset with ALL users returned who have >= 2 purchases
Thanks!
EDIT FOR CLARITY
Here is a similar statement I wrote to get users and their first purchase date (as opposed to 2nd purchase date, which I am asking for help with in the current post):
DB[:users].join(:user_purchases, :user_id => :id)
.select{[:user_id, min(:purchase_date)]}
.group(:user_id)
You don't seem to be worried about the dates, just the counts so
DB[:user_purchases].group_and_count(:user_id).having(:count > 1).all
will return a list of user_ids and counts where the count (of purchases) is >= 2. Something like
[{:count=>2, :user_id=>1}, {:count=>7, :user_id=>2}, {:count=>2, :user_id=>3}, ...]
If you want to get the users with that, the easiest way with Sequel is probably to extract just the list of user_ids and feed that back into another query:
DB[:users].where(:id => DB[:user_purchases].group_and_count(:user_id).
having(:count > 1).all.map{|row| row[:user_id]}).all
Edit:
I felt like there should be a more succinct way and then I saw this answer (from Sequel author Jeremy Evans) to another question using select_group and select_more : https://stackoverflow.com/a/10886982/131226
This should do it without the subselect:
DB[:users].
left_join(:user_purchases, :user_id=>:id).
select_group(:id).
select_more{count(:purchase_date).as(:purchase_count)}.
having(:purchase_count > 1)
It generates this SQL
SELECT `id`, count(`purchase_date`) AS 'purchase_count'
FROM `users` LEFT JOIN `user_purchases`
ON (`user_purchases`.`user_id` = `users`.`id`)
GROUP BY `id` HAVING (`purchase_count` > 1)"
Generally, this could be the SQL query that you need:
SELECT u.id, up1.purchase_date FROM users u
LEFT JOIN user_purchases up1 ON u.id = up1.user_id
LEFT JOIN user_purchases up2 ON u.id = up2.user_id AND up2.purchase_date < up1.purchase_date
GROUP BY u.id, up1.purchase_date
HAVING COUNT(up2.purchase_date) = 1;
Try converting that to sequel, if you don't get any better answers.
The date of the user's second purchase would be the second row retrieved if you do an order_by(:purchase_date) as part of your query.
To access that, do a limit(2) to constrain the query to two results then take the [-1] (or last) one. So, if you're not using models and are working with datasets only, and know the user_id you're interested in, your (untested) query would be:
DB[:user_purchases].where(:user_id => user_id).order_by(:user_purchases__purchase_date).limit(2)[-1]
Here's some output from Sequel's console:
DB[:user_purchases].where(:user_id => 1).order_by(:purchase_date).limit(2).sql
=> "SELECT * FROM user_purchases WHERE (user_id = 1) ORDER BY purchase_date LIMIT 2"
Add the appropriate select clause:
.select(:user_id, :purchase_date)
and you should be done:
DB[:user_purchases].select(:user_id, :purchase_date).where(:user_id => 1).order_by(:purchase_date).limit(2).sql
=> "SELECT user_id, purchase_date FROM user_purchases WHERE (user_id = 1) ORDER BY purchase_date LIMIT 2"

Resources