Get most recent distinct records - ruby-on-rails

Every resort has many snow reports.
I want to get the most recent snow report for every resort where the field snow_summit in the table snow_reports should be > 0 .
So I tried to select distinct on resort_id which is the fkey in snow_reports and order by updated_at but this it not possible since updated_at does not occur in the select.
So how do I get only the most recent records of an associated mode in rails4 (on postgres)?
SnowReport belongs_to Resort
Resort has_many snow_reports
Table snow_reports has id,resort_id,updated_at,snow_summit
Ideally the result is joined for performance reasons.
My approach fails
SnowReport.includes(:resort).select(:resort_id).group(:resort_id).having('max(snow_summit)> 0').order('max(snow_reports.updated_at) DESC')
since SnowReport.id is nil
#<ActiveRecord::Relation [#<SnowReport id: nil, resort_id: 1735>, ...
edit:
I found a solution in plain sql.
How can I transform this to rails ?
select * from resorts where id in (select distinct(resort_id) from snow_reports where snow_summit > 0 and created_at > (now()- interval '3 days')
and created_at in (select max(created_at) from snow_reports group by resort_id));

Try this:
SnowReport.includes(:resort).select(:id, :resort_id).where("snow_summit > ?", 0).order("updated_at DESC").group(:resort_id)
It may work.

Related

How to delete all logs except last 100 for each user in single table?

I have a single logs table which contains entries for users. I want to (prune) delete all but the last 100 for each user. I'd like to do this in the most efficient way (one statement using ActiveRecord if possible).
I know I can use the following:
.order(created_at: :desc) to get the records sorted
.offset(100) to get all records except the ones I want to keep
.ids to pluck the record ids
select(:user_id).distinct to get a list of all users in the table
The table has id, user_id, created_at columns (and others not pertinent to this question).
Each user should have at least the last 100 log entries remaining the logs table.
Not really sure how to do this using ruby syntax with my Log model. If it can't be done efficiently using ruby then I'll resort to using the SQL equivalent.
Any help much appreciated.
In SQL, you could do this:
DELETE FROM logs
USING (SELECT id
FROM (SELECT id,
row_number()
OVER (PARTITION BY user_id
ORDER BY created_at DESC)
AS rownr
FROM logs
) AS a
WHERE rownr > 100
) AS b
WHERE logs.id = b.id;
If the table is large, this will be slow.

Order by a date of the first hasMany child of a model

I have the following Structure for events and there recurrences
Event: (id, name, venue)
has_many :occurences
Occurence(id, date, event_id)
belongs_to :event
I want to get events (only event data) that have a recurrence greater than today
(occurences.date>Date.Today)
Events should be ordered by the date of their next recurrence(greater than today) in chronological order
The following query gives me event data alright but it doesn't let me order
Event.joins(:occurences).where("occurences.date>?",DateTime.now).distinct#.order('occurences.date')
but I can't order it since it says
Expression #1 of ORDER BY clause is not in SELECT list, references
column 'eventdatabase.occurences.starts'
I need to use distinct to ensure I get only one event regardless of how many occurences it has
I am using mysql and rails5
I'm interpreting
Events should be ordered by the date of their next recurrence(greater than today) in chronological order
as earliest occurance.date.
Event.joins(:occurences)
.where("occurences.date>?",DateTime.now)
.group(:id)
.select('events.*, min(occurences.date) as next_recurrence')
.order('next_recurrence')
A group by is like a more powerful distinct. You will get one row (for one "group"). Technically we should group by events.*, but mysql will cheat and let us group by a primary key to do the same thing.
When doing a group by, the aggregate functions, such as min work on the group.
For this task the SQL query may look like:
SELECT E.id AS event_id, MIN(OC.date) AS next_date
FROM events E
JOIN occurences OC ON OC.event_id = E.id
WHERE OC.date > NOW()
GROUP BY E.id
ORDER BY MIN(OC.date);
Here is sandbox: http://rextester.com/HCTE57488
So I guess the Ruby code will be:
Event.joins(:occurences)
.where('occurences.date > ?', DateTime.now)
.group('events.id')
.order('min(occurences.date)')

PostgreSQL in Rails: sorting object by two date attributes in descending order

I have an index of active job positions. Currently, they're sorted by the most recent i.e. created_at. However, recently i've added in a renewal feature that updates a renewal_date attribute without updating the created_at.
What I want to achieve is to sort the list in descending order using both renewal_date and created_at.
jobs = Job.where(active: true).reorder("renewal_date DESC NULLS LAST", "created_at DESC")
With this code, the renewed job will always be at the top regardless of how many new jobs are created. How do I sort it so it checks for the date for both attributes and sorts it according to most recent?
Your code will order first by renewal_date with nulls at the end, and then will look at the created_at if two records have the same renewal_date.
I assume that what you want to do is something like "max(renewal_date, created_at)", which will take the "last modification date", or another custom way to compare the two fields.
If then, you can find your answer here : merge and order two columns in the same model
Job.where(active: true).reorder('GREATEST(renewal_date, created_at) DESC')
Let try a standard SQL, so it can work with all types of database:
Job.where(active: true).order('CASE WHEN renewal_date IS NULL THEN created_at ELSE renewal_date END DESC')

Order with DISTINCT ids in rails with postgres

I have the following code to join two tables microposts and activities with micropost_id column and then order based on created_at of activities table with distinct micropost id.
Micropost.joins("INNER JOIN activities ON
(activities.micropost_id = microposts.id)").
where('activities.user_id= ?',id).order('activities.created_at DESC').
select("DISTINCT (microposts.id), *")
which should return whole micropost columns.This is not working in my developement enviornment.
(PG::InvalidColumnReference: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list
If I add activities.created_at in SELECT DISTINCT, I will get repeated micropost ids because the have distinct activities.created_at column. I have done a lot of search to reach here. But the problem always persist because of this postgres condition to avoid random selection.
I want to select based on order of activities.created_at with distinct micropost _id.
Please help..
To start with, we need to quickly cover what SELECT DISTINCT is actually doing. It looks like just a nice keyword to make sure you only get back distinct values, which shouldn't change anything, right? Except as you're finding out, behind the scenes, SELECT DISTINCT is actually acting more like a GROUP BY. If you want to select distinct values of something, you can only order that result set by the same values you're selecting -- otherwise, Postgres doesn't know what to do.
To explain where the ambiguity comes from, consider this simple set of data for your activities:
CREATE TABLE activities (
id INTEGER PRIMARY KEY,
created_at TIMESTAMP WITH TIME ZONE,
micropost_id INTEGER REFERENCES microposts(id)
);
INSERT INTO activities (id, created_at, micropost_id)
VALUES (1, current_timestamp, 1),
(2, current_timestamp - interval '3 hours', 1),
(3, current_timestamp - interval '2 hours', 2)
You stated in your question that you want "distinct micropost_id" "based on order of activities.created_at". It's easy to order these activities by descending created_at (1, 3, 2), but both 1 and 2 have the same micropost_id of 1. So if you want the query to return just micropost IDs, should it return 1, 2 or 2, 1?
If you can answer the above question, you need to take your logic for doing so and move it into your query. Let's say that, and I think this is pretty likely, you want this to be a list of microposts which were most recently acted on. In that case, you want to sort the microposts in descending order of their most recent activity. Postgres can do that for you, in a number of ways, but the easiest way in my mind is this:
SELECT micropost_id
FROM activities
JOIN microposts ON activities.micropost_id = microposts.id
GROUP BY micropost_id
ORDER BY MAX(activities.created_at) DESC
Note that I've dropped the SELECT DISTINCT bit in favor of using GROUP BY, since Postgres handles them much better. The MAX(activities.created_at) bit tells Postgres to, for each group of activities with the same micropost_id, sort by only the most recent.
You can translate the above to Rails like so:
Micropost.select('microposts.*')
.joins("JOIN activities ON activities.micropost_id = microposts.id")
.where('activities.user_id' => id)
.group('microposts.id')
.order('MAX(activities.created_at) DESC')
Hope this helps! You can play around with this sqlFiddle if you want to understand more about how the query works.
Try the below code
Micropost.select('microposts.*, activities.created_at')
.joins("INNER JOIN activities ON (activities.micropost_id = microposts.id)")
.where('activities.user_id= ?',id)
.order('activities.created_at DESC')
.uniq

Ordering records by frequency with Arel

How do I retrieve a set of records, ordered by count in Arel? I have a model which tracks how many views a product get. I want to find the X most frequently viewed products over the last Y days.
This problem has cropped up while migrating to PostgreSQL from MySQL, due to MySQL being a bit forgiving in what it will accept. This code, from the View model, works with MySQL, but not PostgreSQL due to non-aggregated columns being included in the output.
scope :popular, lambda { |time_ago, freq|
where("created_on > ?", time_ago).group('product_id').
order('count(*) desc').limit(freq).includes(:product)
}
Here's what I've got so far:
View.select("id, count(id) as freq").where('created_on > ?', 5.days.ago).
order('freq').group('id').limit(5)
However, this returns the single ID of the model, not the actual model.
Update
I went with:
select("product_id, count(id) as freq").
where('created_on > ?', time_ago).
order('freq desc').
group('product_id').
limit(freq)
On reflection, it's not really logical to expect a complete model when the results are made up of GROUP BY and aggregate functions results, as returned data will (most likely) match no actual model (row).
you have to extend your select clause with all column you wish to retrieve. or
select("views.*, count(id) as freq")
SQL would be:
SELECT product_id, product, count(*) as freq
WHERE created_on > '$5_days_ago'::timestamp
GROUP BY product_id, product
ORDER BY count(*) DESC, product
LIMIT 5;
Extrapolating from your example, it should be:
View.select("product_id, product, count(*) as freq").where('created_on > ?', 5.days.ago).
order("count(*) DESC" ).group('product_id, product').limit(5)
Disclaimer: Ruby syntax is a foreign language to me.

Resources