I have a table of Projects, in that table, there may be the same project multiple times with the same name, however the created_at month will be different. I'm trying to select the most recent record of each project in my table. Using select works, however I need the entire record so that then I can loop through the records and print out different attributes eg price or what not.
I've tried:
Project.distinct(:project_name) – Prints all records (to check this I copied the project name and did a find and all projects with the identical name would still print out)
Project.order(project_name: :asc, created_at: :desc).uniq(:project_name) – Same result as above
Project.select(:project_name).distinct – Pulls only 1 of each, however it only selects the project name and no other data from the record.
This is the case where DISTINCT ON comes to rescue.
This should work:
Project.select("DISTINCT ON (project_name) *").order("project_name, created_at DESC")
for selecting only particular columns specify them instead of *.
Project.select("DISTINCT ON (project_name) project_name, created_at, price").order("project_name, created_at DESC")
Related
I have a single logs table which contains entries for users. I want to (prune) delete all but the last 100 for each user. I'd like to do this in the most efficient way (one statement using ActiveRecord if possible).
I know I can use the following:
.order(created_at: :desc) to get the records sorted
.offset(100) to get all records except the ones I want to keep
.ids to pluck the record ids
select(:user_id).distinct to get a list of all users in the table
The table has id, user_id, created_at columns (and others not pertinent to this question).
Each user should have at least the last 100 log entries remaining the logs table.
Not really sure how to do this using ruby syntax with my Log model. If it can't be done efficiently using ruby then I'll resort to using the SQL equivalent.
Any help much appreciated.
In SQL, you could do this:
DELETE FROM logs
USING (SELECT id
FROM (SELECT id,
row_number()
OVER (PARTITION BY user_id
ORDER BY created_at DESC)
AS rownr
FROM logs
) AS a
WHERE rownr > 100
) AS b
WHERE logs.id = b.id;
If the table is large, this will be slow.
Looking for a way to get all of the related records for a resource and have them all distinct by a column(:author_id) and then ordered by the created_at column.
There are several similar questions on here, however I do not see any that include this ordering that I need. Trying to avoid converting to an array.
Post.find(id)
.ratings
.select(
"DISTINCT on (author_id) author_id, rating_number, created_at, id")
.order(created_at: :desc)
The first part works, but when introduce the order, it gives me an error to add created_at into the DISTINCT clause. It then works however I then get records that are not distinct on just :author_id.
I am trying to query my PostgreSQL database to get the latest (by created_at) and distinct (by user_id) Activity objects, where each user has multiple activities in the database. The activity object is structured as such:
Activity(id, user_id, created_at, ...)
I first tried to get the below query to work:
Activity.order('created_at DESC').select('DISTINCT ON (activities.user_id) activities.*')
however, kept getting the below error:
ActiveRecord::StatementInvalid: PG::InvalidColumnReference: ERROR: SELECT DISTINCT ON expressions must match initial ORDER BY expressions
According to this post: PG::Error: SELECT DISTINCT, ORDER BY expressions must appear in select list, it looks like The ORDER BY clause can only be applied after the DISTINCT has been applied. This does not help me, as I want to get the distinct activities by user_id, but also want the activities to be the most recently created activities. Thus, I need the activities to be sorted before getting the distinct activities.
I have come up with a solution that works, but first grouping the activities by user id, and then ordering the activities within the groups by created_at. However, this takes two queries to do.
I was wondering if what I want is possible in just one query?
This should work, try the following
Solution 1
Activity.select('DISTINCT ON (activities.user_id) activities.*').order('created_at DESC')
Solution 2
If not work Solution 1 then this is helpful if you create a scope for this
activity model
scope :latest, -> {
select("distinct on(user_id) activities.user_id,
activities.*").
order("user_id, created_at desc")
}
Now you can call this anywhere like below
Activity.latest
Hope it helps
I have an index of active job positions. Currently, they're sorted by the most recent i.e. created_at. However, recently i've added in a renewal feature that updates a renewal_date attribute without updating the created_at.
What I want to achieve is to sort the list in descending order using both renewal_date and created_at.
jobs = Job.where(active: true).reorder("renewal_date DESC NULLS LAST", "created_at DESC")
With this code, the renewed job will always be at the top regardless of how many new jobs are created. How do I sort it so it checks for the date for both attributes and sorts it according to most recent?
Your code will order first by renewal_date with nulls at the end, and then will look at the created_at if two records have the same renewal_date.
I assume that what you want to do is something like "max(renewal_date, created_at)", which will take the "last modification date", or another custom way to compare the two fields.
If then, you can find your answer here : merge and order two columns in the same model
Job.where(active: true).reorder('GREATEST(renewal_date, created_at) DESC')
Let try a standard SQL, so it can work with all types of database:
Job.where(active: true).order('CASE WHEN renewal_date IS NULL THEN created_at ELSE renewal_date END DESC')
I have the following code to join two tables microposts and activities with micropost_id column and then order based on created_at of activities table with distinct micropost id.
Micropost.joins("INNER JOIN activities ON
(activities.micropost_id = microposts.id)").
where('activities.user_id= ?',id).order('activities.created_at DESC').
select("DISTINCT (microposts.id), *")
which should return whole micropost columns.This is not working in my developement enviornment.
(PG::InvalidColumnReference: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list
If I add activities.created_at in SELECT DISTINCT, I will get repeated micropost ids because the have distinct activities.created_at column. I have done a lot of search to reach here. But the problem always persist because of this postgres condition to avoid random selection.
I want to select based on order of activities.created_at with distinct micropost _id.
Please help..
To start with, we need to quickly cover what SELECT DISTINCT is actually doing. It looks like just a nice keyword to make sure you only get back distinct values, which shouldn't change anything, right? Except as you're finding out, behind the scenes, SELECT DISTINCT is actually acting more like a GROUP BY. If you want to select distinct values of something, you can only order that result set by the same values you're selecting -- otherwise, Postgres doesn't know what to do.
To explain where the ambiguity comes from, consider this simple set of data for your activities:
CREATE TABLE activities (
id INTEGER PRIMARY KEY,
created_at TIMESTAMP WITH TIME ZONE,
micropost_id INTEGER REFERENCES microposts(id)
);
INSERT INTO activities (id, created_at, micropost_id)
VALUES (1, current_timestamp, 1),
(2, current_timestamp - interval '3 hours', 1),
(3, current_timestamp - interval '2 hours', 2)
You stated in your question that you want "distinct micropost_id" "based on order of activities.created_at". It's easy to order these activities by descending created_at (1, 3, 2), but both 1 and 2 have the same micropost_id of 1. So if you want the query to return just micropost IDs, should it return 1, 2 or 2, 1?
If you can answer the above question, you need to take your logic for doing so and move it into your query. Let's say that, and I think this is pretty likely, you want this to be a list of microposts which were most recently acted on. In that case, you want to sort the microposts in descending order of their most recent activity. Postgres can do that for you, in a number of ways, but the easiest way in my mind is this:
SELECT micropost_id
FROM activities
JOIN microposts ON activities.micropost_id = microposts.id
GROUP BY micropost_id
ORDER BY MAX(activities.created_at) DESC
Note that I've dropped the SELECT DISTINCT bit in favor of using GROUP BY, since Postgres handles them much better. The MAX(activities.created_at) bit tells Postgres to, for each group of activities with the same micropost_id, sort by only the most recent.
You can translate the above to Rails like so:
Micropost.select('microposts.*')
.joins("JOIN activities ON activities.micropost_id = microposts.id")
.where('activities.user_id' => id)
.group('microposts.id')
.order('MAX(activities.created_at) DESC')
Hope this helps! You can play around with this sqlFiddle if you want to understand more about how the query works.
Try the below code
Micropost.select('microposts.*, activities.created_at')
.joins("INNER JOIN activities ON (activities.micropost_id = microposts.id)")
.where('activities.user_id= ?',id)
.order('activities.created_at DESC')
.uniq