Select unique record with latest created date - ruby-on-rails

| id | user_id | created_at (datetime) |
| 1 | 1 | 17 May 2016 10:31:34 |
| 2 | 1 | 17 May 2016 12:41:54 |
| 3 | 2 | 18 May 2016 01:13:57 |
| 4 | 1 | 19 May 2016 07:21:24 |
| 5 | 2 | 20 May 2016 11:23:21 |
| 6 | 1 | 21 May 2016 03:41:29 |
How can I get the result of unique and latest created_at user_id record, which will be record id 5 and 6 in the above case?
What I have tried so far
So far I am trying to use group_by to return a hash like this:
Table.all.group_by(&:user_id)
#{1 => [record 1, record 2, record 4, record 6], etc}
And select the record with maximum date from it? Thanks.
Updated solution
Thanks to Gordon answer, I am using find_by_sql to use raw sql query in ror.
#table = Table.find_by_sql("Select distinct on (user_id) *
From tables
Order by user_id, created_at desc")
#To include eager loading with find_by_sql, we can add this
ActiveRecord::Associations::Preloader.new.preload(#table, :user)

In Postrgres, you can use DISTINCT ON:
SELECT DISTINCT ON (user_id) *
FROM tables
ORDER BY user_id, created_at DESC;
I am not sure how to express this in ruby.

Table
.select('user_id, MAX(created_at) AS created_at')
.group(:user_id)
.order('created_at DESC')
Notice created_at is passed in as string in order call, since it's a result of aggregate function, not a column value.

1) Extract unique users form the table
Table.all.uniq(:user_id)
2) Find all records of each user.
Table.all.uniq(:user_id).each {|_user_id| Table.where(user_id: _user_id)}
3) Select the latest created
Table.all.uniq(:user_id).each {|_user_id| Table.where(user_id: _user_id).order(:created_at).last.created_at}
4) Return result in form of: [[id, user_id], [id, user_id] ... ]
Table.all.uniq(:user_id).map{|_user_id| [Table.where(user_id: _user_id).order(:created_at).last.id, _user_id]}
This should return [[6,1], [2,5]]

Related

Rails 5 complex query with multiple sub queries

I'm on rails 5 using postgres and I have a users table and a reports table. Users have many reports, and these reports need to be created every day. I want to fetch all of the users that are not archived, that have not completed a report today, and show yesterdays report notes if available.
Here are the models:
Users Model
# == Schema Information
#
# Table name: users
#
# id :bigint(8) not null, primary key
# name :string not null
# created_at :datetime not null
# updated_at :datetime not null
# archived :boolean default(FALSE)
#
class User < ApplicationRecord
has_many :reports
end
Reports Model
# == Schema Information
#
# Table name: reports
#
# id :bigint(8) not null, primary key
# notes :text
# created_at :datetime not null
# updated_at :datetime not null
# user_id :bigint(8)
#
class Report < ApplicationRecord
belongd_to :user
end
Here is an example of what I want from this query:
Users Table
------------------------------------------------------------------------------------
| id | name | archived | created_at | updated_at |
------------------------------------------------------------------------------------
| 1 | Jonn | false | 2018-05-11 00:01:36.124999 | 2018-05-11 00:01:36.124999 |
------------------------------------------------------------------------------------
| 2 | Sam | false | 2018-05-11 00:01:36.124999 | 2018-05-11 00:01:36.124999 |
------------------------------------------------------------------------------------
| 3 | Ashley | true | 2018-05-11 00:01:36.124999 | 2018-05-11 00:01:36.124999 |
------------------------------------------------------------------------------------
Reports Table (Imagine this report was yesterdays)
--------------------------------------------------------------------------------------
| id | user_id | notes | created_at | updated_at |
--------------------------------------------------------------------------------------
| 1 | 1 | Nothing | 2018-06-13 16:32:05.139284 | 2018-06-13 16:32:05.139284 |
--------------------------------------------------------------------------------------
Desire output:
-------------------------------------------------------------------------------------------------------
| id | name | archived | created_at | updated_at | yesterdays_notes |
-------------------------------------------------------------------------------------------------------
| 1 | Jonn | false | 2018-05-11 00:01:36.124999 | 2018-05-11 00:01:36.124999 | Nothing |
-------------------------------------------------------------------------------------------------------
| 2 | Sam | false | 2018-05-11 00:01:36.124999 | 2018-05-11 00:01:36.124999 | NULL |
-------------------------------------------------------------------------------------------------------
I was able to get the desired query results writing raw SQL, but I have run into a lot of issues trying to convert it to an active record query. Would this be an appropriate scenario to use the scenic gem?
Here is the raw SQL query:
SELECT u.*, (
SELECT notes AS yesterdays_notes
FROM reports AS r
WHERE r.created_at >= '2018-06-13 04:00:00'
AND r.created_at <= '2018-06-14 03:59:59.999999'
AND r.user_id = u.id
)
FROM users AS u
WHERE u.archived = FALSE
AND u.id NOT IN (
SELECT rr.user_id
FROM reports rr
WHERE rr.created_at >= '2018-06-14 04:00:00'
);
Here is how I would do it:
First, select all active users with most recent reports created yesterday and assign it to a var:
users = User.where(archived: false).joins(:reports)
.where.not('DATE(reports.created_at) IN (?)', [Date.today])
.where('DATE(reports.created_at) IN (?)', [Date.yesterday])
.select('users.id', 'users.name', 'notes')
now the users var will have the attrs listed in .select available so you can call users.map(&:notes) to see the list of nodes, including nil / null notes.
another trick that may come in handy is the ability to alias the attrs your listed in .select. For example, if you want to store users.id as id, you can do so with
...
.select('users.id as id', 'users.name', 'reports.notes')
you can call users.map(&:attributes) to see what these final structs would look like
more info on available Active Record querying can be found here
users = User.joins(:reports).where("(reports.created_at < ? OR reports.created_at BETWEEN ? AND ?) AND (users.archived = ?)", Date.today.beginning_of_day, Date.yesterday.beginning_of_day, Date.yesterday.end_of_day, false).select("users.id, users.name, reports.notes").uniq
users will return as #<ActiveRecord::Relation [....]
Possibly joins returns duplicate records so use uniq
Filter reports
reports.created_at < Date.today.beginning_of_day OR yesterday.beginning_of_day > reports.created_at < Yesterday.end_of_day
which is required reports as "not completed a report today, and show yesterdays report notes if available"
And users.archived = false
I was able to get the desired results from the 2 answers posted here, however, after doing some more research I think using a database view using the scenic gem is the appropriate approach here so I am going to move forward with that.
Thank you for the input! If you want to see some of the reasoning behind my decision this stackoverflow post summarizes it nicely: https://stackoverflow.com/a/4378166/2909095
Here is what I ended up with using the scenic gem. I changed the actual query a little bit to fit my needs better but it resolves this answer:
Model
# == Schema Information
#
# Table name: missing_reports
#
# id :bigint(8)
# name :string
# created_at :datetime
# updated_at :datetime
# previous_notes :text
# previous_notes_date :datetime
class MissingReport < ApplicationRecord
end
Database View
SELECT u.*, rs.notes AS previous_notes, rs.created_at AS previous_notes_date
FROM users AS u
LEFT JOIN (
SELECT r.*
FROM reports AS r
WHERE r.created_at < TIMESTAMP 'today'
ORDER BY created_at DESC
LIMIT 1
) rs ON rs.user_id = u.id
WHERE u.archived = FALSE
AND u.id NOT IN (
SELECT rr.user_id
FROM standups rr
WHERE rr.created_at >= TIMESTAMP 'today'
);
Usage
def index
#missing_reports = MissingReport.all
end

What is the difference between count and select('DISTINCT COUNT(xxx)') in ActiveRecord?

I have two queries that are similar:
StoreQuery.group(:location).count(:name)
vs
StoreQuery.group(:location).select('DISTINCT COUNT(name)')
I was expecting the results to be exactly the same but they're not. What is the difference between the two?
The difference is that the first query counts all names, and the second query counts unique names, ignoring duplicates. They will return different numbers if you have some names listed more than once.
With this sample data
id | name | location |
---+------+----------+
1 | NULL | US
2 | A | UK
3 | A | UK
4 | B | AUS
Let check the generated queries the results
1st query
StoreQuery.group(:location).count(:name)
Generated query:
SELECT location, COUNT(name) AS count FROM store_queries GROUP BY location
Result:
{US => 0, UK => 2, AUS => 1}
2nd query
StoreQuery.group(:location).select('DISTINCT COUNT(name)')
Generated query:
SELECT DISTINCT COUNT(name) FROM store_queries GROUP BY location
Result:
ActiveRecord::Relation [StoreQuery count: 0, StoreQuery count: 1, StoreQuery count: 1]
# Mean {US => 0, UK => 1, AUS => 1}
So the differences will be:
|1st query | 2nd query |
|----------+-----------+
# returned fields| 2 | 1 |
distinction | no | yes |
Btw, rails supports this:
StoreQuery.group(:location).count(:name, distinct: true)

Display latest messages from messages table, group by user

I'm trying to create an inbox for messaging between users.
Here are the following tables:
Messsages
Id | Message_from | message_to | message
1 | 2 | 1 | Hi
2 | 2 | 1 | How are you
3 | 1 | 3 | Hola
4 | 4 | 1 | Whats up
5 | 1 | 4 | Just Chilling
6 | 5 | 1 | Bonjour
Users
Id | Name
1 | Paul
2 | John
3 | Tim
4 | Rob
5 | Sarah
6 | Jeff
I'd like to display an inbox showing the list of users that the person has communicated and the last_message from either users
Paul's Inbox:
Name | user_id | last_message
Sarah| 5 | bonjour
Rob | 4 | Just Chilling
Tim | 3 | Hola
John | 2 | How are you
How do I do this with Active Records?
This should be rather efficient:
SELECT u.name, sub.*
FROM (
SELECT DISTINCT ON (1)
m.message_from AS user_id
, m.message AS last_message
FROM users u
JOIN messages m ON m.message_to = u.id
WHERE u.name = 'Paul' -- must be unique
ORDER BY 1, m.id DESC
) sub
JOIN users u ON sub.user_id = u.id;
Compute all users with the latest message in the subquery sub using DISTINCT ON. Then join to
table users a second time to resolve the name.
Details for DISTINCT ON:
Select first row in each GROUP BY group?
Aside: Using "id" and "name" as column names is not a very helpful naming convention.
How about this:
#received_messages = current_user.messages_to.order(created_at: :desc).uniq
If you want to include messages from the user as well, you might have to do a union query, or two queries, then merge and join them. I'm just guessing with some pseudocode, here, but this should set you on your way.
received_messages = current_user.messages_to
sent_messages = current_user.messages_from
(received_messages + sent_messages).sort_by { |message| message[:created_at] }.reverse
This type of logic is belongs to a model, not the controller, so perhaps you can add this to the message model.
scope :ids_of_latest_per_user, -> { pluck('MAX(id)').group(:user_id) }
scope :latest_per_user, -> { where(:id => Message.latest_by_user) }
Message.latest_per_user

Select minimum value from a group in activerecord

I need to fetch like the winner bids, and a bid can be for a different date (don't ask why), so I need to select the bid with minimum bid price for each day.
Here are my models
Leilao
has_many :bids
Bid
belongs_to :leilao
#winner_name
#price
#date
I tried a solution already and got close to what I need. The problem is that, in some cases, when I create a new bid with lower price, I don't know why the results do not change.
Leilao.find(1).bids.having('MIN("bids"."price")').group('date').all
This seems to work, but as I said, it does not work in some cases when I create a new bid. But it worked properly once. So, if you do know what might be happening, please tell me.
I then searched for some way for doing this and I got the following
Leilao.find(1).bids.minimum(:price, :group => :date)
which works properly, but with this, I just fetch the dates and prices and I need all the bid data.
I could get it by doing this, but it feels really bad to me
winner_bids = Leilao.find(1).bids.minimum(:price, :group => :date)
winners_data = []
winner_bids.each do |date, price|
winners_data << Leilao.find(1).bids.where(price: price, date: date).first
end
winners_data
Any idea a better way to do this? Or what's wrong with my first approach?
Performance is not an issue, since this is just for academic propose but it just feels nasty for me
Also those Leilao.find(1) is just for explaining it here, I'm not using it allover the place, no.
Thanks in advance
see this
mysql> select * from bids;
+----+-------------+-------+------------+-----------+
| id | winner_name | price | date | leilao_id |
+----+-------------+-------+------------+-----------+
| 1 | A | 1.1 | 2012-06-01 | 1 |
| 2 | A | 2.2 | 2012-06-01 | 1 |
| 3 | A | 3.3 | 2012-05-31 | 1 |
| 4 | A | 4.4 | 2012-05-31 | 1 |
+----+-------------+-------+------------+-----------+
4 rows in set (0.00 sec)
mysql> select * from bids where leilao_id = 1 group by date order by price asc;
+----+-------------+-------+------------+-----------+
| id | winner_name | price | date | leilao_id |
+----+-------------+-------+------------+-----------+
| 1 | A | 1.1 | 2012-06-01 | 1 |
| 3 | A | 3.3 | 2012-05-31 | 1 |
+----+-------------+-------+------------+-----------+
2 rows in set (0.00 sec)
in rails
1.9.2-p290 :013 > Leilao.find(1).bids.order(:price).group(:date).all
[
[0] #<Bid:0x00000003553838> {
:id => 1,
:winner_name => "A",
:price => 1.1,
:date => Fri, 01 Jun 2012,
:leilao_id => 1
},
[1] #<Bid:0x00000003553518> {
:id => 3,
:winner_name => "A",
:price => 3.3,
:date => Thu, 31 May 2012,
:leilao_id => 1
}
]
As Amol said its not going to work, the SQL looks like
SELECT "whatever".* FROM "whatever" GROUP BY date ORDER BY price
But the group_by will be applied before order_by
I had this trouble when i had market table where is stored different types of records, i solved it by writing a class method, maybe not the best approach but working
def self.zobraz_trh(user)
markets = []
area = self.group(:area).all.map &:area
area.each do |market|
markets << self.where(["area = ? and user_id != ?", market,user]).order(:price).first
end
markets
end
where self is Market class

Does it make sense to convert DB-ish queries into Rails ActiveRecord Model lingo?

mysql> desc categories;
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(80) | YES | | NULL | |
+-------+-------------+------+-----+---------+----------------+
mysql> desc expenses;
+-------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| created_at | datetime | NO | | NULL | |
| description | varchar(100) | NO | | NULL | |
| amount | decimal(10,2) | NO | | NULL | |
| category_id | int(11) | NO | MUL | 1 | |
+-------------+---------------+------+-----+---------+----------------+
Now I need the top N categories like this...
Expense.find_by_sql("SELECT categories.name, sum(amount) as total_amount
from expenses
join categories on category_id = categories.id
group by category_id
order by total_amount desc")
But this is nagging at my Rails conscience.. it seems that it may be possible to achieve the same thing via Expense.find and supplying options like :group, :joins..
Can someone translate this query into ActiveRecord Model speak ?
Is it worth it... Personally i find the SQL more readable and gets my job done faster.. maybe coz I'm still learning Rails. Any advantages with not embedding SQL in source code (apart from not being able to change DB vendors..sql flavor, etc.)?
Seems like find_by_sql doesn't have the bind variable provision like find. What is the workaround? e.g. if i want to limit the number of records to a user-specified limit.
Expense.find(:all,
:select => "categories.name name, sum(amount) total_amount",
:joins => "categories on category_id = categories.id",
:group => "category_id",
:order => "total_amount desc")
Hope that helps!
Seems like find_by_sql doesn't have the bind variable provision like find.
It sure does. (from the Rails docs)
# You can use the same string replacement techniques as you can with ActiveRecord#find
Post.find_by_sql ["SELECT title FROM posts WHERE author = ? AND created > ?", author_id, start_date]
Well this is the code that finally worked for me.. (Francois.. the resulting sql stmt was missing the join keyword)
def Expense.get_top_n_categories options={}
#sQuery = "SELECT categories.name, sum(amount) as total_amount
# from expenses
# join categories on category_id = categories.id
# group by category_id
# order by total_amount desc";
#sQuery += " limit #{options[:limit].to_i}" if !options[:limit].nil?
#Expense.find_by_sql(sQuery)
query_options = {:select => "categories.name name, sum(amount) total_amount",
:joins => "inner join categories on category_id = categories.id",
:group => "category_id",
:order => "total_amount desc"}
query_options[:limit] = options[:limit].to_i if !options[:limit].nil?
Expense.find(:all, query_options)
end
find_by_sql does have rails bind variable... I don't know how I overlooked that.
Finally is the above use of user-specified a potential entry point for sql-injection or does the to_i method call prevent that?
Thanks for all the help. I'm grateful.

Resources