How do I write a Rails finder method that will return the greatest date grouped by record? - ruby-on-rails

I'm using Rails 5 with PostGres 9.5. I have a table that tracks prices ...
Table "public.crypto_prices"
Column | Type | Modifiers
--------------------+-----------------------------+------------------------------------------------------------
id | integer | not null default nextval('crypto_prices_id_seq'::regclass)
crypto_currency_id | integer |
market_cap_usd | bigint |
total_supply | bigint |
last_updated | timestamp without time zone |
created_at | timestamp without time zone | not null
updated_at | timestamp without time zone | not null
I would like to get the latest price per currency (where last_updated is greatest) for a select currencies. I can find all the prices related to certain currencies like so
current_prices = CryptoPrice.where(crypto_currency_id: CryptoIndexCurrency.all.pluck(:crypto_currency_id).uniq)
Then I can sort them by currency into arrays, looping through each until I find the one with the greatest last_updated value, but how can I write a finder that will return exactly one row per currency with the greatest last_updated date?
Edit: Tried Owl Max's suggestion like so
ids = CryptoIndexCurrency.all.pluck(:crypto_currency_id).uniq
crypto_price_ids = CryptoPrice.where(crypto_currency_id: ids).group(:crypto_currency_id).maximum(:last_updated).keys
puts "price ids: #{crypto_price_ids.length}"
#crypto_prices = CryptoPrice.where(crypto_currency_id: crypto_price_ids)
puts "ids: #{#crypto_prices.size}"
Although the first "puts" only reveals a size of "12" the second puts reveals over 38,000 results. It should only be returning 12 results, one for each currency.

We can write a finder that will return exactly one row per currency with the greatest last_updated date in such a way like
current_prices = CryptoPrice.where(crypto_currency_id: CryptoIndexCurrency.all.pluck(:crypto_currency_id).uniq).select("*, id as crypto_price_id, MAX(last_updated) as last_updated").group(:crypto_currency_id)
I hope that this will took you closer to your goal. Thank you.

Only works with Rails5 because of or query method
specific_ids = CryptoIndexCurrency.distinct.pluck(:crypto_currency_id)
hash = CryptoPrice.where(crypto_currency_id: specific_ids)
.group(:crypto_currency_id)
.maximum(:last_updated)
hash.each_with_index do |(k, v), i|
if i.zero?
res = CryptoPrice.where(crypto_currency_id: k, last_updated: v)
else
res.or(CryptoPrice.where(crypto_currency_id: k, last_updated: v))
end
end
Explanation:
You can use group to regroup all your CryptoPrice object by each CryptoIndexCurrency presents in your table.
Then using maximum (thanks to #artgb) to take the biggest value last_updated. This will output a Hash with keys: crypto_currency_id and value
last_updated.
Finally, you can use keys to only get an Array of crypto_currency_id.
CryptoPrice.group(:crypto_currency_id).maximum(:last_updated)
=> => {2285=>2017-06-06 09:06:35 UTC,
2284=>2017-05-18 15:51:05 UTC,
2267=>2016-03-22 08:02:53 UTC}
The problem with this solution is that you get the maximum date for each row without getting the whole records.
To get the the records, you can do a loop on the hash pairwise. with crypto_currency_id and last_updated. It's hacky but the only solution I found.

Using this code you can fetch the latest updated row here from particular table.
CryptoPrice.order(:updated_at).pluck(:updated_at).last
This Should be help for you.

This is currently not easy to do in Rails in one statement/query. If you don't mind using multiple statements/queries than this is your solution:
cc_ids = CryptoIndexCurrency.distinct.pluck(:crypto_currency_id)
result = cc_ids.map do |cc_id|
max_last_updated = CryptoPrice.where(crypto_currency_id: cc_id).maximum(:last_updated)
CryptoPrice.find_by(crypto_currency_id: cc_id, last_updated: max_last_updated)
end
The result of the map method is what you are looking for. This produces 2 queries for every crypto_currency_id and 1 query to request the crypto_currency_ids.
If you want to do this with one query you'll need to use OVER (PARTITION BY ...). More info on this in the following links:
Fetch the row which has the Max value for a column
https://learn.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-sql
https://blog.codeship.com/folding-postgres-window-functions-into-rails/
But in this scenario you'll have to write some SQL.
EDIT 1:
If you want a nice Hash result run:
cc_ids.zip(result).to_h
EDIT 2:
If you want to halve the amount of queries you can shove the max_last_updated query in the find_by as sub-query like so:
cc_ids = CryptoIndexCurrency.distinct.pluck(:crypto_currency_id)
result = cc_ids.map do |cc_id|
CryptoPrice.find_by(<<~SQL.squish)
crypto_currency_id = #{cc_id} AND last_updated = (
SELECT MAX(last_updated)
FROM crypto_prices
WHERE crypto_currency_id = #{cc_id})
SQL
end
This produces 1 queries for every crypto_currency_id and 1 query to request the crypto_currency_ids.

Related

ruby multiple where operator with jsonb column and non jsonb columns

I am trying to make a query that includes a jsonb column and 2 non jsonb columns.
Multiple attempts to combine them have failed but 1 nearly worked when I only used 1 other non jsonb column. I have a channel model with an 'options' store and several attributes within.
If i separate the queries they work just fine but combined they retrieve an empty array. I have made sure that if the queries did work, there is definitely data for them to return.
Non jsonb columns - works
Channel.where("platform_id = ? AND updated_at < ?",2,7.days.ago)
jsonb column - works
Channel.where("options #> ?", {valid_account: true}.to_json)
combined where operator - returns empty []
Channel.where("platform_id = ? AND updated_at < ?",2,7.days.ago).where("options #> ?", {valid_account: true}.to_json)
1 where operator with combined query - again, returns empty []
Channel.where("options = ? AND platform_id = ? AND updated_at < ?", {"valid_account" => true}.to_json, 2, 7.days.ago)
Am at a loss now and not sure how to get this all into one query... or if it's even possible.
Again... there are definitely channels that should return with the given queries above
TIA
UPDATE
Managed to get the query to work. tried nearly every combination but missed one critical one.
Channel.where("options #> ? AND platform_id = ? AND updated_at > ?", {valid_account: true}.to_json, 2, 7.days.ago)
this worked. For some reason I had used the '>#' in a separate method but not combined it for this. All working now. Thanks for the support
Answer was to include '>#' in the query for the jsonb column
Channel.where("options #> ? AND platform_id = ? AND updated_at > ?", {valid_account: true}.to_json, 2, 7.days.ago

How can I select distinct dates from a database, when it's formatted as DateTime

I have a database setup similarly to this
The output of
sqlite> PRAGMA table_info(mytable);
is
0|id|INTEGER
1|mydatetime|text
And it looks like this
|__id__|_____mydatetime_____|
| 0 | 2016-10-11 12:10:22|
| 1 | 2016-10-11 12:11:22|
| 2 | 2016-10-12 10:45:45|
| 3 | 2016-10-12 11:12:12|
In Ruby on Rails, I'd like to select all of the rows with the same date (ignoring time). And I'm looping through them, to do something for every date. For example:
If I had the same database, and instead of DateTime it was formatted as just the Date I would do something similar to below:
distinctDate = MyTable.select(:mydatetime).distinct.to_a
distinctDate.each do |x|
put x
end
But how can I write the select with the distinct, and it also ignore the time?
MyTable.pluck("distinct date(updated_at)")
That will give you back an array of distinct dates, then you can process them however you need to in the application.
I'd use sqlite's date() function to get only the date part of DateTime field.
In your example:
distinctDate = MyTable.select('date(mydatetime)').distinct.to_a
And if you need only array of column values instead of rows, you can use ActiveRecord's pluck() method:
distinctDate = MyTable.distinct.pluck('date(mydatetime)')

Is it possible to convert the complex SQL into rails?

SPEC
I need to pick all rooms that don't have a single day with saleable = FALSE in the requested time period(07-09 ~ 07-19):
I have a table room with 1 row per room.
I have a table room_skus with one row per room and day (complete set for the relevant time range).
The column saleable is boolean NOT NULL and date is defined date NOT NULL
SELECT id
FROM room r
WHERE NOT EXISTS (
SELECT 1
FROM room_skus
WHERE date BETWEEN '2016-07-09' AND '2016-07-19'
AND room_id = r.id
AND NOT saleable
GROUP BY 1
);
The above SQL query is working, but I wonder how could I translate it into Rails ORM.
Let's say you have array of room_ids called room_ids:
needed_room_ids = room_ids - RoomSku.where(room_id: room_ids, date: '2016-07-09'..'2016-07-19', sealable: false).pluck(:room_id)
If your model of room_sku is called RoomSku
Updated version:
room_ids = Room.all.select { |record| record.room_skus.present? }.map(&:id)
And then:
needed_room_ids = room_ids - RoomSku.where(room_id: room_ids, date: '2016-07-09'..'2016-07-19', sealable: false).pluck(:room_id)
It won't be one query, but you avoid plain SQL like this.
I don't have any project here to test something like it, but it should work:
Room.where.not(id: RoomSku.where(date: DateTime.parse('2016-07-09').strftime("%Y-%m-%d")..DateTime.parse('2016-07-19').strftime("%Y-%m-%d"), saleable: false).pluck(:room_id))
I hope it helps!

How to make Rails/ActiveRecord return unique objects using join table's boolean column

I have a Rails 4 app using ActiveRecord and Postgresql with two tables: stores and open_hours. a store has many open_hours:
stores:
Column |
--------------------+
id |
name |
open_hours:
Column |
-----------------+
id |
open_time |
close_time |
store_id |
The open_time and close_time columns represent the number of seconds since midnight of Sunday (i.e. beginning of the week).
I would like to get list of store objects ordered by whether the store is open or not, so stores that are open will be ranked ahead of the stores that are closed. This is my query in Rails:
Store.joins(:open_hours).order("#{current_time} > open_time AND #{current_time} < close_time desc")
Notes that current_time is in number of seconds since midnight on the previous Sunday.
This gives me a list of stores with the currently open stores ranked ahead of the closed ones. However, I'm getting a lot of duplicates in the result.
I tried using the distinct, uniq and group methods, but none of them work:
Store.joins(:open_hours).group("stores.id").group("open_hours.open_time").group("open_hours.close_time").order("#{current_time} > open_time AND #{current_time} < close_time desc")
I've read a lot of the questions/answers already on Stackoverflow but most of them don't address the order method. This question seems to be the most relevant one but the MAX aggregate function does not work on booleans.
Would appreciate any help! Thanks.
Here is what I did to solve the issue:
In Rails:
is_open = "bool_or(#{current_time} > open_time AND #{current_time} < close_time)"
Store.select("stores.*, CASE WHEN #{is_open} THEN 1 WHEN #{is_open} IS NULL THEN 2 ELSE 3 END AS open").group("stores.id").joins("LEFT JOIN open_hours ON open_hours.store_id = stores.id").uniq.order("open asc")
Explanation:
The is_open variable is just there to shorten the select statement.
The bool_or aggregate function is needed here to group the open_hours records. Otherwise there likely will be two results for each store (one open and one closed), which is why using the uniq method alone doesn't eliminate the duplicate issues
LEFT JOIN is used instead of INNER JOIN so we can include the stores that don't have any open_hours objects
The store can be open (i.e. true), closed (i.e. false) or not determined (i.e. nil), so the CASE WHEN statement is needed here: if a store is open, then it's 1, 2 if not determined and 3 if closed
Ordering the results ASC will show open stores first, then the not determined ones, then the closed stores.
This solution works but doesn't feel very elegant. Please post your answer if you have a better solution. Thanks a lot!
Have you tried uniq method, just append it at the end
Store.joins(:open_hours).order("#{current_time} > open_time AND #{current_time} < close_time desc").uniq

Order by nil value in column

I have table with column position, which in some cases, for some collection of records can be nil. I have default order options like
order('positions ASC')
id| name | position
1 5 null
2 6 null
3 7 null
If for some collection that I sort (example above), all values have null in position column, in which order I will get this collection from db?
I'm suggestion I will get collection in order of ids (1,2,3). Am I correct?
Addition #1: DB - Postgresql
According Postgres manual, if no sorting clause the records are returned according with physical position at the disk. It says nothing for sorted records with equal values on sort fields. But, it uses b-tree and, like clasic db managers, it must return on the order stored at the b-tree. You must expect that each of this change on db reorganization.
At the end, there are no warranty on the order of records with same values on sort fields.
Note: using Postgres you can make the NULL values at the first or the last (it is detailed at the referrer link).
At this related question, I'm agree with #macek.
You can do something like this.
Cats:
id| name | position
1 5 null
2 6 null
3 7 not_null
nil = Cat.order("id ASC").where(position: nil) = [1, 2]
not_nil = Cat.order("id ASC").where("position is not null") = [3]
not_nil + nil = [3, 1, 2]
This preserves order.

Resources