Background:
Writing a Ruby on Rails app that is connect to MS SQL DB(don't ask why, its something I cant get around), the DB is quiet large and can have up to 2/3 million rows.
There is one main columns that matter for me at the moment and that is 'TimeUTC' and the table is called ApplicationLog and the query will be in the ApplicationLogController.
Problem:
I want to write a query that i can pass in two dates and it will group all the records by day using the 'TimeUTC' column and gives me a total of all the records for those days in between those two dates.
I have the SQL Query:
DECLARE #StartDate DateTime = '2014-01-04 00:00:00'
DECLARE #EndDate DateTime = '2014-02-04 23:59:59'
select (dateadd(DAY,0, datediff(day,0, TimeUtc))) as [DATE], count(*)
from applicationlog (nolock)
where TimeUtc between #StartDate and #EndDate
group by dateadd(DAY,0, datediff(day,0, TimeUtc))
order by [DATE] desc
I try starting with something like this:
#results = ApplicationLog.select((dateadd(DAY,0, datediff(day,0, TimeUtc)) as [date], count(*)).
group(dateadd(DAY,0, datediff(day,0, TimeUtc))).order(date desc)
Now I am a newbie at this so I could be so far off the track its not funny, but any help would be great. Am I even going about this the right way, is there a better way??
Try with the following code, which uses Arel code with some SQL embedded.
class ApplicationLog < ActiveRecord::Base
def self.between(range)
columns = Arel.sql('dateadd(DAY,0,datediff(day,0,TimeUtc)) as date, COUNT(*)')
conditions = arel_table['TimeUTC'].in(range)
query = arel_table.project(columns).where(conditions).group([1]).order('date desc')
ActiveRecord::Base.connection.execute(query.to_sql)
end
end
Then use ApplicationLog.between(1.week.ago..Time.now).
Related
In an attempt to summarise traffic data base on a time span, one cannot search invoking a component of a datetime object as such:
txat0 = Transaction.where(['shop_id = ? AND created_at.hour = ?', shop, 0]).count
One could go via the SQL route (i.e. postgresql)
select extract(shop_id, hour from created_at) from transactions
and filter from there.
But what is a succinct way of achieving this with ruby or rails (performance is not a concern for this query) ?
I believe you could do a mix and run the SQL part inside an ActiveRecord query.
What about:
Transaction.where("DATE_PART('hour', created_at) = ?", 0)
PS: I've ignored the shop_id clause in the above example, but you can just add it afterwards.
I have a working SQL query for Postgres v10.
SELECT *
FROM
(
SELECT DISTINCT ON (title) products.title, products.*
FROM "products"
) subquery
WHERE subquery.active = TRUE AND subquery.product_type_id = 1
ORDER BY created_at DESC
With the goal of the query to do a distinct based on the title column, then filter and order them. (I used the subquery in the first place, as it seemed there was no way to combine DISTINCT ON with ORDER BY without a subquery.
I am trying to express said query in ActiveRecord.
I have been doing
Product.select("*")
.from(Product.select("DISTINCT ON (product.title) product.title, meals.*"))
.where("subquery.active IS true")
.where("subquery.meal_type_id = ?", 1)
.order("created_at DESC")
and, that works! But, it's fairly messy with the string where clauses in there. Is there a better way to express this query with ActiveRecord/Arel, or am I just running into the limits of what ActiveRecord can express?
I think the resulting ActiveRecord call can be improved.
But I would start improving with original SQL query first.
Subquery
SELECT DISTINCT ON (title) products.title, products.* FROM products
(I think that instead of meals there should be products?) has duplicate products.title, which is not necessary there. Worse, it misses ORDER BY clause. As PostgreSQL documentation says:
Note that the “first row” of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first
I would rewrite sub-query as:
SELECT DISTINCT ON (title) * FROM products ORDER BY title ASC
which gives us a call:
Product.select('DISTINCT ON (title) *').order(title: :asc)
In main query where calls use Rails-generated alias for the subquery. I would not rely on Rails internal convention on aliasing subqueries, as it may change anytime. If you do not take this into account you could merge these conditions in one where call with hash-style argument syntax.
The final result:
Product.select('*')
.from(Product.select('DISTINCT ON (title) *').order(title: :asc))
.where(subquery: { active: true, meal_type_id: 1 })
.order('created_at DESC')
I have the following query, which calculates the average number of impressions across all teams for a given name and league:
#all_team_avg = NielsenData
.where('name = ? and league = ?', name, league)
.average('impressions')
.to_i
However, there can be multiple entries for each name/league/team combination. I need to modify the query to only average the most recent records by created_at.
With the help of this answer I came up with a query which gets the result that I need (I would replace the hard-coded WHERE clause with name and league in the application), but it seems excessively complicated and I have no idea how to translate it nicely into ActiveRecord:
SELECT avg(sub.impressions)
FROM (
WITH summary AS (
SELECT n.team,
n.name,
n.league,
n.impressions,
n.created_at,
ROW_NUMBER() OVER(PARTITION BY n.team
ORDER BY n.created_at DESC) AS rowcount
FROM nielsen_data n
WHERE n.name = 'Social Media - Twitter Followers'
AND n.league = 'National Football League'
)
SELECT s.*
FROM summary s
WHERE s.rowcount = 1) sub;
How can I rewrite this query using ActiveRecord or achieve the same result in a simpler way?
When all you have is a hammer, everything looks like a nail.
Sometimes, raw SQL is the best choice. You can do something like:
#all_team_avg = NielsenData.find_by_sql("...your_sql_statement_here...")
I'm trying to figure out how to do a query where created_at.year == a given year, and created_at.month equals a given month.
However I can't figure out what I'm doing wrong.
Model.where("'created_at.month' = ? AND 'created_at.year' = ?", 7,2013)
results in nothing being shown.
However when I try Model.first.created_at.month ==7 and
Model.first.created_at.year ==2013 I get true for both.
Therefore theoretically my query should be at least be returning my first record.
Anyone know what I'm doing wrong or any alternative way to find records created on specific months?
Note that in my views the month / year will be parameters but for the purposes of this example I used actual values.
using ruby 1.9.3
rails 3.2.13
You can use the extract SQL function, that will extract the month and year of the timestamp:
Model.where('extract(year from created_at) = ? and extract(month from created_at) = ?', '2013','7')
This query should give you the desired result.
created_at is a timestamp; it is not a set of discrete fields in the database. created_at.year and such don't exist in your DB; it's simply a single timestamp field. When you call #model.created_at.year, Rails is loading the created_at field from the database, and creating a Time object from it, which has a #year method you can call.
What you want is to query on a range of dates:
Model.where("created_at >= ? and created_at < ?", Time.mktime(2013, 7), Time.mktime(2013, 8))
This will find any Model with a created_at timestamp in July 2013.
How do I retrieve a set of records, ordered by count in Arel? I have a model which tracks how many views a product get. I want to find the X most frequently viewed products over the last Y days.
This problem has cropped up while migrating to PostgreSQL from MySQL, due to MySQL being a bit forgiving in what it will accept. This code, from the View model, works with MySQL, but not PostgreSQL due to non-aggregated columns being included in the output.
scope :popular, lambda { |time_ago, freq|
where("created_on > ?", time_ago).group('product_id').
order('count(*) desc').limit(freq).includes(:product)
}
Here's what I've got so far:
View.select("id, count(id) as freq").where('created_on > ?', 5.days.ago).
order('freq').group('id').limit(5)
However, this returns the single ID of the model, not the actual model.
Update
I went with:
select("product_id, count(id) as freq").
where('created_on > ?', time_ago).
order('freq desc').
group('product_id').
limit(freq)
On reflection, it's not really logical to expect a complete model when the results are made up of GROUP BY and aggregate functions results, as returned data will (most likely) match no actual model (row).
you have to extend your select clause with all column you wish to retrieve. or
select("views.*, count(id) as freq")
SQL would be:
SELECT product_id, product, count(*) as freq
WHERE created_on > '$5_days_ago'::timestamp
GROUP BY product_id, product
ORDER BY count(*) DESC, product
LIMIT 5;
Extrapolating from your example, it should be:
View.select("product_id, product, count(*) as freq").where('created_on > ?', 5.days.ago).
order("count(*) DESC" ).group('product_id, product').limit(5)
Disclaimer: Ruby syntax is a foreign language to me.