How to make all queries to model inherit a calculated sql column? - ruby-on-rails

I need to every query involving my Issue model appends another column to its rows.
In my case this column is a boolean if the issue's created_at date from now passed a total of 12 hours (business hours: 8:00 am - 18:00 pm, excluding saturdays and mondays).
SELECT *, hours > 12 as alert FROM
(
SELECT *,
(SELECT count(*) - 1 AS work_hours
FROM generate_series ( created_at
, now()
, interval '1h') h
WHERE EXTRACT(ISODOW FROM h) < 6
AND h::time >= '08:00'::time
AND h::time < '18:00'::time) as hours
FROM issues
) as issues_with_alert
This will return all issues columns such as id, title, description..., plus the calculated one: alert
I thought making a sql view issues_view and change table name on the model to the view ones.
self.table_name = "issues_view"
Then the view should care about all the queries from it inheriting the alert column.
But it seems rails is not fully prepared for that and monkey patches should be made:
http://mentalized.net/journal/2006/03/27/using_sql_server_views_as_rails_tables/
and other problems I found.
Other one is to implements the method on the Issue model:
def self.default_scope
end
But I don't know how to fit the SQL inside of it.
What is the right way to achieve this with rails? I would like to avoid wrong patterns and low readability.
Thanks!

I believe you can solve this by adding a method to your model that will perform the desired calculation and return the boolean value.

Related

Rails: showing data in a crosstab

I'm building an expenditures diary and therefore need a crosstab in my view. The app has an expenditures table. Every record in that table is a single expenditure holding (amongst others) "Value" (as integer), "Day" (as date) and "Category_ID" (as integer and foreign key).
What I want to do is to display a table that shows a user how much he spent on each category in each month.
Thus I need to do two things:
Group all expenditures by month and category
Display that in a view
In other terms I want to go from:
To:
How can I achieve that? Have been searching for a while now but did not find a suitable solution.
Additionally I want to display not the "Category_ID" but the category's name (from the category join table).
UPDATE:
I tried the following so far:
#expenditures = Expenditure.where(:user_id => current_user)
#grouped_exp = #expenditures.includes(:category).group("DATE_TRUNC('month', day)", :name).sum(:value)
Which gives me now:
[[[2016-08-01 00:00:00 UTC, "housing | electricity"], -31.0], [[2016-09-01 00:00:00 UTC, "other | pharmacy & drugstore"], -9.5], [[2016-08-01 00:00:00 UTC, "financials | salary"], 2913.92], [[2016-10-01 00:00:00 UTC, "housing | internet"], -26.06], ... ]
So I have now for each category at each month a sum. However, I don't know if this is the correct way and if yes, what would be the next steps.
The table you want to create can not be done like this. It seems to have dynamic columns corresponding to months of a year. This can not be done using ActiveRecord. You'd have to use SQL directly and programatically generate the query always selecting specific date ranges per column. Probably not a good idea unless you are some SQL hero.
What you need is something more like this
Category.joins(:expenditures).
where("expenditures.user_id = ?", current_user).
group("categories.id, DATE_TRUNC('month', day)").
select("categories.name, DATE_TRUNC('month', day) AS month, SUM(value) AS sum")
Which will give you records like
r = records.first
r.name => 'Some category'
r.month => '2016-10'
r.sum => 55
you can try something like this:
#expenditures
.joins(:category)
.select('DATE_TRUNC('month', day) as day, category_id, sum(value) as total')
.group('day, category_id, total')

Add timezone range to scope

I have this scope that pulls orders for a rolling 14-day analytics graph. Problem is it pulls standard UTC, so sales show up as coming from tomorrow if a sale happens after 5pm PDT time (7 hour dif between UTC and PDT).
Current scope:
scope :recent, complete.joins(:line_items)
.group("DATE_TRUNC('day', processed_at)")
.select("DATE_TRUNC('day', processed_at) as day, sum(line_items.discounted_price) as count")
.order('day DESC').limit(14)
How can I make it so it only pulls orders 'processed_at' within the PDT time zone? Not sure how to do this syntactically, but basically I want to add 'in_time_zone' to the 'processed_at' timestamp.
When scope is called:
Order Load (2.7ms) SELECT DATE_TRUNC('day', processed_at) as day, sum(line_items.discounted_price) as count FROM "orders" INNER JOIN "line_items" ON "line_items"."order_id" = "orders"."id" WHERE "orders"."status" = 'complete' GROUP BY DATE_TRUNC('day', processed_at) ORDER BY day DESC LIMIT 14
I run into similar situations all the time. The solution depends on which specific operation you're working with, but in all cases I prioritize writing fanatically thorough unit tests of the querying behavior (using Timecop or similar) to ensure that it's doing what I want it to be doing.
If you're doing this in a WHERE clause, it's easier because you can adjust the timestamp in Ruby. That might look something like this:
tz_adjusted_start_time = (Date.today - 14.days - 5.hours)
#result = Thingy.where("processed_at >= ?", tz_adjusted_start_time)
The above produces SQL that will look for records created after May 2nd 19:00:00 UTC, or whatever.
If you're trying to GROUP by date in a different time zone, then you need to do the adjustment in raw SQL and it can be a bit hairier to think your way through, but the same principle applies (and the same testing method works just as well). I've done this before and it was messy but I've run into no trouble with it so far. The SQL might look something like this:
...
GROUP BY (`processed_at` - INTERVAL 5 HOUR)
...
I seem to remember using this simple minus sign syntax, but a quick Google search tells me it's more common to use DATE_SUB (or find another way around the timezone issue altogether), so do your homework before considering this an end solution.
Good luck!

Select rows from the last 12 months starting on the last date inserted on the DB

I have a system about cars and parking tickets. I had a requirement to implement where I had to get all the tickets from the last 12 months, so I opened this question.
The requirement has changed and now I need to get the tickets from the last 12 months starting on the last ticket's date.
I know how to do that using SQL (postgres), it would be something like this example:
select *
from parking_tickets
where car_id = 25
AND
date > (select date from parking_tickets where car_id = 25 order by date desc limit 1) - INTERVAL '12 months'
order by date desc
But I would rather have it in ActiveRecord. Is there any way?
I could insert the subquery itself inside the where clause, but it would not be as nice as I would like to.
Is there a nice way to make this, something like this?
#cars = Car.includes(:parkingTickets)
.where('parkingTickets.date >= ?', MAX(parkingTickets.date) - 12.months)
.order('ID, parkingTickets.date desc')
I would like to have it done in a list of cars, so making the query before and then inserting this value in the query would not be an elegant solution, since I would have an array.
This solution should work:
Car.includes(:parking_tickets).where(id: 25, parking_tickets: {date: (ParkingTicket.where(car_id: 25).order(date: :desc).first.date - 12.month)..ParkingTicket.where(car_id: 25).order(date: :desc).first.date}).first.parking_tickets.order(date: :asc).all

Best way to calculate a non-broken streak of days

In my app a User has_many Gestures. What is a good way to calculate how many subsequent days the User has done Gestures?
Right now I'm doing it like below, and it works as I want it to. But clearly it doesn't scale.
class User < ActiveRecord::Base
# ...
def calculate_current_streak
return 0 unless yesterday = gestures.done_on_day(Date.today-1)
i = 0
while gesture = self.gestures.done_on_day(Date.today - (i+1).days).first
i += 1
end
i += 1 if gestures.done_on_day(Date.today).first
i
end
end
Thanks! Special points to the one who can work in a way to only track business days too :)
Think about it this way:
The length of the streak is equivalent to the number of (business) days passed since the last (business) day the user didn't use gestures at all, right?
So the solution boils down to calculating most recent day that the user didn't make a gesture.
In order to do this most easily, there's a trick that lots of DB admins use: they create DB table with all dates (say, all the dates from year 2000 to year 2100). The table needs to have only date field, but you can throw in a Boolean field to mark non-working days, such as weekends and holidays. Such table can be handy in lots of queries, and you only have to create and fill it once. (Here's a more detailed article on such tables, written for MS SQL, but insightful anyway.)
Having this table (let's call it dates), you can calculate your pre-streak date using something like (pseudo-SQL):
SELECT d.date
FROM dates d
LEFT JOIN gestures g ON d.date = g.date AND g.user_id = <put_user_id_here>
WHERE d.date < TODAY() AND g.date IS NULL AND d.work_day = TRUE
ORDER BY d.date DESC
LIMIT 1

Is it possible to search for dates as strings in a database-agnostic way?

I have a Ruby on Rails application with a PostgreSQL database; several tables have created_at and updated_at timestamp attributes. When displayed, those dates are formatted in the user's locale; for example, the timestamp 2009-10-15 16:30:00.435 becomes the string 15.10.2009 - 16:30 (the date format for this example being dd.mm.yyyy - hh.mm).
The requirement is that the user must be able to search for records by date, as if they were strings formatted in the current locale. For example, searching for 15.10.2009 would return records with dates on October 15th 2009, searching for 15.10 would return records with dates on October 15th of any year, searching for 15 would return all dates that match 15 (be it day, month or year). Since the user can use any part of a date as a search term, it cannot be converted to a date/timestamp for comparison.
One (slow) way would be to retrieve all records, format the dates, and perform the search on that. This could be sped up by retrieving only the id and dates at first, performing the search, and then fetching the data for the matching records; but it could still be slow for large numbers of rows.
Another (not database-agnostic) way would be to cast/format the dates to the right format in the database with PostgreSQL functions or operators, and have the database do the matching (with the PostgreSQL regexp operators or whatnot).
Is there a way to do this efficiently (without fetching all rows) in a database-agnostic way? Or do you think I am going in the wrong direction and should approach the problem differently?
Building on the answer from Carlos, this should allow all of your searches without full table scans if you have indexes on all the date and date part fields. Function-based indexes would be better for the date part columns, but I'm not using them since this should not be database-specific.
CREATE TABLE mytable (
col1 varchar(10),
-- ...
inserted_at timestamp,
updated_at timestamp);
INSERT INTO mytable
VALUES
('a', '2010-01-02', NULL),
('b', '2009-01-02', '2010-01-03'),
('c', '2009-11-12', NULL),
('d', '2008-03-31', '2009-04-18');
ALTER TABLE mytable
ADD inserted_at_month integer,
ADD inserted_at_day integer,
ADD updated_at_month integer,
ADD updated_at_day integer;
-- you will have to find your own way to maintain these values...
UPDATE mytable
SET
inserted_at_month = date_part('month', inserted_at),
inserted_at_day = date_part('day', inserted_at),
updated_at_month = date_part('month', updated_at),
updated_at_day = date_part('day', updated_at);
If the user enters only Year use WHERE Date BETWEEN 'YYYY-01-01' AND 'YYYY-12-31'
SELECT *
FROM mytable
WHERE
inserted_at BETWEEN '2010-01-01' AND '2010-12-31'
OR updated_at BETWEEN '2010-01-01' AND '2010-12-31';
If the user enters Year and Month use WHERE Date BETWEEN 'YYYY-MM-01' AND 'YYYY-MM-31' (may need adjustment for 30/29/28)
SELECT *
FROM mytable
WHERE
inserted_at BETWEEN '2010-01-01' AND '2010-01-31'
OR updated_at BETWEEN '2010-01-01' AND '2010-01-31';
If the user enters the three values use SELECT .... WHERE Date = 'YYYY-MM-DD'
SELECT *
FROM mytable
WHERE
inserted_at = '2009-11-12'
OR updated_at = '2009-11-12';
If the user enters Month and Day
SELECT *
FROM mytable
WHERE
inserted_at_month = 3
OR inserted_at_day = 31
OR updated_at_month = 3
OR updated_at_day = 31;
If the user enters Month or Day (you could optimize to not check values > 12 as a month)
SELECT *
FROM mytable
WHERE
inserted_at_month = 12
OR inserted_at_day = 12
OR updated_at_month = 12
OR updated_at_day = 12;
"Database agnostic way" is usually a synonym for "slow way", so the solutions will unlikely be efficient.
Parsing all records on the client side would be the least efficient solution in any case.
You can process your locale string on the client side and form a correct condition for a LIKE, RLIKE or REGEXP_SUBSRT operator. The client side of course should be aware of the database the system uses.
Then you should apply the operator to a string formed according to the locale with database-specific formatting function, like this (in Oracle):
SELECT *
FROM mytable
WHERE TO_CHAR(mydate, 'dd.mm.yyyy - hh24.mi') LIKE '15\.10'
More efficient way (that works only in PostgreSQL, though) would be creating a GIN index on the individual dateparts:
CREATE INDEX ix_dates_parts
ON dates
USING GIN
(
(ARRAY
[
DATE_PART('year', date)::INTEGER,
DATE_PART('month', date)::INTEGER,
DATE_PART('day', date)::INTEGER,
DATE_PART('hour', date)::INTEGER,
DATE_PART('minute', date)::INTEGER,
DATE_PART('second', date)::INTEGER
]
)
)
and use it in a query:
SELECT *
FROM dates
WHERE ARRAY[11, 19, 2010] <# (ARRAY
[
DATE_PART('year', date)::INTEGER,
DATE_PART('month', date)::INTEGER,
DATE_PART('day', date)::INTEGER,
DATE_PART('hour', date)::INTEGER,
DATE_PART('minute', date)::INTEGER,
DATE_PART('second', date)::INTEGER
]
)
LIMIT 10
This will select records, having all three numbers (1, 2 and 2010) in any of the dateparts: like, all records of Novemer 19 2010 plus all records of 19:11 in 2010, etc.
Watever the user enters, you should extract three values: Year, Month and Day, using his locale as a guide. Some values may be empty.
If the user enters only Year use WHERE Date BETWEEN 'YYYY-01-01' AND 'YYYY-12-31'
If the user enters Year and Month use WHERE Date BETWEEN 'YYYY-MM-01' AND 'YYYY-MM-31' (may need adjustment for 30/29/28)
If the user enters the three values use SELECT .... WHERE Date = 'YYYY-MM-DD'
If the user enters Month and Day, you'll have to use the 'slow' way
IMHO, the short answer is No. But definitely avoid loading all rows.
Few notes:
if you had only simple queries for exact dates or ranges, I would recommend using ISO format for DATE (YYYY-MM-DD, ex: 2010-02-01) or DATETIME. But since you seem to need queries like "all years for October 15th", you need custom queries anyways.
I suggest you create a "parser" that takes your date query and gives you the part of the SQL WHERE clause. I am certain that you will end up having less then a dozen of cases, so you can have optimal WHEREs for each of them. This way you will avoid loading all records.
you definitely do not want to do anything locale specific in the SQL. Therefore convert local to some standard in the non-SQL code, then use it to perform your query (basically separate localization/globalization and the query execution)
Then you can optimize. If you see that you have a lot of query just for year, you might create a COMPUTED COLUMN which would contain only the YEAR and have index on it.

Resources