PostgreSQL OVERLAPS overator not inclusive of end time - Rails - ruby-on-rails

I am using Rails 5 and postgresql in my application.
The query is as follows.
.where('(start_time, end_time) OVERLAPS (time without time zone ?, time without time zone ?)',
start_time, end_time)
Scenario 1:
Range 1 => 0:00 - 08:00
Range 2 => 07:59 - 12:00
This shows an overlap which is correct.
Scenario 2:
Range 1 => 0:00 - 08:00
Range 2 => 08:00 - 12:00
This does not show an overlap. This is incorrect since 08:00 falls in both time ranges.
I assume the issue is that OVERLAPS considers < end_time and not <= end_time.
Any idea on how to fix this?

The documentation makes this quite clear:
Each time period is considered to represent the half-open interval start <= time < end, unless start and end are equal in which case it represents that single time instant. This means for instance that two time periods with only an endpoint in common do not overlap.
You can get what you want with range types and the overlaps operator &&:
WHERE tsrange(start_time, end_time, '[]') && tsrange(?, ?, '[]')
Here the third argument to tsrange specifies than both ends are included.
Using ranges has the additional advantage that you can support the condition with a GiST index.

Related

TIMESTAMP WITHOUT TIME ZONE, INTERVAL and DST extravaganza

I'm working on a Rails application which stores all dates to PostgreSQL as "TIMESTAMP WITHOUT TIME ZONE". (Rails handles the time zone on the application layer which for this application is "Europe/Berlin".) Unfortunately, Daylight Savings Time (DST) becomes an issue.
The simplified "projects" table has the following columns:
started_at TIMESTAMP WITHOUT TIME ZONE
duration INTEGER
Projects start at started_at and run for duration days.
Now, say there's only one project which starts on 2015-01-01 at 10:00. Since this is "Europe/Berlin" and it's January (no DST), the record looks like this on the database:
SET TimeZone = 'UTC';
SELECT started_at from projects;
# => 2015-01-01 09:00:00
It should end on 2015-06-30 at 10:00 (Europe/Berlin). But it's summer now, so DST applies and 10:00 in "Europe/Berlin" is now 08:00 in UTC.
Due to this, finding all projects for which the duration has elapsed by use of the following query does not work for projects which start/end across DST boundaries:
SELECT * FROM projects WHERE started_at + INTERVAL '1 day' * duration < NOW()
I guess it would be best if the above WHERE did the calculation in timezone "Europe/Berlin" rather than "UTC". I've tried a few things with ::TIMESTAMTZ and AT TIME ZONE none of which has worked.
As a side note: According to the PostgreSQL docs, + INTERVAL should deal with '1 day' intervals differently from '24 hours' intervals when it comes to DST. Adding days ignores DST, so 10:00 always stays 10:00. When adding hours on the other hand, 10:00 may become 09:00 or 11:00 if you cross the DST boundary one way or another.
Thanks a lot for any hints!
I think you've got two strategies for avoiding headache:
Let Rails handle everything to do with Timezones, so Postgres doesn't have to at all
or
Let Postgres handle everything to do with Timezones, so Rails doesn't have to at all
Mixing the two will always be a pain, and is basically what's causing your problems now. I'd go with strategy 1 (let Rails handle it). To do this, your Postgres database should store a start time, and a finish time, both in UTC. duration may be a thing in your user interface still, but if a user enters a start time and a duration, then you should calculate a finish time, and store that finish time in your database. The start time the users enters, and the finish time that you calculate in your app, with both be timezone-specific, and you just let Rails handle the conversion to UTC when it saves to the database.
Your query would then be simply:
SELECT * FROM projects WHERE finished_at < NOW()
(BTW, You could also store the duration in your database, but it's superfluous, since it can be calculated from the start time and finish time)
I've created a function which calculates ended_at by adding duration days to started_at honoring DST changes of a given time zone. Both started_at and ended_at, however, are in UTC and therefore play nice with Rails.
It turns started_at (timestamp without time zone, implicit UTC by Rails) to a timestamp with time zone UTC, then to the given time zone, adds the duration and returns the timestamp without time zone (implicit UTC).
# ended_at(started_at, duration, time_zone)
CREATE FUNCTION ended_at(timestamp, integer, text = 'Europe/Zurich') RETURNS timestamp AS $$
SELECT (($1::timestamp AT TIME ZONE 'UTC' AT TIME ZONE $3 + INTERVAL '1 day' * $2) AT TIME ZONE $3)::timestamp
$$ LANGUAGE SQL IMMUTABLE SET search_path = public, pg_temp;
With this function, I can omit having to add ended_at as an explicit column which would have to be kept in sync. And it's easy to use:
SELECT ended_at(started_at, duration) FROM projects

ruby timezone conversion issues

I have a scenario in which i get a timestamp and i need to search for all bookings for that date in that timestamp. The timestamp is in users respective timezone and all the records in the database are stored in UTC. so naturally i need to convert that timestamp back to UTC and then search.
Here's something that i'm doing:
Booking.where("date_time >= '#{DateTime.parse(timestamp).in_time_zone('UTC').beginning_of_day}' and date_time <= '#{DateTime.parse(timestamp).in_time_zone('UTC').end_of_day}'")
which basically means to fetch all bookings from the beginning of day till the end
However, when i use the following query it gives me a different result:
Booking.where("date_time >= '#{DateTime.parse(timestamp).beginning_of_day.in_time_zone('UTC')}' and date_time <= '#{DateTime.parse(timestamp).end_of_day.in_time_zone('UTC')}'")
I'm wondering which one is actually the correct statement to use in my use case and i would appreciate some input here.
I wouldn't use either one.
This one:
DateTime.parse(timestamp).in_time_zone('UTC').beginning_of_day
gives you the beginning of the UTC day, not the beginning of the local-time-zone-day offset to UTC. In short, it is incorrect and won't give you what you're looking for.
This one:
DateTime.parse(timestamp).beginning_of_day.in_time_zone('UTC')
is correct as it changes the time to the beginning of the day in the local time zone and then converts the timestamp to UTC.
If you let ActiveRecord deal with the quoting using a placeholder, then it will apply the UTC adjustment itself.
I'd also use < t.tomorrow.beginning_of_day rather than <= t.end_of_day to avoid timestamp truncation and precision issues; the end of the day is considered to be at 23:59:59.999... and that could leave a little tiny window for errors to creep in. I'm being pretty pedantic here, you might not care about this.
I'd probably do it more like this:
t = DateTime.parse(timestamp)
Booking.where('date_time >= :start and date_time < :end',
:start => t.beginning_of_day,
:end => t.tomorrow.beginning_of_day
)

Store the day of the week and time?

I have a two-part question about storing days of the week and time in a database. I'm using Rails 4.0, Ruby 2.0.0, and Postgres.
I have certain events, and those events have a schedule. For the event "Skydiving", for example, I might have Tuesday and Wednesday and 3 pm.
Is there a way for me to store the record for Tuesday and Wednesday in one row or should I have two records?
What is the best way to store the day and time? Is there a way to store day of week and time (not datetime) or should these be separate columns? If they should be separate, how would I store the day of the week? I was thinking of storing them as integer values, 0 for Sunday, 1 for Monday, since that's how the wday method for the Time class does it.
Any suggestions would be super helpful.
Is there a way for me to store the the record for Tuesday and
Wednesday in one row or do should I have two records?
There are several ways to store multiple time ranges in a single row. #bma already provided a couple of them. That might be useful to save disk space with very simple time patterns. The clean, flexible and "normalized" approach is to store one row per time range.
What is the best way to store the day and time?
Use a timestamp (or timestamptz if multiple time zones may be involved). Pick an arbitrary "staging" week and just ignore the date part while using the day and time aspect of the timestamp. Simplest and fastest in my experience, and all date and time related sanity-checks are built-in automatically. I use a range starting with 1996-01-01 00:00 for several similar applications for two reasons:
The first 7 days of the week coincide with the day of the month (for sun = 7).
It's the most recent leap year (providing Feb. 29 for yearly patterns) at the same time.
Range type
Since you are actually dealing with time ranges (not just "day and time") I suggest to use the built-in range type tsrange (or tstzrange). A major advantage: you can use the arsenal of built-in Range Functions and Operators. Requires Postgres 9.2 or later.
For instance, you can have an exclusion constraint building on that (implemented internally by way of a fully functional GiST index that may provide additional benefit), to rule out overlapping time ranges. Consider this related answer for details:
Preventing adjacent/overlapping entries with EXCLUDE in PostgreSQL
For this particular exclusion constraint (no overlapping ranges per event), you need to include the integer column event_id in the constraint, so you need to install the additional module btree_gist. Install once per database with:
CREATE EXTENSION btree_gist; -- once per db
Or you can have one simple CHECK constraint to restrict the allowed time period using the "range is contained by" operator <#.
Could look like this:
CREATE TABLE event (event_id serial PRIMARY KEY, ...);
CREATE TABLE schedule (
event_id integer NOT NULL REFERENCES event(event_id)
ON DELETE CASCADE ON UPDATE CASCADE
, t_range tsrange
, PRIMARY KEY (event_id, t_range)
, CHECK (t_range <# '[1996-01-01 00:00, 1996-01-09 00:00)') -- restrict period
, EXCLUDE USING gist (event_id WITH =, t_range WITH &&) -- disallow overlap
);
For a weekly schedule use the first seven days, Mon-Sun, or whatever suits you. Monthly or yearly schedules in a similar fashion.
How to extract day of week, time, etc?
#CDub provided a module to deal with it on the Ruby end. I can't comment on that, but you can do everything in Postgres as well, with impeccable performance.
SELECT ts::time AS t_time -- get the time (practically no cost)
SELECT EXTRACT(DOW FROM ts) AS dow -- get day of week (very cheap)
Or in similar fashion for range types:
SELECT EXTRACT(DOW FROM lower(t_range)) AS dow_from -- day of week lower bound
, EXTRACT(DOW FROM upper(t_range)) AS dow_to -- same for upper
, lower(t_range)::time AS time_from -- start time
, upper(t_range)::time AS time_to -- end time
FROM schedule;
db<>fiddle here
Old sqliddle
ISODOW instead of DOW for EXTRACT() returns 7 instead of 0 for sundays. There is a long list of what you can extract.
This related answer demonstrates how to use range type operator to compute a total duration for time ranges (last chapter):
Calculate working hours between 2 dates in PostgreSQL
Check out the ice_cube gem (link).
It can create a schedule object for you which you can persist to your database. You need not create two separate records. For the second part, you can create schedule based on any rule and you need not worry on how that will be saved in the database. You can use the methods provided by the gem to get whatever information you want from the persisted schedule object.
Depending how complex your scheduling needs are, you might want to have a look at RFC 5545, the iCalendar scheduling data format, for ideas on how to store the data.
If you needs are pretty simple, than that is probably overkill. Postgresql has many functions to convert date and time to whatever format you need.
For a simple way to store relative dates and times, you could store the day of week as an integer as you suggested, and the time as a TIME datatype. If you can have multiple days of the week that are valid, you might want to use an ARRAY.
Eg.
ARRAY[2,3]::INTEGER[] = Tues, Wed as Day of Week
'15:00:00'::TIME = 3pm
[EDIT: Add some simple examples]
/* Custom the time and timetz range types */
CREATE TYPE timerange AS RANGE (subtype = time);
--drop table if exists schedule;
create table schedule (
event_id integer not null, /* should be an FK to "events" table */
day_of_week integer[],
time_of_day time,
time_range timerange,
recurring text CHECK (recurring IN ('DAILY','WEEKLY','MONTHLY','YEARLY'))
);
insert into schedule (event_id, day_of_week, time_of_day, time_range, recurring)
values
(1, ARRAY[1,2,3,4,5]::INTEGER[], '15:00:00'::TIME, NULL, 'WEEKLY'),
(2, ARRAY[6,0]::INTEGER[], NULL, '(08:00:00,17:00:00]'::timerange, 'WEEKLY');
select * from schedule;
event_id | day_of_week | time_of_day | time_range | recurring
----------+-------------+-------------+---------------------+-----------
1 | {1,2,3,4,5} | 15:00:00 | | WEEKLY
2 | {6,0} | | (08:00:00,17:00:00] | WEEKLY
The first entry could be read as: the event is valid at 3pm Mon - Fri, with this schedule occurring every week.
The second entry could be read as: the event is valid Saturday and Sunday between 8am and 5pm, occurring every week.
The custom range type "timerange" is used to denote the lower and upper boundaries of your time range.
The '(' means "inclusive", and the trailing ']' means "exclusive", or in other words "greater than or equal to 8am and less than 5pm".
Why not just store the datestamp then use the built in functionality for Date to get the day of the week?
2.0.0p247 :139 > Date.today
=> Sun, 10 Nov 2013
2.0.0p247 :140 > Date.today.strftime("%A")
=> "Sunday"
strftime sounds like it can do everything for you. Here are the specific docs for it.
Specifically for what you're talking about, it sounds like you'd need an Event table that has_many :schedules, where a Schedule would have a start_date timestamp...

Rails 3.1: Querying Postgres for records within a time range

In my app I have a Person model. Each Person has an attribute time_zone that specifies their default time zone. I also have an Event model. Each Event has a start_time and end_time timestamp, saved in a Postgres database in UTC time.
I need to create a query that finds events for a particular person that fall between midnight of one day and midnight of the next. The #todays_events controller variable hold the results of the query.
Part of the reason that I'm taking this approach is that I may have people from other time zones looking at the list of events for a person. I want them to see the day as the person would see the day and not based on the time zone they are in as an observer.
For whatever reason, I'm still getting some events from the previous day in my result set for #todays_events. My guess is that I'm comparing a UTC timestamp with a non-UTC parameter, or something along those lines. Generally, only events that begin or end in the evening of the previous day show up on the query result list for today.
Right now, I'm setting up:
#today = Time.now.in_time_zone(#person.time_zone).midnight.to_date
#tomorrow = (#today + 1.day ).to_datetime
#today = #today.to_datetime
My query looks like:
#todays_activities = #person.marks.where("(start_time >= ? AND start_time < ?) OR (end_time >= ? AND end_time < ?);", #today, #tomorrow, #today, #tomorrow ).order("start_time DESC")
How should I change this so that I'm guaranteed only to receive results from today (per the #person.time_zone in the #todays_activities query?
You're losing track of your timezones when you call to_date so don't do that:
#today = Time.now.in_time_zone(#person.time_zone).midnight.utc
#tomorrow = #today + 1.day
When you some_date.to_datetime, you get a DateTime instance that is in UTC so the result of something like this:
Time.now.in_time_zone(#person.time_zone).midnight.to_date.to_datetime
will have a time-of-day of 00:00:00 and a time zone of UTC; the 00:00:00 is the correct time-of-day in #person.time_zone but not right for UTC (unless, of course, #person is in in the +0 time zone).
And you could simplify your query with overlaps:
where(
'(start_time, end_time) overlaps (timestamp :today, timestamp :tomorrow)',
:today => #today, :tomorrow => #tomorrow
)
Note that overlaps works with half-open intervals:
Each time period is considered to represent the half-open interval start <= time < end, unless start and end are equal in which case it represents that single time instant.

Given a date, how can I efficiently calculate the next date in a given sequence (weekly, monthly, annually)?

In my application I have a variety of date sequences, such as Weekly, Monthly and Annually. Given an arbitrary date in the past, I need to calculate the next future date in the sequence.
At the moment I'm using a sub-optimal loop. Here's a simplified example (in Ruby/Rails):
def calculate_next_date(from_date)
next_date = from_date
while next_date < Date.today
next_date += 1.week # (or 1.month)
end
next_date
end
Rather than perform a loop (which, although simple, is inefficient especially when given a date in the distant past) I'd like to do this with date arithmetic by calculating the number of weeks (or months, years) between the two dates, calculating the remainder and using these values to generate the next date.
Is this the right approach, or am I missing a particularly clever 'Ruby' way of solving this? Or should I just stick with my loop for the simplicity of it all?
Because you tagged this question as ruby-on-rails, I suppose you are using Rails.
ActiveSupport introduces the calculation module which provides an helpful #advance method.
date = Date.today
date.advance(:weeks => 1)
date.advance(:days => 7)
# => next week
I have used the recurrence gem in the past for this purpose. There are a few other gems that model recurring events listed here.
If you are using a Time object, you can use Time.to_a to break the time into an array (with fields representing the hour, day, month, etc), adjust the appropriate field, and pass the array back to Time.local or Time.utc to build a new Time object.
If you are using the Date class, date +/- n will give you a date n days later/earlier, and date >>/<< n will give you a date n months later/earlier.
You can use the more generic Date.step instead of your loop. For example,
from_date.step(Date.today, interval) {|d|
# Each iteration of this block will be passed a value for 'd'
# that is 'interval' days after the previous 'd'.
}
where interval is a length of time in days.
If all you are doing is calculating elapsed time, then there is probably a better approach to this. If your date is stored as a Date object, doing date - Date.today will give you the number of days between that date and now. To calculate months, years, etc, you can use something like this:
# Parameter 'old_date' must be a Date object
def months_since(old_date)
(Date.today.month + (12 * Date.today.year)) - (old_date.month + (12 * old_date.year))
end
def years_since(old_date)
Date.today.year - old_date.year
end
def days_since(old_date)
Date.today - old_date
end

Resources