Is it possible to search for dates as strings in a database-agnostic way? - ruby-on-rails

I have a Ruby on Rails application with a PostgreSQL database; several tables have created_at and updated_at timestamp attributes. When displayed, those dates are formatted in the user's locale; for example, the timestamp 2009-10-15 16:30:00.435 becomes the string 15.10.2009 - 16:30 (the date format for this example being dd.mm.yyyy - hh.mm).
The requirement is that the user must be able to search for records by date, as if they were strings formatted in the current locale. For example, searching for 15.10.2009 would return records with dates on October 15th 2009, searching for 15.10 would return records with dates on October 15th of any year, searching for 15 would return all dates that match 15 (be it day, month or year). Since the user can use any part of a date as a search term, it cannot be converted to a date/timestamp for comparison.
One (slow) way would be to retrieve all records, format the dates, and perform the search on that. This could be sped up by retrieving only the id and dates at first, performing the search, and then fetching the data for the matching records; but it could still be slow for large numbers of rows.
Another (not database-agnostic) way would be to cast/format the dates to the right format in the database with PostgreSQL functions or operators, and have the database do the matching (with the PostgreSQL regexp operators or whatnot).
Is there a way to do this efficiently (without fetching all rows) in a database-agnostic way? Or do you think I am going in the wrong direction and should approach the problem differently?

Building on the answer from Carlos, this should allow all of your searches without full table scans if you have indexes on all the date and date part fields. Function-based indexes would be better for the date part columns, but I'm not using them since this should not be database-specific.
CREATE TABLE mytable (
col1 varchar(10),
-- ...
inserted_at timestamp,
updated_at timestamp);
INSERT INTO mytable
VALUES
('a', '2010-01-02', NULL),
('b', '2009-01-02', '2010-01-03'),
('c', '2009-11-12', NULL),
('d', '2008-03-31', '2009-04-18');
ALTER TABLE mytable
ADD inserted_at_month integer,
ADD inserted_at_day integer,
ADD updated_at_month integer,
ADD updated_at_day integer;
-- you will have to find your own way to maintain these values...
UPDATE mytable
SET
inserted_at_month = date_part('month', inserted_at),
inserted_at_day = date_part('day', inserted_at),
updated_at_month = date_part('month', updated_at),
updated_at_day = date_part('day', updated_at);
If the user enters only Year use WHERE Date BETWEEN 'YYYY-01-01' AND 'YYYY-12-31'
SELECT *
FROM mytable
WHERE
inserted_at BETWEEN '2010-01-01' AND '2010-12-31'
OR updated_at BETWEEN '2010-01-01' AND '2010-12-31';
If the user enters Year and Month use WHERE Date BETWEEN 'YYYY-MM-01' AND 'YYYY-MM-31' (may need adjustment for 30/29/28)
SELECT *
FROM mytable
WHERE
inserted_at BETWEEN '2010-01-01' AND '2010-01-31'
OR updated_at BETWEEN '2010-01-01' AND '2010-01-31';
If the user enters the three values use SELECT .... WHERE Date = 'YYYY-MM-DD'
SELECT *
FROM mytable
WHERE
inserted_at = '2009-11-12'
OR updated_at = '2009-11-12';
If the user enters Month and Day
SELECT *
FROM mytable
WHERE
inserted_at_month = 3
OR inserted_at_day = 31
OR updated_at_month = 3
OR updated_at_day = 31;
If the user enters Month or Day (you could optimize to not check values > 12 as a month)
SELECT *
FROM mytable
WHERE
inserted_at_month = 12
OR inserted_at_day = 12
OR updated_at_month = 12
OR updated_at_day = 12;

"Database agnostic way" is usually a synonym for "slow way", so the solutions will unlikely be efficient.
Parsing all records on the client side would be the least efficient solution in any case.
You can process your locale string on the client side and form a correct condition for a LIKE, RLIKE or REGEXP_SUBSRT operator. The client side of course should be aware of the database the system uses.
Then you should apply the operator to a string formed according to the locale with database-specific formatting function, like this (in Oracle):
SELECT *
FROM mytable
WHERE TO_CHAR(mydate, 'dd.mm.yyyy - hh24.mi') LIKE '15\.10'
More efficient way (that works only in PostgreSQL, though) would be creating a GIN index on the individual dateparts:
CREATE INDEX ix_dates_parts
ON dates
USING GIN
(
(ARRAY
[
DATE_PART('year', date)::INTEGER,
DATE_PART('month', date)::INTEGER,
DATE_PART('day', date)::INTEGER,
DATE_PART('hour', date)::INTEGER,
DATE_PART('minute', date)::INTEGER,
DATE_PART('second', date)::INTEGER
]
)
)
and use it in a query:
SELECT *
FROM dates
WHERE ARRAY[11, 19, 2010] <# (ARRAY
[
DATE_PART('year', date)::INTEGER,
DATE_PART('month', date)::INTEGER,
DATE_PART('day', date)::INTEGER,
DATE_PART('hour', date)::INTEGER,
DATE_PART('minute', date)::INTEGER,
DATE_PART('second', date)::INTEGER
]
)
LIMIT 10
This will select records, having all three numbers (1, 2 and 2010) in any of the dateparts: like, all records of Novemer 19 2010 plus all records of 19:11 in 2010, etc.

Watever the user enters, you should extract three values: Year, Month and Day, using his locale as a guide. Some values may be empty.
If the user enters only Year use WHERE Date BETWEEN 'YYYY-01-01' AND 'YYYY-12-31'
If the user enters Year and Month use WHERE Date BETWEEN 'YYYY-MM-01' AND 'YYYY-MM-31' (may need adjustment for 30/29/28)
If the user enters the three values use SELECT .... WHERE Date = 'YYYY-MM-DD'
If the user enters Month and Day, you'll have to use the 'slow' way

IMHO, the short answer is No. But definitely avoid loading all rows.
Few notes:
if you had only simple queries for exact dates or ranges, I would recommend using ISO format for DATE (YYYY-MM-DD, ex: 2010-02-01) or DATETIME. But since you seem to need queries like "all years for October 15th", you need custom queries anyways.
I suggest you create a "parser" that takes your date query and gives you the part of the SQL WHERE clause. I am certain that you will end up having less then a dozen of cases, so you can have optimal WHEREs for each of them. This way you will avoid loading all records.
you definitely do not want to do anything locale specific in the SQL. Therefore convert local to some standard in the non-SQL code, then use it to perform your query (basically separate localization/globalization and the query execution)
Then you can optimize. If you see that you have a lot of query just for year, you might create a COMPUTED COLUMN which would contain only the YEAR and have index on it.

Related

How do you select all the rows between two values in SQLite?

I'm building an iOS app where I want to retrieve all the values from my database between two dates that the user picks. So for example, I want all the rows from the 1st of March to the 5th of March. Would look something like
SELECT * FROM MAIN WHERE DATE = '01/03/2020' AND ENDS ='05/03/2020'
So from that I would hope to retrieve all data from the 1st,2nd,3rd,4th and 5th of march. Any ideas on how to do this?
Thank you
Try to use comparison operators like:
DATE >= '01/03/2020' AND DATE <= '05/03/2020'
There are two issues:
Date types:
As Datatypes In SQLite Version 3 says:
2.2. Date and Time Datatype
SQLite does not have a storage class set aside for storing dates and/or times. Instead, the built-in Date And Time Functions of SQLite are capable of storing dates and times as TEXT, REAL, or INTEGER values:
TEXT as ISO8601 strings ("YYYY-MM-DD HH:MM:SS.SSS").
REAL as Julian day numbers, the number of days since noon in Greenwich on November 24, 4714 B.C. according to the proleptic Gregorian calendar.
INTEGER as Unix Time, the number of seconds since 1970-01-01 00:00:00 UTC.
Applications can chose to store dates and times in any of these formats and freely convert between formats using the built-in date and time functions.
So storing dates in a dd/MM/yyyy format (using the DateFormatter capitalization convention) is problematic because in the absence of a native date type, it’s going to store them as strings, and therefore all comparisons will be done alphabetically, not chronologically, sorting values like 03/10/2009 (or nonsense strings like 02foobar, for that matter) in between the strings 01/05/2020 and 05/05/2020.
If, however you store them as yyyy-MM-dd, then it just so happens that alphabetical comparisons will yield chronologically correct comparisons, too.
SQL syntax:
Once you have your dates in your database in a format that is comparable, then if you have all of your dates in a single column, you can use the BETWEEN syntax. For example, let’s say you stored all of your dates in yyyy-MM-dd format, then you could do things like:
SELECT * FROM main WHERE date BETWEEN '2020-03-01' AND '2020-03-05';
But needless to say, you can’t use this pattern (or any comparison operators other than equality) as long as your dates are stored in dd/MM/yyyy format.
If you want to show all the data that has values of column "date" between this two dates then:
Select *
from MAIN
where `date` between '01.03.2020' and '05.03.2020';
If you want to show all the data that has values of column "ends" between this two dates then:
Select *
from MAIN
where ends between '01.03.2020' and '05.03.2020';
If you want to show all the data that has values of columns "date" and "ends" between this two dates then:
Select *
from MAIN
where ends between '01.03.2020' and '05.03.2020'
and `date` between '01.03.2020' and '05.03.2020';
Here is a demo

Query influxdb for a date

I have a table in influxdb that has a column called 'expirydate'. In the column I have afew dates e.g. "2016-07-14" or "2016-08-20". I want to select only the 2016-07-14 date, but I am unsure how?
My query is currently:
SELECT * FROM tablee where expirydate = '2016-07-14' limit 1000
But this does not work. Can someone please help me?
Assuming the value table**e** is a valid measurement...
If you are looking at selecting all of the points for the day '2016-07-14', then your query should look something like.
Query:
SELECT * FROM tablee where time >= '2016-07-14 00:00:00' and time < '2016-07-15 00:00:00'
You might also be interested in the influx's date time string in query.
See:
https://docs.influxdata.com/influxdb/v0.9/query_language/data_exploration/#relative-time
Date time strings Specify time with date time strings. Date time
strings can take two formats: YYYY-MM-DD HH:MM:SS.nnnnnnnnn and
YYYY-MM-DDTHH:MM:SS.nnnnnnnnnZ, where the second specification is
RFC3339. Nanoseconds (nnnnnnnnn) are optional in both formats.
Note:
The limit api could be redundant in your original query as it is there to impose restriction to the query from returning more than 1,000 point data.
I had to force influx to treat my 'string date' as a string. This works:
SELECT * FROM tablee where expirydate=~ /2016-07-14/ limit 1000;

Store the day of the week and time?

I have a two-part question about storing days of the week and time in a database. I'm using Rails 4.0, Ruby 2.0.0, and Postgres.
I have certain events, and those events have a schedule. For the event "Skydiving", for example, I might have Tuesday and Wednesday and 3 pm.
Is there a way for me to store the record for Tuesday and Wednesday in one row or should I have two records?
What is the best way to store the day and time? Is there a way to store day of week and time (not datetime) or should these be separate columns? If they should be separate, how would I store the day of the week? I was thinking of storing them as integer values, 0 for Sunday, 1 for Monday, since that's how the wday method for the Time class does it.
Any suggestions would be super helpful.
Is there a way for me to store the the record for Tuesday and
Wednesday in one row or do should I have two records?
There are several ways to store multiple time ranges in a single row. #bma already provided a couple of them. That might be useful to save disk space with very simple time patterns. The clean, flexible and "normalized" approach is to store one row per time range.
What is the best way to store the day and time?
Use a timestamp (or timestamptz if multiple time zones may be involved). Pick an arbitrary "staging" week and just ignore the date part while using the day and time aspect of the timestamp. Simplest and fastest in my experience, and all date and time related sanity-checks are built-in automatically. I use a range starting with 1996-01-01 00:00 for several similar applications for two reasons:
The first 7 days of the week coincide with the day of the month (for sun = 7).
It's the most recent leap year (providing Feb. 29 for yearly patterns) at the same time.
Range type
Since you are actually dealing with time ranges (not just "day and time") I suggest to use the built-in range type tsrange (or tstzrange). A major advantage: you can use the arsenal of built-in Range Functions and Operators. Requires Postgres 9.2 or later.
For instance, you can have an exclusion constraint building on that (implemented internally by way of a fully functional GiST index that may provide additional benefit), to rule out overlapping time ranges. Consider this related answer for details:
Preventing adjacent/overlapping entries with EXCLUDE in PostgreSQL
For this particular exclusion constraint (no overlapping ranges per event), you need to include the integer column event_id in the constraint, so you need to install the additional module btree_gist. Install once per database with:
CREATE EXTENSION btree_gist; -- once per db
Or you can have one simple CHECK constraint to restrict the allowed time period using the "range is contained by" operator <#.
Could look like this:
CREATE TABLE event (event_id serial PRIMARY KEY, ...);
CREATE TABLE schedule (
event_id integer NOT NULL REFERENCES event(event_id)
ON DELETE CASCADE ON UPDATE CASCADE
, t_range tsrange
, PRIMARY KEY (event_id, t_range)
, CHECK (t_range <# '[1996-01-01 00:00, 1996-01-09 00:00)') -- restrict period
, EXCLUDE USING gist (event_id WITH =, t_range WITH &&) -- disallow overlap
);
For a weekly schedule use the first seven days, Mon-Sun, or whatever suits you. Monthly or yearly schedules in a similar fashion.
How to extract day of week, time, etc?
#CDub provided a module to deal with it on the Ruby end. I can't comment on that, but you can do everything in Postgres as well, with impeccable performance.
SELECT ts::time AS t_time -- get the time (practically no cost)
SELECT EXTRACT(DOW FROM ts) AS dow -- get day of week (very cheap)
Or in similar fashion for range types:
SELECT EXTRACT(DOW FROM lower(t_range)) AS dow_from -- day of week lower bound
, EXTRACT(DOW FROM upper(t_range)) AS dow_to -- same for upper
, lower(t_range)::time AS time_from -- start time
, upper(t_range)::time AS time_to -- end time
FROM schedule;
db<>fiddle here
Old sqliddle
ISODOW instead of DOW for EXTRACT() returns 7 instead of 0 for sundays. There is a long list of what you can extract.
This related answer demonstrates how to use range type operator to compute a total duration for time ranges (last chapter):
Calculate working hours between 2 dates in PostgreSQL
Check out the ice_cube gem (link).
It can create a schedule object for you which you can persist to your database. You need not create two separate records. For the second part, you can create schedule based on any rule and you need not worry on how that will be saved in the database. You can use the methods provided by the gem to get whatever information you want from the persisted schedule object.
Depending how complex your scheduling needs are, you might want to have a look at RFC 5545, the iCalendar scheduling data format, for ideas on how to store the data.
If you needs are pretty simple, than that is probably overkill. Postgresql has many functions to convert date and time to whatever format you need.
For a simple way to store relative dates and times, you could store the day of week as an integer as you suggested, and the time as a TIME datatype. If you can have multiple days of the week that are valid, you might want to use an ARRAY.
Eg.
ARRAY[2,3]::INTEGER[] = Tues, Wed as Day of Week
'15:00:00'::TIME = 3pm
[EDIT: Add some simple examples]
/* Custom the time and timetz range types */
CREATE TYPE timerange AS RANGE (subtype = time);
--drop table if exists schedule;
create table schedule (
event_id integer not null, /* should be an FK to "events" table */
day_of_week integer[],
time_of_day time,
time_range timerange,
recurring text CHECK (recurring IN ('DAILY','WEEKLY','MONTHLY','YEARLY'))
);
insert into schedule (event_id, day_of_week, time_of_day, time_range, recurring)
values
(1, ARRAY[1,2,3,4,5]::INTEGER[], '15:00:00'::TIME, NULL, 'WEEKLY'),
(2, ARRAY[6,0]::INTEGER[], NULL, '(08:00:00,17:00:00]'::timerange, 'WEEKLY');
select * from schedule;
event_id | day_of_week | time_of_day | time_range | recurring
----------+-------------+-------------+---------------------+-----------
1 | {1,2,3,4,5} | 15:00:00 | | WEEKLY
2 | {6,0} | | (08:00:00,17:00:00] | WEEKLY
The first entry could be read as: the event is valid at 3pm Mon - Fri, with this schedule occurring every week.
The second entry could be read as: the event is valid Saturday and Sunday between 8am and 5pm, occurring every week.
The custom range type "timerange" is used to denote the lower and upper boundaries of your time range.
The '(' means "inclusive", and the trailing ']' means "exclusive", or in other words "greater than or equal to 8am and less than 5pm".
Why not just store the datestamp then use the built in functionality for Date to get the day of the week?
2.0.0p247 :139 > Date.today
=> Sun, 10 Nov 2013
2.0.0p247 :140 > Date.today.strftime("%A")
=> "Sunday"
strftime sounds like it can do everything for you. Here are the specific docs for it.
Specifically for what you're talking about, it sounds like you'd need an Event table that has_many :schedules, where a Schedule would have a start_date timestamp...

Rails ActiveRecord Query - how to add a certain hour offset to every record in the created_at field in a query

I need to run a group_by query in Ruby on Rails, but I first want to adjust all records in the created_at column by a certain hour amount before running the query. So, for example, adding 9 hours to every record in the created_at field, and then grouping by date.
Something like the following (which is incorrect):
#foo = Bar.group("date(created_at + 9.hours)").count
How can I accomplish this in Rails?
PostgreSQL has excellent support for manipulating dates and times (see Date/Time Functions and Operators). You can express '9 hours' as an interval, add it to a timestamp, and cast to a date:
=> select (now()::timestamp + '9 hours'::interval)::date;
date
------------
2012-09-22
(1 row)
This ends up strikingly similar to your original pseudocode:
#foo = Bar.group("date(created_at + '9 hours'::interval)").count

sqlite Date Sorting

I am a parsing a file into a sqlite database that contains dates in the YYYY-MM-DD format. I want to store the entries into sqlite in such a way that I can sort the entries by date (strings not cutting it). What is the normal protocol for storing and ordering dates in sqlite? Should convert the dates into a number. Is there a way to convert YYYY-MM-DD dates into timestamps?
SQLite supports "DATE" in table creation. (More about that later.)
CREATE TABLE test (dt DATE PRIMARY KEY);
INSERT INTO "test" VALUES('2012-01-01');
INSERT INTO "test" VALUES('2012-01-02');
INSERT INTO "test" VALUES('2012-01-03');
SELECT dt FROM test ORDER BY dt;
2012-01-01
2012-01-02
2012-01-03
Values in the form yyyy-mm-dd sort correctly as either a string or a date. That's one reason yyyy-mm-dd is an international standard.
But SQLite doesn't use data types in the way most database workers expect it. Data storage is based on storage classes instead. For example, SQLite allows this.
INSERT INTO test VALUES ('Oh, bugger.');
SELECT * FROM test ORDER BY dt;
2012-01-01
2012-01-02
2012-01-03
Oh, bugger.
It also allows different date "formats" (actually, values) in a single column. Its behavior is quite unlike standard SQL engines.
INSERT INTO test VALUES ('01/02/2012');
SELECT * FROM test ORDER BY dt;
01/02/2012
2012-01-01
2012-01-02
2012-01-03
Oh, bugger.
You don't have to do anything special to store a timestamp in a date column. (Although I'd rather see you declare the column as timestamp, myself.)
INSERT INTO test VALUES ('2012-01-01 11:00:00');
SELECT * FROM test ORDER BY dt;
2012-01-01
2012-01-01 11:00:00
2012-01-02
2012-01-03
Oh, bugger.
SQLite will try to do the Right Thing as long as you feed consistent data into it. And it will sort dates correctly if you use the standard format.
Instead of storing date in format "YYYY-MM-DD", store the time-stamp of that date and that will help you to sorting the table.
If You want to Current TimeStamp then use
SELECT strftime('%s','now');
If You want toYYYY-MM-DD date TimeStamp then use
SELECT strftime('%s','YYYY-MM-DD');
where %s=seconds since 1970-01-01
i have the date field store in this way DD/MM/YYYY.
For sorting the date ( date field is a string ) i have to convert it before order it.
select (substr(date, 7, 4) || '-' || substr(date, 4, 2) || '-' || substr(date, 1, 2)) as new_date from work_hour order by new_date desc

Resources