I have a table of Albums that has a date column named release_date.
I want to get a list of all the month + year combinations present along with the number of albums released in that month/year.
So, the output might be something like:
November 2016 - 11
October 2016 - 4
July 2016 - 19
December 2015 - 2
Ruby 2.3.1 w/ Rails 5 on Postgres 9.6, FWIW.
Database layer is where this task belongs, not Ruby:
Album.group("TO_CHAR(release_date, 'Month YYYY')").count
Why using database layer? Simply because it is lightning fast compared to nearly anything else, it is resource-efficient especially compared to Ruby, it scales perfectly and because having tons of Album records you can simply overload memory and never actually finish the processing.
I'm assuming your table is singular Album per Rails convention. If not, consider changing it.
Album.all.map { |album| [Date::MONTHNAMES[album.date.month], album.date.year].join(' ') }
.each_with_object(Hash.new(0)) { |month_year, counts| counts[month_year] += 1 }
Explanation:
The .map method iterates over the albums and returns an array of strings consisting of ["month year", "month year", ... ].
The .each_with_object method is a standard counting algorithm that returns a hash with a count for each unique array item.
Related
Let's say I have three users created on different dates. Now I want do have a graph of users progression. So I want to get something like:
{
Thu, 02 Nov 2017=>1,
Sat, 04 Feb 2017=>2,
Wed, 21 Mar 2018=>3
}
It's very similar to grouping by created_at::date, but I want to have number of all the records created before this date rather than number of items created exactly on this date.
How can I achieve this using group_by and aggregate functions in Postgresql? I need it for Ruby on Rails project, but I expect simple vanilla SQL, no maps and complex queries.
I am no rails expert and my solution requires loading a big relationship like:
#u = User.all
And then find the smallest creation date :
#start = #u.minimum("created_at")
Then you can calculate the number of days between creation and now:
#days = ((Time.now - #start)/1.day).to_i
Then you can calculate for each day the number of users already created:
#days.each do |day|
puts "day "+day.to_s
puts #u.where("created_at < ?", #start+day.day).count
end
Probably some easier solutions in SQL though. (Also you may take only a subset of users by choosing a range of dates not to exceed the server memory)
I have a two-part question about storing days of the week and time in a database. I'm using Rails 4.0, Ruby 2.0.0, and Postgres.
I have certain events, and those events have a schedule. For the event "Skydiving", for example, I might have Tuesday and Wednesday and 3 pm.
Is there a way for me to store the record for Tuesday and Wednesday in one row or should I have two records?
What is the best way to store the day and time? Is there a way to store day of week and time (not datetime) or should these be separate columns? If they should be separate, how would I store the day of the week? I was thinking of storing them as integer values, 0 for Sunday, 1 for Monday, since that's how the wday method for the Time class does it.
Any suggestions would be super helpful.
Is there a way for me to store the the record for Tuesday and
Wednesday in one row or do should I have two records?
There are several ways to store multiple time ranges in a single row. #bma already provided a couple of them. That might be useful to save disk space with very simple time patterns. The clean, flexible and "normalized" approach is to store one row per time range.
What is the best way to store the day and time?
Use a timestamp (or timestamptz if multiple time zones may be involved). Pick an arbitrary "staging" week and just ignore the date part while using the day and time aspect of the timestamp. Simplest and fastest in my experience, and all date and time related sanity-checks are built-in automatically. I use a range starting with 1996-01-01 00:00 for several similar applications for two reasons:
The first 7 days of the week coincide with the day of the month (for sun = 7).
It's the most recent leap year (providing Feb. 29 for yearly patterns) at the same time.
Range type
Since you are actually dealing with time ranges (not just "day and time") I suggest to use the built-in range type tsrange (or tstzrange). A major advantage: you can use the arsenal of built-in Range Functions and Operators. Requires Postgres 9.2 or later.
For instance, you can have an exclusion constraint building on that (implemented internally by way of a fully functional GiST index that may provide additional benefit), to rule out overlapping time ranges. Consider this related answer for details:
Preventing adjacent/overlapping entries with EXCLUDE in PostgreSQL
For this particular exclusion constraint (no overlapping ranges per event), you need to include the integer column event_id in the constraint, so you need to install the additional module btree_gist. Install once per database with:
CREATE EXTENSION btree_gist; -- once per db
Or you can have one simple CHECK constraint to restrict the allowed time period using the "range is contained by" operator <#.
Could look like this:
CREATE TABLE event (event_id serial PRIMARY KEY, ...);
CREATE TABLE schedule (
event_id integer NOT NULL REFERENCES event(event_id)
ON DELETE CASCADE ON UPDATE CASCADE
, t_range tsrange
, PRIMARY KEY (event_id, t_range)
, CHECK (t_range <# '[1996-01-01 00:00, 1996-01-09 00:00)') -- restrict period
, EXCLUDE USING gist (event_id WITH =, t_range WITH &&) -- disallow overlap
);
For a weekly schedule use the first seven days, Mon-Sun, or whatever suits you. Monthly or yearly schedules in a similar fashion.
How to extract day of week, time, etc?
#CDub provided a module to deal with it on the Ruby end. I can't comment on that, but you can do everything in Postgres as well, with impeccable performance.
SELECT ts::time AS t_time -- get the time (practically no cost)
SELECT EXTRACT(DOW FROM ts) AS dow -- get day of week (very cheap)
Or in similar fashion for range types:
SELECT EXTRACT(DOW FROM lower(t_range)) AS dow_from -- day of week lower bound
, EXTRACT(DOW FROM upper(t_range)) AS dow_to -- same for upper
, lower(t_range)::time AS time_from -- start time
, upper(t_range)::time AS time_to -- end time
FROM schedule;
db<>fiddle here
Old sqliddle
ISODOW instead of DOW for EXTRACT() returns 7 instead of 0 for sundays. There is a long list of what you can extract.
This related answer demonstrates how to use range type operator to compute a total duration for time ranges (last chapter):
Calculate working hours between 2 dates in PostgreSQL
Check out the ice_cube gem (link).
It can create a schedule object for you which you can persist to your database. You need not create two separate records. For the second part, you can create schedule based on any rule and you need not worry on how that will be saved in the database. You can use the methods provided by the gem to get whatever information you want from the persisted schedule object.
Depending how complex your scheduling needs are, you might want to have a look at RFC 5545, the iCalendar scheduling data format, for ideas on how to store the data.
If you needs are pretty simple, than that is probably overkill. Postgresql has many functions to convert date and time to whatever format you need.
For a simple way to store relative dates and times, you could store the day of week as an integer as you suggested, and the time as a TIME datatype. If you can have multiple days of the week that are valid, you might want to use an ARRAY.
Eg.
ARRAY[2,3]::INTEGER[] = Tues, Wed as Day of Week
'15:00:00'::TIME = 3pm
[EDIT: Add some simple examples]
/* Custom the time and timetz range types */
CREATE TYPE timerange AS RANGE (subtype = time);
--drop table if exists schedule;
create table schedule (
event_id integer not null, /* should be an FK to "events" table */
day_of_week integer[],
time_of_day time,
time_range timerange,
recurring text CHECK (recurring IN ('DAILY','WEEKLY','MONTHLY','YEARLY'))
);
insert into schedule (event_id, day_of_week, time_of_day, time_range, recurring)
values
(1, ARRAY[1,2,3,4,5]::INTEGER[], '15:00:00'::TIME, NULL, 'WEEKLY'),
(2, ARRAY[6,0]::INTEGER[], NULL, '(08:00:00,17:00:00]'::timerange, 'WEEKLY');
select * from schedule;
event_id | day_of_week | time_of_day | time_range | recurring
----------+-------------+-------------+---------------------+-----------
1 | {1,2,3,4,5} | 15:00:00 | | WEEKLY
2 | {6,0} | | (08:00:00,17:00:00] | WEEKLY
The first entry could be read as: the event is valid at 3pm Mon - Fri, with this schedule occurring every week.
The second entry could be read as: the event is valid Saturday and Sunday between 8am and 5pm, occurring every week.
The custom range type "timerange" is used to denote the lower and upper boundaries of your time range.
The '(' means "inclusive", and the trailing ']' means "exclusive", or in other words "greater than or equal to 8am and less than 5pm".
Why not just store the datestamp then use the built in functionality for Date to get the day of the week?
2.0.0p247 :139 > Date.today
=> Sun, 10 Nov 2013
2.0.0p247 :140 > Date.today.strftime("%A")
=> "Sunday"
strftime sounds like it can do everything for you. Here are the specific docs for it.
Specifically for what you're talking about, it sounds like you'd need an Event table that has_many :schedules, where a Schedule would have a start_date timestamp...
I want to create an array of the number of items created each hour, each day.
I'm tracking how people are feeling, so my model is called TrackMood It just has a column called mood and the timestamps.
If I do
TrackMood.where(mood: "good").group("hour(created_at)").count
I get something like
{11=>4, 12=>2, 13=>2, 15=>1}
I've got 2 issues here
1 How do I add the day into this so it doesn't just add the items created yesterday at 11 o'clock to the items added today at 11 o'clock?
2 How do I make sure it says 0 for hours when nothing is created?
1) Instead of grouping on just the hours part of the date you'll need to group part of the date that is relevant i.e. the date up to the hours and not including anything more specific than that. E.g.
TrackMood.where(mood: "good").group("date_format(created_at, '%Y%m%d %H')").count
2) You're always going to get a hash back from this call even if it doesn't find any groups. If you want to check how many groups there are you can call .size or .count on it.
For PostgreSQL you can use date_part
SO-post - Rails & Postgresql: how to group queries by hour?
Sorry if that question sounds strange, but I'm diving into Rails and I'm still learning the jargon. Basically, I'm trying to create a single-pass query that uses the value of one of the model's attributes in a calculation in the query (assuming that's even possible).
I have a Tournament model that has a start_date attribute that is a DateTime object. I'm trying to create a query that returns all the Tournaments that have a start_date no older than 1 hour + the length of the tournament, or put another way, all tournaments that haven't yet started or have started, but haven't ended longer than an hour ago. My current query, which doesn't work, looks like this...
validTourneys = Tournament.where("start_date > (? - duration_in_mins)", (DateTime.now.utc - 1.hour))
where duration_in_mins is an integer attribute of the Tournament model, but this query doesn't work and it seems to be returning all the Tournaments all the time. I'd like to include duration_in_mins in the (DateTime.now.utc - 1.hour) part of the calculation, but I don't know how to reference it, which is why I included it in the string part of the query, hoping that would work. Am I at least on the right track?
I should mention I'm using SQLite for development and PostgreSQL for production.
Thanks for your wisdom!
The problem is that if you subtract minutes from a DateTime object, you are not subtracting minutes but days.
# This works as expected
dt = DateTime.now # Thu, 28 Apr 2011 09:55:14 +0900
an_hour_ago = dt - 1.hour # Thu, 28 Apr 2011 08:55:14 +0900
# But, this does not...
two_hours_in_minutes = 120
two_hours_ago = dt - two_hours_in_minutes # Wed, 29 Dec 2010 09:55:14 +0900
In the last example 120 days are subtracted instead of minutes. This is probably also happening in your query. You have to convert duration_in_minutes to days and then subtract.
I don't know enough about SQL to answer your question directly (I think this will probably also depend on what database you're using, so you might want to mention that).
Have you considered, though, having start_date and end_date as DateTime columns instead of start_date and duration_in_mins? If this is going to be a common query, that would certainly make it more performant, as well as making your code easier to read and understand.
This query will only work if your database is smart enough to know how to add (what I am assuming) is a DateTime and and integer. And I can't think of a database that will do that correctly the way you have it coded. No database will assume minutes. Some might do ticks, seconds, or days.
This part of the calculation
(? - duration_in_mins)
is going to happen on the database, not in Ruby-land.
I need to find all records that were created on a specific day of week.
I only have available to me the standard model datetime timestamps.
How would I go about doing this in activerecord?
To follow up on Justin's answer
where("extract(dow from created_at) = ?", Date.today.wday)
This is what I'm using in my application for postgres. This will find all records that were created on the same day-of-week as today. For example, if today was tuesday it would find all records created on tuesdays.
You can use the DAYOFWEEK function in MySQL and pass it to the :conditions option. Supposing you have a model called Item, this would return all of the items created on Sunday:
Item.all(:conditions => ['dayofweek(created_at) = ?', 1])
Using Postgres you could do something similar with to_char.
Note that using a function like this will probably make the database do a full table scan, since at least MySQL doesn't support adding an index to a function. You may want to consider extracting the day of week out to another column if this is something that you anticipate doing frequently.
Obtain the seconds since Unix Epoch. Time.to_i does this in Ruby.
Use modulus of 7 to obtain the day of the week (0 to 6).
dayOfWeek = (epochseconds / 86400 ) % 7;
If you're not opposed to using ruby you could try this.
array.select { |arr| ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"].include?(arr.created_at.strftime('%A'))
I originally tried using dayofweek that was suggested in another answer.
The issue I ran in to was that it seems like my sql server was using UTC time and my rails server was using Eastern US. Records created after 8pm would be picked up while those that happened before would be considered the previous day.
Here is another related question:
How to filter by day of week in Rails 4.2 and sqlite?