Storing large amount of boolean values in Rails - ruby-on-rails

I am to store quite large amount of boolean values in database used by Rails application - it needs to store 60 boolean values in single record per day. What is best way to do this in Rails?
Queries that I will need to program or execute:
* CRUD
* summing up how many true values are for each day
* possibly (but not nessesarily) other reports like how often true is recorded in each of field
UPDATE: This is to store events that may or may not occur in 5 minute intervals between 9am and 1pm. If it occurs, then I need to set it to true, if not then false. Measurements are done manually and users will be reporting these information using checkboxes on the website. There might be small updates, but most of the time it's just one time entry and then queries as listed above.
UPDATE 2: 60 values per day is per one user, there will be between 1000-2000 users. If there isn't some library that helps with that, I will go for simplest approach and deal with it later if I will get issues with performance. Every day user reports events by checking desired checkboxes on the website, so there is normally a single data entry moment per day (or few if not done on daily basis).

This is dependent on a lot of different things. Do you need callbacks to run? Do you need AR objects instantiated? What is the frequency of these updates? Is it done frequently but not many at a time or rarely but a bunch at once? Could you represent these booleans as a mask instead? We definitely need more context.

Why do these need to be in a single record? Can't you use a 'days' table to tie them all together, then use a day_id column in your 'events' table?
Specify in the Day model that it 'has_many :events' and specify in the Event model file that it 'belongs_to :day'. Then you can find all the events for a day with just the id for the day.
For the third day record, you'd do this:
this_day = Day.find 3
Then you can you use 'this_day.events' to get all the events for that day.
You'll need to decide what you wish to use to identify each day so you query for a day's events using something that you understand. The id column I used above to find it probably won't work.
You could use the timestamp first moment of each day to do that, for example. Or you could rely upon the 'created_at' column of the table to be between the start and end of a day
And you'll want to be sure to thing about what time zone you are using and how this will be stored in the database.
And if your data will be stored close to midnight, daylight savings time could also be an issue. I find it best to use GMT to avoid that issue.
Good luck.

Related

better design for fact table where each row has a Start & End Date

My fact table contains details for clients who attend a course.
To ensure i can get a list of clients registered on any particular day, I have not related the date dimension to the fact table.
Instead i created a measure that does basic between logic (where startDate <= selectedDate && endDate >=SelectedDate)
This allows me to find all clients registered on one single selected day.
There are a few drawback to this however:
-I have to ensure the report user only selects a single day, i.e. they cannot select a date range.
-I cant easily do counts for samePeriodLastMonth or Year.
Is there a better design i should consider that will still allow me to see counts of registered clients on any given day, along with allowing me to use SamePeriodLastMonth/Year functionality?
Would you mind uploading the structure of your fact and dim tables?
Just a thought bubble: if you would like to measure counts for a program over calendar years, I believe you would definitely need to create a Date dimension. Also depending on your reporting needs you might want to consider whether you need an Accumulating Snapshot Fact table.
Please find further details on this:
http://www.kimballgroup.com/2012/05/design-tip-145-time-stamping-accumulating-snapshot-fact-tables/
Cheers
Nithin

Selecting record based on time

I have a model A which uses Rails timestamps and a attribute called ttl which represents the time in minutes between each update of the record.
I want to select all records that should be updated ergo:
WHERE (Time.now - updated_at).to_minutes > ttl)
I've been trying to use variations of Feed.where("(?-updated_at) > ttl", Time.now).to_a but I've not succeeded yet. Can someone help me with the query?
The reason is, that databases can't do time calculations just like arithmetic calculations.
You have to use the time function of the DB, unfortunately there is no standard.
In MySQL you could use
Feed.where("TIMESTAMPADD(MINUTE,ttl,updated_at)<?",Time.now)

Efficiently retrieving ice_cube schedules for a given time period

I'm looking into using Ice Cube https://github.com/seejohnrun/ice_cube for recurring events.
My question is, if I then need to get any events that fall within a given time period (say, on a day or within a week), is there any better way than to loop through them all like this:
items = Records.find(:all)
items.each do |item|
schedule = item.schedule
if schedule.occurs_on?(Date.new)
#if today is a recurrence, add to array
end
end
This seems horribly inefficient but I'm not sure how else to go about it.
That's one approach - but what people do more often is end up denormalizing their schedules into a format that is conveniently queryable.
You may have a collection called something like ScheduleOccurrences - that you build each week / and then query that instead.
Its unfortunate it has to work this way, but sticking to the iCal way of managing schedules has led IceCube to need to format its data in certain ways (specifically ways that can line up with the requirements of the iCal RFC).
I've been doing some thinking recently about what a library would look like that shook away some of those restrictions, for greater flexibility like this - but its definitely still a bit off.
Hope this helps
I faced a similar problem and here was my approach:
Create a column on Event table to store the next occurrence date, and write a method which stores that value after_save. (method available through ice_cube. Perhaps index column too for faster querying.)
Then you can query the database for occurrences happening in the timeframe you need. See below:
Event.where(next_occurrence: Date.today.all_day)
Store EventOccurrences on a separate table.
Update the next_occurrence column for the rows returned to you by your query. Or something similar. This works for me because I'm running a daily job, so that update next_occurrence will run regularly. But you may need to tweak a bit.

What's the correct way to handle optional time information in Rails?

I'm working on a Rails application that allows users to define Tasks, which require a due_date. The due_date may or may not include a time.
The way we're handling this right now feels like a hack. due_dates default to 12:00 AM, and in that case we don't display a time. The DateTime object doesn't allow for empty Time values as far as I know.
Should I split this information up into two columns in the database? How do you guys handle this?
Since your data structure needs to accommodate dates with and without a time, you have two choices:
Use a Ruby DateTime object with a flag value for the time to indicate that the date does not have a time. The usual flag value for this is 0 which then means the midnight time can't be shown. (Midnight is 0 seconds after the day has started.)
For example, parsing "Jan 1, 2010" into a DateTime will give you Jan 1, 2010 00:00.
Otherwise you'll need to invent your own data structure. Easiest would probably a Class with a DateTime and a "show_time" boolean flag. -- by using a DateTime to hold the time, you'll be able to use the DateTime output methods, and do arithmetic with them if needed.
Creating a new data structure is not such a big deal in Ruby, but if you can live without tasks due exactly at midnight, I'd recommend method 1. Note that you'd probably want to print them without a time since that's what the task definer requested. Or you could include "(any time)" in the output.
PS
Watch out for timezones! Many ways to handle, but you should be sure to choose one deliberately.
Splitting the attributes may be unnecessarily complex. Why not adopt a convention that if no time is specified then default it to midnight or noon on the date in question? Unless you need to be able to distinguish between the two cases, i.e. this is midnight because it was explicitly specified or this is midnight because no time was specified. If the latter then splitting them might be advisable or just add a boolean to disambiguate the cases.
If you thought you had further use for a separated date and time and would expend lots of energy splitting a single field for other reasons then that might also argue for splitting.

How would you build this daily class schedule?

What I want to do is very simple but I'm trying to find the best or most elegant way to do this. The Rails application I'm building now will have a schedule of daily classes. For each class the fields relevant to this question are:
Day of the week
Starting time
Ending time
A single entry could be something such as:
day of week: Wednesday
starting time: 10:00 am
ending time: Noon
Also I must mention that it's a bi-lingual Rails 2.2 app and I'm using the native i18n Rails feature. I actually have several questions.
Regarding the day of the week, should I create an extra table with list of days, or is there a built-in way to create that list on the fly? Keep in mind these days of the week will have to be rendered in English or Spanish in the schedule view depending on the locale variable.
While querying the schedule I will need to group and order the results by weekday, from Monday to Sunday, and of course order the classes within each day by starting time.
Regarding the starting time and ending time of each class would you use datetime fields or integer fields? If the latter how would you implement this exactly?
Looking forward to read the different suggestions you guys will come up with.
I would just store the day of the week as an integer. 0 => Monday ... 6 => Sunday (or any way you want. ie. 0 => Sunday). Then store the start time and end time as Time.
That would make grouping really easy. All you would have to do is sort by the day of the week and the start time.
You can display this in multiple ways, but here is what I would do.
Have functions like: #sunday_classes = DailyClass.find_sunday_classes that returns all the classes for Sunday sorted by start time. Then repeat for each day.
def find_sunday_classes
find_by_day_of_week(1, :order -> 'start_time')
end
Note: find_by probably should have id at the end but that's just preference in how you want to name the column.
If you want the full week then call all seven from the controller and loop trough them in the view. You could even create detail pages for each day.
Translation is the only tricky part. You can create a helper function that takes an integer and returns the text for the appropriate day of the week based on local.
That's very basic. Nothing complicated.
If your data is a Time then I would store that as a Time - otherwise you will always have to convert it out of the database when you do date and time related operations on it. The day is redundant data, as it will be part of the time object.
This should mean that you don't need to store a list of days.
If t is a time then
t.strftime('%A')
will always give you the day as a string in English. This could then be translated by i18n as required.
So you only need to store starting time and ending time, or starting time and duration. Both should be equivalent. I would be tempted to store ending time myself, in case you need to do data manipulations on ending times, which therefore won't have to be calculated.
I think most of the rest of what you describe should also fall out of storing time data as instances of Time.
Ordering by week day and time will just be a matter of ordering by your time column. i.e.
daily_class.find(:all, :conditions => ['whatever'], :order => :starting_time)
Grouping by day is a little more tricky. However this is an excellent post on how to group by week. Grouping by day will be analogous.
If you are dealing with non-trivial volumes of data, it may be better to do it in the database, with a find_by_sql and that may depend on your database's time and date functionality, but again storing the data as a Time will also help you here. For example in Postgresql (which I use), getting the week of a class is
date_trunc('week', starting_time)
which you can use in a Group By clause, or as a value to use in some loop logic in rails.
Re days-of-week, if you need to have e.g. classes that meet 09:00-10:00 on MWF, then you could either use a separate table for days a class meets (keyed by both class ID and DOW) or be evil (i.e. non-normalized) and keep the equivalent of an array of DOW in each class. The classic argument is this:
The separate table can be indexed in a way to support either class-oriented or DOW-oriented selects, but takes a bit more glue to put the entire picture together for a class.
The array-of-DOW is simpler to visualize for beginning programmers and slightly simpler to code about, but means that reasoning about DOW requires looking at all classes.
If this is only for your personal class schedule, do what gets you the value you're looking for, and live with the consequences; if you're trying to build a real system for multiple users, I'd go with a separate table. All those normalization rules are there for a reason.
As far as (human-readable) DOW names, that's a presentation-layer issue, and shouldn't be in the core concept of DOW. (Suppose you decided to move to Montreal, and needed French? That should be another "face" and not a change to the core implementation.)
As for starting/ending times, again the issue is your requirements. If all classes begin and end at hour (x:00) boundaries, you could certainly use 0..23 as the hours of the day. But then your life would be miserable as soon as you had to accommodate that 45-minute seminar. As the old commercial said, "Pay me now or pay me later."
One approach would be to define your own ClassTime concept and partition all reasoning about times to that class. It could start with a simplistic representation (integral hours 0..23, or integral minutes after midnight 0..1439) and then "grow" as needed.

Resources