Temporal data with Rails / Active Record - ruby-on-rails

I'm looking for ideas/information about managing temporal data with Active Record (Rails). One example would be the emplyoment history (working 100% in january, but only 80% from february up to now). This proably would be easy to tackle with a traditional 'has-many :eployment_parts'. But there's another case where the user can plan something like a todo list, which the user can change over time (this todo is valid from Jan-March, another one is valid from Jan-April, and then with changed details from May to August).
I know, there's no silver bullet solution for these kind of requirements, but I'd like to collect here some ideas/docmentations/plugins about this topic. It looks like there hasn't been done much in this area for rails, at least nothing which got public.
Please, if you have an idea, link or thought, drop a short answer!
Links so far:
Eric's random thoughts, blog entry about temporal data in rails
Bi-Temporal PostgreSQL, helps managing temporal data in postgres (w/o rails)
Richard T. Snodgrass's book "Developing Time-Oriented Database Applications in SQL" can be downloaded from his homepage (out of print)

we had the need to keep historical data of all changes to database records, and be able to query data as-of-time, without impacting performance of querying current data.
The solution we envisioned is a Slowly-Changing dimension type 2, with history tables. The implementation is a Ruby gem that requires PostgreSQL 9.3 and fits nicely in Active Record extending it with the temporal framework (e.g. Foo.as_of(1.year.ago).bars).
You can find the code here: https://github.com/ifad/chronomodel :-)

You need to detail what you want to do a bit more.
For example, what "resolution" do you need? Do you plan to record every worked hour of every day? Or just the average montly workload? Per week, maybe?
Also, what do you want to do with holidays, and sick days?
(EDIT - answering to the comment below)
In your case I'd go with a model similar to this one:
class WorkSlice < ActiveRecord::Base
belongs_to :employee
validates_presence_of employee_id, start_date, end_date, percentage
def calculate_hours
#employees might have different hours per day (i.e. UK has 7, Spain has 8)
employee.hours_per_day * self.calculate_days
end
def calculate_days
#days between start_day and end_day that are not sick days, holidays or weekends
workdays = ... #this depends on how you model holidays, etc
return workdays * self.percentage
end
end
The only association you need is with "employee", as far as I know. A method on employee would allow you to see, for example, how "free" that eployee is on a given date.

"I know, there's no silver bullet solution for these kind of requirements,"
That does depend a bit on how silver you need your bullet to be.
Anyhow. You might find the following stuff interesting too :
(a) "An Overview and Analysis of TSQL2" by Hugh Darwen and C.J. Date. A link is on www.thethirdmanifesto.com in the "Papers" section at the bottom.
(b) "Temporal Data and the Relational Model", a complete book by the same authors plus Nikos Lorentzos. Contains a different, complete, proposal plus very sound justifications why they believe that proposal to be better.
(c) You can find an implementation called SIRA_PRISE, written by me, based on the ideas from (b), on shark.armchair.mb.ca/~erwin

Related

Creating new workout journal/tracker app need help establishing database/models to get started?

I am fairly new to Ruby and Rails, made a few blogs etc. I am slowy learning the ruby language and rails framework. I am wanting to create a workout journal/tracker application and need help establishing the models and or to get me started on the right path. I basically want to be able to create a workout/different types of workouts (back, arms, legs, etc), be able to use the # of sets and reps used for that workout, how many days/which days a week, add, edit, delete the workouts, track weight loss/weight, track the workouts, reps, sets you did prior, set goals in the journal, track progress, eventually be able to share workouts etc. I know what I am looking to do just need help getting started and establishing what models to use and what associations to use. I know it seems like alot of info. Any help getting at all getting going would be awesome. Thanks all!
This might be a bit tricky, since there are many styles of exercises -- N sets of M reps, pyramid, max lifts, etc. You may want polymorphic associations in the final version.
But I think you'll have a more clear vision of where to take the project once you've built a few tables and classes; I think I'd start with a Workout class that has_many WOSets (don't use Set; having class names that conflict with built-in class names is way more irritating than you may think), and each WOSet has_many Reps. Then your Reps will keep track of count and weight. Store the order of the reps in the WOSet.
You'll also need a Station class for all the machines and exercises; probably your WOSet will belongs_to the Station, and the Station will has_many WOSet. (So you can retrieve all the sets ever performed on a specific station.)
I hope this quick sketch gets you to the point of playing with creating new workouts, new stations, and playing with the interface in script/console.
Models = Tables
You should have a look at database design and normalization. Its paramount you get the basics right. Otherwise you might end up with database with common errors like performance issues and redundancy (which is is a bad thing).
One you understand what it is you need to store, mapping it to Rails is easy.
http://en.wikipedia.org/wiki/Database_normalization

Performance issues with complex nested RoR reservation system

I'm designing a Ruby on Rails reservation system for our small tour agency. It needs to accommodate a number of things, and the table structure is becoming quite complex.
Has anyone encountered a similar problem before? What sort of issues might I come up against? And are performance/ validation likely to become issues?
In simple terms, I have a customer table, and a reservations table. When a customer contacts us with an enquiry, a reservation is set up, and related information added (e.g., paid/ invoiced, transport required, hotel required, etc).
So far so good, but this is where is gets complex. Under each reservation, a customer can book different packages (e.g. day trip, long tour, training course). These are sufficiently different, require specific information, and are limited in number, such that I feel they should each have a different model.
Also, a customer may have several people in his party. This would result in links between the customer table and the reservation table, as well as between the customer table and the package tables.
So, if customer A were to make a booking for a long trip for customers A,B and C, and a training course for customer B, it would look something like this.
CUSTOMERS TABLE
CustomerA
CustomerB
CustomerC
CustomerD
CustomerE
etc
RESERVATIONS TABLE
1. CustomerA
LONG TRIP BOOKINGS
CustomerA - Reservation_ID 1
CustomerB - Reservation_ID 1
CustomerC - Reservation_ID 1
TRAINING COURSE BOOKINGS
CustomerB - Reservation_ID 1
This is a very simplified example, and omits some detail. For example, there would be a model containing details of training courses, a model containing details of long trips, a model containing long trip schedules, etc. But this detail shouldn't affect my question.
What I'd like to know is:
1) are there any issues I should be aware of in linking the customer table to the reservations model, as well as to bookings models nested under reservations.
2) is this the best approach if I need to handle information about the reservation itself (including invoicing), as well as about the specific package bookings.
On the one hand this approach seems to be complex, but on the other, simplifying everything into a single package model does not appear to provide enough flexibility.
Please let me know if I haven't explained this issue very clearly, I'm happy to provide more information. Grateful for any ideas, suggestions or comments that would help me think through this rather complex database design.
Many thanks!
I have built a large reservation system for travel operators and wholesalers, and I can tell you that it isn't easy. There seems to be similarity yet still large differences in the kinds of product booked. Also, date-sensitivity is a large difference from other systems.
1) In respect to 'customers' I have typically used different models for representing different concepts. You really have:
a. Person / Company paying for the booking
b. Contact person for emergencies
c. People travelling
a & b seem like the same, but if you have an agent booking, then you might want to separate them.
I typically use a => 'customer' table, then some simple contact-fields for b, and finally for c use a 'passengers' table. These could be setup as different associations to the same model, but I think they are different enough, and I tend to separate them - perhaps use a common address/contact model.
2) I think this is fine, but depends on your needs. If you are building up itineraries for a traveller, then it makes sense to setup 'passengers' on the 'reservation', then for individual itinerary items, with links to which passenger is travelling on/using that item.
This is more complicated, and you must be careful to track dependencies, but the alternative is to not track passenger names, and simply assign quantities to each item (1xAdult, 2xChildren). This later method is great for small bookings, so it seems to depend on if your bookings are simple, or typically built up of longer itineraries.
other) In addition, in respect to different models for different product types, this can work well. However, there tends to be a lot of cross over, so some kind of common 'resource' model might be better -- or some other means of capturing common behaviour.
If I haven't answered your questions, please do ask more specific database design questions, or I can add more detail about specific examples of what I've found works well.
Good luck with the design!

How to best model and search seasonal availability with Rails

So I have an app where I am tracking a number of things, including flowers. Depending on the type of flower they can be available from places during certain spans of time during the year. Sometimes they can even be available during multiple spans of time (eg domestically from Mar-Jun, but can be found internationally from Sept-Dec).
What I am looking to be able to do is search for a specific date and determine all the different flowers that would be available on that date.
My idea was to have an Availability model which had a belongs_to relationship with a Flower. It would have a start_date, an end_date, and a flower_id. The problem was that dates in rails tend to be specific points in time, eg 2009-10-13. If I said a flower was available from 2009-10-01 - 2009-12-31 when 2010 came around I wouldn't see it as available.
So then I thought maybe I could have some sort of cron job that went through daily and changed the years on availability records as their end dates came up.
Maybe this is the right approach, but it feels a bit clunky. I looked through a few gems/plugins and couldn't find anything in particular that would fit my need.
Anyone have any insight?
Thanks in advance...
Given the cyclical nature of months it can be difficult perform a query to quickly select months where a given set of flowers is available.
I like the availability model, but think you should be saving just the month number (eg: October is saved as 10 in the start_month/end_month fields). Use Date::MONTHNAMES to make things human readable. (eg: Date::MONTHNAMES[10] => "October")
This allows you to easily form a named scope in Availabilities to choose what's available now.
class Flower < ActiveRecord::Base
has_many :availabilities
named_scope :available_today, lambda
month = Date.today.month
{:include => :availabilities, :conditions =>
["(availabilities.start_month < availabilities.end_month AND \
availabilities.start_month <= ? AND availabilities.end_month >= ?) \
OR (availabilities.start_month > availabilities.end_month AND \
availabilities.start_month >= ? AND availabilities.end_month <= ?)",
month, month,month,month]}
named_scope :red, :conditions => {:colour => "red"}
end
Now you can do Flower.available_today.red to get a list of red flowers available now.
Unfortunately the named scope might populate the association with the availabilities (Flower.available_today.first.availabilities) that you don't want. I can't be sure without testing it, and I don't have the environment to do so right now. Worst case scenario you can use #flower.availabilities.select with similar arguments to the queries to prune the list.
I like the Availability model idea, but I'd store just a start_month and end_month (both integers) and a notes field (for stuff like "Internationally" or "Domestically"). Then when you have a date, you just get the month field and compare to the set of ranges you have for each flower.
That might be a little compute-intensive if you have a lot of flowers. You could instead store a single Availability row for each flower, with 12 integers - one for each month - and 12 notes fields. Or if you won't have a lot of notes, you have an AvailabilityNotes model. Availability then has_many :availability_notes.
If it's for seasonal items, you could just have a (start/end)month or season field.
First of all, is there a customer demand to get this "right?" Who wants this feature, and how do you know they want it? Have you modeled the user's workflow to determine how this actually fits into ordering behavior? Do you really take orders for flowers a year or more in advance? Is availability only based on the season -- there's no dependency on suppliers or other events that can change year-to-year? And are you anticipating that you'll always have the same inventory next year that you will this year? Or is pinning an availability date on a flower not a guarantee?
If this is just for general information purposes -- you'd like people to know what sorts of flowers they can order at what times of year -- then I wouldn't give the user a "pick a date" function at all. I'd just give them a dropdown: either for four values for seasons, or twelve for months. If they just want to know "Is it likely I can get lilies for my mom's birthday?" that's fully sufficient.
And you could model it very simply, with four or twelve boolean values in your model, and four or twelve checkboxes on your flower create/edit form. (Yeah, I know, it's "purer" to do a :has_many association, but unnecessary; the number of months in the year isn't going to change.)

Keeping the history of model associations

I have multiple models that need to have their history kept pretty much indefinitely or at least a very long time. The application I would keep track of daily attendance statistics for people in different organizations, and a couple of other similar associations. I've realized that I can't ever delete users due to the user not showing up in a query for attendance anytime before said user was deleted. The problem I'm having is finding a nice way of keep track of old associations as well as querying on those old associations.
If I have a connecting table between Users and Organizations, I could just add a new row with the new associations between a User and the Organization the user belongs to, and query on the old ones based on the date, but I really don't have any elegant way of doing this, everything just feels ugly. I was just wondering if anyone has dealt with anything like this before, and maybe had a solution they had already implemented. Thanks.
From a modeling point, the relationship sounds like the one between Employee and Employer, namely Employment. This would hold a reference to both Employee and Employer along with some notion of the TimePeriod (ie, startDate and end Date). If you wanted to query 'active' employees then they are all the ones with a startDate <= Now() && endDate >= Now(), 'terminated' employees have endDate < Now(), etc.
I can't tell in your case what the relationship is between Users and Organizations, and what makes the relationship start and end, but a concept similar to Employment, Membership, Enrollment or Contract is likely there. When you say daily attendance, is it not daily over a period of time? The time period is the basis for your queries.
Hope that helps,
Berryl
Create an is_deleted field so that you can still query those "deleted" users, but modify your code so that they will behave everywhere else as if they are deleted. Then you never need to actually delete the row and lose data.
There are a number of plugins that keep track of revisions to models, including their associations. Take a look at this search for revision-related plugins.

Rails Caching DB Queries and Best Practices

The DB load on my site is getting really high so it is time for me to cache common queries that are being called 1000s of times an hour where the results are not changing.
So for instance on my city model I do the following:
def self.fetch(id)
Rails.cache.fetch("city_#{id}") { City.find(id) }
end
def after_save
Rails.cache.delete("city_#{self.id}")
end
def after_destroy
Rails.cache.delete("city_#{self.id}")
end
So now when I can City.find(1) the first time I hit the DB but the next 1000 times I get the result from memory. Great. But most of the calls to city are not City.find(1) but #user.city.name where Rails does not use the fetch but queries the DB again... which makes sense but not exactly what I want it to do.
I can do City.find(#user.city_id) but that is ugly.
So my question to you guys. What are the smart people doing? What is
the right way to do this?
With respect to the caching, a couple of minor points:
It's worth using slash for separation of object type and id, which is rails convention. Even better, ActiveRecord models provide the cacke_key instance method which will provide a unique identifier of table name and id, "cities/13" etc.
One minor correction to your after_save filter. Since you have the data on hand, you might as well write it back to the cache as opposed to delete it. That's saving you a single trip to the database ;)
def after_save
Rails.cache.write(cache_key,self)
end
As to the root of the question, if you're continuously pulling #user.city.name, there are two real choices:
Denormalize the user's city name to the user row. #user.city_name (keep the city_id foreign key). This value should be written to at save time.
-or-
Implement your User.fetch method to eager load the city. Only do this if the contents of the city row never change (i.e. name etc.), otherwise you can potentially open up a can of worms with respect to cache invalidation.
Personal opinion:
Implement basic id based fetch methods (or use a plugin) to integrate with memcached, and denormalize the city name to the user's row.
I'm personally not a huge fan of cached model style plugins, I've never seen one that's saved a significant amount of development time that I haven't grown out of in a hurry.
If you're getting way too many database queries it's definitely worth checking out eager loading (through :include) if you haven't already. That should be the first step for reducing the quantity of database queries.
If you need to speed up sql queries on data that doesnt change much over time then you can use materialized views.
A matview stores the results of a query into a table-like structure of
its own, from which the data can be queried. It is not possible to add
or delete rows, but the rest of the time it behaves just like an
actual table. Queries are faster, and the matview itself can be
indexed.
At the time of this writing, matviews are natively available in Oracle
DB, PostgreSQL, Sybase, IBM DB2, and Microsoft SQL Server. MySQL
doesn’t provide native support for matviews, unfortunately, but there
are open source alternatives to it.
Here is some good articles on how to use matviews in Rails
sitepoint.com/speed-up-with-materialized-views-on-postgresql-and-rails
hashrocket.com/materialized-view-strategies-using-postgresql
I would go ahead and take a look at Memoization, which is now in Rails 2.2.
"Memoization is a pattern of
initializing a method once and then
stashing its value away for repeat
use."
There was a great Railscast episode on it recently that should get you up and running nicely.
Quick code sample from the Railscast:
class Product < ActiveRecord::Base
extend ActiveSupport::Memoizable
belongs_to :category
def filesize(num = 1)
# some expensive operation
sleep 2
12345789 * num
end
memoize :filesize
end
More on Memoization
Check out cached_model

Resources