ActiveRecord has_n association - ruby-on-rails

I was wondering what the best way to model a relationship where an object is associated with exactly n objects of another class. I want to extend the has_one relationship to a specific value of n.
For example, a TopFiveMoviesList would belong to user and have exactly five movies. I would imagine that the underlying sql table would have fields like movie_id_1, movie_id_2, ... movie_id_5.
I know I could do a has_many relationship and limit the number of children at the model level, but I'd rather not have an intermediary table.

I think implementing this model through a join model is going to be you're best bet here. It allows the List model to worry about List logic and the Movie model to worry about Movie logic. You can create a Nomination (name isn't the greatest, but you know what I mean) model to handle the relationship between movies and lists, and when there's a limit of 5, you could just limit the number of nominations you pull back.
There are a few reasons I think this approach is better.
First, assuming you want to be able to traverse the relationships both ways (movie.lists and list.movies), the 5 column approach is going to be much messier.
While it'd be so much better for ActiveRecord to support has n relationships, it doesn't, and so you'll be fighting the framework on that one. Also, the has n relationship seems a bit brittle to me in this situation. I haven't seen that kind of implementation pulled off in ActiveRecord, though I'd be really interested in seeing it happen. :)

My first instinct would be to use a join table, but if that's not desirable User.movie[1-5]_id columns would fit the bill. (I think movie1_id fits better with Rails convention than movie_id_1.)
Since you tagged this Rails and ActiveRecord, I'll add some completely untested and probably somewhat wrong model code to my answer. :)
class User < ActiveRecord::Base
TOP_N_MOVIES = 5
(1..TOP_N_MOVIES).each { |n| belongs_to "movie#{n}".to_sym, :class_name => Movie }
end
You could wrap that line in a macro-style method, but unless if that's a common pattern for your application, doing that will probably just make your code that harder to read with little DRY benefit.
You might also want to add validations to ensure that there are no duplicate movies on a user's list.
Associating your movie class back to your users is similar.
class Movie < ActiveRecord::Base
(1..User::TOP_N_MOVIES).each do |n|
has_many "users_list_as_top_#{n}".to_sym, :class_name => User, :foreign_key => "movie#{n}_id"
end
def users_list_as_top_anything
ary = []
(1..User::TOP_N_MOVIES).each {|n| ary += self.send("users_list_as_top_#{n}") }
return ary
end
end
(Of course that users_list_as_top_anything would probably be better written out as explicit SQL. I'm lazy today.)

I assume you mean "implement" rather than "model"? The modeling's pretty easy in UML, say, where you have a Person entity that is made up of 5 Movie entities.
But the difficulty comes when you say has_one, going to has_5. If it's a simple scalar value, has_one is perhaps a property on the parent entity. Has_5 is probably 2 entities related to one another through an "is made up of" relationship in UML.
The main question to answer is probably, "Can you guarantee that it will always be 'Top 5'?" If yes, model it with columns, as you mentioned. If no, model it with another entity.
Another question is perhaps, "How easy will it be to refactor?" If it's simple, heck, start with 5 columns and refactor to separate entities if it ever changes.
As usual, "best" is dependent on the business and technical environment.

Related

Can I have a one way HABTM relationship?

Say I have the model Item which has one Foo and many Bars.
Foo and Bar can be used as parameters when searching for Items and so Items can be searched like so:
www.example.com/search?foo=foovalue&bar[]=barvalue1&bar[]=barvalue2
I need to generate a Query object that is able to save these search parameters. I need the following relationships:
Query needs to access one Foo and many Bars.
One Foo can be accessed by many different Queries.
One Bar can be accessed by many different Queries.
Neither Bar nor Foo need to know anything about Query.
I have this relationship set up currently like so:
class Query < ActiveRecord::Base
belongs_to :foo
has_and_belongs_to_many :bars
...
end
Query also has a method which returns a hash like this: { foo: 'foovalue', bars: [ 'barvalue1', 'barvalue2' } which easily allows me to pass these values into a url helper and generate the search query.
This all works fine.
My question is whether this is the best way to set up this relationship. I haven't seen any other examples of one-way HABTM relationships so I think I may be doing something wrong here.
Is this an acceptable use of HABTM?
Functionally yes, but semantically no. Using HABTM in a "one-sided" fashion will achieve exactly what you want. The name HABTM does unfortunately insinuate a reciprocal relationship that isn't always the case. Similarly, belongs_to :foo makes little intuitive sense here.
Don't get caught up in the semantics of HABTM and the other association, instead just consider where your IDs need to sit in order to query the data appropriately and efficiently. Remember, efficiency considerations should above all account for your productivity.
I'll take the liberty to create a more concrete example than your foos and bars... say we have an engine that allows us to query whether certain ducks are present in a given pond, and we want to keep track of these queries.
Possibilities
You have three choices for storing the ducks in your Query records:
Join table
Native array of duck ids
Serialized array of duck ids
You've answered the join table use case yourself, and if it's true that "neither [Duck] nor [Pond] need to know anything about Query", using one-sided associations should cause you no problems. All you need to do is create a ducks_queries table and ActiveRecord will provide the rest. You could even opt to use has_many :through relationship if you need to do anything fancy.
At times arrays are more convenient than using join tables. You could store the data as a serialized integer array and add handlers for accessing the data similar to the following:
class Query
serialize :duck_ids
def ducks
transaction do
Duck.where(id: duck_ids)
end
end
end
If you have native array support in your database, you can do the same from within your DB. similar.
With Postgres' native array support, you could make a query as follows:
SELECT * FROM ducks WHERE id=ANY(
(SELECT duck_ids FROM queries WHERE id=1 LIMIT 1)::int[]
)
You can play with the above example on SQL Fiddle
Trade Offs
Join table:
Pros: Convention over configuration; You get all the Rails goodies (e.g. query.bars, query.bars=, query.bars.where()) out of the box
Cons: You've added complexity to your data layer (i.e. another table, more dense queries); makes little intuitive sense
Native array:
Pros: Semantically nice; you get all the DB's array-related goodies out of the box; potentially more performant
Cons: You'll have to roll your own Ruby/SQL or use an ActiveRecord extension such as postgres_ext; not DB agnostic; goodbye Rails goodies
Serialized array:
Pros: Semantically nice; DB agnostic
Cons: You'll have to roll your own Ruby; you'll loose the ability to make certain queries directly through your DB; serialization is icky; goodbye Rails goodies
At the end of the day, your use case makes all the difference. That aside, I'd say you should stick with your "one-sided" HABTM implementation: you'll lose a lot of Rails-given gifts otherwise.

How to design/model a has many relationship that has a meaningful join table?

I wasn't able to put well into words (in Question title) what I'm trying to do, so in honor of the saying that an image is worth a thousand words; In a nutshell what I'm trying to do is..
Basically, what I have is A Teacher has many Appointments and A Student has many Appointments which roughly translates to:
I'm trying to stay away from using the has_and_belongs_to_many macro, because my appointments model has some meaning(operations), for instance it has a Boolean field: confirmed.
So, I was thinking about using the has_many :through macro, and perhaps using an "Appointable" join table model? What do you guys think?
The Scenario I'm trying to code is simple;
A Student requests an Appointment with a Teacher at certain Date/Time
If Teacher is available (and wants to give lesson at that Date/Time), She confirms the Appointment.
I hope you can tell me how would you approach this problem? Is my assumption of using the has_many :through macro correct?
Thank you!
Both teachers and students could inherit from a single class e.g. Person. Then create an association between Person and Appointments. This way you keep the architecture open so that if in the future you want to add 'Parents' then they could easily be integrated and may participate in appointments.
It may not be completely straightforward how you do the joins with the children classes (Students, Parents, Teachers). It may involve polymorphic relationships which I don't particularly like. You should though get away with a single join table.
In any case, you want to design so that your system can be extended. Some extra work early on will save you a lot of work later.

A database design for variable column names

I have a situation that involves Companies, Projects, and Employees who write Reports on Projects.
A Company owns many projects, many reports, and many employees.
One report is written by one employee for one of the company's projects.
Companies each want different things in a report. Let's say one company wants to know about project performance and speed, while another wants to know about cost-effectiveness. There are 5-15 criteria, set differently by each company, which ALL apply to all of that company's project reports.
I was thinking about different ways to do this, but my current stalemate is this:
To company table, add text field criteria, which contains an array of the criteria desired in order.
In the report table, have a company_id and columns criterion1, criterion2, etc.
I am completely aware that this is typically considered horrible database design - inelegant and inflexible. So, I need your help! How can I build this better?
Conclusion
I decided to go with the serialized option in my case, for these reasons:
My requirements for the criteria are simple - no searching or sorting will be required of the reports once they are submitted by each employee.
I wanted to minimize database load - where these are going to be implemented, there is already a large page with overhead.
I want to avoid complicating my database structure for what I believe is a relatively simple need.
CouchDB and Mongo are not currently in my repertoire so I'll save them for a more needy day.
This would be a great opportunity to use NoSQL! Seems like the textbook use-case to me. So head over to CouchDB or Mongo and start hacking.
With conventional DBs you are slightly caught in the problem of how much to normalize your data:
A sort of "good" way (meaning very normalized) would look something like this:
class Company < AR::Base
has_many :reports
has_many :criteria
end
class Report < AR::Base
belongs_to :company
has_many :criteria_values
has_many :criteria, :through => :criteria_values
end
class Criteria < AR::Base # should be Criterion but whatever
belongs_to :company
has_many :criteria_values
# one attribute 'name' (or 'type' and you can mess with STI)
end
class CriteriaValues < AR::Base
belongs_to :report
belongs_to :criteria
# one attribute 'value'
end
This makes something very simple and fast in NoSQL a triple or quadruple join in SQL and you have many models that pretty much do nothing.
Another way is to denormalize:
class Company < AR::Base
has_many :reports
serialize :criteria
end
class Report < AR::Base
belongs_to :company
serialize :criteria_values
def criteria
self.company.criteria
end
# custom code here to validate that criteria_values correspond to criteria etc.
end
Related to that is the rather clever way of serializing at least the criteria (and maybe values if they were all boolean) is using bit fields. This basically gives you more or less easy migrations (hard to delete and modify, but easy to add) and search-ability without any overhead.
A good plugin that implements this is Flag Shih Tzu which I've used on a few projects and could recommend.
Variable columns (eg. crit1, crit2, etc.).
I'd strongly advise against it. You don't get much benefit (it's still not very searchable since you don't know in which column your info is) and it leads to maintainability nightmares. Imagine your db gets to a few million records and suddenly someone needs 16 criteria. What could have been a complete no-issue is suddenly a migration that adds a completely useless field to millions of records.
Another problem is that a lot of the ActiveRecord magic doesn't work with this - you'll have to figure out what crit1 means by yourself - now if you wan't to add validations on these fields then that adds a lot of pointless work.
So to summarize: Have a look at Mongo or CouchDB and if that seems impractical, go ahead and save your stuff serialized. If you need to do complex validation and don't care too much about DB load then normalize away and take option 1.
Well, when you say "To company table, add text field criteria, which contains an array of the criteria desired in order" that smells like the company table wants to be normalized: you might break out each criterion in one of 15 columns called "criterion1", ..., "criterion15" where any or all columns can default to null.
To me, you are on the right track with your report table. Each row in that table might represent one report; and might have corresponding columns "criterion1",...,"criterion15", as you say, where each cell says how well the company did on that column's criterion. There will be multiple reports per company, so you'll need a date (or report-number or similar) column in the report table. Then the date plus the company id can be a composite key; and the company id can be a non-unique index. As can the report date/number/some-identifier. And don't forget a column for the reporting-employee id.
Any and every criterion column in the report table can be null, meaning (maybe) that the employee did not report on this criterion; or that this criterion (column) did not apply in this report (row).
It seems like that would work fine. I don't see that you ever need to do a join. It looks perfectly straightforward, at least to these naive and ignorant eyes.
Create a criteria table that lists the criteria for each company (company 1 .. * criteria).
Then, create a report_criteria table (report 1 .. * report_criteria) that lists the criteria for that specific report based on the criteria table (criteria 1 .. * report_criteria).

Best way to handle multiple tables to replace one big table in Rails? (e.g. 'todo_items1', 'todo_items2', etc., instead of just 'todo_items')?

Update:
Originally, this post was using Books as the example entity, with
Books1, Books2, etc. being the
separated table. I think this was a
bit confusing, so I've changed the
example entity to be "private
todo_items created by a particular
user."
This kind of makes Horace and Ryan's original comments seem a bit off, and
I apologize for that. Please know that
their points were valid when it looked
like I was dealing with books.
Hello,
I've decided to use multiple tables for an entity (e.g. todo_items1, todo_items2, todo_items3, etc.), instead of just one main table which could end up having a lot of rows (e.g. just todo_items). I'm doing this to try and to avoid a potential future performance drop that could come with having too many rows in one table.
With that, I'm looking for a good way to handle this in Rails, mainly by trying to avoid loading a bunch of unused associations for each User object. I'm guessing that other have done something similar, so there's probably a few good tips/recommendations out there.
(I know that I could use a partition for this, but, for now, I've decided to go the 'multiple tables' route.)
Each user has their todo_items placed into a specific table. The actual "todo items" table is chosen when the user is created, and all of their todo_items go into the same table. The data in their todo items collection is private, so when it comes time to process a users todo_items, I'll only have to look at one table.
One thing I don't particularly want to have is a bunch of unused associations in the User class. Right now, it looks like I'd have to do the following:
class User < ActiveRecord::Base
has_many :todo_items1, :todo_items2, :todo_items3, :todo_items4, :todo_items5
end
class todo_items1 < ActiveRecord::Base
belongs_to :user
end
class todo_items2 < ActiveRecord::Base
belongs_to :user
end
class todo_items3 < ActiveRecord::Base
belongs_to :user
end
The thing is, for each individual user, only one of the "todo items" tables would be usable/applicable/accessible since all of a user's todo_items are stored in the same table. This means only one of the associations would be in use at any time and all of the other has_many :todo_itemsX associations that were loaded would be a waste.
For example, with a user.id of 2, I'd only need todo_items3.find_by_text('search_word'), but the way I'm thinking of setting this up, I'd still have access to todo_items1, todo_items2, todo_items4 and todo_items5.
I'm thinking that these "extra associations" adds extra overhead and makes each User object's size in memory much bigger than it has to be. Also, there's a bunch of stuff that Ruby/Rails is doing in the background which may cause other performance problems.
I'm also guessing that there could be some additional method call/lookup overhead for each User object, since it has to load all of those associations, which in turn creates all of those nice, dynamic model accessor methods like User.find_by_something.
I don't really know Ruby/Rails does internally with all of those has_many associations though, so maybe it's not so bad. But right now I'm thinking that it's really wasteful, and that there may just be a better, more efficient way of doing this.
So, a few questions:
1) Is there's some sort of special Ruby/Rails methodology that could be applied to this 'multiple tables to represent one entity' scheme? Are there any 'best practices' for this?
2) Is it really bad to have so many unused has_many associations for each object? Is there a better way to do this?
3) Does anyone have any advice on how to abstract the fact that there's multiple "todo items" tables behind a single todo_items model/class? For example, so I can call todo_items.find_by_text('search_phrase') instead of todo_items3.find_by_text('search_phrase').
Thank you!
This is not the way to scale.
It would probably be better going with master-slave replication and proper indexing (besides primary key) on fields such as "title" and/or "author" if that's what you're going to be looking up books based on. Having it in n-tables, how are you going to know the best place to go looking for the book the user is after? Are you going to go looking through 4 tables?
I agree with Horace: " don't try to solve a performance issue before you have figures to prove it." I suggest, however, that you should really look into adding indexes to your table if you want lookups to be fast. If they aren't fast, then tell us how they aren't fast and we will tell you how to make it go ZOOOOOM.

Objects and Relations in Ruby on Rails

I'm attempting to create a simple web application for some personal development, but I've run into an obstacle, which I'm hoping others will find trivial. I have working classes created, but I'm not sure how define their relationships, or whether their models are even appropriate.
For the sake of argument, I have the following classes already in existence: User, Team, and Athlete
The functionality I'm striving for is for a user to create a team by adding one athlete to the team object. Other users can either add players to the team, or they can add alternatives to an existing athlete on the team roster. Essentially, I want to create some sort of class to wrap an array of athlete objects, we'll call it AthleteArray.
If my understanding is correct, these are the relationships that would be appropriate:
So a Team would have-many AthleteArrays
An AthleteArray would have-many Athletes and would belong-to a Team
Athletes would belong-to a User, and would belong-to a AthleteArray.
Users would have-many athletes, but that would be the extent of their involvement.
Since the AthleteArray class wouldn't have any attributes, is it wise to create it as an ActiveRecord object(would it merely have an ID)? Is there another way to implement this idea(can you define the team class, to have an array of arrays of athlete objects)? I have very little knowledge of RoR, but I thought it would be a good place to start with web development. Any help would be appreciated. Thanks in advance!
Edit: The names of Athletes, Teams, and AthleteArrays are unimportant. Assume that an Athlete is basically a comment, validated against a list of athletes. Duplication is acceptable. If I'm understanding the answers posted, I should basically create an intermediate class which takes the IDs of their parents?
Is response to Omar:
Eventually, I'd like to add a voting system. So after basic team is created, users can individual suggest replacements for a given player, and users can vote up the best choices. For instance, If I have a team of chess_players created:
Bobby Fischer
Garry Kasparov
Vladimir Kramnik
someone might click on Kasparov, and decide that a better athlete would be Deep Blue, so this option would appear nested under Kasparov until it received more votes. This same process would occur for every character, but unlike a comment system, you wouldn't be able to respond to "Deep Blue" and substitute another player, you would simply respond to the position number 2, and suggest another player.
Right, so if I understand the question properly, there will be many possible combinations of athletes for a team, which may or may not use the same athletes?
Something sounds off here, possibly in the naming of your models. Don't quite like the smell of AthleteArrays.
has_many_through might be what you need to access the athletes from your team e.g.
has_many :athletes, :through => :team_permuations # your AthleteArray model
Josh Susser has an old roundup of many to many associations which i guess is what you need, since according to your specification, theoretically it can be possible for an athlete to belong to any number of teams. I suppose the best thing is that you can have automagically populated audit columns (created/updated_at/on) on your many to many associations, which is a nice thing to have.
http://blog.hasmanythrough.com/2006/4/20/many-to-many-dance-off
Please comment if you think I have understood the question wrong.
If I understand your need, this is something that can be solved through has_many :through.
basically you can represent this in the database like
athletes:
id
other_fields
teams:
id
other_fields
roster_items:
id
team_id
athlete_id
position
is_active #is currently on the team or just a suggestion
adding_user_id #the user who added this roster item
class Athlete < ActiveRecord::Base
has_many :roster_items
end
class Team < ActiveRecord::Base
has_many :roster_items
has_many :athletes, :through => :roster_items
end
class RosterItem < ActiveRecord::Base
belongs_to :team
belongs_to :athlete
belongs_to :adding_user, :class_name => 'User'
end

Resources