What is the difference between the
has_and_belongs_to_many and
has_many through relationships? When and where to use which one?
As far as I can remember, has_and_belongs_to_many gives you a simple lookup table which references your two models.
For example,
Stories can belong to many categories.
Categories can have many stories.
Categories_Stories Table
story_id | category_id
has_many :through gives you a third model which can be used to store various other pieces of information which don't belong to either of the original models.
For example
Person can subscribe to many magazines.
Magazines can have many subscribers.
Thus, we can have a subscription model in the middle, which gives us a similar table to the earlier example, but with additional properties.
Subscriptions Table
person_id | magazine_id | subscription_type | subscription_length | subscription_date
And so on.
From http://guides.rubyonrails.org/association_basics.html#choosing-between-has-many-through-and-has-and-belongs-to-many
The simplest rule of thumb is that you should set up a has_many :through relationship if you need to work with the relationship model as an independent entity. If you don’t need to do anything with the relationship model, it may be simpler to set up a has_and_belongs_to_many relationship (though you’ll need to remember to create the joining table in the database).
You should use has_many :through if you need validations, callbacks, or extra attributes on the join model.
My rule of thumb is, can I get by with a list of checkboxes here? If so, then it's a has-and-belongs-to-many (HABTM) association. If I need the checkbox to capture more about the relationship than simply yes/no it belongs, then use has_many :through. HABTM is as simple as using the _ids method with a simple_form collection_check_boxes. has_many :through often involves accepts_nested_attributes_for.
From my experience it's always better to use has_many: through because you can add timestamps to the table. Many times while debugging some ActiveRecord objects that are connected through HABTM, I was missing created_at, updated_at timestamps to get the clue what actually happened.
So keep in mind that it can help you to debug, investigate an issues with the data relations in the context of the time, because without it you are "blind" when relations were created or updated.
You should use has_many :through if you need validations, callbacks, or extra attributes on the join model.
Many of the answers clarify that you should use has_and_belongs_to_many vs. has_many through: if you will not need any extra data or validations on the join table.
However, beware of taking this approach. In the early stages of application development, it is nearly impossible to know what extra features or validations you may need in the far future of your project's lifecycle. If you decided to use has_and_belongs_to_many, and want to add one simple datapoint or validation 2 years down the road, migrating this change will be extremely difficult and bug-prone. To be safe, default to has_many :through
The simplest rule of thumb is that you can go with has_many :through relationship if you need to work with the relationship model as an independent entity.
If you don't need to do anything with the relationship model, it may be simpler to set up a has_and_belongs_to_many relationship (though you'll need to remember to create the joining table in the database).
You should use has_many :through if you need validations, callbacks, or extra attributes on the join model.
Rails offers two different ways to declare a many-to-many relationship between models. The first way is to use has_and_belongs_to_many, which allows you to make the association directly:
The second way to declare a many-to-many relationship is to use has_many :through. This makes the association indirectly, through a join model:
You should use has_many :through if you need validations, callbacks, or extra attributes on the join model.
Related
TL;DR: How does one create a has_one association using through join table and vice-versa a belongs_to through?
Context: I have two models, ProcessLog and Encounter. ProcessLog, as the name (somewhat) suggests, saves log of a single run (corresponding to a row in DB) of an external process (which is run multiple times). On the other hand, Encounter is a model that keeps track of some information X. Encounters can be produced either internally or as a result of a successful execution of the external process mentioned earlier. What it entails is that not all Encounters have an associated ProcessLog and not all ProcessLogs have an associated Encounter. However, If there is a ProcessLog for an Encounter, this is a 1:1 relationship. An Encounter cannot have more than one ProcessLog and a ProcessLog cannot belong to more than one Encounter. From DB design perspective, this is an optional relationship (I hope I haven't forgotten my lessons). In a database, this would be modelled using a join table with encounter_id as the primary key and process_log_id as the foreign key.
Rails: In Rails, 1:1 relationships are generally modelled without using a join table and the belongs_to table generally having a foreign key to the other table. So in my case, this would be encounter_id in process_logs table.
Problem: With traditional Rails approach of has_one and belongs_to, this will result in many rows in process_logs table with NULL values for encounter_id column. Now there are pros and cons to this approach by Rails, however, that is not my intention to discuss here. Yes, it will keep the table structure simple, however, in my case it breaks the semantics and also introduces lots of NULL values, which I don't consider a good approach. And is also the reason why a join table exists for optional relationships.
What have I done so far?: There aren't a whole lot of helpful documents I could find on this topic, except for the following two linked documents, though they have their own issues and don't solve my problem.
SO Question The approach here is using has_many for the join model, whereas I have only one
Discussion on RoR Similarly, it is using has_many and yet somehow talks about has_one
I created a join model called EncounterProcessLog (which has belongs_to for both ProcessLog and Encounter) and then a has_one ..., through: on the other two models, but Rails is looking for a many-to-many association and of course looking for encounter_id on process_logs table.
Question: How can I achieve what I intend to achieve here? Something on the lines of (non-working) code below:
class Encounter:
has_one :process_log, through: :encounter_process_logs
class ProcessLog:
belongs_to :encounter, through: :encounter_process_logs # This may be incorrect way of specifying the relationship?
class EncounterProcessLog:
belongs_to :encounter
belongs_to :process_log # May be this should be a has_one?
I hope someone is able to guide me in the right direction. Thanks for reading so far.
One way I can think of for doing this is:
class Encounter:
has_one :encounter_process_log
has_one :process_log, through: :encounter_process_log
class ProcessLog:
has_one :encounter_process_log
has_one :encounter, through: :encounter_process_log
class EncounterProcessLog:
belongs_to :encounter
belongs_to :process_log
This would return the process_log for encounter and vice versa which is what you want probably.
Let's say I have a user model and a movie model as well as tables to store them. Let's say I want to add a watchlist feature by adding a third table called watchlist_movies that simply maps a user_id to a movie_id.
I now want to add a watchlist_movie model. I want to be able to query ActiveRecord with user.watchlist_movies and get a collection of movie objects. So far, I've tried (in user.rb)
has_many :watchlist_movies
and
has_many :movies, through: watchlist_movies
The first results in user.watchlist_movies returning a record of and the second will return a collection of movie records at user.movies. Essentially what I want is for user.watchlist_movies to return a collection of movie records, so I want the access as defined in the first relationship to return the content of the second relationship. How do I do this?
You're defining the relationships correctly here, but you may have a case of your expectations not aligning with how Rails does things. If your structure is that of Movie being related to User through WatchlistMovie, which is a simple join model, you're on the right track.
Remember you can call your relationships anything you want:
has_many :watchlist_movie_associations,
class_name: 'WatchlistMovie'
has_many :watchlist_movies,
class_name: 'Movie',
through: :watchlist_movie_associations
I'd advise against this since it goes against the grain of how ActiveRecord prefers to name things. If it just bugs you right now, that's understandable, but if you embrace the style instead of fighting it you'll have an application that's a lot more consistent and understandable.
As a Ruby on Rails newbie, I'm going through the Rails Guides and tonight is Active Record Migrations.
After finishing the section on Join Tables (http://guides.rubyonrails.org/active_record_migrations.html#creating-a-join-table), I'm left with the impression that using create_join_table is preferred (and simpler) than creating the Join Table via rails generate model.
Is this a correct assumption? Are there nuances I should be aware of?
Using the example in the guides of categories and products:
A join table works transparently. You only work with the two existing models (Category and Product), and the join table exists only to enable the HABTM relationship between them, so you can call category.products, or product.categories, and things just work.
Generating a model, on the other hand, would only be necessary if you need to work with that association as a distinct thing in your application (e.g. if you need to do things with a Categorization directly).
Contrast the Rails Guides description of a has_and_belongs_to_many association (read more):
A has_and_belongs_to_many association creates a direct many-to-many
connection with another model, with no intervening model. For example,
if your application includes assemblies and parts, with each assembly
having many parts and each part appearing in many assemblies, you
could declare the models this way:
with that of a has_many :through association (read more):
A has_many :through association is often used
to set up a many-to-many connection with another model. This
association indicates that the declaring model can be matched with
zero or more instances of another model by proceeding through a third
model. For example, consider a medical practice where patients make
appointments to see physicians. The relevant association declarations
could look like this:
So, yes, you're correct that create_join_table would be simpler than creating a model for the association. You can also see this answer for another explanation of the difference between these approaches.
As the docs mention...
Migration method create_join_table creates a HABTM join table
--
When you create a has_and_belongs_to_many relationship, you only need a join table (no model).
has_and_belongs_to_many join tables are different from has_many :through as they don't have an id (primary_key); has_many :through are meant to represent an interim model, so each "record" should have a unique identifier.
Thus, your question about whether it's better to create a join table through rails g model is false. has_and_belongs_to_many has no binding model. If you were using has_many :through, you'd be able to use the rails g model no problem.
I have the following models setup:
clinics
has_many :occupations, through: :clinic_occupations
has_many :clinic_occupations
occupations
has_many :clinic_occupations
clinic_occupations
belongs_to :occupations
belongs_to :clinics
I think the has_many :clinic_occupations in clinics is probably unnecessary, but it's what we have right now so I wanted to include it. I am trying to only call occupations that have been associated with a clinic, or occupations that have no clinic id at all. What is the correct way to do this, and what are these model associations tangibly doing?
I think the has_many :clinic_occupations in clinics is probably unnecessary
Nope, it is. In order for any has_many _, through: _ to work, you need to supply another association. That's exactly what through stands for, "an association through an association". Under the hood, this uses JOIN-clauses for "lower" asssociated objects with "higher" ones . Since Rails 3 these can even be nested, producing a JOIN-clause through multiple tables at once.
What it does is create a bunch of methods for your model (reference) with some caching underneath.
Why should you define that association explicitly? Because:
It's safer. Association name may not follow Rails convention and thus no assumptions can be made just with its name. Let's face it, sometimes this happens. More on that below.
It offers all the features has_many associations have in general. Association may contain details that do not violate conventions (like conditions).
It's more flexible. This way you can define multiple associations that point to the same class. It's impossible to follow conventions in this case, since one name can only point to one method.
It's clearer when looking at the model why a single method call queries multiple tables at once.
In fact, you may want to access the join model (ClinicOccupation) directly when speeding up certain queries. There are cases when you already know the needed object's id and want to use it in a query without fetching the entire object first with an extra query.
Can some one please explain the the pros and cons between has_many :through and has_and_belongs_to_many?
There is nothing bad about using habtm per se. The reason why many people don't use this kind of association is that they use has_many :through instead. Why? Because it's more versatile. While HABTM "hides" the intermediary table, when using has_many :through the middle man is a resource by itself - which is usually a good thing (if for nothing, you can timestamp the relationship). You'll come across many situations when you'll need to add some behavior or attributes to such a relationship (when designing a system in a resource-oriented fashion).