ActiveRecord Associations - Where to put functionality? - ruby-on-rails

I'm looking for some best-practice advice for the following situation.
I have the following skeleton ActiveRecord models:
# user.rb
class User < ActiveRecord::Base
has_many :country_entries, dependent: destroy
end
# country_entry.rb
class CountryEntry < ActiveRecord::Base
belongs_to :user
validates :code, presence: true
end
Now suppose I need to get a comma-separated list of CountryEntry codes for a particular user. The question is, where do I put this method? There are two options:
# user.rb
#...
def country_codes
self.country_entries.map(&:code)
end
#...
-or-
# country_entry.rb
#...
def self.codes_for_user(user)
where(user_id: user.id).map(&:code)
end
#...
And so the APIs would be: #current_user.country_codes -or- CountryEntry.codes_for_user(#current_user)
Seems like placing the code in country_entry.rb decouples everything a little more, but it makes the API a little uglier. Any general or personal-experience best practices on this issue?

Instance method VS Class method: If the method is for an instance, of course it is better to be an instance method.
In user model VS in Coutry model: User model wins. Law of Demeter suggests one dot only in Ruby. If you have chance to do that, of course it's better to follow.
Conclusion: Your first method wins.
# user.rb
def country_codes
self.country_entries.map(&:code)
end
Add: Reference for Law of Demeter
http://en.wikipedia.org/wiki/Law_of_Demeter
http://rails-bestpractices.com/posts/15-the-law-of-demeter
http://devblog.avdi.org/2011/07/05/demeter-its-not-just-a-good-idea-its-the-law/

Now this is really an interesting question. And it has so many answers ;-)
From your initial question I would suggest you put the code in the association itself
class User < ActiveRecord::Base
has_many :country_entries do
def codes
proxy_association.owner.country_entries.map(&:code)
end
end
end
so you could do something like this
list_of_codes = a_user.country_entries.codes
Now obviously this is a violation of the Law of Demeter.
So you would best be advised to offer a method on the User object like this
class User < ActiveRecord::Base
has_many :country_entries do
def codes
proxy_association.owner.country_entries.map(&:code)
end
end
def country_codes
self.country_entries.codes
end
end
Obviously nobody in the Rails world cares about the Law of Demeter so take this with a grain of salt.
As for putting the code into the CountryEntry class I am not sure why you would do this. If you can look up country codes only with the user I dont see the need to create a class method. You are anyway only able to look that list up if you have a User at hand.
If however many different objects can have a country_entries association than it makes sense to put it as a class method into CountryEntry.
My favorite would be a combination of LOD and a class method for reuse purposes.
class User < ActiveRecord::Base
has_many :country_entries
def country_codes
CountryEntry.codes_for_user(self)
end
end
class CountryEntry < ActiveRecord::Base
belongs_to :user
validates :code, presence: true
def self.codes_for_user(some_id)
where(ref_id: some_id).map(&:code)
end
end

In terms of API developers get from the two proposals, adding to the user model seems pretty straightforward. Given the problem:
Now suppose I need to get a comma-separated list of CountryEntry codes for a particular user.
The context is made of a user, for which we want to get the code list. The natural "entry point" seems a user object.
Another way to see the problem is in terms of responsibilities (thus linking to #robkuz entry on Demeter's). A CountryEntry instance is responsible for providing its code (and maybe a few other things). A CountryEntry class is basically responsible for providing attributes and methods common to all its instances, and no more (well). Getting the list of comma-separated codes is a specialized usage of CountryEntry instances that only User objects care of apparently. In this case, the responsibility belongs to the current user object. Value in the eye of the beholder...
This is inline with most answers on the thread, although in the solutions so far, you do not get a comma-separated list of codes, but an array of codes.
In terms of performance, note there is probably a difference too because of lazy evaluation. Just a note---someone more deeply familiar with ActiveRecord could comment on that!

I think #current_user.country_codes is a better choice in this case because it will be easier to use in your code.

Related

Ruby on rails active record queries which one is efficient

I was recently working on a project where I faced a dilemma of choosing between two ways of getting same results. Here is the class structure:
class Book < ApplicationRecord
belongs_to :author
end
class Author < ApplicationRecord
has_many :books
end
An author has first name, last name. I want to get the full name of the author for a given book as an instance method.
In simple active record terms, since book is associated with author, we can get the author name for a book as follows:
For example in Book class, we have:
class Book < ApplicationRecord
belongs_to :author
def author_name
"#{author.first_name} #{author.last_name}"
end
end
And we get the result!
But, according to the target of minimizing dependencies (POODR Book), future ease of change and better object oriented design, the book should not know properties of an author. It should interact with an author object by interfaces.
So Book should not be the one responsible for getting the Author name. The author class should.
class Book < ApplicationRecord
belongs_to :author
def author_name
get_author_name(self.author_id)
end
private
#minimizing class dependecies by providing private methods as external interfaces
def get_author_name(author_id)
Author.get_author_name_from_id(author_id)
end
end
class Author < ApplicationRecord
has_many :books
#class methods which provides a gate-way for other classes to communicate through interfaces, thus reducing coupling.
def self.get_author_name_from_id(id)
author = self.find_by_id(id)
author == nil ? "Author Record Not Found" : "#{author.first_name.titleize} #{author.last_name.titleize}"
end
end
Now, book is just interacting with the public interface provided by Author and Author is handling the responsibility of getting full name from its properties which is a better design for sure.
I tried running the queries as two separate methods in my console:
class Book < ApplicationRecord
def author_name
get_author_name(self.author_id)
end
def author_name2
"#{author.last_name} + #{author.first_name}"
end
end
The results are shown below:
Looks like both run the same queries.
My questions are
Does rails convert author.last_name called inside the Book class to
the same SQL query as Author.find_by_id(author_id).last_name called inside
Author class (through message passing from Book class) in case of bigger data size?
Which one is more performant in case of bigger data size?
Doesn't calling author.last_name from Book class violates design
principles ?
It's actually much more common and simplier to use delegation.
class Book < ApplicationRecord
belongs_to :author
delegate :name, to: :author, prefix: true, allow_nil: true
end
class Author < ApplicationRecord
has_many :books
def name
"#{first_name.titleize} #(last_name.titleize}"
end
end
As to performance, if you join the authors at the time of the book query you end up doing a single query.
#books = Book.joins(:author)
Now when you iterate through #books and you call individually book.author_name no SQL query needs to be made to the authors table.
1) Obviously not, it performs JOIN of books & authors tables. What you've made requires 2 queries, instead of 1 join you'll have book.find(id) and author.find(book.author_id).
2) JOIN should be faster.
3) Since last_name is a public interface, it absolutely doesn't violate design principles. It would violate principles if you were accessing author's last name from outside like that: Book.find(1).author.last_name - that's a bad thing. Correct is: Book.find(1).authors_last_name - and accessing author's name inside Model class.
Your provided example seems to be overcomplicated to me.
According to the example you shared, you only want to get full name of the book's author. So, the idea of splitting responsibility is correct, but in Author class should be simple instance method full_name, like:
class Author < ApplicationRecord
has_many :books
def full_name
"#{author.first_name.titleize} #{author.last_name.titleize}"
end
end
class Book < ActiveRecord::Base
belongs_to :author
def author_name
author.full_name
end
end
Note, there're no direct queries in this code. Once you'll need the author's name somewhere (in a view, in api response, etc), Rails will make the most optimized query possible (depends on your use case though, it may be ineffective for example, if you call iterate over books and call author in a loop)
I prefer the second approach because the full_name is property of author not a book. If the book wants to access that information, it can using book.author&.full_name (& is for handling cases of books with no authors).
but I would suggest a refactoring as below:
class Book < ApplicationRecord
belongs_to :author
end
class Author < ApplicationRecord
has_many :books
def full_name
"#{firstname} #{lastname}"
end
end
Does rails convert author.last_name called inside the Book class to the same SQL query as Author.find_by_id(author_id).last_name called inside Author class (through message passing from Book class) in case of bigger data size?
Depend upon the calling factor, like in your example both will generate the same query. But if you have a include\join clause while getting the Book/Author, both will generate different queries.
As per the rails convention, Author.find_by_id(author_id).last_name is not recommended as it will always fire a query on database whenever the method is called. One should use the rails' association interface to call the method on related object which is smart to identify the object from memory or fetch it from database if not in memory.
Which one is more performant in case of bigger data size?
author.last_name is better because it will take care of joins, include, and memoization clauses if used and avoid the N+1 query problem.
Doesn't calling author.last_name from Book class violates design principles?
No, you can even use delegate like #Steve Suggested.
In my experience, it's a balancing act between minimizing code complexity and minimizing scalability issues.
However, in this case, I think the simplest solution that would separate class concerns and minimize code would be to simply use: #book.author.full_name
And in your Author.rb define full_name in Author.rb:
def full_name
"#{self.first_name} #{self.last_name}"
end
This will simplify your code a lot. For example, if in the future you had another model called Magazine that has an Author, you don't have to go define author_name in the Magazine model as well. You simply use #magazine.author.full_name. This will DRY up your code nicely.

Best code structure for Rails associations

The Stage
Lets talk about the most common type of association we encounter.
I have a User which :has_many Post(s)
class User < ActiveRecord::Base
has_many :posts
end
class Post < ActiveRecord::Base
belongs_to :user
end
Problem Statement
I want to do some (very light and quick) processing on all the posts of a user. I am looking for the best way to structure my code to achieve it. Below are a couple of ways and why they work or don't work.
Method 1
Do it in the User class itself.
class User < ActiveRecord::Base
has_many :posts
def process_posts
posts.each do |post|
# code of whatever 'process' does to posts of this user
end
end
end
Post class remains the same:
class Post < ActiveRecord::Base
belongs_to :user
end
The method is called as:
User.find(1).process_posts
Why doesn't this look the best way to do it
The logic of doing something with the posts of the user should really belong to the Post class. In a real world scenario, a user might also have :has_many relations with a lot of other classes e.g. orders, comments, children etc.
If we start adding similar process_orders, process_comments, process_children (yikes) methods to the User class, it'll result in one giant file with lots of code much of which could (and should) be distributed to where it belongs i.e. the target associations.
Method 2
Proxy Associations and Scopes
Both of these constructs require addition of methods/code to the User class which again makes it bloated. I'd rather have all implementation shifted to the target classes.
Method 3
Class Method on target Class
Create class methods in the target class and call those methods on the User object.
class User < ActiveRecord::Base
has_many :comments
# all target specific code in target classes
end
class Post < ActiveRecord::Base
belongs_to :user
# Class method
def self.process
Post.all.each do |post| # see Note 2 below
# code of whatever 'process' does to posts of this user
end
end
end
The method is called as:
User.find(1).posts.process # See Note 1 below
Now, this looks and feels better than Method 1 and 2 because:
User model remains clutter free.
The process function is called process instead of process_posts. Now we can have a process for other classes as well and invoke them as: User.find(1).orders.process etc. instead of User.find(1).process_orders (Method 1).
Note 1:
Yes you can call a class method like this on a association. Read why here. TL;DR is that User.find(1).posts returns a CollectionProxy object which has access to class methods of the target (Post) class. It also conveniently passes a scope_attributes which stores the user_id of the user which called posts.process. This comes handy. See Note 2 below.
Note 2:
For people not sure whats going on when we do a Post.all.each in the class method, it returns all the posts of the user this method was called on as against all the posts in the database.
So when called as User.find(99).posts.process, Post.all executes:
SELECT "notes".* FROM "posts" WHERE "posts"."user_id" = $1 [["user_id", 99]]
which are all the posts for User ID: 99.
Per #Jesuspc's comment below, Post.all.each can be succinctly written as all.each. Its more idiomatic and doesn't make it look like we are querying all posts in the database.
The Answer I am looking for
Explains what is the best way to handle such associations. How do people do it normally? and if there are any obvious design flaws in Method 3.
There's a fourth option. Move this logic out of the model entirely:
class PostProcessor
def initialize(posts)
#posts = posts
end
def process
#posts.each do |post|
# ...
end
end
end
PostProcessor.new(User.find(1).posts).process
This is sometimes called the Service Object pattern. A very nice bonus of this approach is that it makes writing tests for this logic really simple. Here's a great blog post on this and other ways to refactor "fat" models: http://blog.codeclimate.com/blog/2012/10/17/7-ways-to-decompose-fat-activerecord-models/
Personally, I think that Method 1 is the cleanest one. It will be very clean and understandable write something like this:
Class User < ActiveRecord::Base
has_many :posts
def process_posts
posts.each do |post|
post.process
end
end
end
And put all the logic of process method in Post model (with an instance variable):
Class Post < ActiveRecord::Base
belongs_to :user
def process
# Logic of your Post process
end
end
That way, the very logic of a Post process belong to Post class. Even if your User model will have many "process" functions, these will be very basic and small. That seems very clean to me, as a developer.
Method 3 has many technical implications that are pretty complex and unintuitive (yourself had to clarify your question).
NOTE: If you want better performance, maybe you should use eager loading to reduce ActiveRecord calls, but that is out of the scope of this question.
First of all excuse me for the opinionated answer.
ActiveRecord models are a controversial matter. Its essence is against the Single responsibility principle since they handle both database interaction via class methods and domain objects (which use to implement their own behaviour) via its instances. At the same time they also break the Liskov Substitution Principle because the models are not sub cases of ActiveRecord::Base and implement their own set of methods. And finally the ActiveRecord paradigm often leads to code that breaks the Law of Demeter, as in your proposal for the third method:
User.find(1).posts.process
Thus, there is a trend that in order to reduce coupling would recommend to use ActiveRecord objects only to interact with the database and therefore no behaviour should be added to them (in your case the process method). Under my point of view that is the lesser evil, even though it is still not a perfect solution.
So if I were to implement what you describe I would have a ProcessablePostsCollection object (where the name Processable can be customised to better describe what the processing is about, or even neglected completely so you would simple have a PostsCollection class) that would probably be a wrapper over a list of posts using SimpleDelegator and would have a method process.
class ProcessablePostsCollection < SimpleDelegator
def self.from_collection(collection)
new collection
end
def initialize(source)
super source
end
def process
# code of whatever 'process' does to posts
end
end
And the usage would be something like:
ProcessablePostsCollection.from_collection(User.find(1).posts).process
even though the from_collection and the call to process should happen in different clases.
Also, in case you have a big posts table it would probably be wise to process stuff in batches. For that your process method could call find_in_batches on your posts ActiveRecord::Relation.
But as always it depends on your needs. If you are simply building a prototype is perfectly fine to let your models grow fat, and if you are building an enormous application Rails itself is probably not going to be the best choice since discourages some OOP best practises with things such as ActiveRecord models.
You shouldn't be putting this in the User model - put it in Post (unless - of course - the scope of process involves the User model directly) :
#app/models/post.rb
class Post < ActiveRecord::Base
def process
return false if post.published?
# do something
end
end
Then you can use an ActiveRecord Association Extension to add the functionality to the User model:
#app/models/user.rb
class User < ActiveRecord::Base
has_many :posts do
def process
proxy_association.target.each do |post|
post.process
end
end
end
end
This will allow you to call...
#user = User.find 1
#user.posts.process

Is an Empty Table for implementing Reverse Polymorphism and ActiveRecord::Base okay?

I have spent a lot of thought on this situation and cannot figure out what the best modeling system is:
There is a Test. A test can have a variety of of TestItems. These TestItems can (currently) consist of TrueFalseQuestions, MultipleChoiceQuestions, ShortAnswerQuestions, and TestInfo.
All of the models will implement some sort of Printable module. They will all be printable, but each model handles its printing in a different way. All models will also have a position as they are sortable in relation to all other models. All models can belong to a test.
All models of type XXXQuestion will print numbers when they print. The TestInfo will not do that.
MultipleChoiceQuestions will have Answers as children.
I have tried creating a TestItem class that uses reverse polymorphism and a shareable question module:
class TestItem < ActiveRecord::Base
belongs_to :test
belong_to :item, polymorphic: true
db_fields: :main_text, :position, :item_id, :item_type
def sort(params)
...
end
end
module QuestionPrintable
def get_print_number
...
end
def print
raise NotImplementedError
end
end
module Question
def self.included(klass)
klass.class_eval do
include QuestionPrintable
has_one :test_item, as: :item, dependent: :destroy
delegate :test, :main_text to: :test_item
end
end
end
class MultipleChoiceQuestion < ActiveRecord::Base
include Question
has_many :answers
def print
number = get_print_number
...
end
end
This would work, except that some models (like TrueFalseQuestion) would not actually expand the TestItem class. They would have no extra information in the TrueFalseQuestions table, but they would implement methods unique to TrueFalseQuestions. I realize I could also wrap a TestItem in a TrueFalseQuestion wrapper whenever it's instantiated but then I would need to store the kind of the question on the TestItem to know when to do that. So, in some sense, the TrueFalseQuestion < ActiveRecord::Base class is actually storing the kind implicitly just by existing. I don't know if that is a valid use of ActiveRecord::Base.
All the questions do share the printing features of a number (and several behaviors I anticipate needing, just not quite yet) that are not shared with other types of TestItems (i.e. TestInfo). Additionally, some Question types will store extra data right now. And I believe that all of them will store more data as this problem evolves. So I do think that abstraction is helpful. Is it okay to have an table that more or less exists to allow the implementation of a polymorphic ActiveRecord model?
Also, having the text on the TestItem prevents a crazy amount of joins to display the main text of all items for a test.
The big difficulty, is if I do this a different way (for example not having a TestItem class and just a bunch of shared modules or storing these all as TestItems with a :kind attribute), I need to start switching behavior on the class type or an attribute, and I try to avoid any code that tests on class type or has so much behavior switch based on a attribute value.
I think in general those solutions can be achieved with duck typing, which would work with my empty ActiveRecord class, but this one just has me puzzled.
EDIT:
Another solution that occurred to me, that would prevent switching on kind would be to use some sort of kind value in the TestItem and use it to create a wrapper:
class TestItem < ActiveRecord::Base
belongs_to :test
attr_accessor :main_text, :position, :kind
def wrapped_object
klass = kind.constantize
klass.new(_needed_params)
end
end
class TrueFalseQuestion # DO NOT INHERIT
attr_accessor :kind, :position
def print
...
end
end
I left out the various modules to not distract from the general solution, those can be easily implemented.
So now my potential debate is:
Empty Database Tables
Positives:
No wrappers needed
More extendable in the future
Negatives:
It's an empty table....
Possible YAGNI
Method that returns wrapped object
Positives:
Solves the immediate problem without introducing extra database tables
Allows for all the same abstractions in the previous solution
Negatives:
Relies on the kind attribute (maybe not bad in this case?)
If the domain changes this could easily become too complex to maintain

Law of Demeter for Views: create delegates to access attributes on associated objects or not?

The Law of Demeter seems to be a very powerful concept. I can understand how it helps writing good and maintainable object-oriented code.
Some people suggest to write a delegate method each time you need to access an attribute of an associated object in a view. Instead of writing something like this in a view
#order.customer.name
you would write this code:
# model
class Order < ActiveRecord::Base
belongs_to :customer
delegate :name, :to => :customer, :prefix => true
end
#view
#order.customer_name
On the other hand, people argue that you views should not dictate models and you should not add methods such as delegate to a model only for the sake of trading a dot for an underscore in a view.
When violating the Law of Demeter in a view, is it considered best practice to write delegate methods in models or not?
I see your customer_name auto-generated delegate method as the Simpliest Thing That Works Right now. Since it's one method call (and not a series of method chains) it's easy to refactor later (or, easier to refactor than some chained methods)
Imagine adding many customers to an order, one of which is the primary customer, for whatever reason. Now your order class might look like
class Order < ActiveRecord::Base
has_many :customers
def customer_name
if customers.first.primary?
customers.first.name
else
customers.last.name
end
end
It was easy to replace that convenience delegate generated method with one of our own.
(It's also super easy to write the first time, as delegate takes care of all the boilerplate. It's very possible you'll use customer_name in this form forever in your app. It's hard to know. But code that's easy/automatic to write the first time is cheap to throw away :))
Of course you have to avoid situations where you are writing method names like customer_streetaddress_is_united_states? (where yes, instead of encoding the object graph in dots you're encoding it in underscores.)
If your view really needs to know if the user is located in the US perhaps a method like this might work:
class Order < ActiveRecord::Base
belongs_to :customer
def shipping_to_us?
customer.shipping_country == "USA"
# Law of Demeter violation would be:
# customer.addresses.first.country == "USA"
end
end
class Customer < ActiveRecord::Base
has_many :addresses
def shipping_country
addresses.first.country
end
end
Notice here how the Order asks the Customer object for the shipping address, vs telling the customer to get it's customer's first address's country. Like a boss that tells you to do something and leaves you alone vs a boss that micromanages exactly how you do your day to day. (For additional edification, read up on the ask, don't tell approach to Ruby development :) )
There is something to be said about using presenters, decorator methods, or helpers to avoid having this potentially just display logic code littering your models. I'll leave that as an exercise for the reader :)

How many classes is too many? Rails STI

I am working on a very large Rails application. We initially did not use much inheritance, but we have had some eye opening experiences from a consultant and are looking to refactor some of our models.
We have the following pattern a lot in our application:
class Project < ActiveRecord::Base
has_many :graph_settings
end
class GraphType < ActiveRecord::Base
has_many :graph_settings
#graph type specific settings (units, labels, etc) stored in DB and very infrequently updated.
end
class GraphSetting < ActiveRecord::Base
belongs_to :graph_type
belongs_to :project
# Project implementation of graph type specific settings (y_min, y_max) also stored in db.
end
This also results in a ton of conditionals in views, helpers and in the GraphSetting model itself. None of this is good.
A simple refactor where we get rid of GraphType in favor of using a structure more like this:
class Graph < ActiveRecord::Base
belongs_to :project
# Generic methods and settings
end
class SpecificGraph < Graph
# Default methods and settings hard coded
# Project implementation specific details stored in db.
end
Now this makes perfect sense to me, eases testing, removes conditionals, and makes later internationalization easier. However we only have 15 to 30 graphs.
We have a very similar model (to complicated to use as an example) with close to probably 100 different 'types', and could potentially double that. They would all have relationships and methods they inheritated, some would need to override more methods then others. It seems like the perfect use, but that many just seems like a lot.
Is 200 STI classes to many? Is there another pattern we should look at?
Thanks for any wisdom and I will answer any questions.
If the differences are just in the behavior of the class, then I assume it shouldn't be a problem, and this is a good candidate for STI. (Mind you, I've never tried this with so many subclasses.)
But, if your 200 STI classes each have some unique attributes, you would need a lot of extra database columns in the master table which would be NULL, 99.5% of the time. This could be very inefficient.
To create something like "multiple table inheritance", what I've done before with success was to use a little metaprogramming to associate other tables for the details unique to each class:
class SpecificGraph < Graph
include SpecificGraphDetail::MTI
end
class SpecificGraphDetail < ActiveRecord::Base
module MTI
def self.included(base)
base.class_eval do
has_one :specific_graph_detail, :foreign_key => 'graph_id', :dependent => :destroy
delegate :extra_column, :extra_column=, :to => :specific_graph_detail
end
end
end
end
The delegation means you can access the associated detail fields as if they were directly on the model instead of going through the specific_graph_detail association, and for all intents and purposes it "looks" like these are just extra columns.
You have to trade off the situations where you need to join these extra detail tables against just having the extra columns in the master table. That will decide whether to use STI or a solution using associated tables, such as my solution above.

Resources