EDIT -- I changed the object-oriented modeling to better reflect the not-particularly-intuitive relationship in my app, (thanks, Anthony) so that this question makes more sense. Sorry about that.
In my Rails app, I want certain models to be able to delegate to other models on not just an instance level but also a class/relation level. That is to say, assuming a House model, which has_many Users and defines a class-level method "addresses" that's called on relations of houses, I could do this:
users.addresses
Behind the scenes, this would actually do two things: 1) run users.houses (which would grab all houses with an ID among those plucked from the house_id column of the relation of users), and 2) call addresses on that relation of houses.
My attempt currently looks something like this:
class House
has_many :users
def self.addresses
map(&:address)
end
def address
"#{street_address}, {state}, #{country}"
end
end
class User
belongs_to :house
delegate :address, to: :house
class << self
delegate :addresses, to: :houses
end
def self.houses
Houses.where(id: pluck(:house_id))
end
...
end
This fundamentally seems to work -- almost. If I have a group of users, I can do users.houses and grab the associated relation of houses. If I call addresses on an unrelated relation of houses, the method works great. If I call, users.addresses, it calls the class-level method of that name in houses.rb.
But, the method errors when I try to chain these things together.
If I call users.addresses (or users.houses.addresses, so delegation isn't the issue here per se), the House.addresses method is called, but within that method, self seems to be not the relation of houses that users.houses should (and generally does) return, but just the House class itself. Consequently, if I call any array methods on self within addresses (which would work on a relation), the method throws an error such as undefined method 'map' for <Class:0x1230101239123>.
This problem persists even if I take away my fancy class-level delegation logic and replace it with the explicit version:
class User
...
def self.addresses
houses.addresses
end
end
Same error. Also when I just try users.houses.addresses.
The only version that does work is the following:
class User
...
def self.addresses
houses.map(&:address)
end
end
In other words, the logic chain seems to be identical, but moving the map into the User method and out of the House method fixes things.
I'm super confused by this, because to my eyes, that last (successful) version should fundamentally be identical to the two failing versions above. The only difference is that in the successful version, I'm repeating logic in a way I'd like to avoid.
So I guess the questions are
1) Why is users.houses returning a relation, but users.addresses (and users.houses.addresses) suddenly calling addresses on the class rather than the relation?
2) Given these issues, is there a better, more Railsy way to set up the relationship between House and User such that I can run these class-level query methods (ie users.houses)?
Any opinions on the best way to go about fixing this?
Related
In our Rails application, the Post resource can be made by either a User or an Admin.
Thus, we have an ActiveRecord model class called Post, with a belongs_to :author, polymorphic: true.
However, in certain conditions, the system itself is supposed to be able to create posts.
Therefore, I'm looking for a way to add e.g. System as author.
Obviously, there will only ever be one System, so it is not stored in the database.
Naïvely attempting to just add an instance (e.g. the singleton instance) of class System; end as author returns errors like NoMethodError: undefined method `primary_key' for System:Class.
What would be the cleanest way to solve this?
Is there a way to write a 'fake' ActiveRecord model that is not actually part of the database?
There's two ways that I see that make the most sense:
Option A: Add a 'system' Author record to the DB
This isn't a horrible idea, it just shifts the burden onto you making sure certain records are present in every environment. But you can always create these records in seed files if you want to ensure they're always created.
The benefit over option B is that you can just use standard ActiveRecord queries to find all of the system's Posts.
Option B: Leave the association nil and add a new flag for :created_by_system
This is what I would opt for. If a Post was made by the system, just leave the author reference blank and set a special flag to indicate this model was created internally.
You can still have a method to quickly get a list of all of them just by making a scope:
scope :from_system, -> { where(created_by_system: :true) }
Which one you choose I think depends on whether you want to be able to query Post.author and get information about the System. In that case you need to take option A. Otherwise, I would use option B. I'm sure there's some other ways to do it too but I think this makes the most sense.
Finally I ended up with creating the following 'fake' model class that does not require any changes to the database schema.
It which leverages a bit of meta-programming:
# For the cases in which the System itself needs to be given an identity.
# (such as when it does an action normally performed by a User or Admin, etc.)
class System
include ActiveModel::Model
class << self
# The most beautiful kind of meta-singleton
def class
self
end
def instance
self
end
# Calling`System.new` is a programmer mistake;
# they should use plain `System` instead.
private :new
def primary_key
:id
end
def id
1
end
def readonly?
true
end
def persisted?
true
end
def _read_attribute(attr)
return self.id if attr == :id
nil
end
def polymorphic_name
self.name
end
def destroyed?
false
end
def new_record?
false
end
end
end
Of main note here is that System is both its own class and its own instance.
This has the following advantages:
We can just pass Post.new(creator: System) rather than System.new or System.instance
There is at any point only one system.
We can define the class methods that ActiveRecord requires (polymorphic_name) on System itself rather than on Class.
Of course, whether you like this kind of metaprogramming or find it too convoluted is very subjective.
What is less subjective is that overriding ActiveRecord's _read_attribute is not nice; we are depending on an implementation detail of ActiveRecord. Unfortunately to my knowledge there is no public API exposed that could be used to do this more cleanly. (In our project, we have some specs in place to notify us immediately when ActiveRecord might change this.)
Motivation
The motivation was that I want to embed the serialization of any model that have been included in a Relation chain. What I've done works at the relation level but if I get one record, the serialization can't take advantage of what I've done.
What I've achieved so far
Basically what I'm doing is using the method includes_values of the class ActiveRecord::Relation, which simply tells me what things have been included so far. I.e
> Appointment.includes(:patient).includes(:slot).includes_values
=> [:patient, :slot]
To take advantage of this, I'm overwriting the as_json method at the ActiveRecord::Relation level, with this initializer:
# config/initializers/active_record_patches.rb
module ActiveRecord
class Relation
def as_json(**options)
super(options.merge(include: includes_values)) # I could precondition this behaviour with a config
end
end
end
What it does is to add for me the option include in the as_json method of the relation.
So, the old chain:
Appointment.includes(:patient).includes(:slot).as_json(include: [:patient, :slot])
can be wrote now without the last include:
Appointment.includes(:patient).includes(:slot).as_json
obtaining the same results (the Patient and Slot models are embedded in the generated hash).
THE PROBLEM
The problem is that because the method includes_values is of the class ActiveRecord::Relation, I can't use it at the record level to know if a call to includes have been done.
So currently, when I get a record from such queries, and call as_json on it, I don't get the embedded models.
And the actual problem is to answer:
how to know the included models in the query chain that retrieved the
current record, given that it happened?
If I could answer this question, then I could overwrite the as_json method in my own Models with:
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
extend Associations
def as_json(**options)
super(options.merge(include: included_models_in_the_query_that_retrieved_me_as_a_record))
end
end
One Idea
One Idea I have is to overwrite the includes somewhere (could be in my initializer overwriting directly the ActiveRecord::Relation class, or my ApplicationRecord class). But once I'm there, I don't find an easy way to "stamp" arbitrary information in the Records produced by the relation.
This solution feels quite clumsy and there might be better options out there.
class ApplicationRecord < ActiveRecord::Base
def as_json(**options)
loaded_associations = _reflections.each_value
.select { |reflection| association(reflection.name).loaded? }
.map(&:name)
super(options.merge(include: loaded_associations))
end
end
Note that this only loads 1st level associations. If Appointment.includes(patient: :person) then only :patient will be returned since :person is nested. If you plan on making the thing recursive beware of circular loaded associations.
Worth pointing out is that you currently merge include: ... over the provided options. Giving a user no choice to use other include options. I recommend using reverse_merge instead. Or swap the placements around {includes: ...}.merge(options).
The Stage
Lets talk about the most common type of association we encounter.
I have a User which :has_many Post(s)
class User < ActiveRecord::Base
has_many :posts
end
class Post < ActiveRecord::Base
belongs_to :user
end
Problem Statement
I want to do some (very light and quick) processing on all the posts of a user. I am looking for the best way to structure my code to achieve it. Below are a couple of ways and why they work or don't work.
Method 1
Do it in the User class itself.
class User < ActiveRecord::Base
has_many :posts
def process_posts
posts.each do |post|
# code of whatever 'process' does to posts of this user
end
end
end
Post class remains the same:
class Post < ActiveRecord::Base
belongs_to :user
end
The method is called as:
User.find(1).process_posts
Why doesn't this look the best way to do it
The logic of doing something with the posts of the user should really belong to the Post class. In a real world scenario, a user might also have :has_many relations with a lot of other classes e.g. orders, comments, children etc.
If we start adding similar process_orders, process_comments, process_children (yikes) methods to the User class, it'll result in one giant file with lots of code much of which could (and should) be distributed to where it belongs i.e. the target associations.
Method 2
Proxy Associations and Scopes
Both of these constructs require addition of methods/code to the User class which again makes it bloated. I'd rather have all implementation shifted to the target classes.
Method 3
Class Method on target Class
Create class methods in the target class and call those methods on the User object.
class User < ActiveRecord::Base
has_many :comments
# all target specific code in target classes
end
class Post < ActiveRecord::Base
belongs_to :user
# Class method
def self.process
Post.all.each do |post| # see Note 2 below
# code of whatever 'process' does to posts of this user
end
end
end
The method is called as:
User.find(1).posts.process # See Note 1 below
Now, this looks and feels better than Method 1 and 2 because:
User model remains clutter free.
The process function is called process instead of process_posts. Now we can have a process for other classes as well and invoke them as: User.find(1).orders.process etc. instead of User.find(1).process_orders (Method 1).
Note 1:
Yes you can call a class method like this on a association. Read why here. TL;DR is that User.find(1).posts returns a CollectionProxy object which has access to class methods of the target (Post) class. It also conveniently passes a scope_attributes which stores the user_id of the user which called posts.process. This comes handy. See Note 2 below.
Note 2:
For people not sure whats going on when we do a Post.all.each in the class method, it returns all the posts of the user this method was called on as against all the posts in the database.
So when called as User.find(99).posts.process, Post.all executes:
SELECT "notes".* FROM "posts" WHERE "posts"."user_id" = $1 [["user_id", 99]]
which are all the posts for User ID: 99.
Per #Jesuspc's comment below, Post.all.each can be succinctly written as all.each. Its more idiomatic and doesn't make it look like we are querying all posts in the database.
The Answer I am looking for
Explains what is the best way to handle such associations. How do people do it normally? and if there are any obvious design flaws in Method 3.
There's a fourth option. Move this logic out of the model entirely:
class PostProcessor
def initialize(posts)
#posts = posts
end
def process
#posts.each do |post|
# ...
end
end
end
PostProcessor.new(User.find(1).posts).process
This is sometimes called the Service Object pattern. A very nice bonus of this approach is that it makes writing tests for this logic really simple. Here's a great blog post on this and other ways to refactor "fat" models: http://blog.codeclimate.com/blog/2012/10/17/7-ways-to-decompose-fat-activerecord-models/
Personally, I think that Method 1 is the cleanest one. It will be very clean and understandable write something like this:
Class User < ActiveRecord::Base
has_many :posts
def process_posts
posts.each do |post|
post.process
end
end
end
And put all the logic of process method in Post model (with an instance variable):
Class Post < ActiveRecord::Base
belongs_to :user
def process
# Logic of your Post process
end
end
That way, the very logic of a Post process belong to Post class. Even if your User model will have many "process" functions, these will be very basic and small. That seems very clean to me, as a developer.
Method 3 has many technical implications that are pretty complex and unintuitive (yourself had to clarify your question).
NOTE: If you want better performance, maybe you should use eager loading to reduce ActiveRecord calls, but that is out of the scope of this question.
First of all excuse me for the opinionated answer.
ActiveRecord models are a controversial matter. Its essence is against the Single responsibility principle since they handle both database interaction via class methods and domain objects (which use to implement their own behaviour) via its instances. At the same time they also break the Liskov Substitution Principle because the models are not sub cases of ActiveRecord::Base and implement their own set of methods. And finally the ActiveRecord paradigm often leads to code that breaks the Law of Demeter, as in your proposal for the third method:
User.find(1).posts.process
Thus, there is a trend that in order to reduce coupling would recommend to use ActiveRecord objects only to interact with the database and therefore no behaviour should be added to them (in your case the process method). Under my point of view that is the lesser evil, even though it is still not a perfect solution.
So if I were to implement what you describe I would have a ProcessablePostsCollection object (where the name Processable can be customised to better describe what the processing is about, or even neglected completely so you would simple have a PostsCollection class) that would probably be a wrapper over a list of posts using SimpleDelegator and would have a method process.
class ProcessablePostsCollection < SimpleDelegator
def self.from_collection(collection)
new collection
end
def initialize(source)
super source
end
def process
# code of whatever 'process' does to posts
end
end
And the usage would be something like:
ProcessablePostsCollection.from_collection(User.find(1).posts).process
even though the from_collection and the call to process should happen in different clases.
Also, in case you have a big posts table it would probably be wise to process stuff in batches. For that your process method could call find_in_batches on your posts ActiveRecord::Relation.
But as always it depends on your needs. If you are simply building a prototype is perfectly fine to let your models grow fat, and if you are building an enormous application Rails itself is probably not going to be the best choice since discourages some OOP best practises with things such as ActiveRecord models.
You shouldn't be putting this in the User model - put it in Post (unless - of course - the scope of process involves the User model directly) :
#app/models/post.rb
class Post < ActiveRecord::Base
def process
return false if post.published?
# do something
end
end
Then you can use an ActiveRecord Association Extension to add the functionality to the User model:
#app/models/user.rb
class User < ActiveRecord::Base
has_many :posts do
def process
proxy_association.target.each do |post|
post.process
end
end
end
end
This will allow you to call...
#user = User.find 1
#user.posts.process
The question below had a good answer to grab associated values of an activerecord collection in one hit using Comment.includes(:user). What about when you have multiple associations that you want to grab in one go?
Rails have activerecord grab all needed associations in one go?
Is the best way to just chain these together like below Customer.includes(:user).includes(:sales).includes(:prices) or is there a cleaner way.
Furthermore, when I am doing this on a loop on an index table. Can I add a method on the customer.rb model so that I can call #customers.table_includes etc and have
def table_includes
self.includes(:user).includes(:sales).includes(:prices)
end
For the record I tested the above and it didn't work because its a method on a collection (yet to figure out how to do this).
In answering this, I'm assuming that user, sales, and prices are all associations off of Customer.
Instead of chaining, you can do something like this:
Customer.includes(:user, :sales, :prices)
In terms of creating an abstraction for this, you do have a couple options.
First, you could create a scope:
class Customer < ActiveRecord::Base
scope :table_includes, -> { includes(:user, :sales, :prices) }
end
Or if you want for it to be a method, you should consider making it a class-level method instead of an instance-level one:
def self.table_includes
self.includes(:user, :sales, :prices)
end
I would consider the purpose of creating this abstraction though. A very generic name like table_includes will likely not be very friendly over the long term.
The Law of Demeter seems to be a very powerful concept. I can understand how it helps writing good and maintainable object-oriented code.
Some people suggest to write a delegate method each time you need to access an attribute of an associated object in a view. Instead of writing something like this in a view
#order.customer.name
you would write this code:
# model
class Order < ActiveRecord::Base
belongs_to :customer
delegate :name, :to => :customer, :prefix => true
end
#view
#order.customer_name
On the other hand, people argue that you views should not dictate models and you should not add methods such as delegate to a model only for the sake of trading a dot for an underscore in a view.
When violating the Law of Demeter in a view, is it considered best practice to write delegate methods in models or not?
I see your customer_name auto-generated delegate method as the Simpliest Thing That Works Right now. Since it's one method call (and not a series of method chains) it's easy to refactor later (or, easier to refactor than some chained methods)
Imagine adding many customers to an order, one of which is the primary customer, for whatever reason. Now your order class might look like
class Order < ActiveRecord::Base
has_many :customers
def customer_name
if customers.first.primary?
customers.first.name
else
customers.last.name
end
end
It was easy to replace that convenience delegate generated method with one of our own.
(It's also super easy to write the first time, as delegate takes care of all the boilerplate. It's very possible you'll use customer_name in this form forever in your app. It's hard to know. But code that's easy/automatic to write the first time is cheap to throw away :))
Of course you have to avoid situations where you are writing method names like customer_streetaddress_is_united_states? (where yes, instead of encoding the object graph in dots you're encoding it in underscores.)
If your view really needs to know if the user is located in the US perhaps a method like this might work:
class Order < ActiveRecord::Base
belongs_to :customer
def shipping_to_us?
customer.shipping_country == "USA"
# Law of Demeter violation would be:
# customer.addresses.first.country == "USA"
end
end
class Customer < ActiveRecord::Base
has_many :addresses
def shipping_country
addresses.first.country
end
end
Notice here how the Order asks the Customer object for the shipping address, vs telling the customer to get it's customer's first address's country. Like a boss that tells you to do something and leaves you alone vs a boss that micromanages exactly how you do your day to day. (For additional edification, read up on the ask, don't tell approach to Ruby development :) )
There is something to be said about using presenters, decorator methods, or helpers to avoid having this potentially just display logic code littering your models. I'll leave that as an exercise for the reader :)