Rails: how to load 2 models via join? - ruby-on-rails

I am new to rails and would appreciate some help optimizing my database usage.
Is there a way to load two models associated with each other with one DB query?
I have two models Person and Image:
class Person < ActiveRecord::Base
has_many :images
end
class Image < ActiveRecord::Base
belongs_to :person
end
I would like to load a set of people and their associated images with a single trip to the DB using a join command. For instance, in SQL, I can load all the data I need with the following query:
select * from people join images on people.id = images.person_id where people.id in (2, 3) order by timestamp;
So I was hoping that this rails snippet would do what I need:
>> people_and_images = Person.find(:all, :conditions => ["people.id in (?)", "2, 3"], :joins => :images, :order => :timestamp)
This code executes the SQL statement I am expecting and loads the instances of Person I need. However, I see that accessing a a Person's images leads to an additional SQL query.
>> people_and_images[0].images
Image Load (0.004889) SELECT * FROM `images` WHERE (`images`.person_id = 2)
Using the :include option in the call to find() does load both models, however it will cost me an additional SELECT by executing it along with the JOIN.
I would like to do in Rails what I can do in SQL which is to grab all the data I need with one query.
Any help would be greatly appreciated. Thanks!

You want to use :include like
Person.find(:all, :conditions => ["people.id in (?)", "2, 3"], :include => :images, :order => :timestamp)
Check out the find documentation for more details

You can use :include for eager loading of associations and indeed it does call exactly 2 queries instead of one as with the case of :joins; the first query is to load the primary model and the second is to load the associated models. This is especially helpful in solving the infamous N+1 query problem, which you will face if you doesn't use :include, and :joins doesn't eager-load the associations.
the difference between using :joins and :include is 1 query more for :include, but the difference of not using :include will be a whole lot more.
you can check it up here: http://guides.rubyonrails.org/active_record_querying.html#eager-loading-associations

Related

Grails equivelant of Rails ARel 'includes'

I have a Rails app that I was able to speed up significantly using ARel "includes" e.g. (contrived)
class User < ActiveRecord::Base
has_many :posts
scope :eager, includes(:posts => [:rating, :author, {:tags => [:day, {:foo => :bar}]}] )
end
Calling
#posts = current_user.posts.eager
reduces that page load hugely and reduces the number queries by a very large factor. Rails does this by first selecting the posts in one query
select * from posts where ...
and then selecting all the comments for all those posts in one query instead of one query per comment:
select * from comments where post_id in (6,7,8,9,10,...)
Is there an equivalent in grails? I am familiar with criteria and named queries where I could write a query with a lot of joins but what I want is for Grails to produce a few queries with "IN" operator.
I found some references to this problem: Eager and Lazy Fetching and fetchMode.

Simple ActiveRecord Question

I have a database model set up such that a post has many votes, a user has many votes and a post belongs to both a user and a post. I'm using will paginate and I'm trying to create a filter such that the user can sort a post by either the date or the number of votes a post has. The date option is simple and looks like this:
#posts = Post.paginate :order => "date DESC"
However, I can't quite figure how to do the ordering for the votes. If this were SQL, I would simply use GROUP BY on the votes user_id column, along with the count function and then I would join the result with the posts table.
What's the correct way to do with with ActiveRecord?
1) Use the counter cache mechanism to store the vote count in Post model.
# add a column called votes_count
class Post
has_many :votes
end
class Vote
belongs_to :post, :counter_cache => true
end
Now you can sort the Post model by vote count as follows:
Post.order(:votes_count)
2) Use group by.
Post.select("posts.*, COUNT(votes.post_id) votes_count").
join(:votes).group("votes.post_id").order(:votes_count)
If you want to include the posts without votes in the result-set then:
Post.select("posts.*, COUNT(votes.post_id) votes_count").
join("LEFT OUTER JOIN votes ON votes.post_id=posts.id").
group("votes.post_id").order(:votes_count)
I prefer approach 1 as it is efficient and the cost of vote count calculation is front loaded (i.e. during vote casting).
Just do all the normal SQL stuff as part of the query with options.
#posts = Post.paginate :order => "date DESC", :join => " inner join votes on post.id..." , :group => " votes.user_id"
http://apidock.com/rails/ActiveRecord/Base/find/class
So I don't know much about your models, but you seem to know somethings about SQL so
named scopes: you basically just put the query into a class method:
named_scope :index , :order => 'date DESC', :join => .....
but they can take parameters
named_scope :blah, {|param| #base query on param }
for you, esp if you are more familiar with SQL you can write your own query,
#posts = Post.find_by_sql( <<-SQL )
SELECT posts.*
....
SQL

Is there an idiomatic way to cut out the middle-man in a join in Rails?

We have a Customer model, which has a lot of has_many relations, e.g. to CustomerCountry and CustomerSetting. Often, we need to join these relations to each other; e.g. to find the settings of customers in a given country. The normal way of expressing this would be something like
CustomerSetting.find :all,
:joins => {:customer => :customer_country},
:conditions => ['customer_countries.code = ?', 'us']
but the equivalent SQL ends up as
SELECT ... FROM customer_settings
INNER JOIN customers ON customer_settings.customer_id = customers.id
INNER JOIN customer_countries ON customers.id = customer_countries.customer_id
when what I really want is
SELECT ... FROM customer_settings
INNER JOIN countries ON customer_settings.customer_id = customer_countries.customer_id
I can do this by explicitly setting the :joins SQL, but is there an idiomatic way to specify this join?
Besides of finding it a bit difficult wrapping my head around the notion that you have a "country" which belongs to exactly one customer:
Why don't you just add another association in your model, so that each setting has_many customer_countries. That way you can go
CustomerSetting.find(:all, :joins => :customer_countries, :conditions => ...)
If, for example, you have a 1-1 relationship between a customer and her settings, you could also select through the customers:
class Customer
has_one :customer_setting
named_scope :by_country, lambda { |country| ... }
named_scope :with_setting, :include => :custome_setting
...
end
and then
Customer.by_country('us').with_setting.each do |cust|
setting = cust.customer_setting
...
end
In general, I find it much more elegant to use named scopes, not to speak of that scopes will become the default method for finding, and the current #find API will be deprecated with futures versions of Rails.
Also, don't worry too much about the performance of your queries. Only fix the things that you actually see perform badly. (If you really have a critical query in a high-load application, you'll probably end up with #find_by_sql. But if it doesn't matter, don't optimize it.

Is it possible to delete_all with inner join conditions?

I need to delete a lot of records at once and I need to do so based on a condition in another model that is related by a "belongs_to" relationship. I know I can loop through each checking for the condition, but this takes forever with my large record set because for each "belongs_to" it makes a separate query.
Here is an example. I have a "Product" model that "belongs_to" an "Artist" and lets say that artist has a property "is_disabled".
If I want to delete all products that belong to disabled artists, I would like to be able to do something like:
Product.delete_all(:joins => :artist, :conditions => ["artists.is_disabled = ?", true])
Is this possible? I have done this directly in SQL before, but not sure if it is possible to do through rails.
The problem is that delete_all discards all the join information (and rightly so). What you want to do is capture that as an inner select.
If you're using Rails 3 you can create a scope that will give you what you want:
class Product < ActiveRecord::Base
scope :with_disabled_artist, lambda {
where("product_id IN (#{select("product_id").joins(:artist).where("artist.is_disabled = TRUE").to_sql})")
}
end
You query call then becomes
Product.with_disabled_artist.delete_all
You can also use the same query inline but that's not very elegant (or self-documenting):
Product.where("product_id IN (#{Product.select("product_id").joins(:artist).where("artist.is_disabled = TRUE").to_sql})").delete_all
In Rails 4 (I tested on 4.2) you can almost do how OP originally wanted
Application.joins(:vacancy).where(vacancies: {status: 'draft'}).delete_all
will give
DELETE FROM `applications` WHERE `applications`.`id` IN (SELECT id FROM (SELECT `applications`.`id` FROM `applications` INNER JOIN `vacancies` ON `vacancies`.`id` = `applications`.`vacancy_id` WHERE `vacancies`.`status` = 'draft') __active_record_temp)
If you are using Rails 2 you can't do the above. An alternative is to use a joins clause in a find method and call delete on each item.
TellerLocationWidget.find(:all, :joins => [:widget, :teller_location],
:conditions => {:widgets => {:alt_id => params['alt_id']},
:retailer_locations => {:id => #teller_location.id}}).each do |loc|
loc.delete
end

How can I order by attributes of multiple subclasses that inherit from the same base class in Ruby on Rails?

I have 4 classes - Patient, Doctor, Person, and Appointment. Patient and Doctor are subclasses of Person. Appointment belongs to Patient and belongs to Doctor.
I want to write an AR statement to sort appointments by the patient's last name and another to sort by the doctor's last name, as all appointments will be in a sortable table. I am a bit confused as to what to put in the "order" option of the AR find statement. If I put
:order => 'patient.last_name'
I get a mysql error - "Unknown column 'patient.last_name'
Which makes sense because there is no patient column, it is a patient_id referring to a foreign "person" object. Of course I can sort by person.last_name but I am not sure how to specify which type of person to sort by - doctor or patient.
I should also note that I am using the include option to eager load the patient and doctor.
UPDATE
There is only a person table and a appointments table. The patient and doctor inherit from the person. patients.last_name will not work because there is no patients table.
The AR statement is:
find :all,
:include => [:doctor, :patient],
:order => 'something needs to go here'
The 'something needs to go here' should be a statement to order by either the doctor's last name or the patient's last name.
You can do this:
Appointment.find(:all, :include => {:patient}, :order => 'people.last_name')
What you're doing is grabbing all the appointments, and their associated patients at the same time. You don't have to worry about patients vs doctors because all the people rows retrieved will be patient records.
In order to have a doctor-centric list, just change :patient to :doctor in the above example.
EDIT: I figured out the solution when you eager load both patients and doctors. It gets a little complex. First, I recreated a simple version of your 4 models in a blank rails app, then tried to run the find with :order => 'patients.name':
Appointment.find(:all, :include => [:patient, :doctor], :order => 'patients.name')
Of course if failed, but it also spit out the SQL query it attempted:
SELECT
"appointments"."id" AS t0_r0,
"appointments"."name" AS t0_r1,
"appointments"."doctor_id" AS t0_r2,
"appointments"."patient_id" AS t0_r3,
"appointments"."created_at" AS t0_r4,
"appointments"."updated_at" AS t0_r5,
"people"."id" AS t1_r0,
"people"."name" AS t1_r1,
"people"."type" AS t1_r2,
"people"."created_at" AS t1_r3,
"people"."updated_at" AS t1_r4,
"doctors_appointments"."id" AS t2_r0,
"doctors_appointments"."name" AS t2_r1,
"doctors_appointments"."type" AS t2_r2,
"doctors_appointments"."created_at" AS t2_r3,
"doctors_appointments"."updated_at" AS t2_r4
FROM "appointments"
LEFT OUTER JOIN "people" ON "people".id = "appointments".patient_id AND ("people"."type" = 'Patient' )
LEFT OUTER JOIN "people" doctors_appointments ON "doctors_appointments".id = "appointments".doctor_id AND ("doctors_appointments"."type" = 'Doctor' )
ORDER BY patients.name
Now we can see how rails forms a query like this. The first association to use a given table gets the table name directly - "people". Subsequent associations get a combo of the association and original table - "doctors_appointments".
It may seem a little messy, but this call gives you ordered by patients:
Appointment.find(:all, :include => [:patient, :doctor], :order => 'people.name')
And this one gives you ordered by doctors:
Appointment.find(:all, :include => [:patient, :doctor], :order => 'doctors_appointments.name')
Of course, in my example I just had a simple name field for each person, and you'll be using "last_name" instead. But you get the idea. Does this work for you?
ONE LAST EDIT:
I would put these in finders, so you don't need to mess with the table names anywhere else in your code. I'd do it like this:
class << self
def order_by_patient(field='last_name')
find(:all, :include => [:patient, :doctor], :order => "people.#{field}")
end
def order_by_doctor(field='last_name')
find(:all, :include => [:patient, :doctor], :order => "doctors_appointments.#{field}")
end
end
Now you can call them from anywhere, and even sort by the field you want.
I think you might need to use a manual join rather than an include.
See the Active Record Querying Guide on Joining Tables.
You can then create an alias for the table names, and order on these accordingly.
find.all(:joins => 'LEFT OUTER JOIN people dr ON dr.id = appointments.id'
:order => dr.last_name
)
Or something similar for your database.
Alternatively, you can add a "sort" column that holds an integer. All doctors have a 1, patients have a 0. You can then ORDER BY last_name, sort_column, and the results will be arranged accordingly within the last_name group based on the doctor/patient sort value.
Note: I haven't had my coffee yet this morning so that join is possibly all out of wack, but you get the general idea I hope.

Resources