Techniques for reducing database queries in a Rails app

Techniques for reducing database queries in a Rails app - ruby-on-rails

If you have a Rail app with many complex associated models, what techniques do you employ to reduce database queries?
In fact, I'll extend that question a little further and ask, what do you consider "too many" queries for any page?
I have a page that I expect will end up hitting the database about 20 times each page load. That concerns be but don't know whether it should concern me, or what I can do to reduce the load?

Check out: bullet
Its a great way to identify n+1 queries and it offers suggestions to minimize it.
It does slow down development mode, so be sure to disable it when you are not performance tuning.
While we are at it, also checkout: rails_indexes
A simple way to identify which indexes your app could be missing.
Happy tuning.

One common practice is judicious use of the include => :association option.
For instance on a controller you might do:
def show
#items = Item.find(:all)
end
...and the show view would do something like:
<% #items.each |item| %>
<%= item.product.title %>
<% end %>
This will create a query for every call to product. But if you declare the association included as follows, you get eagerly-loaded associations in one query:
def show
#items = Item.find(:all, :include => :product)
end
As always, check your console for query times and such.

I am useing :joins and :select options if you need just to display data.
I found very useful named_scope to define all possible :joins and one :select_columns named_scope. Example
class Activity < ActiveRecord::Base
belongs_to :event
belongs_to :html_template
has_many :participants
named_scope :join_event, :joins => :event
named_scope :join_owner, :joins => {:event => :owner}
named_scope :left_join_html_template,
:joins => "LEFT JOIN html_templates ON html_templates.id = activities.html_template_id"
named_scope :select_columns, lambda { |columns| {:select => columns}}
named_scope :order, lambda{ |order| {:order => order}}
end
So now you can easly build queries like this:
columns = "activities.*,events.title,events.owner_id,owners.full_name as owner_name"
#activities = Activity.join_event.join_owner.order("date_from ASC").select_columns(columns)
I consider this is not the best and safest way, but in my case it really minify query count that executes per request and there are no errors rised about some wrong generated queries yet.

It is really difficult to estimate a limit for queries. This is related at the concept/design of your application.
If you don't have to reload the whole page, I suggest you consider javascript (or rjs) in order to update only the data you need. This should be also an UI improvement, your users will love it!
Check the SQL generated from your ActiveRecord queries. Be sure that everything is like expected.
Consider to denormalize your db in order to improve performance. (be carefully)
This is what I see from the "code side".

Related

Proper way to prevent ActiveRecord::ReadOnlyRecord?

I'm currently using Rails 2.3.9. I understand that specifying the :joins option in a query without an explicit :select automatically makes any records that are returned read-only. I have a situation where I would like to update the records and while I've read about different ways to approach it, I was wondering which way is the preferred or "proper" way.
Specifically, my situation is that I have the following User model with an active named scope that performs a JOIN with the subscriptions table:
class User < ActiveRecord::Base
has_one :subscription
named_scope :active, :conditions => { :subscriptions => { :status => 'active' } }, :joins => :subscription
end
When I call User.active.all, the user records that are returned are all read-only, so if, for instance, I call update_attributes! on a user, ActiveRecord::ReadOnlyRecord will be raised.
Through reading various sources, it seems a popular way to get around this is by adding :readonly => false to the query. However, I was wondering the following:
Is this safe? I understand the reason why Rails sets it to read-only in the first place is because, according to the Rails documentation, "they will have attributes that do not correspond to the table’s columns." However, the SQL query that is generated from this call uses SELECT `users`.* anyway, which appears to be safe, so what is Rails trying to guard against in the first place? It would appear that Rails should be guarding against the case when :select is actually explicitly specified, which is the reverse of the actual behavior, so am I not properly understanding the purpose of automatically setting the read-only flag on :joins?
Does this seem like a hack? It doesn't seem proper that the definition of a named scope should care about explicitly setting :readonly => false. I'm also afraid of side effects if the named scoped is chained with other named scopes. If I try to specify it outside of the scope (e.g., by doing User.active.scoped(:readonly => false) or User.scoped(:readonly => false).active), it doesn't appear to work.
One other way I've read to get around this is to change the :joins to an :include. I understand the behavior of this better, but are there any disadvantages to this (other than the unnecessary reading of all the columns in the subscriptions table)?
Lastly, I could also retrieve the query again using the record IDs by calling User.find_all_by_id(User.active.map(&:id)), but I find this to be more of a workaround rather than a possible solution since it generates an extra SQL query.
Are there any other possible solutions? What would be the preferred solution in this situation? I've read the answer given in the previous StackOverflow question about this, but it doesn't seem to give specific guidance of what would be considered correct.
Thanks in advance!

I believe that it would be customary and acceptable in this case to use :include instead of :join. I think that :join is only used in rare specialized circumstances, whereas :include is pretty common.
If you're not going to be updating all of the active users, then it's probably wise to add an additional named scope or find condition to further narrow down which users you're loading so that you're not loading extra users & subscriptions unnecessarily. For instance...
User.active.some_further_limiting_scope(:with_an_argument)
#or
User.active.find(:all, :conditions => {:etc => 'etc'})
If you decide that you still want to use the :join, and are only going to update a small percentage of the loaded users, then it's probably best to reload just the user you want to update right before doing so. Such as...
readonly_users = User.active
# insert some other code that picks out a particular user to update
User.find(readonly_users[#index].id).update_attributes(:etc => 'etc')
If you really do need to load all active users, and you want to stick with the :join, and you will likely be updating most or all of the users, then your idea to reload them with an array of IDs is probably your best choice.
#no need to do find_all_by_id in this case. A simple find() is sufficient.
writable_users_without_subscriptions = User.find(Users.active.map(&:id))
I hope that helps. I'm curious which option you go with, or if you found another solution more appropriate for your scenario.

I think the best solution is to use .join as you have already and do a separate find()
One crucial difference of using :include is that it uses outer join while :join uses an inner join! So using :include may solve the read-only problem, but the result might be wrong!

I ran across this same issue and was not comfortable using :readonly => false
As a result I did an explicit select namely :select => 'users.*' and felt that it seemed like less of a hack.
You could consider doing the following:
class User < ActiveRecord::Base
has_one :subscription
named_scope :active, :select => 'users.*', :conditions => { :subscriptions => { :status => 'active' } }, :joins => :subscription
end

Regarding your sub-question: so am I not properly understanding the purpose of automatically setting the read-only flag on :joins?
I believe the answer is: With a joins query, you're getting back a single record with the User + Subscription table attributes. If you tried to update one of the attributes (say "subscription_num") in the Subscription table instead of the User table, the update statement to the User table wouldn't be able to find subscription_num and would crash. So the join-scopes are read-only by default to prevent that from happening.
Reference:
1) http://blog.ethanvizitei.com/2009/05/joins-and-namedscopes-in-activerecord.html

Rails Advanced Sorting

I have three models, basically:
class Vendor
has_many :items
end
class Item
has_many :sale_items
belongs_to :vendor
end
class SaleItem
belongs_to :item
end
Essentially, each sale_item points to a specific item (but has an associated quantity and sale price which might be different from the item's base price, hence the separate model), and each item is made by a specific vendor.
I'd like to sort all sale_items by vendor name, but this means going through the associated item, because that's where the association is.
My first attempt was to change SaleItem to the following:
class SaleItem
belongs_to :item
has_one :vendor, :through => :item
end
Which allows me to look for SaleItem.first.vendor, but doesn't allow me to do something like:
SaleItem.joins(:vendor).all(:order => "vendors.name")
Is there an easy way to figure out these complex associations and sorting? It would be especially great if there were a plugin that could take care of these sort of things. I have a lot of different types of tables to add sorting to in this application, and I feel like this will be a big chunk of the figuring-out work.

This could definitely be done with a more complex SQL query (possibly using find_by_sql), but you could also do it pretty easily in Ruby. Try something like the following:
SaleItem.find(:all, :include => { :items => :vendors }).sort do |first,second|
first.vendor.name <=> second.vendor.name
end
I haven't tested it, so it might not work exactly like this, but it should give you a good idea of one possible solution.
Edit: Found an old blog post that seems to have solved this issue. Hopefully this still works in the lastest version of ActiveRecord.
source: http://matthewman.net/2007/01/04/eager-loading-objects-in-a-rails-has_many-through-association/
Second Edit: Straight from the Rails documentation
To include a deep hierarchy of associations, use a hash:
for post in Post.find(:all, :include => [ :author, { :comments => { :author => :gravatar } } ])
That’ll grab not only all the comments but all their authors and gravatar pictures. You can mix and match symbols, arrays and hashes in any combination to describe the associations you want to load.
There's your explanation.

Do you really need your sale_items sorted by the database, or could you wait until it is presented and do the sorting client side via javascript (there are some great sorting libraries out there) - that would save server CPU and (backend) code complexity.

Rails Thinking Sphinx:- How to select only some fields in the result and multiple tables select(association)

I am a rookie in Thinking Sphinx for Rails.
When Sphinx found a record, it will give all the fields in the table. How can i select only the needed fields?
And in my case, i also need reference to another table. how can i do that?
Thanks

This is an old thread, but as I found it whilst looking for the same information, I thought I'd share my answer.
It's not (as far as I can tell) clearly defined on the Thinking Sphinx homepage, but the search function on a model accepts the option :select - and in answer to your second question, it also accepts :joins, so if you had two related models:
class Project < ActiveRecord::Base
attr_accessible :name
has_many :tasks
end
class Task < ActiveRecord::Base
attr_accessible :name
belongs_to :project
end
You should be able to search your tasks like so:
Task.search "Fix Bug",
:select => 'tasks.id, tasks.name, projects.name as project_name',
:joins => [:project]
There's no doubt a slightly cleaner way to do this, so I'm happy to be corrected - the general idea works though!
EDIT (for thinking sphinx v3)
:select is used as a sphinx parameter in version 3, and instead you should add :select and :joins to a :sql hash. Otherwise you get some really strange errors that aren't that obvious!
The above example then becomes:
Task.search "Fix Bug",
:sql => { :select => 'tasks.id, tasks.name, projects.name as project_name',
:joins => [:project] }

Querying a polymorphic association

I have a polymorphic association like this -
class Image < ActiveRecord::Base
has_one :approval, :as => :approvable
end
class Page < ActiveRecord::Base
has_one :approval, :as => :approvable
end
class Site < ActiveRecord::Base
has_one :approval, :as => :approvable
end
class Approval < ActiveRecord::Base
belongs_to :approvable, :polymorphic => true
end
I need to find approvals where approval.apporvable.deleted = false
I have tried something like this -
#approvals = Approval.find(:all,
:include => [:approvable],
:conditions => [":approvable.deleted = ?", false ])
This gives "Can not eagerly load the polymorphic association :approvable" error
How can the condition be given correctly so that I get a result set with approvals who's approvable item is not deleted ?
Thanks for any help in advance

This is not possible, since all "approvables" reside in different tables. Instead you will have to fetch all approvals, and then use the normal array methods.
#approvals = Approval.all.select { |approval| !approval.approvable.deleted? }

What your asking, in terms of SQL, is projecting data from different tables for different rows in the resultset. It is not possible to my knowledge.
So you'll have to be content with:
#approvals = Approval.all.reject{|a| a.approvable.deleted? }
# I assume you have a deleted? method in all the approvables

I would recommend either of the answers already presented here (they are the same thing) but I would also recommend putting that deleted flag into the Approval model if you really care to do it all in a single query.
With a polymorphic relationship rails can use eager fetching on the polys, but you can't join to them because yet again, the relationships are not known so the query is actually multiple queried intersected.
So in the end if you REALLY need to, drop into sql and intersect all the possible joins you can do to all the types of approvables in a single query, but you will have to do lots of joining manually. (manually meaning not using rails' built-in mechanisms...)

thanks for your answers
I was pretty sure that this couldn't be done. I wanted some more confirmation
besides that I was hoping for some other soln than looping thru the result set
to avoid performance related issues later
Although for the time being both reject/select are fine but in the long run I
will have to do those sql joins manually.
Thanks again for your help!!
M

How do I make named_scope work properly with a joined table?

Here's my situation. I have two tables: pledges and pledge_transactions. When a user makes a pledge, he has only a row in the pledges table.
Later when it comes time to fulfill the pledge, each payment is logged in my pledge_transactions table.
I need to be able to query all open pledges which means that the sum of the amounts in the transactions table is less than the pledged amount.
Here's what I have so far:
named_scope :open,
:group => 'pledges.id',
:include => :transactions,
:select => 'pledge_transactions.*',
:conditions => 'pledge_transactions.id is not null or pledge_transactions.id is null',
:having => 'sum(pledge_transactions.amount) < pledges.amount or sum(pledge_transactions.amount) is null'
You might be asking yourself why I have that superfluous and ridiculous conditions option specified. The answer is that when I don't force ActiveRecord to acknowledge the pledge_transactions table in the conditions, it omits it completely, which means my having clause becomes meaningless.
My belief is that I have run into a shortcoming of ActiveRecord.
Ultimately I need to be able to do the following:
Pledge.open
Pledge.open.count
Pledge.open.find(:all, ...)
etc.
Anybody have a more elegant answer to this problem? Please no suggestions of incrementing a pledges amount_given field each time a transaction occurs. That feels like a band-aid approach and I'm much more of a fan of keeping the pledge static after it is created and computing the difference.
If I weren't using Rails here, I'd just create a view and be done with it.
Thanks!

How is the :transactions association defined? Does it stipulate :class_name = 'PledgeTransaction' (or whatever the class is, if it uses set_table_name)?
Have you looked at the :joins parameter? I think it might be what you were looking for. Certainly that :conditions thing doesn't look right.
If I weren't using Rails here, I'd just create a view and be done with it
Just because it's Rails doesn't mean you can't use a view. OK, depending on the way it's constructed you may not be able to update it, but otherwise go for it. You can create and drop views in migrations, too:
class CreateReallyUsefulView < ActiveRecord::Migration
def self.up
# this is Oracle, I don't know if CREATE OR REPLACE is widely-supported
sql = %{
CREATE OR REPLACE VIEW really_usefuls AS
SELECT
... blah blah SQL blah
}
execute sql
end
def self.down
execute 'drop view really_usefuls'
end
end
class ReallyUseful < ActiveRecord::Base
# all the usual stuff here, considering overriding the C, U and D parts
# of CRUD if it's supposed to be read-only and you're paranoid
end
I think the books/docs don't go into this much because implementation of, and support for views varies significantly across platforms.

I think using NOT EXISTS in your conditions will get you what you want. I'm assuming the association is on the pledge_transaction as pledge_id. Here's how I would implement #open
named_scope :open,
:conditions =>
"
NOT EXISTS (
select 1
from pledge_transactions
where
pledge.id = pledge_transactions.pledge_id AND
pledge_transactions.amount < pledge.amount
)
"
}
}
This will allow you to do Pledge.open, Pledge.open.count and Pledge.open.find_by_{what ever}.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart