How to (massively) reduce the number of SQL queries in Rails app? - ruby-on-rails

In my Rails app I have users which can have many invoices which in turn can have many payments.
Now in the dashboard view I want to summarize all the payments a user has ever received, ordered either by year, quarter, or month. The payments are also subdivided into gross, net, and tax.
user.rb:
class User < ActiveRecord::Base
has_many :invoices
has_many :payments
def years
(first_year..current_year).to_a.reverse
end
def year_ranges
years.map { |y| Date.new(y,1,1)..Date.new(y,-1,-1) }
end
def quarter_ranges
...
end
def month_ranges
...
end
def revenue_between(range, kind)
payments_with_invoice ||= payments.includes(:invoice => :items).all
payments_with_invoice.select { |x| range.cover? x.date }.sum(&:"#{kind}_amount")
end
end
invoice.rb:
class Invoice < ActiveRecord::Base
belongs_to :user
has_many :items
has_many :payments
def total
items.sum(&:total)
end
def subtotal
items.sum(&:subtotal)
end
def total_tax
items.sum(&:total_tax)
end
end
payment.rb:
class Payment < ActiveRecord::Base
belongs_to :user
belongs_to :invoice
def percent_of_invoice_total
(100 / (invoice.total / amount.to_d)).abs.round(2)
end
def net_amount
invoice.subtotal * percent_of_invoice_total / 100
end
def taxable_amount
invoice.total_tax * percent_of_invoice_total / 100
end
def gross_amount
invoice.total * percent_of_invoice_total / 100
end
end
dashboards_controller:
class DashboardsController < ApplicationController
def index
if %w[year quarter month].include?(params[:by])
range = params[:by]
else
range = "year"
end
#ranges = #user.send("#{range}_ranges")
end
end
index.html.erb:
<% #ranges.each do |range| %>
<%= render :partial => 'range', :object => range %>
<% end %>
_range.html.erb:
<%= #user.revenue_between(range, :gross) %>
<%= #user.revenue_between(range, :taxable) %>
<%= #user.revenue_between(range, :net) %>
Now the problem is that this approach works but produces an awful lot of SQL queries as well. In a typical dashboard view I get 100+ SQL queries. Before adding .includes(:invoice) there were even more queries.
I assume one of the major problems is that each invoice's subtotal, total_tax and total aren't stored anywhere in the database but instead calculated with every request.
Can anybody tell me how to speed up things here? I am not too familiar with SQL and the inner workings of ActiveRecord, so that's probably the problem here.
Thanks for any help.

Whenever revenue_between is called, it fetches the payments in the given time range and the associated invoices and items from the db. Since the time ranges have lot of overlap (month, quarter, year), same records are being fetched over and over again.
I think it is better to fetch all the payments of the user once, then filter and summarize them in Ruby.
To implement, change the revenue_between method as follows:
def revenue_between(range, kind)
#store the all the payments as instance variable to avoid duplicate queries
#payments_with_invoice ||= payments.includes(:invoice => :items).all
#payments_with_invoice.select{|x| range.cover? x.created_at}.sum(&:"#{kind}_amount")
end
This would eager load all the payments along with associated invoices and items.
Also change the invoice summation methods so that it uses the eager loaded items
class Invoice < ActiveRecord::Base
def total
items.map(&:total).sum
end
def subtotal
items.map(&:subtotal).sum
end
def total_tax
items.map(&:total_tax).sum
end
end

Apart from the memoizing strategy proposed by #tihom, I suggest you have a look at the Bullet gem, that as they say in the description, it will help you kill N+1 queries and unused eager loading.

Most of your data do not need to be real time. You can have a service calculating the stats and storing them wherever you want (Redis, cache...). Then refresh them every 10 minutes or upon user's request.
In the first place, render your page without stats and load them with ajax.

Related

What is the best way to store a multi-dimensional counter_cache?

In my application I have a search_volume.rb model that looks like this:
search_volume.rb:
class SearchVolume < ApplicationRecord
# t.integer "keyword_id"
# t.integer "search_engine_id"
# t.date "date"
# t.integer "volume"
belongs_to :keyword
belongs_to :search_engine
end
keyword.rb:
class Keyword < ApplicationRecord
has_and_belongs_to_many :labels
has_many :search_volumes
end
search_engine.rb:
class SearchEngine < ApplicationRecord
belongs_to :country
belongs_to :language
end
label.rb:
class Label < ApplicationRecord
has_and_belongs_to_many :keywords
has_many :search_volumes, through: :keywords
end
On the label#index page I am trying to show the sum of search_volumes for the keywords in each label for the last month for the search_engine that the user has cookied. I am able to do this with the following:
<% #labels.each do |label| %>
<%= number_with_delimiter(label.search_volumes.where(search_engine_id: cookies[:search_engine_id]).where(date: 1.month.ago.beginning_of_month..1.month.ago.end_of_month).sum(:volume)) %>
<% end %>
This works well but I have the feeling that the above is very inefficient. With the current approach I also dind it difficult to do operations on search volumes. Most of the time I just want to know about last month's search volume.
Normally I would create a counter_cache on the keywords model to keep track of the latest search_volume, but since there are dozens of search_engines I would have to create one for each, which is also inefficient.
What's the most efficient way to store last month's search volume for all the different search engines separately?
First of all, you can optimize your current implementation by doing one single request for all involved labels like so:
# models
class SearchVolume < ApplicationRecord
# ...
# the best place for your filters!
scope :last_month, -> { where(date: 1.month.ago.beginning_of_month..1.month.ago.end_of_month) }
scope :search_engine, ->(search_engine_id) { where(search_engine_id: search_engine_id) }
end
class Label < ApplicationRecord
# ...
# returns { label_id1 => search_volumn_sum1, label_id2 => search_volumn_sum2, ... }
def self.last_month_search_volumes_per_label_report(labels, search_engine_id:)
labels.
group(:id).
left_outer_joins(:search_volumes).
merge(SearchVolume.last_month.search_engine(search_engine_id)).
pluck(:id, 'SUM(search_volumes.volume)').
to_h
end
end
# controller
class LabelsController < ApplicationController
def index
#labels = Label.all
#search_volumes_report =
Label.last_month_search_volumes_per_label_report(
#labels, search_engine_id: cookies[:search_engine_id]
)
end
end
# view
<% #labels.each do |label| %>
<%= number_with_delimiter(#search_volumes_report[label.id]) %>
<% end %>
Please note that I have not tested it with the same architecture, but with similar models I have on my local machine. It may work by adjusting a few things.
My proposed approach still is live requesting the database. If you really need to store values somewhere because you have very large datasets, I suggest two solutions:
- using materialized views you could refresh each month (scenic gem offers a good way to handle views in Rails application: https://github.com/scenic-views/scenic)
- implementing a new table with standard relations between models, that could store your calculations by ids and months and whose you could populate each month using rake tasks, then you would simply have to eager load your calculations
Please let me know your feedbacks!

How to update a column using active record concurrently?

Here're my models - User, UserDevice, Device.
The tables 've the following columns:
users - id, name, email, phone, etc.
devices - id, imei, model etc.
user_devices - id, user_id, device_id etc.
Now I added a new column to devices trk, and want to update it to 1 for a huge number of users.
Here're how my models look.
class User < ActiveRecord::Base
has_many :user_devices
has_one :some_model
# ....
end
class Device < ActiveRecord::Base
has_many :user, through: :user_devices
attr_accessor :device_user_phone
# ....
# some callbacks which depend on on trk & device_user_phone
end
class UserDevice < ActiveRecord::Base
belongs_to :device
belongs_to :user
# ....
end
The list of user ids are provided in csv file like this:
1234
1235
1236
....
This's what I tried so far, but it updates one after the other and is taking a lot of time. Also I can't use update_all, as I want the call_backs to be triggered.
def update_user_trk(user_id)
begin
user = User.find_by(id: user_id)
user_devices = user.user_devices
user_devices.each do |user_device|
user_device.device.update(trk: 1, device_user_col: user.some_model.col) if user_device.device.trk.nil?
end
rescue StandardError => e
Rails.logger.error("ERROR_FOR_USER_#{user_id}::#{e}")
end
end
def read_and_update(file_path)
start_time = Time.now.to_i
unless File.file? file_path
Rails.logger.error("File not found")
end
CSV.foreach(file_path, :headers => false) do |row|
update_user_trk(row[0])
end
end_time = Time.now.to_i
return end_time-start_time
end
Since this updates one row after another in a sequence, it's taking a lot of time and I was wondering if this can be done concurrently to speed it up.
Ruby version : ruby-2.2.3
rails version : Rails 4.2.10
This is at least slightly quicker, this way you're not creating and committing a new transaction on every update.
user = User.find_by(id: user_id)
user.user_devices.each do |user_device|
if user_device.device.trk.nil?
user_device.device.trk = 1
user_device.device_user_col = user.some_model.col
end
end
user.save

Ruby-on-Rails display issue - May 2018 only exists once but my app is showing it 7 times

I'm a beginner to coding. Apologies if this is some simple answer, but I've been looking for a couple hours and no luck.
Current problem: May 2018 exists with 1 run.
On my /months index page, there is a link to the May 2018 page, where I could create my future runs this month.
However, if I create a 2nd run, when I navigate back to my /months index page, TWO links to May 2018 show up (not one like I expect).
In the db there is only one object May 2018, and it owns both runs. (It then becomes 3, 4, 5, 6, etc. links when I create more runs...)
Quick summary: This is a running-log app. A month has_many runs.
When I create a run, it's attached to a month.
runs_controller.rb
def create
#run = #month.runs.build(run_params)
#run[:user_id] = current_user.id
#run[:pace_per_mile] = #run.format_pace_per_mile
if #run.save
redirect_to month_path(#month)
else
#month = Month.find(params[:month_id])
#runs = #month.runs
render 'months/show'
end
end
Here is my /month index.html.erb code where the error is happening:
<strong><h2>Your Previous Runs</h2></strong>
<% #months.each do |month| %>
<%= link_to(month) do %>
<h3><%= month.name %> <%= month.year %></h3>
<% end %>
<% end %>
Here is my months#index so you can see the scope.
def index
#months = current_user.months
#month = Month.new
end
I can provide more code if I'm not including something!
#xploshioOn, #fool-dev, and #moveson, Thanks for your responses.
I'm including the month and user models, as well as the code where a month gets created...
month.rb
class Month < ApplicationRecord
has_many :runs
has_many :users, through: :runs
validates :name, :year, presence: true
def month_mileage
self.runs.collect {|run| run.distance}.sum
end
end
user.rb
class User < ApplicationRecord
has_secure_password
validates :email, presence: true
validates :email, uniqueness: true
validates :password, presence: true
validates :password, confirmation: true
has_many :runs
has_many :months, through: :runs
end
I'm currently creating months from the months_controller. I'm starting to get the feeling this is where my error lies?
def create
#month = Month.new(month_params)
if #month.save
redirect_to month_url(#month)
else
#months = current_user.months
render :index
end
end
Thanks again for any advice!
It may be confusing to have the relationship with a User having many months through runs. Consider whether that relationship is necessary at all.
If you want to keep your current has many through relationship between User and Month, in your MonthsController#index action, you can do this:
def index
#months = current_user.months.uniq
#month = Month.new
end
If you want to do away with that relationship, in your MonthsController#index action, I would do this:
def index
#months = current_user.runs.map(&:month).uniq
#month = Month.new
end
Accessing months through current_user.runs is more explicit and may be easier to follow. Calling .uniq on the result will eliminate duplicates.
Keep in mind that both of the above options will result in your getting back an Array rather than an ActiveRecord object. To avoid this problem, you could run your query directly on the Month model:
def index
#months = Month.joins(runs: :user).where(users: {id: current_user}).distinct
#month = Month.new
end
This will return an ActiveRecord object allowing you to further refine your query.
You are loading an association and it takes the month for every item that is in the relationship. So just add .uniq to your query.
#months = current_user.months.uniq

Calculations before_validation in Rails

I have two model relationship :
class Totalsold < ActiveRecord::Base
attr_accessible :qty, :total_cost, :date, :price_id, :price_attributes
belongs_to :price
accepts_nested_attributes_for :price
before_validation :calculation_total_cost
private
def calculation_total_cost
#price = Price.where(:id => price_id).first
if qty.nil?
self.qty = 0
end
self.total_cost = qty.to_f * #price.cost
end
end
class Totalsold < ActiveRecord::Base
attr_accessible :cost
has_many :totalsolds
end
calculation_total_cost method successfully post total_cost calculation from qty * cost before_validation. isn't good? because I'm using multiple create and see log here (I'm using pastebin for paste apps log) when submit form.
Is there another way for my case? something solution for that better performance.
This is create method :
def create
#totalsolds = params[:totalsolds].values.collect { |ts| Totalsold.new(ts) }
if #totalsolds.all?(&:valid?)
#totalsolds.each(&:save!)
redirect_to lhg_path
else
render :action => 'new'
end
end
To make it more efficient, you'll need to do the following:
Reduce save calls to 1 per object
Move your function to before_save
Remove any unnecessary queries from your callback
Create
Firstly, you need to make your create method more efficient. Currently, you're cycling through the params[:totalsolds] hash, and running validation & save requests every time. It just looks very cumbersome to me:
def create
totalsold = params[:totalsolds]
for total in totalsold do
if total.save #-> should invoke validation
redirect_to lhg_path
else
render :action => 'new'
end
end
Before Save
Currently, you're calling before_validation. This means every time you validate an ActiveRecord object, your callback will be running. This is inefficient, although might be part of the way your app works
I would move this to the before_save callback:
before_save :set_qty
before_save :calculate_total_cost
private
def set_qty
self.qty = 0 if qty.nil?
end
def calculate_total_cost
price = Price.find(price_id).cost
total_cost = qty * price #-> qty doesn't need to be float (I think)
end
Unnecessary Queries
Your main problem is you're using a lot of queries which you don't need. Prime example: Price.where(:id => price_id).first HIGHLY inefficient -- just use find to pull a single record (as you're dealing with the primary key)
Hope this helps!!

How to update instance variable in Rails model?

In my Rails app I have users who can have many payments.
class User < ActiveRecord::Base
has_many :invoices
has_many :payments
def year_ranges
...
end
def quarter_ranges
...
end
def month_ranges
...
end
def revenue_between(range, kind)
payments.sum_within_range(range, kind)
end
end
class Invoice < ActiveRecord::Base
belongs_to :user
has_many :items
has_many :payments
...
end
class Payment < ActiveRecord::Base
belongs_to :user
belongs_to :invoice
def net_amount
invoice.subtotal * percent_of_invoice_total / 100
end
def taxable_amount
invoice.total_tax * percent_of_invoice_total / 100
end
def gross_amount
invoice.total * percent_of_invoice_total / 100
end
def self.chart_data(ranges, unit)
ranges.map do |r| {
:range => range_label(r, unit),
:gross_revenue => sum_within_range(r, :gross),
:taxable_revenue => sum_within_range(r, :taxable),
:net_revenue => sum_within_range(r, :net) }
end
end
def self.sum_within_range(range, kind)
#sum ||= includes(:invoice => :items)
#sum.select { |x| range.cover? x.date }.sum(&:"#{kind}_amount")
end
end
In my dashboard view I am listing the total payments for the ranges depending on the GET parameter that the user picked. The user can pick either years, quarters, or months.
class DashboardController < ApplicationController
def show
if %w[year quarter month].include?(params[:by])
#unit = params[:by]
else
#unit = 'year'
end
#ranges = #user.send("#{#unit}_ranges")
#paginated_ranges = #ranges.paginate(:page => params[:page], :per_page => 10)
#title = "All your payments"
end
end
The use of the instance variable (#sum) greatly reduced the number of SQL queries here because the database won't get hit for the same queries over and over again.
The problem is, however, that when a user creates, deletes or changes one of his payments, this is not reflected in the #sum instance variable. So how can I reset it? Or is there a better solution to this?
Thanks for any help.
This is incidental to your question, but don't use #select with a block.
What you're doing is selecting all payments, and then filtering the relation as an array. Use Arel to overcome this :
scope :within_range, ->(range){ where date: range }
This will build an SQL BETWEEN statement. Using #sum on the resulting relation will build an SQL SUM() statement, which is probably more efficient than loading all the records.
Instead of storing the association as an instance variable of the Class Payment, store it as an instance variable of a user (I know it sounds confusing, I have tried to explain below)
class User < ActiveRecord::Base
has_many :payments
def revenue_between(range)
#payments_with_invoices ||= payments.includes(:invoice => :items).all
# #payments_with_invoices is an array now so cannot use Payment's class method on it
#payments_with_invoices.select { |x| range.cover? x.date }.sum(&:total)
end
end
When you defined #sum in a class method (class methods are denoted by self.) it became an instance variable of Class Payment. That means you can potentially access it as Payment.sum. So this has nothing to do with a particular user and his/her payments. #sum is now an attribute of the class Payment and Rails would cache it the same way it caches the method definitions of a class.
Once #sum is initialized, it will stay the same, as you noticed, even after user creates new payment or if a different user logs in for that matter! It will change when the app is restarted.
However, if you define #payments_with_invoiceslike I show above, it becomes an attribute of a particular instance of User or in other words instance level instance variable. That means you can potentially access it as some_user.payments_with_invoices. Since an app can have many users these are not persisted in Rails memory across requests. So whenever the user instance changes its attributes are loaded again.
So if the user creates more payments the #payments_with_invoices variable would be refreshed since the user instance is re-initialized.
Maybe you could do it with observers:
# payment.rb
def self.cached_sum(force=false)
if #sum.blank? || force
#sum = includes(:invoice => :items)
end
#sum
end
def self.sum_within_range(range)
#sum = cached_sum
#sum.select { |x| range.cover? x.date }.sum(&total)
end
#payment_observer.rb
class PaymentObserver < ActiveRecord::Observer
# force #sum updating
def after_save(comment)
Payment.cached_sum(true)
end
def after_destroy(comment)
Payment.cached_sum(true)
end
end
You could find more about observers at http://apidock.com/rails/v3.2.13/ActiveRecord/Observer
Well your #sum is basically a cache of the values you need. Like any cache, you need to invalidate it if something happens to the values involved.
You could use after_save or after_create filters to call a function which sets #sum = nil. It may also be useful to also save the range your cache is covering and decide the invalidation by the date of the new or changed payment.
class Payment < ActiveRecord::Base
belongs_to :user
after_save :invalidate_cache
def self.sum_within_range(range)
#cached_range = range
#sum ||= includes(:invoice => :items)
#sum.select { |x| range.cover? x.date }.sum(&total)
end
def self.invalidate_cache
#sum = nil if #cached_range.includes?(payment_date)
end

Resources