Here're my models - User, UserDevice, Device.
The tables 've the following columns:
users - id, name, email, phone, etc.
devices - id, imei, model etc.
user_devices - id, user_id, device_id etc.
Now I added a new column to devices trk, and want to update it to 1 for a huge number of users.
Here're how my models look.
class User < ActiveRecord::Base
has_many :user_devices
has_one :some_model
# ....
end
class Device < ActiveRecord::Base
has_many :user, through: :user_devices
attr_accessor :device_user_phone
# ....
# some callbacks which depend on on trk & device_user_phone
end
class UserDevice < ActiveRecord::Base
belongs_to :device
belongs_to :user
# ....
end
The list of user ids are provided in csv file like this:
1234
1235
1236
....
This's what I tried so far, but it updates one after the other and is taking a lot of time. Also I can't use update_all, as I want the call_backs to be triggered.
def update_user_trk(user_id)
begin
user = User.find_by(id: user_id)
user_devices = user.user_devices
user_devices.each do |user_device|
user_device.device.update(trk: 1, device_user_col: user.some_model.col) if user_device.device.trk.nil?
end
rescue StandardError => e
Rails.logger.error("ERROR_FOR_USER_#{user_id}::#{e}")
end
end
def read_and_update(file_path)
start_time = Time.now.to_i
unless File.file? file_path
Rails.logger.error("File not found")
end
CSV.foreach(file_path, :headers => false) do |row|
update_user_trk(row[0])
end
end_time = Time.now.to_i
return end_time-start_time
end
Since this updates one row after another in a sequence, it's taking a lot of time and I was wondering if this can be done concurrently to speed it up.
Ruby version : ruby-2.2.3
rails version : Rails 4.2.10
This is at least slightly quicker, this way you're not creating and committing a new transaction on every update.
user = User.find_by(id: user_id)
user.user_devices.each do |user_device|
if user_device.device.trk.nil?
user_device.device.trk = 1
user_device.device_user_col = user.some_model.col
end
end
user.save
Related
I have a model RegularOpeningHour(dayOfWeek: integer) that is associated to a model OpeningTime(opens: time, closes: time). RegularOpeningHour has an 1:n relation to OpeningTime, so that a specific day can have many opening times.
(I know that I simply could have one entry with 'opens' and 'closes' included in RegularOpeningHour but for other reasons I need this splitting)
Now I want a open?-Method, that returns whether the business is opened or not. I tried the following in my model file regular_opening_hour.rb:
def open?
RegularOpeningHour.where(dayOfWeek: Time.zone.now.wday).any? { |opening_hour| opening_hour.opening_times.where('? BETWEEN opens AND closes', Time.zone.now).any? }
end
Unforutnately, that doesn't work. Any ideas to solve this?
How about this:
def open?
joins(:opening_times)
.where(dayOfWeek: Time.current.wday)
.where("opens <= :time AND closes >= :time", time: Time.current)
.any?
end
EDIT: Missing ':' in the join
You could create some scopes to make selecting open OpeningTimes and open RegularOpeningHours less clunky. This makes creating the given selection much easier.
class OpeningTime < ApplicationRecord
# ...
belongs_to :regular_opening_hour
def self.open
time = Time.current
where(arel_table[:opens].lteq(time).and(arel_table[:closes].gteq(time)))
end
# ...
end
class RegularOpeningHour < ApplicationRecord
# ...
has_many :opening_times
def self.open
where(
dayOfWeek: Time.current.wday,
id: OpeningTime.select(:regular_opening_hour_id).open,
)
end
# ...
end
def open?
RegularOpeningHour.open.any?
end
Since you have has_many association of RegularOpeningHour to OpeningTime you can use join query like below.:
RegularOpeningHour.joins(:opening_times).where(dayOfWeek: Time.zone.now.wday).where('? BETWEEN opening_times.opens AND opening_times.closes', Time.zone.now).any?
Given the following models:
class Client < ApplicationRecord
has_many :preferences
validates_associated :preferences
accepts_nested_attributes_for :preferences
end
class Preference < ApplicationRecord
belongs_to :client
validates_uniqueness_of :block, scope: [:day, :client_id]
end
I'm still able to create preferences with duplicate days* when creating a batch of preferences during client creation. This is (seemingly) because the client_id foreign key isn't available when the validates_uniqueness_of validation is run. (*I have an index in place which prevents the duplicate from being saved, but I'd like to catch the error, and return a user friendly error message, before it hits the database.)
Is there any way to prevent this from happening via ActiveRecord validations?
EDIT: This appears to be a known issue.
There's not a super clean way to do this with AR validations when you're batch inserting, but you can do it manually with the following steps.
Make a single query to the database using a Postgresql VALUES list to load any potentially duplicate records.
Compare the records you are about to batch create and pull out any duplicates
Manually generate and return your error message
Step 1 looks a little like this
# Build array of uniq attribute pairs we want to check for
uniq_attrs = new_collection.map do |record|
[
record.day,
record.client_id,
]
end
# santize the values and create a tuple like ('Monday', 5)
values = uniq_attrs.map do |attrs|
safe = attrs.map {|v| ActiveRecord::Base.connection.quote(v)}
"( #{safe.join(",")} )"
end
existing = Preference.where(%{
(day, client_id) in
(#{values.join(",")})
})
# SQL Looks like
# select * from preferences where (day, client_id) in (('Monday',5), ('Tuesday', 3) ....)
Then you can take the collection existing and use it in steps 2 and 3 to pull out your duplicates and generate your error messages.
When I've needed this functionality, I've generally made it a self method off my class, so something like
class Preference < ApplicationRecord
def self.filter_duplicates(collection)
# blah blah blah from above
non_duplicates = collection.reject do |record|
existing.find do |exist|
exist.duplicate?(record)
end
end
[non_duplicates, existing]
end
def duplicate?(record)
record.day == self.day &&
record.client_id = self.client_id
end
end
In my Rails app I have users who can have many payments.
class User < ActiveRecord::Base
has_many :invoices
has_many :payments
def year_ranges
...
end
def quarter_ranges
...
end
def month_ranges
...
end
def revenue_between(range, kind)
payments.sum_within_range(range, kind)
end
end
class Invoice < ActiveRecord::Base
belongs_to :user
has_many :items
has_many :payments
...
end
class Payment < ActiveRecord::Base
belongs_to :user
belongs_to :invoice
def net_amount
invoice.subtotal * percent_of_invoice_total / 100
end
def taxable_amount
invoice.total_tax * percent_of_invoice_total / 100
end
def gross_amount
invoice.total * percent_of_invoice_total / 100
end
def self.chart_data(ranges, unit)
ranges.map do |r| {
:range => range_label(r, unit),
:gross_revenue => sum_within_range(r, :gross),
:taxable_revenue => sum_within_range(r, :taxable),
:net_revenue => sum_within_range(r, :net) }
end
end
def self.sum_within_range(range, kind)
#sum ||= includes(:invoice => :items)
#sum.select { |x| range.cover? x.date }.sum(&:"#{kind}_amount")
end
end
In my dashboard view I am listing the total payments for the ranges depending on the GET parameter that the user picked. The user can pick either years, quarters, or months.
class DashboardController < ApplicationController
def show
if %w[year quarter month].include?(params[:by])
#unit = params[:by]
else
#unit = 'year'
end
#ranges = #user.send("#{#unit}_ranges")
#paginated_ranges = #ranges.paginate(:page => params[:page], :per_page => 10)
#title = "All your payments"
end
end
The use of the instance variable (#sum) greatly reduced the number of SQL queries here because the database won't get hit for the same queries over and over again.
The problem is, however, that when a user creates, deletes or changes one of his payments, this is not reflected in the #sum instance variable. So how can I reset it? Or is there a better solution to this?
Thanks for any help.
This is incidental to your question, but don't use #select with a block.
What you're doing is selecting all payments, and then filtering the relation as an array. Use Arel to overcome this :
scope :within_range, ->(range){ where date: range }
This will build an SQL BETWEEN statement. Using #sum on the resulting relation will build an SQL SUM() statement, which is probably more efficient than loading all the records.
Instead of storing the association as an instance variable of the Class Payment, store it as an instance variable of a user (I know it sounds confusing, I have tried to explain below)
class User < ActiveRecord::Base
has_many :payments
def revenue_between(range)
#payments_with_invoices ||= payments.includes(:invoice => :items).all
# #payments_with_invoices is an array now so cannot use Payment's class method on it
#payments_with_invoices.select { |x| range.cover? x.date }.sum(&:total)
end
end
When you defined #sum in a class method (class methods are denoted by self.) it became an instance variable of Class Payment. That means you can potentially access it as Payment.sum. So this has nothing to do with a particular user and his/her payments. #sum is now an attribute of the class Payment and Rails would cache it the same way it caches the method definitions of a class.
Once #sum is initialized, it will stay the same, as you noticed, even after user creates new payment or if a different user logs in for that matter! It will change when the app is restarted.
However, if you define #payments_with_invoiceslike I show above, it becomes an attribute of a particular instance of User or in other words instance level instance variable. That means you can potentially access it as some_user.payments_with_invoices. Since an app can have many users these are not persisted in Rails memory across requests. So whenever the user instance changes its attributes are loaded again.
So if the user creates more payments the #payments_with_invoices variable would be refreshed since the user instance is re-initialized.
Maybe you could do it with observers:
# payment.rb
def self.cached_sum(force=false)
if #sum.blank? || force
#sum = includes(:invoice => :items)
end
#sum
end
def self.sum_within_range(range)
#sum = cached_sum
#sum.select { |x| range.cover? x.date }.sum(&total)
end
#payment_observer.rb
class PaymentObserver < ActiveRecord::Observer
# force #sum updating
def after_save(comment)
Payment.cached_sum(true)
end
def after_destroy(comment)
Payment.cached_sum(true)
end
end
You could find more about observers at http://apidock.com/rails/v3.2.13/ActiveRecord/Observer
Well your #sum is basically a cache of the values you need. Like any cache, you need to invalidate it if something happens to the values involved.
You could use after_save or after_create filters to call a function which sets #sum = nil. It may also be useful to also save the range your cache is covering and decide the invalidation by the date of the new or changed payment.
class Payment < ActiveRecord::Base
belongs_to :user
after_save :invalidate_cache
def self.sum_within_range(range)
#cached_range = range
#sum ||= includes(:invoice => :items)
#sum.select { |x| range.cover? x.date }.sum(&total)
end
def self.invalidate_cache
#sum = nil if #cached_range.includes?(payment_date)
end
In my Rails app I have users which can have many invoices which in turn can have many payments.
Now in the dashboard view I want to summarize all the payments a user has ever received, ordered either by year, quarter, or month. The payments are also subdivided into gross, net, and tax.
user.rb:
class User < ActiveRecord::Base
has_many :invoices
has_many :payments
def years
(first_year..current_year).to_a.reverse
end
def year_ranges
years.map { |y| Date.new(y,1,1)..Date.new(y,-1,-1) }
end
def quarter_ranges
...
end
def month_ranges
...
end
def revenue_between(range, kind)
payments_with_invoice ||= payments.includes(:invoice => :items).all
payments_with_invoice.select { |x| range.cover? x.date }.sum(&:"#{kind}_amount")
end
end
invoice.rb:
class Invoice < ActiveRecord::Base
belongs_to :user
has_many :items
has_many :payments
def total
items.sum(&:total)
end
def subtotal
items.sum(&:subtotal)
end
def total_tax
items.sum(&:total_tax)
end
end
payment.rb:
class Payment < ActiveRecord::Base
belongs_to :user
belongs_to :invoice
def percent_of_invoice_total
(100 / (invoice.total / amount.to_d)).abs.round(2)
end
def net_amount
invoice.subtotal * percent_of_invoice_total / 100
end
def taxable_amount
invoice.total_tax * percent_of_invoice_total / 100
end
def gross_amount
invoice.total * percent_of_invoice_total / 100
end
end
dashboards_controller:
class DashboardsController < ApplicationController
def index
if %w[year quarter month].include?(params[:by])
range = params[:by]
else
range = "year"
end
#ranges = #user.send("#{range}_ranges")
end
end
index.html.erb:
<% #ranges.each do |range| %>
<%= render :partial => 'range', :object => range %>
<% end %>
_range.html.erb:
<%= #user.revenue_between(range, :gross) %>
<%= #user.revenue_between(range, :taxable) %>
<%= #user.revenue_between(range, :net) %>
Now the problem is that this approach works but produces an awful lot of SQL queries as well. In a typical dashboard view I get 100+ SQL queries. Before adding .includes(:invoice) there were even more queries.
I assume one of the major problems is that each invoice's subtotal, total_tax and total aren't stored anywhere in the database but instead calculated with every request.
Can anybody tell me how to speed up things here? I am not too familiar with SQL and the inner workings of ActiveRecord, so that's probably the problem here.
Thanks for any help.
Whenever revenue_between is called, it fetches the payments in the given time range and the associated invoices and items from the db. Since the time ranges have lot of overlap (month, quarter, year), same records are being fetched over and over again.
I think it is better to fetch all the payments of the user once, then filter and summarize them in Ruby.
To implement, change the revenue_between method as follows:
def revenue_between(range, kind)
#store the all the payments as instance variable to avoid duplicate queries
#payments_with_invoice ||= payments.includes(:invoice => :items).all
#payments_with_invoice.select{|x| range.cover? x.created_at}.sum(&:"#{kind}_amount")
end
This would eager load all the payments along with associated invoices and items.
Also change the invoice summation methods so that it uses the eager loaded items
class Invoice < ActiveRecord::Base
def total
items.map(&:total).sum
end
def subtotal
items.map(&:subtotal).sum
end
def total_tax
items.map(&:total_tax).sum
end
end
Apart from the memoizing strategy proposed by #tihom, I suggest you have a look at the Bullet gem, that as they say in the description, it will help you kill N+1 queries and unused eager loading.
Most of your data do not need to be real time. You can have a service calculating the stats and storing them wherever you want (Redis, cache...). Then refresh them every 10 minutes or upon user's request.
In the first place, render your page without stats and load them with ajax.
In Rails/ActiveReocrd is there a way to replace one instance with another such that all the relations/foreign keys get resolved.
I could imagine something like this:
//setup
customer1 = Customer.find(1)
customer2 = Customer.find(2)
//this would be cool
customer1.replace_with(customer2)
supposing customer1 was badly configured and someone had gone and created customer2, not knowing about customer1 it would be nice to be able to quickly set everything to customer 2
So, also this would need to update any foreign keys as well
User belongs_to :customer
Website belongs_to :customer
then any Users/Websites with a foreign key customer_id = 1 would automatically get set to 2 by this 'replace_with' method
Does such a thing exist?
[I can imagine a hack involving Customer.reflect_on_all_associations(:has_many) etc]
Cheers,
J
Something like this could work, although there may be a more proper way:
Updated: Corrected a few errors in the associations example.
class MyModel < ActiveRecord::Base
...
# if needed, force logout / expire session in controller beforehand.
def replace_with (another_record)
# handles attributes and belongs_to associations
attribute_hash = another_record.attributes
attribute_hash.delete('id')
self.update_attributes!(attribute_hash)
### Begin association example, not complete.
# generic way of finding model constants
find_model_proc = Proc.new{ |x| x.to_s.singularize.camelize.constantize }
model_constant = find_model_proc.call(self.class.name)
# handle :has_one, :has_many associations
have_ones = model_constant.reflect_on_all_associations(:has_one).find_all{|i| !i.options.include?(:through)}
have_manys = model_constant.reflect_on_all_associations(:has_many).find_all{|i| !i.options.include?(:through)}
update_assoc_proc = Proc.new do |assoc, associated_record, id|
primary_key = assoc.primary_key_name.to_sym
attribs = associated_record.attributes
attribs[primary_key] = self.id
associated_record.update_attributes!(attribs)
end
have_ones.each do |assoc|
associated_record = self.send(assoc.name)
unless associated_record.nil?
update_assoc_proc.call(assoc, associated_record, self.id)
end
end
have_manys.each do |assoc|
associated_records = self.send(assoc.name)
associated_records.each do |associated_record|
update_assoc_proc.call(assoc, associated_record, self.id)
end
end
### End association example, not complete.
# and if desired..
# do not call :destroy if you have any associations set with :dependents => :destroy
another_record.destroy
end
...
end
I've included an example for how you could handle some associations, but overall this can become tricky.