I'm trying to cache a lot of data (100 000) that i took with an SQL query, but the caching is not working (take about 30sec to write into the cache and the same amount of time to read it) What I'm I doing wrong? my config variable is already set to true
query = "SELECT inscriptions.`id`, banners.`id`, banners.`name`, inscriptions.`registered_at`,
inscriptions.`synched_at`, inscriptions.`state`
FROM inscriptions
JOIN firm_offices
ON inscriptions.`firm_office_1_id` = firm_offices.`id`
JOIN firms
ON firm_offices.`firm_id` = firms.`id`
JOIN banners
ON firms.`banner_id` = banners.`id`
GROUP BY inscriptions.`id`"
result = ActiveRecord::Base.connection.execute(query)
Rails.cache.fetch 'huge-array' do
data = []
result.each do |r|
data.push({ :id => r[0],
:banner_id => r[1],
:banner_name => r[2],
:registered_at => r[3],
:synched_at => r[4],
:state => r[5]})
end
data
end
#data = Rails.cache.read("huge-array")
Move it all inside your fetch block:
#data ||= Rails.cache.fetch 'huge-array' do
query = "SELECT inscriptions.`id`, banners.`id`, banners.`name`, inscriptions.`registered_at`, inscriptions.`synched_at`, inscriptions.`state`
FROM inscriptions
JOIN firm_offices
ON inscriptions.`firm_office_1_id` = firm_offices.`id`
JOIN firms
ON firm_offices.`firm_id` = firms.`id`
JOIN banners
ON firms.`banner_id` = banners.`id`
GROUP BY inscriptions.`id`"
result = ActiveRecord::Base.connection.execute(query)
data = []
result.each do |r|
data.push({ :id => r[0],
:banner_id => r[1],
:banner_name => r[2],
:registered_at => r[3],
:synched_at => r[4],
:state => r[5]})
end
data
end
Notes:
You don't need to actually move all of it into the block, just the expensive parts (e.g., execute(query)).
Your big SQL query looks like it could translate pretty easily into an AR query. You might want to translate it into an AR query, and then use to_sql if that turns out to be more efficient.
There's no need to run the query and the fetch block each time this function is called. Try something like
#data = Rails.cache.read("huge-array")
if #data.empty?
result = ActiveRecord::Base.connection.execute(query)
#data = []
result.each do |r|
#data.push({ :id => r[0],
:banner_id => r[1],
:banner_name => r[2],
:registered_at => r[3],
:synched_at => r[4],
:state => r[5]})
end
Rails.cache.write("huge-array", #data)
end
return #data
This way you only have to do the expensive query + array creation if the data does not already exist in cache.
Related
In our Rails 3.2.13 app (Ruby 2.0.0 + Postgres on Heroku), we are often retreiving a large amount of Order data from an API, and then we need to update or create each order in our database, as well as the associations. A single order creates/updates itself plus approx. 10-15 associcated objects, and we are importing up to 500 orders at a time.
The below code works, but the problem is it's not at all efficient in terms of speed. Creating/updating 500 records takes approx. 1 minute and generates 6500+ db queries!
def add_details(shop, shopify_orders)
shopify_orders.each do |shopify_order|
order = Order.where(:order_id => shopify_order.id.to_s, :shop_id => shop.id).first_or_create
order.update_details(order,shopify_order,shop) #This calls update_attributes for the Order
ShippingLine.add_details(order, shopify_order.shipping_lines)
LineItem.add_details(order, shopify_order.line_items)
Taxline.add_details(order, shopify_order.tax_lines)
Fulfillment.add_details(order, shopify_order.fulfillments)
Note.add_details(order, shopify_order.note_attributes)
Discount.add_details(order, shopify_order.discount_codes)
billing_address = shopify_order.billing_address rescue nil
if !billing_address.blank?
BillingAddress.add_details(order, billing_address)
end
shipping_address = shopify_order.shipping_address rescue nil
if !shipping_address.blank?
ShippingAddress.add_details(order, shipping_address)
end
payment_details = shopify_order.payment_details rescue nil
if !payment_details.blank?
PaymentDetail.add_details(order, payment_details)
end
end
end
def update_details(order,shopify_order,shop)
order.update_attributes(
:order_name => shopify_order.name,
:order_created_at => shopify_order.created_at,
:order_updated_at => shopify_order.updated_at,
:status => Order.get_status(shopify_order),
:payment_status => shopify_order.financial_status,
:fulfillment_status => Order.get_fulfillment_status(shopify_order),
:payment_method => shopify_order.processing_method,
:gateway => shopify_order.gateway,
:currency => shopify_order.currency,
:subtotal_price => shopify_order.subtotal_price,
:subtotal_tax => shopify_order.total_tax,
:total_discounts => shopify_order.total_discounts,
:total_line_items_price => shopify_order.total_line_items_price,
:total_price => shopify_order.total_price,
:total_tax => shopify_order.total_tax,
:total_weight => shopify_order.total_weight,
:taxes_included => shopify_order.taxes_included,
:shop_id => shop.id,
:email => shopify_order.email,
:order_note => shopify_order.note
)
end
So as you can see, we are looping through each order, finding out if it exists or not (then either loading the existing Order or creating the new Order), and then calling update_attributes to pass in the details for the Order. After that we create or update each of the associations. Each associated model looks very similar to this:
class << self
def add_details(order, tax_lines)
tax_lines.each do |shopify_tax_line|
taxline = Taxline.find_or_create_by_order_id(:order_id => order.id)
taxline.update_details(shopify_tax_line)
end
end
end
def update_details(tax_line)
self.update_attributes(:price => tax_line.price, :rate => tax_line.rate, :title => tax_line.title)
end
I've looked into the activerecord-import gem but unfortunately it seems to be more geared towards creation of records in bulk and not update as we also require.
What is the best way that this can be improved for performance?
Many many thanks in advance.
UPDATE:
I came up with this slight improvement, which essentialy removes the call to update the newly created Orders (one query less per order).
def add_details(shop, shopify_orders)
shopify_orders.each do |shopify_order|
values = {:order_id => shopify_order.id.to_s, :shop_id => shop.id,
:order_name => shopify_order.name,
:order_created_at => shopify_order.created_at,
:order_updated_at => shopify_order.updated_at,
:status => Order.get_status(shopify_order),
:payment_status => shopify_order.financial_status,
:fulfillment_status => Order.get_fulfillment_status(shopify_order),
:payment_method => shopify_order.processing_method,
:gateway => shopify_order.gateway,
:currency => shopify_order.currency,
:subtotal_price => shopify_order.subtotal_price,
:subtotal_tax => shopify_order.total_tax,
:total_discounts => shopify_order.total_discounts,
:total_line_items_price => shopify_order.total_line_items_price,
:total_price => shopify_order.total_price,
:total_tax => shopify_order.total_tax,
:total_weight => shopify_order.total_weight,
:taxes_included => shopify_order.taxes_included,
:email => shopify_order.email,
:order_note => shopify_order.note}
get_order = Order.where(:order_id => shopify_order.id.to_s, :shop_id => shop.id)
if get_order.blank?
order = Order.create(values)
else
order = get_order.first
order.update_attributes(values)
end
ShippingLine.add_details(order, shopify_order.shipping_lines)
LineItem.add_details(order, shopify_order.line_items)
Taxline.add_details(order, shopify_order.tax_lines)
Fulfillment.add_details(order, shopify_order.fulfillments)
Note.add_details(order, shopify_order.note_attributes)
Discount.add_details(order, shopify_order.discount_codes)
billing_address = shopify_order.billing_address rescue nil
if !billing_address.blank?
BillingAddress.add_details(order, billing_address)
end
shipping_address = shopify_order.shipping_address rescue nil
if !shipping_address.blank?
ShippingAddress.add_details(order, shipping_address)
end
payment_details = shopify_order.payment_details rescue nil
if !payment_details.blank?
PaymentDetail.add_details(order, payment_details)
end
end
end
and for the associated objects:
class << self
def add_details(order, tax_lines)
tax_lines.each do |shopify_tax_line|
values = {:order_id => order.id,
:price => tax_line.price,
:rate => tax_line.rate,
:title => tax_line.title}
get_taxline = Taxline.where(:order_id => order.id)
if get_taxline.blank?
taxline = Taxline.create(values)
else
taxline = get_taxline.first
taxline.update_attributes(values)
end
end
end
end
Any better suggestions?
Try wrapping your entire code into a single database transaction. Since you're on Heroku it'll be a Postgres bottom-end. With that many update statements, you can probably benefit greatly by transacting them all at once, so your code executes quicker and basically just leaves a "queue" of 6500 statements to run on Postgres side as the server is able to dequeue them. Depending on the bottom end, you might have to transact into smaller chunks - but even transacting 100 at a time (and then close and re-open the transaction) would greatly improve throughput into Pg.
http://api.rubyonrails.org/classes/ActiveRecord/Transactions/ClassMethods.html
http://www.postgresql.org/docs/9.2/static/sql-set-transaction.html
So before line 2 you'd add something like:
def add_details(shop, shopify_orders)
Order.transaction do
shopify_orders.each do |shopify_order|
And then at the very end of your method add another end:
if !payment_details.blank?
PaymentDetail.add_details(order, payment_details)
end
end //shopify_orders.each..
end //Order.transaction..
end //method
You can monkey-patch ActiveRecord like this:
class ActiveRecord::Base
#http://stackoverflow.com/questions/15317837/bulk-insert-records-into-active-record-table?lq=1
#https://gist.github.com/jackrg/76ade1724bd816292e4e
# "UPDATE THIS SET <list_of_column_assignments> FROM <table_name> THIS JOIN (VALUES (<csv1>, <csv2>,...) VALS ( <column_names> ) ON <list_of_primary_keys_comparison>"
def self.bulk_update(record_list)
pk = self.primary_key
raise "primary_key not found" unless pk.present?
raise "record_list not an Array of Hashes" unless record_list.is_a?(Array) && record_list.all? {|rec| rec.is_a? Hash }
return nil if record_list.empty?
result = nil
#test if every hash has primary keys, so we can JOIN
record_list.each { |r| raise "Primary Keys '#{self.primary_key.to_s}' not found on record: #{r}" unless hasAllPKs?(r) }
#list of primary keys comparison
pk_comparison_array = []
if (pk).is_a?(Array)
pk.each {|thiskey| pk_comparison_array << "THIS.#{thiskey} = VALS.#{thiskey}" }
else
pk_comparison_array << "THIS.#{pk} = VALS.#{pk}"
end
pk_comparison = pk_comparison_array.join(' AND ')
#SQL
(1..record_list.count).step(1000).each do |start|
key_list, value_list = convert_record_list(record_list[start-1..start+999])
#csv values
csv_vals = value_list.map {|v| "(#{v.join(", ")})" }.join(", ")
#column names
column_names = key_list.join(", ")
#list of columns assignments
columns_assign_array = []
key_list.each {|col|
unless inPK?(col)
columns_assign_array << "THIS.#{col} = VALS.#{col}"
end }
columns_assign = columns_assign_array.join(', ')
sql = "UPDATE THIS SET #{columns_assign} FROM #{self.table_name} THIS JOIN ( VALUES #{csv_vals} ) VALS ( #{column_names} ) ON ( #{pk_comparison} )"
result = self.connection.execute(sql)
return result if result<0
end
return result
end
def self.inPK?(str)
pk = self.primary_key
test = str.to_s
if pk.is_a?(Array)
(pk.include?(test))
else
(pk==test)
end
end
#test if given hash has primary keys included as hash keys and those keys are not empty
def self.hasAllPKs?(hash)
h = hash.stringify_keys
pk = self.primary_key
if pk.is_a?(Array)
(pk.all? {|k| h.key?(k) and h[k].present? })
else
h.key?(pk) and h[pk].present?
end
end
def self.convert_record_list(record_list)
# Build the list of keys
key_list = record_list.map(&:keys).flatten.map(&:to_s).uniq.sort
value_list = record_list.map do |rec|
list = []
key_list.each {|key| list << ActiveRecord::Base.connection.quote(rec[key] || rec[key.to_sym]) }
list
end
# If table has standard timestamps and they're not in the record list then add them to the record list
time = ActiveRecord::Base.connection.quote(Time.now)
for field_name in %w(created_at updated_at)
if self.column_names.include?(field_name) && !(key_list.include?(field_name))
key_list << field_name
value_list.each {|rec| rec << time }
end
end
return [key_list, value_list]
end
end
Then, you can generate a array of hashes containing your models attributes (including theirs primary keys) and do something like:
ActiveRecord::Base.transaction do
Model.bulk_update [ {attr1: val1, attr2: val2,...}, {attr1: val1, attr2: val2,...}, ... ]
end
It will be a single SQL command without Rails callbacks and validations.
For PostgreSQL, there are several issues that the above approach does not address:
You must specify an actual table, not just an alias, in the update target table.
You cannot repeat the target table in the FROM phrase. Since you are joining the target table to a VALUES table (hence there is only one table in the FROM phrase, you won't be able to use JOIN, you must instead use "WHERE ".
You don't get the same "free" casts in a VALUES table that you do in a simple "UPDATE" command, so you must cast date/timestamp values as such (#val_cast does this).
class ActiveRecord::Base
def self.update!(record_list)
raise ArgumentError "record_list not an Array of Hashes" unless record_list.is_a?(Array) && record_list.all? {|rec| rec.is_a? Hash }
return record_list if record_list.empty?
(1..record_list.count).step(1000).each do |start|
field_list, value_list = convert_record_list(record_list[start-1..start+999])
key_field = self.primary_key
non_key_fields = field_list - [%Q["#{self.primary_key}"], %Q["created_at"]]
columns_assign = non_key_fields.map {|field| "#{field} = #{val_cast(field)}"}.join(",")
value_table = value_list.map {|row| "(#{row.join(", ")})" }.join(", ")
sql = "UPDATE #{table_name} AS this SET #{columns_assign} FROM (VALUES #{value_table}) vals (#{field_list.join(", ")}) WHERE this.#{key_field} = vals.#{key_field}"
self.connection.update_sql(sql)
end
return record_list
end
def self.val_cast(field)
field = field.gsub('"', '')
if (column = columns.find{|c| c.name == field }).sql_type =~ /time|date/
"cast (vals.#{field} as #{column.sql_type})"
else
"vals.#{field}"
end
end
def self.convert_record_list(record_list)
# Build the list of fields
field_list = record_list.map(&:keys).flatten.map(&:to_s).uniq.sort
value_list = record_list.map do |rec|
list = []
field_list.each {|field| list << ActiveRecord::Base.connection.quote(rec[field] || rec[field.to_sym]) }
list
end
# If table has standard timestamps and they're not in the record list then add them to the record list
time = ActiveRecord::Base.connection.quote(Time.now)
for field_name in %w(created_at updated_at)
if self.column_names.include?(field_name) && !(field_list.include?(field_name))
field_list << field_name
value_list.each {|rec| rec << time }
end
end
field_list.map! {|field| %Q["#{field}"] }
return [field_list, value_list]
end
end
Im trying to arrange #plrdet by the values in arr.
when im selecting this way:
#plrdet = Player.find_all_by_fid(arr)
it returns in the order of the rows in the table, i want it to be ordered by the order of arr.
for example:
Player contains the following attributes: address, age, uniqnum.
and:
arr
is an array of the uniqnum.
arr=[456,123,789]
player=[{NYC,32,123},{BSAS,27,456},{LND,30,789})
the result that im looking for should be from the "find_all"
player=[,{BSAS,27,456},{NYC,32,123},{LND,30,789})
If I understand the problem I would try something like this:
Hash version
players = [{}]
#plrdet.each do |player|
players << {"adress" => player.adress, "age" => player.age, "fid" => player.fid}
end
players.inspect
Now result should be [{"adress" => BSAS, "age" => 27, "fid" => 456},{"adress" => NYC, "age" => 32,"fid" => 123},{"adress" => LND, "age" => 30, "fid" => 789}]
Array version
players = [[]]
#plrdet.each do |player|
players << [player.adress, player.age, player.fid]
end
Now result should be [[BSAS,27,456],[NYC,32,123],[LND,30,789]]
Sort
I think this solution should work but I don't like it and there are maybe better way to solve your problem :
sorted_players = [[]]
arr.each do |arr_fid|
sorted_players << players.collect{|player| player if player.include?(arr_fid)}
end
You have two options:
Use order to sort the results with the query
Use sort to sort the results in memory
You may use 1. It will be something like:
#plrdet = Player.find_all_by_fid(arr).order("address")
We start an operation by making sure a customer has enough items with which to work. So we begin by collecting all their current items in an array:
#items = SOrder.where(:user_id => current_user.id).order("order")
Then we determine how many items they should have. If someone has a free account, they should have 5 items. If it is a paid account they should have 20 items:
if current_user.paid
should_have = 19 # one less than 20 because of 0 position in the array
else
should_have = 4
end
Then, in case we need to add blank records, we figure out where we should start:
if #items.empty?
start = 0
else
start = #items.length + 1
end
If the start is less than or equal to what someone should have, then we add blank records:
if start <= should_have
value = [start .. should_have].each do |v|
SOrder.create(:user_id => current_user.id, :order => v, :item_id => 0 )
end
#items = SOrder.where(:user_id => current_user.id).order("order") # reload array
end
The records that should be added are not showing up in the database.
Where is the error?
Try
value = (start .. should_have).each do |v|
instead of
value = [start .. should_have].each do |v|
[start .. should_have] will just return an array with a single range element in it. (start .. should_have) will return a range, upon which the each enumerator will work as you expect.
The error may come from calling .length from an Arel object and not a record set.
#items = SOrder.where(:user_id => current_user.id).order("order").all
However, since you only need a count for the first query, I'd suggest using .count. If I was writing this I'd do something like:
number_of_items = SOrder.where(:user_id => current_user.id).count
number_of_blank_items_to_add = current_user.allowed_items - number_of_items
if number_of_blank_items_to_add > 0
number_of_blank_items_to_add.times do |num|
SOrder.create(:user_id => current_user.id, :order => (number_of_items + num), :item_id => 0 )
end
end
#str_order = SOrder.where(:user_id => current_user.id).order("order")
In User model:
def allowed_items
if paid
20
else
5
end
end
Better Yet
In User model:
has_many :s_orders, :order => "s_orders.order asc"
def add_extra_blank_orders
number_of_items = s_orders.count
number_of_blank_items_to_add = allowed_items - number_of_items
if number_of_blank_items_to_add > 0
number_of_blank_items_to_add.times do |num|
s_orders.create(:order => (number_of_items + num), :item_id => 0 )
end
end
def allowed_items
if paid
20
else
5
end
end
In controller:
current_user.add_extra_blank_orders
#str_order = current_user.s_orders
While I am sure that you have a good reason, I am questioning why blank items need to be in the database at all. And, if a after_create hook could be used here.
Try this code to ensure your code is entering the loop of creating records by adding puts "entered the loop" inside the loop like this:
if start <= should_have
(start .. should_have).each do |v|
puts "entered loop"
SOrder.create(:user_id => current_user.id, :order => v, :item_id => 0 )
end
#items = SOrder.where(:user_id => current_user.id).order("order") # reload array
end
If "entered loop" is getting printed, try .create! to make sure all the validations are passed(If any of them are failed ActiveRecord error will be raised stating the validation)
if start <= should_have
(start .. should_have).each do |order|
SOrder.create!(:user_id => current_user.id, :order => order, :item_id => 0 )
end
#str_order = SOrder.where(:user_id => current_user.id).order("order") # reload array
end
I don't see where you're using value and not sure why you're using it.
Can you use this?:
if start <= should_have
(start .. should_have).each do |order|
SOrder.create(:user_id => current_user.id, :order => order, :item_id => 0 )
end
end
#str_order = SOrder.where(:user_id => current_user.id).order("order") # reload
Edit: I moved #str_order outside of your if statement to make sure you'd always be reloading the array, if this is undesired just switch it back.
I have the below set of queries, but I'm sure this isn't DRY. However, I can't find out how to filter trough the deals var instead of querying again for each var. Is it possible?
deals = Deal.all
won = Deal.find( :all, :conditions => ["status = 'won'"] ).count
pending = Deal.find( :all, :conditions => ["status = 'pending'"] ).count
lost = Deal.find( :all, :conditions => ["status = 'lost'"] ).count
Use GROUP BY SQL clause:
Hash[Deal.all(:select => 'status, count(*) as count', :group => 'status').map{|e|
[e.status, e.count]
}]
Edit: I forgot that you already have all the records loaded. In that case, you can get counts per status this way:
Hash[deals.group_by(&:status).map{|k,v| [k,v.count]}]
You can use following:-
Deal.find(:all, :select => 'status, count(id) as deal_count', :group => 'status')
You can use Array#select:
deals = Deal.all
won = deals.select { |deal| deal.status == 'won' }.length
# similar for pending and lost
I think you can use Ruby's inject function for this:
won = deals.inject(0) {|total, deal| deal.status == 'won' ? total + 1 : total }
if your Deal objects are ActiveRecord objects (which is typically the case for models), you can launch the count on the data base:
won = Deal.count_by_sql("select count(*) from deals where status = 'won'")
Another way to do it would be to write the sql query that would do the all the count for you, and group them by status:
count_by_status = Deal.find_by_sql("select status,count(*) from deals group by status;")
Then you can use the result (which will be an array of hashes I think).
I have a table with columns 'id', 'resource_id', 'read_time', 'value' where 'value' is a float
What I am trying to accomplish is to return a list of records such that the 'value' of each record is the sum of all the records at a specific 'read_time' but having differing 'resource_id' values.
I am wondering if there is a clever way (ie not looping through all the entries) to accomplish this. Currently I am implementing something along these lines:
#aggregate_meters = []
#res_one_meters = Meter.find(:all, :conditions => ["resource_id = ?", 1])
#res_one_meters.each do |meter|
read_time = meter.read_time
value = meter.value
if res_two_meter = Meter.find(:first, :conditions => ["resource_id = ? AND read_time = ?", 2, read_time ])
value = value + res_two_meter.value
end
aggregate_meter = Meter.new(:read_time => read_time, :value => value, :resource_id => 3)
#aggregate_meters.push(aggregate_meter)
end
Thank you.
ActiveRecord::Calculate is your friend here. Letting you do exactly what you want with one database call. It returns a hash using the unique values in the column used in the group as keys.
Here's the code you wrote, rewritten to use sum.
values = Meter.sum(:value, :group => :read_time)
values.each do |read_time, value|
aggregate_meter = Meter.new(:read_time => read_time, :value => value, :resource_id => 3)
#aggregates_meter.push(aggregate_meter)
end