Optimization of calculating recommendations and work with many-to-many relationship - ruby-on-rails

I have two models User and Brand, and Many-to-Many relationship between them (through UserBrand table). I have about a thousand users, a thousand brands and a hundred favorite brands of each user.
User.all.count # => 1000
Brand.all.count # => 1000
User.find(1).brands # => 100
If I'd like to find 5 users, which favorite brands almost equal to current users', I wrote the following in the User model
class User < ActiveRecord::Base
has_many :user_brands
has_many :brands, :through => :user_brands
def similar_users
result = {}
User.all.each do |u|
result[u] = shared_brands_with u.brands
end
result.sort{ |a, b| b[1] <=> a[1] }[1..5].map!{ |e| e[0] }
end
def shared_brands_with(brands)
(brands & #brands).size
end
end
and the following in the users/show view
<h2>Similar users</h2>
<ul>
<% #user.similar_users.each do |user| %>
<li><%= link_to user.name, user %></li>
<% end %>
</ul>
But it takes about 30-60 seconds to see user recommendations in the browser.
So my question is "How can I speed up calculation of recommendations?"
UPD: using
User.includes(:brands).each do |u|
result[u] = shared_brands_with u.brands
end
doubles performance, but even with 50 brands instead of 100, giving recommendation in 10 sec is very slow.

So, basically. You're pulling the whole users table. And for each, grabbing the brands list for that user, and presumably the brands themselves. And filtering. I wouldn't exect it to be fast. :-)
You're going to need to rewrite that logic in SQL to make it fast. Sorry I'm not more fluent than at in ruby - can't really make sense of the criteria... But in essence, fetch the brands through a single query.
It'll be big and ugly, full of joins and in() and possibly group by/having clauses, but it'll be faster than your current approach.

Related

How to perform addition operation for a column in a loop in rails

i am having 2 models where User has_many invoice_details and InvoiceDetail belongs_to user.
Now, i am having a condition where we have to perform addition for a column named total_amount which is in a loop.
<% if #user.invoice_details.nil? %>
"NA"
<% else %>
<% #user.invoice_details.each do |mg| %>
<%= mg.total_amount %>
<% end %>
<%end%>
It is displaying amounts 222 333, i want to display the value as 555 (222 + 333)
A the User model you can calculate invoice_details total_amount
#app/models/user.rb
class User < ActiveRecord::Base
has_many :invoice_details
def total_invoices_amount
#convert string to integer and sum
self.invoice_details.map{|x| x.total_amount.to_i}.sum
end
end
Than at UI you can get total_invoices_amount
<%= #user.total_invoices_amount%>
It will automatically take care for nil case
#user.invoice_details.sum(:total_amount)
Fix 1 :
The ruby way of doing it but i insist don't do this
<% if #user.invoice_details.nil? %>
"NA"
<% else %>
<%= #user.invoice_details.map(&:total_amount).sum %>
<%end%>
Fix 2 :
Do it in your model where you query it it will be much faster ie sum it using db query.
Put below method in the User model .
def sum_invoices_amount
self.invoice_details.sum(:total_amount)
end
And use it in the view
Don't use map, it's bad, you're querying for all objects then looping on them and adding the amount, and you'll feel it when the invoices start to increase in number, instead let the database calculate the sum for you, with a single query, also create the method into the model to separate the logic from the view.
class User < ActiveRecord::Base
def total_invoices_amount
invoice_details.sum(:total_amount)
end
end
Then in the view you'll just access that method
<%= #user.total_invoices_amount %>
You'll also notice the difference in the sql query that's running, assuming the user id = 1:
the the activerecord sum will do:
SELECT SUM(`invoice_details`.`total_amount`) AS sum_id FROM `invoice_details` WHERE `invoice_details`.`user_id` = 1
which returns a single Fixnum number ( and zero if the records don't even exist )
But the map will do
SELECT `invoice_details`.`*` FROM `invoice_details` WHERE `invoice_details`.`user_id` = 1
Returns an ActiveRecord::Relation that maps to an N number of records depending on the number of invoices the user has.
Update:
Ok, I just noticed you have the field as string, which is weird considering it's a total amount field, anyways you'll need to tweak the method a little bit, we could also let mysql ( assuming you are using mysql ) cast the string to integer before doing the sum.
class User < ActiveRecord::Base
def total_invoices_amount
invoice_details.select('sum(cast(total_amount as signed)) as total_amount').first.total_amount
end
end

Activerecord opitimization - best way to query all at once?

I am trying to achieve by reducing the numbers of queries using ActiveRecord 3.0.9. I generated about 'dummy' 200K customers and 500K orders.
Here's Models:
class Customer < ActiveRecord::Base
has_many :orders
end
class Orders < ActiveRecord::Base
belongs_to :customer
has_many :products
end
class Product < ActiveRecord::Base
belongs_to :order
end
when you are using this code in the controller:
#customers = Customer.where(:active => true).paginate(page => params[:page], :per_page => 100)
# SELECT * FROM customers ...
and use this in the view (I removed HAML codes for easier to read):
#order = #customers.each do |customer|
customer.orders.each do |order| # SELECT * FROM orders ...
%td= order.products.count # SELECT COUNT(*) FROM products ...
%td= order.products.sum(:amount) # SELECT SUM(*) FROM products ...
end
end
However, the page is rendered the table with 100 rows per page. The problem is that it kinda slow to load because its firing about 3-5 queries per customer's orders. thats about 300 queries to load the page.
There's alternative way to reduce the number of queries and load the page faster?
Notes:
1) I have attempted to use the includes(:orders), but it included more than 200,000 order_ids. that's issue.
2) they are already indexed.
If you're only using COUNT and SUM(amount) then what you really need is to retrieve only that information and not the orders themselves. This is easily done with SQL:
SELECT customer_id, order_id, COUNT(id) AS order_count, SUM(amount) AS order_total FROM orders LEFT JOIN products ON orders.id=products.order_id GROUP BY orders.customer_id, products.order_id
You can wrap this in a method that returns a nice, orderly hash by re-mapping the SQL results into a structure that fits your requirements:
class Order < ActiveRecord::Base
def self.totals
query = "..." # Query from above
result = { }
self.connection.select_rows(query).each do |row|
# Build out an array for each unique customer_id in the results
customer_set = result[row[0].to_i] ||= [ ]
# Add a hash representing each order to this customer order set
customer_set << { :order_id => row[1].to_i, :count => row[2].to_i, :total => row[3].to_i } ]
end
result
end
end
This means you can fetch all order counts and totals in a single pass. If you have an index on customer_id, which is imperative in this case, then the query will usually be really fast even for large numbers of rows.
You can save the results of this method into a variable such as #order_totals and reference it when rendering your table:
- #order = #customers.each do |customer|
- #order_totals[customer.id].each do |order|
%td= order[:count]
%td= order[:total]
You can try something like this (yes, it looks ugly, but you want performance):
orders = Order.find_by_sql([<<-EOD, customer.id])
SELECT os.id, os.name, COUNT(ps.amount) AS count, SUM(ps.amount) AS amount
FROM orders os
JOIN products ps ON ps.order_id = os.id
WHERE os.customer_id = ? GROUP BY os.id, os.name
EOD
%td= orders.name
%td= orders.count
%td= orders.amount
Added: Probably it is better to create count and amount cache in Orders, but you will have to maintain it (count can be counter-cache, but I doubt there is a ready recipe for amount).
You can join the tables in with Arel (I prefer to avoid writing raw sql when possible). I believe that for your example you would do something like:
Customer.joins(:orders -> products).select("id, name, count(products.id) as count, sum(product.amount) as total_amount")
The first method--
Customer.joins(:orders -> products)
--pulls in the nested association in one statement. Then the second part--
.select("id, name, count(products.id) as count, sum(product.amount) as total_amount")
--specifies exactly what columns you want back.
Chain those and I believe you'll get a list of Customer instances only populated with what you've specified in the select method. You have to be careful though because you now have in hand read only objects that are possibly in in invalid state.
As with all the Arel methods what you get from those methods is an ActiveRecord::Relation instance. It's only when you start to access that data that it goes out and executes the SQL.
I have some basic nervousness that my syntax is incorrect but I'm confident that this can be done w/o relying on executing raw SQL.

Order Users with most Songs Desc

Hopefully a simple question. I have several models, two of them, :users and :songs, interact to get data from the database.
A User has_many :songs.
I'm trying to find the users with the most songs in the USER INDEX action, i.e list the 10 users with the most songs, user with the most songs at the top, descending.
So far I have in the users_controller in index;
#users = User.all
And so far all I can do in the view, users/index;
<% #users.each do |user| %>
<%= user.name %><%= user.songs.count %>
<% end %>
Works, it counts the songs but how do I order by users with most songs?
I have been looking at sort_by inside the block and I guess it could be done by listing all songs and grouping them by the user, but I feel that is not efficient enough. -as you can see I'm not an advanced developer.
Please tell me what you guys think and the answer to my solution if possible. The thing that is confusing me is that I am ordering a list generating from a table by data generated by another table. Must be simple but I haven't done it before so I can't get my head around it.
Thanks in Advance.
You should use something like this:
User.includes(:songs).order{|x| x.songs.size}
This does everything in one query so it should be more efficient than for example
User.all.sort{|x| user.songs.size}
Which would perform a query for each user
How about
User.all(:limit => 10, :order => "(select count(user_id) from songs where user_id = users.id) DESC")

How do I calculate the most popular combination of a order lines? (or any similar order/order lines db arrangement)

I'm using Ruby on Rails. I have a couple of models which fit the normal order/order lines arrangement, i.e.
class Order
has_many :order_lines
end
class OrderLines
belongs_to :order
belongs_to :product
end
class Product
has_many :order_lines
end
(greatly simplified from my real model!)
It's fairly straightforward to work out the most popular individual products via order line, but what magical ruby-fu could I use to calculate the most popular combination(s) of products ordered.
Cheers,
Graeme
My suggestion is to create an array a of Product.id numbers for each order and then do the equivalent of
h = Hash.new(0)
# for each a
h[a.sort.hash] += 1
You will naturally need to consider the scale of your operation and how much you are willing to approximate the results.
External Solution
Create a "Combination" model and index the table by the hash, then each order could increment a counter field. Another field would record exactly which combination that hash value referred to.
In-memory Solution
Look at the last 100 orders and recompute the order popularity in memory when you need it. Hash#sort will give you a sorted list of popularity hashes. You could either make a composite object that remembered what order combination was being counted, or just scan the original data looking for the hash value.
Thanks for the tip digitalross. I followed the external solution idea and did the following. It varies slightly from the suggestion as it keeps a record of individual order_combos, rather than storing a counter so it's possible to query by date as well e.g. most popular top 10 orders in the last week.
I created a method in my order which converts the list of order items to a comma separated string.
def to_s
order_lines.sort.map { |ol| ol.id }.join(",")
end
I then added a filter so the combo is created every time an order is placed.
after_save :create_order_combo
def create_order_combo
oc = OrderCombo.create(:user => user, :combo => self.to_s)
end
And finally my OrderCombo class looks something like below. I've also included a cached version of the method.
class OrderCombo
belongs_to :user
scope :by_user, lambda{ |user| where(:user_id => user.id) }
def self.top_n_orders_by_user(user,count=10)
OrderCombo.by_user(user).count(:group => :combo).sort { |a,b| a[1] <=> b[1] }.reverse[0..count-1]
end
def self.cached_top_orders_by_user(user,count=10)
Rails.cache.fetch("order_combo_#{user.id.to_s}_#{count.to_s}", :expiry => 10.minutes) { OrderCombo.top_n_orders_by_user(user, count) }
end
end
It's not perfect as it doesn't take into account increased popularity when someone orders more of one item in an order.

ActiveRecord and SELECT AS SQL statements

I am developing in Rails an app where I would like to rank a list of users based on their current points. The table looks like this: user_id:string, points:integer.
Since I can't figure out how to do this "The Rails Way", I've written the following SQL code:
self.find_by_sql ['SELECT t1.user_id, t1.points, COUNT(t2.points) as user_rank FROM registrations as t1, registrations as t2 WHERE t1.points <= t2.points OR (t1.points = t2.points AND t1.user_id = t2.user_id) GROUP BY t1.user_id, t1.points ORDER BY t1.points DESC, t1.user_id DESC']
The thing is this: the only way to access the aliased column "user_rank" is by doing ranking[0].user_rank, which brinks me lots of headaches if I wanted to easily display the resulting table.
Is there a better option?
how about:
#ranked_users = User.all :order => 'users.points'
then in your view you can say
<% #ranked_users.each_with_index do |user, index| %>
<%= "User ##{index}, #{user.name} with #{user.points} points %>
<% end %>
if for some reason you need to keep that numeric index in the database, you'll need to add an after_save callback to update the full list of users whenever the # of points anyone has changes. You might look into using the acts_as_list plugin to help out with that, or that might be total overkill.
Try adding user_rank to your model.
class User < ActiveRecord::Base
def rank
#determine rank based on self.points (switch statement returning a rank name?)
end
end
Then you can access it with #user.rank.
What if you did:
SELECT t1.user_id, COUNT(t1.points)
FROM registrations t1
GROUP BY t1.user_id
ORDER BY COUNT(t1.points) DESC
If you want to get all rails-y, then do
cool_users = self.find_by_sql ['(sql above)']
cool_users.each do |cool_user|
puts "#{cool_user[0]} scores #{cool_user[1]}"
end

Resources