N+1 while enumerating self-referencing records - ruby-on-rails

I'm doing a pretty basic thing - displaying a tree of categories in topological order and ActiveRecord issues extra query for enumerating each category's children.
class Category < ActiveRecord::Base
attr_accessible :name, :parent_id
belongs_to :parent, :class_name => 'Category'
has_many :children, :class_name => 'Category', :foreign_key => 'parent_id'
def self.in_order
all = Category.includes(:parent, :children).all # Three queries as it should be
root = all.find{|c| c.parent_id == nil}
queue = [root]
result = []
while queue.any?
current = queue.shift
result << current
current.children.each do |child| # SELECT * FROM categories WHERE parent_id = ?
queue << child
end
end
result
end
end
UPD. As far as I understand what's going here is that when a category is referred as a children of some category it's not the same object as the one in the initial list and so it hasn't it's children loaded. Is there a way to implement desired behavior without resorting to creating extra adjacency list?
UPD2: Here's the manual adjacency list solution. It uses only one query but I'd really like to use something more idiomatic
def self.in_order_manual
cache = {}
adj = {}
root = nil
all.each do |c|
cache[c.id] = c
if c.parent_id != nil
(adj[c.parent_id] ||= []) << c.id
else
root = c.id
end
end
queue = [root]
result = []
while queue.any?
current = queue.shift
result << current
(adj[current] || []).each{|child| queue << child}
end
result.map{|id| cache[id]}
end

Related

Display Similar Items With Having Distinct Count Rails 5.1

I'm trying to display a list of gins that have a similar minimum number of botanicals on my show page. I feel I'm close, but the current output is not right. It's actually just printing the name of the gin a number of times.
Gin Load (1.6ms) SELECT "gins".* FROM "gins" INNER JOIN
"gins_botanicals" ON "gins_botanicals"."gin_id" = "gins"."id" INNER
JOIN "botanicals" ON "botanicals"."id" =
"gins_botanicals"."botanical_id" WHERE "botanicals"."id" IN (4, 10, 3)
AND ("gins"."id" != $1) GROUP BY gins.id HAVING (COUNT(distinct
botanicals.id) >= 3) [["id", 2]]
I have three models; two resources with a joins table:
gin.rb
class Gin < ApplicationRecord
belongs_to :distillery, inverse_of: :gins
accepts_nested_attributes_for :distillery, reject_if: lambda {|attributes| attributes['name'].blank?}
acts_as_punchable
has_many :gins_botanical
has_many :botanicals, through: :gins_botanical
botanical.rb
class Botanical < ApplicationRecord
has_many :gins_botanical
has_many :gins, through: :gins_botanical
gins_botanical.rb
class GinsBotanical < ApplicationRecord
belongs_to :gin
belongs_to :botanical
gins_controller
def show
#gin = Gin.friendly.find(params[:id])
#gin.punch(request)
#meta_title = meta_title #gin.name
#similiar_gins = Gin.joins(:botanicals).where("botanicals.id" => #gin.botanical_ids).where.not('gins.id' => #gin.id).having("COUNT(distinct botanicals.id) >= 3").group("gins.id")
end
so in #similar_gins i am trying to count how many matching botanicals does the current #gin have compared to all the other #gins and if >= 3 return the values.
And in my view:
show.html.erb
<% #similiar_gins.each do |gin| %>
<%= #gin.name %>
<% end %>
I'm suspecting my where is not correct...
Yes, I have the similar feature but I have implemented like below
#gin = Gin.find(params[:id])
if #gin.botanicals.count > 1
#botanicals = #gin.botanical_ids
#gin_ids = Botanical.select('distinct gin_id').where('gin_id IN (?)', #botanicals).limit(10)
#ids = #gin_ids.map(&:gin_id)
#similiar_gins = Gin.where('id IN (?)', #ids).where.not(id: #gin) #=> similar all without current gin
end
This code is converted from my code which is relation is category and jobs, if you need to see my code for showing the similar jobs then it is
def show
#job = Job.find(params[:id])
if #job.categories.count > 1
#category = #job.category_ids
#jobs = JobCategory.select('distinct job_id').where('category_id IN (?)', #category).limit(10)
ids = #jobs.map(&:job_id)
#releted_jobs = Job.where('id IN (?)', ids).where.not(id: #job)
end
end
Hope it helps

Rails association scope by method with aggregate

I'm trying to retrieve association records that are dependent on their association records' attributes. Below are the (abridged) models.
class Holding
belongs_to :user
has_many :transactions
def amount
transactions.reduce(0) { |m, t| t.buy? ? m + t.amount : m - t.amount }
end
class << self
def without_empty
includes(:transactions).select { |h| h.amount.positive? }
end
end
class Transaction
belongs_to :holding
attributes :action, :amount
def buy?
action == ACTION_BUY
end
end
The problem is my without_empty method returns an array, which prevents me from using my pagination.
Is there a way to rewrite Holding#amount and Holding#without_empty to function more efficiently with ActiveRecord/SQL?
Here's what I ended up using:
def amount
transactions.sum("CASE WHEN action = '#{Transaction::ACTION_BUY}' THEN amount ELSE (amount * -1) END")END")
end
def without_empty
joins(:transactions).group(:id).having("SUM(CASE WHEN transactions.action = '#{Transaction::ACTION_BUY}' THEN transactions.amount ELSE (transactions.amount * -1) END) > 0")
end

combine keys in array of hashes

I map results of my query to create an array of hashes grouped by organisation_id like so:
results.map do |i|
{
i['organisation_id'] => {
name: capability.name,
tags: capability.tag_list,
organisation_id: i['organisation_id'],
scores: {i['location_id'] => i['score']}
}
}
a capability is defined outside the map.
The result looks like:
[{1=>{:name=>"cap1", :tags=>["tag A"], :scores=>{26=>4}}}, {1=>{:name=>"cap1", :tags=>["tag A"], :scores=>{12=>5}}}, {2 => {...}}...]
For every organisation_id there is a separate entry in the array. I would like to merge these hashes and combine the scores key as so:
[{1=>{:name=>"cap1", :tags=>["tag A"], :scores=>{26=>4, 12=>5}}}, {2=>{...}}... ]
EDIT
To create the results I use the following AR:
Valuation.joins(:membership)
.where(capability: capability)
.select("valuations.id, valuations.score, valuations.capability_id, valuations.membership_id, memberships.location_id, memberships.organisation_id")
.map(&:serializable_hash)
A Valuation model:
class Valuation < ApplicationRecord
belongs_to :membership
belongs_to :capability
end
A Membership model:
class Membership < ApplicationRecord
belongs_to :organisation
belongs_to :location
has_many :valuations
end
results snippet:
[{"id"=>1, "score"=>4, "capability_id"=>1, "membership_id"=>1, "location_id"=>26, "organisation_id"=>1}, {"id"=>16, "score"=>3, "capability_id"=>1, "membership_id"=>2, "location_id"=>36, "organisation_id"=>1}, {"id"=>31, "score"=>3, "capability_id"=>1, "membership_id"=>3, "location_id"=>26, "organisation_id"=>2}, {"id"=>46, "score"=>6, "capability_id"=>1, "membership_id"=>4, "location_id"=>16, "organisation_id"=>2}...
I'll assume for each organization: the name, taglist and organization_id remains the same.
your_hash = results.reduce({}) do |h, i|
org_id = i['organisation_id']
h[org_id] ||= {
name: capability.name,
tags: capability.taglist,
organisation_id: org_id,
scores: {}
}
h[org_id][:scores][i['location_id']] = i['score']
# If the location scores are not strictly exclusive, you can also just +=
h
end
I believe this works, but data is needed to test it.
results.each_with_object({}) do |i,h|
h.update(i['organisation_id'] => {
name: capability.name,
tags: capability.tag_list,
organisation_id: i['organisation_id'],
scores: {i['location_id'] => i['score']}) { |_,o,n|
o[:scores].update(n[:score]); o }
}
end.values
This uses the form of Hash#update (aka merge!) that uses a block to determine the values of keys that are present in both hashes being merged. Please consult the doc for the contents of each of the block variables _, o and n.
Assume, that result is your final array of hashes:
result.each_with_object({}) do |e, obj|
k, v = e.flatten
if obj[k]
obj[k][:scores] = obj[k][:scores].merge(v[:scores])
else
obj[k] = v
end
end

Rails ActiveRecord intersect query with has_many association

I have the following models:
class Piece < ActiveRecord::Base
has_many :instrument_pieces
has_many :instruments, through: :instrument_pieces
end
class Instrument < ActiveRecord::Base
has_many :pieces, through: :instrument_pieces
has_many :instrument_pieces
end
class InstrumentPiece < ActiveRecord::Base
belongs_to :instrument
belongs_to :piece
end
And I have the following query:
Piece
.joins(:instrument_pieces)
.where(instrument_pieces: { instrument_id: search_params[:instruments] } )
.find_each(batch_size: 20) do |p|
Where search_params[:instruments] is an array. The problem with this query is that it will retrieve all pieces that have any of the instruments, so if search_params[:instruments] = ["1","3"], the query will return pieces with an instrument association of either 1 or 3 or of both. I'd like the query to only return pieces whose instrument associations include both instruments 1 and 3. I've read through the docs, but I'm still not sure how this can be done...
It seems like what I wanted was an intersection between the two queries, so what i ended up doing was:
queries = []
query = Piece.joins(:instruments)
search_params[:instruments].each do |instrument|
queries << query.where(instruments: {id: instrument})
end
sql_str = ""
queries.each_with_index do |query, i|
sql_str += "#{query.to_sql}"
sql_str += " INTERSECT " if i != queries.length - 1
end
Piece.find_by_sql(sql_str).each do |p|
Very ugly, but ActiveRecord doesn't support INTERSECT yet. Time to wait for ActiveRecord 5, I suppose.
You can use where clause chaining to achieve this. Try:
query = Piece.joins(:instrument_pieces)
search_params[:instruments].each do |instrument|
query = query.where(instrument_pieces: { instrument_id: instrument } )
end
query.find_each(batch_size: 20) do |p|
or another version
query = Piece.joins(:instruments)
search_params[:instruments].each do |instrument|
query = query.where(instrument_id: instrument)
end
query.find_each(batch_size: 20) do |p|

Ways to simplify and optimize my code?

I've got some code which i would like to optimize.
First, not bad at all, but maybe it can be a bit shorter or faster, mainly the update_result method:
class Round < ActiveRecord::Base
belongs_to :match
has_and_belongs_to_many :banned_champions, :class_name => "Champion", :join_table => "banned_champions_rounds"
belongs_to :clan_blue, :class_name => "Clan", :foreign_key => "clan_blue_id"
belongs_to :clan_purple, :class_name => "Clan", :foreign_key => "clan_purple_id"
belongs_to :winner, :class_name => "Clan", :foreign_key => "winner_id"
after_save {self.update_result}
def update_result
match = self.match
if match.rounds.count > 0
clan1 = match.rounds.first.clan_blue
clan2 = match.rounds.first.clan_purple
results = {clan1=>0, clan2=>0}
for round in match.rounds
round.winner == clan1 ? results[clan1] += 1 : results[clan2] += 1
end
if results[clan1] > results[clan2] then
match.winner = clan1; match.looser = clan2
match.draw_1 = nil; match.draw_2 = nil
elsif results[clan1] < results[clan2] then
match.winner = clan2; match.looser = clan1
match.draw_1 = nil; match.draw_2 = nil
else
match.draw_1 = clan1; match.draw_2 = clan2
match.winner = nil; match.looser = nil
end
match.save
end
end
end
And second, totally bad and slow in seeds.rb:
require 'faker'
champions = [{:name=>"Akali"},
{:name=>"Alistar"},
{:name=>"Amumu"},
{:name=>"Anivia"},
{:name=>"Annie"},
{:name=>"Galio"},
{:name=>"Tryndamere"},
{:name=>"Twisted Fate"},
{:name=>"Twitch"},
{:name=>"Udyr"},
{:name=>"Urgot"},
{:name=>"Veigar"}
]
Champion.create(champions)
10.times do |n|
name = Faker::Company.name
clan = Clan.create(:name=>name)
6.times do |n|
name = Faker::Internet.user_name
clan.players.create(:name=>name)
end
end
for clan in Clan.all do
2.times do
match = Match.create()
c = [clan,Clan.first(:offset => rand(Clan.count))]
3.times do
round = match.rounds.create
round.clan_blue = c[0]
round.clan_purple = c[1]
round.winner = c[0]
round.save!
end
for item in c
for p in item.players.limit(5)
rand_champion = Champion.first(:offset => rand(Champion.count))
match.participations.create!(:player => p, :champion => rand_champion)
end
end
match.save!
end
2.times do
match = Match.create()
c = [clan,Clan.first(:offset => rand(Clan.count))]
3.times do
round = match.rounds.create
round.clan_blue = c[0]
round.clan_purple = c[1]
round.winner = c[1]
round.save!
end
for item in c
for p in item.players.limit(5)
rand_champion = Champion.first(:offset => rand(Champion.count))
match.participations.create!(:player => p, :champion => rand_champion)
end
end
match.save!
end
2.times do
match = Match.create()
c = [clan,Clan.first(:offset => rand(Clan.count))]
2.times do |n|
round = match.rounds.create
round.clan_blue = c[0]
round.clan_purple = c[1]
round.winner = c[n]
round.save!
end
for item in c
for p in item.players.limit(5)
rand_champion = Champion.first(:offset => rand(Champion.count))
match.participations.create!(:player => p, :champion => rand_champion)
end
end
match.save!
end
end
Any chances to optimize them?
Don't underestimate the value of whitespace in cleaning up code readability!
class Round < ActiveRecord::Base
belongs_to :match
belongs_to :clan_blue, :class_name => "Clan", :foreign_key => "clan_blue_id"
belongs_to :clan_purple, :class_name => "Clan", :foreign_key => "clan_purple_id"
belongs_to :winner, :class_name => "Clan", :foreign_key => "winner_id"
has_and_belongs_to_many :banned_champions, :class_name => "Champion", :join_table => "banned_champions_rounds"
after_save { match.update_result }
end
class Match < ActiveRecord::Base
def update_result
return unless rounds.count > 0
clan1, clan2 = rounds.first.clan_blue, rounds.first.clan_purple
clan1_wins = rounds.inject(0) {|total, round| total += round.winner == clan1 ? 1 : 0 }
clan2_wins = rounds.length - clan1_wins
self.winner = self.loser = self.draw_1 = self.draw_2 = nil
if clan1_wins == clan2_wins
self.draw1, self.draw2 = clan1, clan2
else
self.winner = clan1_wins > clan2_wins ? clan1 : clan2
self.loser = clan1_wins < clan2_wins ? clan1 : clan2
end
save
end
end
For your seeds, I'd replace your fixtures with a factory pattern, if it's for tests. If you're going to stick with what you have there, though, wrap the whole block in a transaction and it should become orders of magnitude faster.
Well, on your first example, it appears that you are forcing Match behavior into your Round class, which is not consistent with abstract OOP. Your update_result method actually belongs in your Match class. Once you do that, I think the code will clean itself up a bit.
On your second example, it's hard to see what you are trying to do, but it's not surprising that it's so slow. Every single create and save generates a separate database call. At first glance your code generates over a hundred separate database saves. Do you really need all those records? Can you combine some of the saves?
Beyond that, you can cut your database calls in half by using build instead of create, like this:
round = match.rounds.build
round.clan_blue = c[0]
round.clan_purple = c[1]
round.winner = c[0]
round.save!
If you want to save some lines of code, you could replace the above with this syntax:
match.rounds.create(:clan_blue_id => c[0].id, :clan_purple_id => c[1].id, :winner_id => c[0].id)
In your seeds file:
c = [clan,Clan.first(:offset => rand(Clan.count))]
This works, but it looks like you're picking a random number in Ruby. From what I understand, if you can do something in SQL instead of Ruby, it's generally faster. Try this:
c = [clan,Clan.find(:all, :limit => 1, :order => 'random()')
You won't get too many gains since it's only run twice per clan (so 20x total), but there are similar lines like these two
# (runs 60x total)
rand_champion = Champion.first(:offset => rand(Champion.count))
# (runs up to 200x, I think)
c = [clan,Clan.first(:offset => rand(Clan.count))]
In general, you can almost always find something more to optimize in your program. So your time is most efficiently used by starting with the areas that are repeated the most--the most deeply nested loops. I'll leave optimizing the above 2 lines (and any others that may be similar) to you as an exercise. If you're having trouble, just let me know in a comment.
Also, I'm sure you'll get a lot of good suggestions in many of the responses, so I highly highly highly recommend setting up a benchmarker so you can measure the differences. Be sure run it several times for each version you test, so you can get a good average (programs running in the background could potentially throw off your results).
As far as simplicity, I think readability is pretty important. It won't make your code run any faster, but it can make your debugging faster (and your time is important!). The few things that were giving me trouble were nondescript variables like c and p. I do this too sometimes, but when you have several of these variables in the same scope, I very quickly reach a point where I think "what was that variable for again?". Something like temp_clan instead of c goes a long way.
For readability, I also prefer .each instead of for. That's entirely a personal preference, though.
btw I love League of Legends :)
Edit: (comments won't let me indent code) Upon taking a second look, I realized that this snippet can be optimized further:
for p in item.players.limit(5)
rand_champion = Champion.first(:offset => rand(Champion.count))
match.participations.create!(:player => p, :champion => rand_champion)
end
change Champion.first(:offset => rand(Champion.count))
rand_champs = Champion.find(:all, :limit => 5, :order => 'random()')
for p ...
i = 0
match.participations.create!(:player => p, :champion => rand_champs(i))
i++
end
This will reduce 5 SQL queries into 1. Since it's called 60x, this will reduce your SQL queries from 60 to 12. As an extra plus, you won't get repeated champions on the same team, (or I guess that could be a downside if that was your intention)

Resources