Include scalar on active record relation? - ruby-on-rails

Starting with two tables
class Single
has_many :manies
def total_value
manies.sum(:value)
end
end
class Many
belongs_to: single
# also has an integer 'value'
end
singles = Singles.all.includes(:total_value, :total_value2)
singles.each {|s| s.total_value } # Makes a new sql call for each single
I need to list many singles, and for each single I need to total up all of the values for that single. If I just call total_value it ends up with N+1 SQL calls.
It is pretty trivial to do this in a single raw SQL query, but I can't figure out how to tell active record to do it.
Grouping.. kindof works?
Many.group(:single).sum(:value)
=> {#<Single id: "d80b4132-7ef1-4fe1-a9e0-7da89c00d295">=>0.171322e4}
But it returns a hash, not an ActiveRecord relation.. not sure what that is going to do when I have thousands of singles but it doesn't seem good.
Plus.. now I want to add a second scalar on value2. Still fairly trivial in SQL.
What I would prefer to do is use includes.. but I believe it only works on associations, and I can't figure out how to make has_one work on a scalar.
e.g. something like this
class Single
has_many :manies
has_one :total_value, -> { many.sum(:value) }
has_one :total_value2, -> { many.sum(:value2) }
end
class Many
belongs_to: single
# also has an integer 'value'
end
singles = Singles.all.includes(:total_value, :total_value2)
singles.each {|s| [s.total_value, s.total_value2] } # already cached, does not make SQL call

Related

Are .select and or .where responsible for causing N+1 queries in rails?

I have two methods here, distinct_question_ids and #correct_on_first attempt. The goal is to show a user how many distinct multiple choice questions have been answered that are correct.
The second one will let me know how many of these distinct MCQs have been answered correctly on the first attempt. (A user can attempt a MCQ many times)
Now, when a user answers thousands of questions and has thousands of user answers, the page to show their performance is taking 30 seconds to a minute to load. And I believe it's due to the .select method, but I don't know how to replace .select without using .select, since it loops just like .each
Is there any method that doesn't cause N+1?
distinct_question_ids = #user.user_answers.includes(:multiple_choice_question).
where(is_correct_answer: true).
distinct.pluck(:multiple_choice_question_id)
#correct_on_first_attempt = distinct_question_ids.select { |qid|
#user.user_answers.
where(multiple_choice_question_id: qid).first.is_correct_answer
}.count
.pluck returns an Array of values, not an ActiveRecord::Relation.
So when you do distinct_question_ids.select you're not calling ActiveRecord's select, but Array's select. Within that select, you're issuing a fresh new query against #user for every id you just plucked -- including ones that get rejected in the select.
You could create a query named distinct_questions that returns a relation (no pluck!), and then build correct_on_first_attempt off of that, and I think you'll avoid the N+1 queries.
Something along these lines:
class UserAnswer < ActiveRecord::Base
scope :distinct_correct, -> { includes(:multiple_choice_question)
.where(is_correct_answer: true).distinct }
scope :first_attempt_correct, -> { distinct_correct
.first.is_correct_answer }
end
class User < ActiveRecord::Base
def good_guess_count
#correct_on_first_attempt = #user.user_answers.distinct_correct.first_attempt_correct.count
end
end
You'll need to ensure that .first is actually getting their first attempt, probably by sorting by id or created_at.
As an aside, if you track the attempt number explicitly in UserAnswer, you can really tighten this up:
class UserAnswer < ActiveRecord::Base
scope :correct, -> { where(is_correct_answer: true) }
scope :first_attempt, -> { where(attempt: 1) }
end
class User < ActiveRecord::Base
def lucky_guess_count
#correct_on_first_attempt = #user.user_answers.includes(:multiple_choice_question)
.correct.first_attempt.count
end
end
If you don't have an attempt number in your schema, you could .order and .group to get something similar. But...it seems that some of your project requirements depend on that sequence number, so I'd recommend adding it if you don't have it already.
ps. For fighting N+1 queries, use gem bullet. It is on-point.

Rails: Return Records Where Array of IDs for Associated Model All Exist

So I have a model of goals and for each goal there are skaters on the ice when it is scored. I want to return the goals where specific combinations of skaters were on for a goal. The relationship looks like so:
class Goal < ApplicationRecord
has_many :on_ice_skaters
end
class OnIceSkater < ApplicationRecord
belongs_to :goal
end
I've been trying to do it via a .joins then a .where but it seems to return goals where ANY of the players in the array were present and not when ALL were present (each OnIceSkater record has a player_id):
player_ids = [6382,5635]
Goal.joins(:on_ice_skaters).where('on_ice_skaters.player_id' => player_ids)
Was wondering if there was a way to turn the above statement into an AND statement functionally (eg. Find Goals that have OnIceSkaters with player_id 6382 AND 5635)?
If you write two .where() clauses in a row, ActiveRecord ANDs them together. So try this:
player_ids = [6382,5635]
goals = Goal.joins(:on_ice_skaters)
player_ids.each do |player_id|
goals = goals.where(on_ice_skaters: { player_id: player_id })
end
I took the liberty of upgrading your string to a hash. It might be more accurate, and it certainly looks more Railsey.
I'm not sure why Phlip's answer does not work (as the OP said in a comment). If you can't make it work, here you have a more complicated and not efficient approach (it makes number of player_ids + 1 queries):
player_ids = [6382,5635]
# All goal ids for player 6328 (first player in the array)
goal_ids = Goal.joins(:on_ice_skaters).where(on_ice_skaters: { player_id: player_ids[0] }).pluck(:id).uniq
player_ids.shift.each do |player_id|
# Keep only goals in common for each other player
goal_ids = goal_ids & Goal.joins(:on_ice_skaters).where(on_ice_skaters: { player_id: player_id }).pluck(:id).uniq
end
# Take these ids.
goals = Goal.where(id: goal_ids)

Rails query a has_many :through conditionally with multiple ids

I'm trying to build a filtering system for a website that has locations and features through a LocationFeature model. Basically what it should do is give me all the locations based on a combination of feature ids.
So for example if I call the method:
Location.find_by_features(1,3,4)
It should only return the locations that have all of the selected features. So if a location has the feature_ids [1, 3, 5] it should not get returned, but if it had [1, 3, 4, 5] it should. However, currently it is giving me Locations that have either of them. So in this example it returns both, because some of the feature_ids are present in each of them.
Here are my models:
class Location < ActiveRecord::Base
has_many :location_features, dependent: :destroy
has_many :features, through: :location_features
def self.find_by_features(*ids)
includes(:features).where(features: {id: ids})
end
end
class LocationFeature < ActiveRecord::Base
belongs_to :location
belongs_to :feature
end
class Feature < ActiveRecord::Base
has_many :location_features, dependent: :destroy
has_many :locations, through: :location_features
end
Obviously this code isn't working the way I want it to and I just can't get my head around it. I've also tried things such as:
Location.includes(:features).where('features.id = 5 AND features.id = 9').references(:features)
but it just returns nothing. Using OR instead of AND give me either again. I also tried:
Location.includes(:features).where(features: {id: 9}, features: {id: 1})
but this just gives me all the locations with the feature_id of 1.
What would be the best way to query for a location matching all the requested features?
When you do an include it makes a "pseudo-table" in memory which has all the combinations of table A and table B, in this case joined on the foreign_key. (In this case there's already a join table included (feature_locations), to complicate things.)
There won't be any rows in this table which satisfy the condition features.id = 9 AND features.id = 1. Each row will only have a single features.id value.
What i would do for this is forget about the features table: you only need to look in the join table, location_features, to test for the presence of specific feature_id values. We need a query which will compare feature_id and location_id from this table.
One way is to get the features, then get a collection of arrays if associated location_ids (which just calls the join table), then see which location ids are in all of the arrays: (i've renamed your method to be more descriptive)
#in Location
def self.having_all_feature_ids(*ids)
location_ids = Feature.find_all_by_id(ids).map(&:location_ids).inject{|a,b| a & b}
self.find(location_ids)
end
Note1: the asterisk in *ids in the params means that it will convert a list of arguments (including a single argument, which is like a "list of one") into a single array.
Note2: inject is a handy device. it says "do this code between the first and second elements in the array, then between the result of this and the third element, then the result of this and the fourth element, etc, till you get to the end. In this case the code i'm doing between the two elements in each pair (a and b) is "&" which, when dealing with arrays, is the "set intersection operator" - this will return only elements which are in both pairs. By the time you've gone through the list of arrays doing this, only elements which are in ALL arrays will have survived. These are the ids of locations which are associated with ALL of the given features.
EDIT: i'm sure there's a way to do this with a single sql query - possibly using group_concat - which someone else will probably post shortly :)
I would do this as a set of subqueries. You can actually also do it as a scope if you wish.
scope :has_all_features, ->(*feature_ids) {
where( ( ["locations.id in (select location_id from location_features where feature_id=?)"] * feature_ids.count).join(' and '), *feature_ids)
}

ActiveRecord query array intersection?

I'm trying to figure out the count of certain types of articles. I have a very inefficient query:
Article.where(status: 'Finished').select{|x| x.tags & Article::EXPERT_TAGS}.size
In my quest to be a better programmer, I'm wondering how to make this a faster query. tags is an array of strings in Article, and Article::EXPERT_TAGS is another array of strings. I want to find the intersection of the arrays, and get the resulting record count.
EDIT: Article::EXPERT_TAGS and article.tags are defined as Mongo arrays. These arrays hold strings, and I believe they are serialized strings. For example: Article.first.tags = ["Guest Writer", "News Article", "Press Release"]. Unfortunately this is not set up properly as a separate table of Tags.
2nd EDIT: I'm using MongoDB, so actually it is using a MongoWrapper like MongoMapper or mongoid, not ActiveRecord. This is an error on my part, sorry! Because of this error, it screws up the analysis of this question. Thanks PinnyM for pointing out the error!
Since you are using MongoDB, you could also consider a MongoDB-specific solution (aggregation framework) for the array intersection, so that you could get the database to do all the work before fetching the final result.
See this SO thread How to check if an array field is a part of another array in MongoDB?
Assuming that the entire tags list is stored in a single database field and that you want to keep it that way, I don't see much scope of improvement, since you need to get all the data into Ruby for processing.
However, there is one problem with your database query
Article.where(status: 'Finished')
# This translates into the following query
SELECT * FROM articles WHERE status = 'Finished'
Essentially, you are fetching all the columns whereas you only need the tags column for your process. So, you can use pluck like this:
Article.where(status: 'Finished').pluck(:tags)
# This translates into the following query
SELECT tags FROM articles WHERE status = 'Finished'
I answered a question regarding general intersection like queries in ActiveRecord here.
Extracted below:
The following is a general approach I use for constructing intersection like queries in ActiveRecord:
class Service < ActiveRecord::Base
belongs_to :person
def self.with_types(*types)
where(service_type: types)
end
end
class City < ActiveRecord::Base
has_and_belongs_to_many :services
has_many :people, inverse_of: :city
end
class Person < ActiveRecord::Base
belongs_to :city, inverse_of: :people
def self.with_cities(cities)
where(city_id: cities)
end
# intersection like query
def self.with_all_service_types(*types)
types.map { |t|
joins(:services).merge(Service.with_types t).select(:id)
}.reduce(scoped) { |scope, subquery|
scope.where(id: subquery)
}
end
end
Person.with_all_service_types(1, 2)
Person.with_all_service_types(1, 2).with_cities(City.where(name: 'Gold Coast'))
It will generate SQL of the form:
SELECT "people".*
FROM "people"
WHERE "people"."id" in (SELECT "people"."id" FROM ...)
AND "people"."id" in (SELECT ...)
AND ...
You can create as many subqueries as required with the above approach based on any conditions/joins etc so long as each subquery returns the id of a matching person in its result set.
Each subquery result set will be AND'ed together thus restricting the matching set to the intersection of all of the subqueries.

How do I calculate the most popular combination of a order lines? (or any similar order/order lines db arrangement)

I'm using Ruby on Rails. I have a couple of models which fit the normal order/order lines arrangement, i.e.
class Order
has_many :order_lines
end
class OrderLines
belongs_to :order
belongs_to :product
end
class Product
has_many :order_lines
end
(greatly simplified from my real model!)
It's fairly straightforward to work out the most popular individual products via order line, but what magical ruby-fu could I use to calculate the most popular combination(s) of products ordered.
Cheers,
Graeme
My suggestion is to create an array a of Product.id numbers for each order and then do the equivalent of
h = Hash.new(0)
# for each a
h[a.sort.hash] += 1
You will naturally need to consider the scale of your operation and how much you are willing to approximate the results.
External Solution
Create a "Combination" model and index the table by the hash, then each order could increment a counter field. Another field would record exactly which combination that hash value referred to.
In-memory Solution
Look at the last 100 orders and recompute the order popularity in memory when you need it. Hash#sort will give you a sorted list of popularity hashes. You could either make a composite object that remembered what order combination was being counted, or just scan the original data looking for the hash value.
Thanks for the tip digitalross. I followed the external solution idea and did the following. It varies slightly from the suggestion as it keeps a record of individual order_combos, rather than storing a counter so it's possible to query by date as well e.g. most popular top 10 orders in the last week.
I created a method in my order which converts the list of order items to a comma separated string.
def to_s
order_lines.sort.map { |ol| ol.id }.join(",")
end
I then added a filter so the combo is created every time an order is placed.
after_save :create_order_combo
def create_order_combo
oc = OrderCombo.create(:user => user, :combo => self.to_s)
end
And finally my OrderCombo class looks something like below. I've also included a cached version of the method.
class OrderCombo
belongs_to :user
scope :by_user, lambda{ |user| where(:user_id => user.id) }
def self.top_n_orders_by_user(user,count=10)
OrderCombo.by_user(user).count(:group => :combo).sort { |a,b| a[1] <=> b[1] }.reverse[0..count-1]
end
def self.cached_top_orders_by_user(user,count=10)
Rails.cache.fetch("order_combo_#{user.id.to_s}_#{count.to_s}", :expiry => 10.minutes) { OrderCombo.top_n_orders_by_user(user, count) }
end
end
It's not perfect as it doesn't take into account increased popularity when someone orders more of one item in an order.

Resources