Hi can anyone tell me how can i increase performance if association returns large no. of records. for example in my app :-
class Restaurant < ActiveRecord::Base
has_many :inventory_items
end
class InventoryItem < ActiveRecord::Base
belongs_to :vendor
end
i am trying to find the vendors of my restuarant as follow :-
current_restaurant.inventory_items.includes(:vendor).uniq
current_restaurant.inventory_items returns large no. of records which takes maximum time. so how can i reduce this time please help me.
There are a number of solutions that you can use depending on how your application is configured and what it needs to do -
Only select the columns that you want, for example, if you are only looking for the IDs, you can use the pluck or select methods.
As Chetan suggested in his answer, you can also add scopes, and in addition to that also add indexes for the columns in the scope depending on what kind of columns they are.
If you are looking at calculated values, consider caching them on the Restaurant table.
You can add a scope to your model and add a condition for the records you wanna fetch. Like
scope :your_scope_name, -> { includes(:vendor).where(*some more conditions*) }
This will help query to not to go through all data
Use pagination, loading all records is never recommended..
see will_paginate OR kaminari gems
Update:
class Restaurant < ActiveRecord::Base
has_many :inventory_items
has_many :vendors, through: :inventory_items
end
Then,
current_restaurant.vendors.uniq
It depends on the size of the tables and how they are indexed, one sub query might be faster than a huge join:
Vendor.where(id: current_restaurant.inventory_items.select(:vendor_id).distinct)
Related
class Category
has_many :images
has_many :articles
end
class Image
belongs_to :category
end
class Article
belongs_to :category
end
I'm trying to understand what solutions there are in Rails for children of different models to be queried by the same parent?
E.g. I'd like to get all images and articles that belong to the same category and sort them all by created_at.
You can try 'includes' in rails
Article.includes(:Category)
As I said it seems to me you can use eager loading multiple associations. In your case it could be something like this:
Category.where(id: 2).includes(:images, :articles).sort_by(&:created_at)
Basically you pass your desired Category ID and get :images, :articles which belongs_to Category with particular ID. sort_byprobably should do the sorting thing.
This blog post on eager loading could help you as well.
You can't simply force Active Record to bring all their dependences in a single query (afaik), regardless if is lazy/eager loading. I think your best bet is:
class Category
has_many :images, -> { order(:created_at) }
has_many :articles, -> { order(:created_at) }
end
categories = Category.includes(:images, :articles)
As long as you iterate categories and get their images and articles, this will make three queries, one for each table categories, images and articles, which is a good tradeoff for the ease of use of an ORM.
Now, if you insist to bring all that info in just one query, for sure it must be a way using Arel, but think twice if it worths. The last choice I see is the good old SQL with:
query = <<-SQL
SELECT *, images.*, articles.*
FROM categories
-- and so on with joins, orders, etc...
SQL
result = ActiveRecord::Base.connection.execute(query)
I really discourage this option as it will bring A LOT of duplicated info as you will joining three tables and it really would be a pain to sort them for your use.
Let's say I have a single web page form user interface with 2 sets of checkboxes. With set 1 checkboxes, I can check off what Trainers I would like ("Jason", "Alexandra, etc.) With set 2 checkboxes, I can check off what animals I would like to see ("Tigers", "Bears", etc.) Once I submit the form with these options, I get back a list of zoos that match the criteria (let's assume all the trainers work at all the zoos and all the animals are at all the zoos for discussion's sake)
We'll be running our database query by "name" (e.g., search using trainer names and animal names, NOT database ids)
Let's say we are using a Postgres database that has hundreds of thousands of rows (if not millions).
Is it more efficient to search using an "ILIKE" query or is it better to do a standard join query (e.g., Zoo.includes(:animals, :trainers).where("animals.name = ? and trainers.name = ?", animal_names, trainer_names)?
Is there a better way than what I just showed in #1 above?
model setup
class Zoo < ActiveRecord::Base
has_many :animals, through: zoo_animals
has_many :trainers, through: zoo_trainers
has_many :zoo_trainers
has_many :zoo_animals
end
class Animal < ActiveRecord::Base
has_many :zoos, through :zoo_animals
has_many :zoo_animals
end
class Trainer < ActiveRecord::Base
has_many :zoos, through :zoo_trainers
has_many :zoo_trainers
end
class ZooAnimal < ActiveRecord::Base
belongs_to :animal
belongs_to :zoo
end
class ZooTrainer < ActiveRecord::Base
belongs_to :zoo
belongs_to :trainer
end
EDIT: let's suppose I don't have access to the database ID's.
LIKE '%Jason%' is much less efficient than querying for the exact string 'Jason' (or querying for an ID), because while exact comparisons and some uses of LIKE can use an index on the column being queried, LIKE with a pattern beginning with a wildcard can't use an index.
However, performance doesn't sound like the most important consideration here. LIKE %Jason% will still probably be fast enough on a reasonably sized database under reasonable load. If the application really needs to search for things by substring (which implies that a search might have multiple results), that requirement can't be met by simple equality.
There are an endless number of higher-powered solutions to searching text, including Postgres built-in full-text search and external solutions like Elasticsearch. Without specific requirements for scaling I'd go with LIKE until it started to slow down and only then invest in something more complicated.
I have the following models, each a related child of the previous one (I excluded other model methods and declarations for brevity):
class Course < ActiveRecord::Base
has_many :questions
scope :most_answered, joins(:questions).order('questions.answers_count DESC') #this is the query causing issues
end
class Question < ActiveRecord::Base
belongs_to :course, :counter_cache => true
has_many: :answers
end
class Answer < ActiveRecord::Base
belongs_to :question, :counter_cache => true
end
Right now I only have one Course populated (so when I run in console Course.all.count, I get 1). The first Course currently has three questions populated, but when I run Course.most_answered.count (most_answered is my scope method written in Course as seen above), I get 3 as the result in console, which is incorrect. I have tried various iterations of the query, as well as consulting the Rails guide on queries, but can't seem to figure out what Im doing wrong. Thanks in advance.
From what I can gather, your most_answered scope is attempting to order by the sum of questions.answer_count.
As it is there is no sum, and since there are three answers for the first course, your join on to that table will produce three results.
What you will need to do is something like the following:
scope :most_answered, joins(:questions).order('questions.answers_count DESC')
.select("courses.id, courses.name, ..., SUM(questions.answers_count) as answers_count")
.group("courses.id, courses.name, ...")
.order("answers_count DESC")
You'll need to explicitely specify the courses fields you want to select so that you can use them in the group by clause.
Edit:
Both places where I mention courses.id, courses.name, ... (in the select and the group), you'll need to replace this with the actual columns you want to select. Since this is a scope it would be best to select all fields in the courses table, but you will need to specify them individually.
I have models UserVote, Comment, Edit, etc, all of which have a user_id attribute. I'm trying to create a sort of timeline of recent activity, and this has me querying all 5 of my models separately and sorting by datetime. However, with accounts that have a lot of activity, these 5 queries take a very long time to execute. I'd like to find a way to optimize the performance, and I figured combining the 5 queries might work.
I haven't been able to come up with any working query to achieve what I'd like.
Thanks for any help!
I think the best suggestion in the comments is from Steve Jorgensen, with "I have generally seen this done by adding records to an activity log, and then querying that.".
If you want to take this idea to the next level, check out sphinx (a search engine designed for indexing database content). You can integrate easily with rails using thinksphinx - http://freelancing-god.github.com/ts/en/.
Also, as Tim Peters brings up, you really should have indexs on all of your fkeys, regardless of how you solve this - http://apidock.com/rails/ActiveRecord/ConnectionAdapters/SchemaStatements/add_index.
I think it is good idea to use Polymorphic associations for this problem - http://guides.rubyonrails.org/association_basics.html#polymorphic-associations
class TimeLine < ActiveRecord::Base
belongs_to :timelineable, :polymorphic => true
end
class UserVote < ActiveRecord::Base
has_many :time_lines, :as => :timelineable
end
class Comments < ActiveRecord::Base
has_many :time_lines, :as => :timelineable
end
Now you can sort time_line and access associated resources.
I have two models:
Novel has_many :pages
Page belongs_to :novel
I want to list popular Novels according to page count. Essentially, I want Novel models loaded from the outcome of this query:
select p.novel_id, count(*) as count
from pages p
GROUP BY p.novel_id
ORDER BY count DESC
I'm sure there's some cute way to do it in Rails 2.3 using named_scope, but I can't quite get it to work. Plus, if it does work, is it going to be dog slow?
I've considered keeping page_count on Novel, but that seems like a violation of something (convention, normalization, my soul).
Seems like counter cache is the way to go. If you create a column called page_count on the novels table (with an index), Rails will cache the number of pages on the Novel model itself, making this kind of query very easy and performant.
The named_scope on the Novel model then becomes
class Novel < ActiveRecord::Base
named_scope :popular, :order => 'page_count desc'
end
class Page < ActiveRecord::Base
belongs_to :novel, :counter_cache => true
end
For more details check out the counter cache railscast
Yep, that's going to be pretty slow. It's not a horrible thing to cache the page_count in your Novel. Normalization is all well and good, until it impacts performance.
Caching expensive calculations is the essence of most optimizations.
Keeping a counter_cache on the Novel is deemed acceptable in this matter and should aid your query.
In page.rb do:
belongs_to :novel, :counter_cache => true
And in your novels table put a pages_count column. This will be automatically incremented when you create pages and decremented when you remove them.