I have a model "Wrapper", which has_many of another model "Category", which in turn has_many of another model, "Thing".
"Thing" has the integer attributes :count and :number. It also has an instance method defined as such in models/thing.rb:
def ratio
(self.count + self.number).to_f / Thing.all.count.to_f
end
"Category", then, has this instance method, defined in models/category.rb:
def thing_ratios
self.things.sum(&:ratio.to_f)
end
Finally, my wrapper.html.erb view shows Categories, listed in order of thing_ratios:
<%= #wrapper.categories.all.order(&:thing_ratios).each do |category| %>
...
My question is this: every time someone reloads the page wrapper.html.erb, would every single relavant calculation have to be recalculated, all the way down to self.count for every Thing associated with every Category on the page?
In addition to the resources that #Kelseydh provided, you can also consider memoization for when you hit the same function multiple times as part of the same request. However, it will not retain its value after the request is processed.
Yes it will be recalculated every time. If this is an expensive operation you should add a counter_cache (guide: http://railscasts.com/episodes/23-counter-cache-column) for the count and look into caching the query result using a service like memcache.
Many caching strategies exist, but for the database/Rails app itself Russian doll caching is considered the most flexible approach. If your data doesn't update often (meaning you don't need to worry about cache expiration often) you may be able to get way with page caching -- if so, count yourself lucky.
Some resources to get you started:
DHH on Russian Doll Caching: https://signalvnoise.com/posts/3113-how-key-based-cache-expiration-works).
Railscast on cache keys: http://railscasts.com/episodes/387-cache-digests
Advanced caching guide: http://hawkins.io/2012/07/advanced_caching_revised/
Not free, but I found this series was what really let me understand various forms of caching correctly:
http://www.pluralsight.com/courses/rails-4-1-performance-fundamentals
Related
I have a method called on user object which has many documents (associated).
Inside method I had to call documents many places where caller is self by default.
So I was wondering whether it will call documents for user so many times, and thought that I will call once and refer by docs, docs = self.documents or docs = documents and I will use this reference wherever user's documents are needed & thus we can avoid calling association method documents on user object
But does it really call again and again or just cache it for first time when it gets called?
I checked in console, When I call user.documents, it loaded documents (db call) but later for same call It was not loading.
Suggest how it works. Is it good to use reference variable for first call and use it further ?
Rails automatically caches the result of database calls. From the Rails Guides:
Query caching is a Rails feature that caches the result set returned by each query. If Rails encounters the same query again for that request, it will use the cached result set as opposed to running the query against the database again.
For example:
class ProductsController < ApplicationController
def index
# Run a find query
#products = Product.all
...
# Run the same query again
#products = Product.all
end
end
The second time the same query is run against the database, it's not actually going to hit the database. The first time the result is returned from the query it is stored in the query cache (in memory) and the second time it's pulled from memory.
However, it's important to note that query caches are created at the start of an action and destroyed at the end of that action and thus persist only for the duration of the action. If you'd like to store query results in a more persistent fashion, you can with low level caching.
My recommendation is not to assign it to a variable because it does nothing to improve the readability of the code and the performance difference is negligible. It could introduce confusion; if I were reading the code and saw someone replaced all calls to documents with docs I would wonder why and would have to take time to understand why.
Ultimately, setting docs = self.documents just tells Ruby "docs should point at the same memory location as self.documents", and regardless of which one you call Ruby will return the same data from the same memory location. There is a performance difference between calling a method and calling a variable, but that performance difference is so minor in comparison to something like the speed of a database call that it can be ignored; there are much better ways to improve the performance of an app than switching method calls to variable calls.
If your concern is that you don't want to type out documents over and over again when you could just type docs, then use alias_method:
class User < ApplicationRecord
has_many :documents
alias_method :docs, :documents
end
Then there is no difference between calling user.documents and user.docs -- they call the same method. But again, does it do anything to improve readability of the code? In my opinion, no.
Stick with calling documents.
I want to implement low level cache on my application but I'm having some troubles following the documentation. This is what they have as example:
class Product < ActiveRecord::Base
def competing_price
Rails.cache.fetch("#{cache_key}/competing_price", expires_in: 12.hours) do
Competitor::API.find_price(id)
end
end
end
My questions are:
How am I suppose to get that variable cache_key? Should be given somehow via rails cache or should I have my pre-builded key?
I'm not sure if I clearly understood how this works, confirm if this logic is correct: I set this for example on my controller to generate a ton of variables. And then every time a variable is requested (from view for example), the controller instead of recalculating it every time (long query) will retrieve the pre-made query in case the key haven't changed. If the key has changed it will recalculate again all variables inside the cache block.
ActiveRecord::Integration includes the cache_key method in Rails 4. It should be included by default in the standard Rails configuration. To test this, pop open console, get a record and call cache_key on it.
Reference on the method is available here.
It will usually generate a string similar to "#{record.class.name.underscore}/#{record.to_param}-#{record.updated_at}". This key-based invalidation approach lets you avoid a lot of the effort involved by simply looking for a cache value based on whenever the record was last updated. Old cache values will be ignored because they're not being retrieved.
DHH wrote a great article on the topic here.
The special_item_id_list method is responsible for returning an array of ids. The query and logic is complicated enough that I only want to have to run it once per any page request, but I'll be utilizing that resulting array of ids in many different places. The idea is to be able to use the is_special? method or the special_items scope freely without worrying about incurring overhead each time they are used, so they rely on the special_item_id_list method to do the heavy lifting and caching.
I don't want the results of this query to persist between page loads, but I'd like the query ran only once per page load. I don't want to use a global variable and thought a class variable on the model might work, however it appears that the class variable does persist between page loads. I'm guessing the Item class is part of the Rails stack and stays in memory.
So where would be the preferred place for storing my id list so that it's rebuilt on each page load?
class Item < ActiveRecord::Base
scope :special_items, lambda { where(:id => special_item_id_list) }
def self.special_item_id_list
#special_item_id_list ||= ... # some complicated queries
end
def is_special?
self.class.special_item_id_list.include?(id)
end
end
UPDATE: What about using Thread? I've done this before for tracking the current user and I think it could be applied here, but I wonder if there's another way? Here's a StackOverflow conversation discussing threads! and also mentions the request_store! gem as possibly a cleaner way of doing so.
This railscast covers what you're looking for. In short, you're going to want to do something like this:
after_commit :flush_cache
def self.cached_special_item_list
Rails.cache.fetch("special_items") do
special_item_id_list
end
end
private
def flush_cache
Rails.cache.delete("special_items")
end
At first I went with a form of Jonathan Bender's suggestion of utilizing Rails.cache (thanks John), but wasn't quite happy with how I was having to expire it. For lack of a better idea I thought it might be better to use Thread after all. I ultimately installed the request_store gem to store the query results. This keeps the data around for the duration I wanted (the lifetime of the request/response) and no longer, without any need for expiration.
Are you really sure this optimisation is necessary? Are you having performance issues because of it? Unless it's actually a problem I would not worry about it.
That said; you could create a new class, make special_item_id_list an instance method on that class and then pass the class around to anything needs to use that expensive-to-calculate data.
Or it might suffice to cache the data on instances of Item (possibly by making special_item_id_list an instance method), and not worry about different instances not being able to share the cache.
I've taken the quote below, which I can see some sense in:
"Cached pages and fragments usually depend on model states. The cache doesn't care about which actions create, change or destroy the relevant model(s). So using a normal observer seems to be the best choice to me for expiring caches."
For example. I've got a resque worker that updates a model. I need a fragment cache to expire when a model is updated / created. This can't be done with a sweeper.
However, using an observer will mean I would need something like, either in the model or in the Resque job:
ActionController::Base.new.expire_fragment('foobar')
The model itself should not know about caching. Which will also break MVC principles that will lead to ugly ugly results down the road.
Use an ActiveRecord::Observer to watch for model changes. It can expire the cache.
You can auto-expire the cache by passing the model as an argument in your view template:
<% cache #model do %>
# your code here
<% end %>
What's happening behind the scenes is a cache named [model]/[id]-[updated_at] is created. Models have a method cache_key, which returns a string containing the model id and updated_at timestamp. When a model changes, the fragment's updated_at timestamp won't match and the cache will re-generate.
This is a much nicer approach and you don't have to worry about background workers or expiring the cache in your controllers/observers.
Ryan Bates also has a paid Railscast on the topic: Fragment Caching
A good and simple solution would be not to expire but to cache it with a key that will be different if the content is different. Here is an example
<% cache "post-#{#post.id}", #post.updated_at.to_i do %>
When that post gets updated or deleted and you fetch it again, it will miss the cache since the hash is different, so it will kind of expire and cache the new value. I think you can have some problems by doing this, for example if you are using the Rails default cache wich creates html files as cache, so you would end up with a lot of files in your public dir after some time, so you better set your application to use something like memcached, wich manages memory deleting old cached records/pages/parcials if needed to cache others or something like that.
I'd recommend reviewing this section on sweepers in the Rails Guide - Caching with Rails: An overview
http://guides.rubyonrails.org/caching_with_rails.html#sweepers
It looks like this can be done without specifically creating lots of cache expiration observers.
Introduction
I have a (mostly) single-page application built with BackboneJS and a Rails backend.
Because most of the interaction happens on one page of the webapp, when the user first visits the page I basically have to pull a ton of information out of the database in one large deeply joined query.
This is causing me some rather extreme load times on this one page.
NewRelic appears to be telling me that most of my problems are because of 457 individual fast method calls.
Now I've done all the eager loading I can do (I checked with the Bullet gem) and I still have a problem.
These method calls are most likely ocurring in my Rabl serializer which I use to serialize a bunch of JSON to embed into the page for initializing Backbone. You don't need to understand all this but suffice to say it could add up to 457 method calls.
object #search
attributes :id, :name, :subscription_limit
# NOTE: Include a list of the members of this search.
child :searchers => :searchers do
attributes :id, :name, :gravatar_icon
end
# Each search has many concepts (there could be over 100 of them).
child :concepts do |search|
attributes :id, :title, :search_id, :created_at
# The person who suggested each concept.
child :suggester => :suggester do
attributes :id, :name, :gravatar_icon
end
# Each concept has many suggestions (approx. 4 each).
node :suggestions do |concept|
# Here I'm scoping suggestions to only ones which meet certain conditions.
partial "suggestions/show", object: concept.active_suggestions
end
# Add a boolean flag to tell if the concept is a favourite or not.
node :favourite_id do |concept|
# Another method call which occurs for each concept.
concept.favourite_id_for(current_user)
end
end
# Each search has subscriptions to certain services (approx. 4).
child :service_subscriptions do
# This contains a few attributes and 2 fairly innocuous method calls.
extends "service_subscriptions/show"
end
So it seems that I need to do something about this but I'm not sure what approach to take. Here is a list of potential ideas I have:
Performance Improvement Ideas
Dumb-Down the Interface
Maybe I can come up with ways to present information to the user which don't require the actual data to be present. I don't see why I should absolutely need to do this though, other single-page apps such as Trello have incredibly complicated interfaces.
Concept Pagination
If I paginate concepts it will reduct the amount of data being extracted from the database each time. Would product an inferior user interface though.
Caching
At the moment, refreshing the page just extracts the entire search out of the DB again. Perhaps I can cache parts of the app to reduce on DB hits. This seems messy though because not much of the data I'm dealing with is static.
Multiple Requests
It is technically bad to serve the page without embedding the JSON into the page but perhaps the user will feel like things are happening faster if I load the page unpopulated and then fetch the data.
Indexes
I should make sure that I have indexes on all my foreign keys. I should also try to think about places where it would help to have indexes (such as favourites?) and add them.
Move Method Calls into DB
Perhaps I can cache some of the results of the iteration I do in my view layer into the DB and just pull them out instead of computing them. Or I could sync things on write rather than on read.
Question
Does anyone have any suggestions as to what I should be spending my time on?
This is a hard question to answer without being able to see the actual user interface, but I would focus on loading exactly only as much data as is required to display the initial interface. For example, if the user has to drill down to see some of the data you're presenting, then you can load that data on demand, rather than loading it as part of the initial payload. You mention that a search can have as many as 100 "concepts," maybe you don't need to fetch all of those concepts initially?
Bottom line, it doesn't sound like your issue is really on the client side -- it sounds like your server-side code is slowing things down, so I'd explore what you can do to fetch less data, or to defer the complex queries until they are definitely required.
I'd recommend separating your JS code-base into modules that are dynamically loaded using an asset loader like RequireJS. This way you won't have so many XHRs firing at load time.
When a specific module is needed it can load and initialize at an appropriate time instead of every page load.
If you complicate your code a little, each module should be able to start and stop. So, if you have any polling occurring or complex code executing you can stop the module to increase performance and decrease the network load.