Rails caching a paginated collection - ruby-on-rails

Just doing some research on the best way to cache a paginated collection of items. Currently using jbuilder to output JSON and have been playing with various cache_key options.
The best example I've seen is by using the latest record's updated_at plus the amount of items in the collection.
def cache_key
pluck("COUNT(*)", "MAX(updated_at)").flatten.map(&:to_i).join("-")
end
defined here: https://gist.github.com/aaronjensen/6062912
However this won't work for paginated items, where I always have 10 items in my collection.
Are there any workarounds for this?

With a paginated collection, you're just getting an array. Any attempt to monkey patch Array to include a cache key would be a bit convoluted. Your best bet it just to use the cache method to generate a key on a collection-to-collection basis.
You can pass plenty of things to the cache method to generate a key. If you always have 10 items per page, I don't think the count is very valuable. However, the page number, and the last updated item would be.
cache ["v1/items_list/page-#{params[:page]}", #items.maximum('updated_at')] do
would generate a cache key like
v1/items_list/page-3/20140124164356774568000
With russian doll caching you should also cache each item in the list
# index.html.erb
<%= cache ["v1/items_list/page-#{params[:page]}", #items.maximum('updated_at')] do %>
<!-- v1/items_list/page-3/20140124164356774568000 -->
<%= render #items %>
<% end %>
# _item.html.erb
<%= cache ['v1', item] do %>
<!-- v1/items/15-20140124164356774568000 -->
<!-- render item -->
<% end %>

Caching pagination collections is tricky. The usual trick of using the collection count and max updated_at does mostly not apply!
As you said, the collection count is a given so kind of useless, unless you allow dynamic per_page values.
The latest updated_at is totally dependent on the sorting of your collection.
Imagine than a new record is added and ends up in page one. This means that one record, previously page 1, now enters page 2. One previous page 2 record now becomes page 3. If the new page 2 record is not updated more recently than the previous max, the cache key stays the same but the collection is not! The same happens when a record is deleted.
Only if you can guarantee that new records always end up on the last page and no records will ever be deleted, using the max updated_at is a solid way to go.
As a solution, you could include the total record count and the total max updated_at in the cache key, in addition to page number and the per page value. This will require extra queries, but can be worth it, depending on your database configuration and record count.
Another solution is using a key that takes into account some reduced form of the actual collection content. For example, also taking into account all record id's.
If you are using postgres as a database, this gem might help you, though I've never used it myself.
https://github.com/cmer/scope_cache_key
And the rails 4 fork:
https://github.com/joshblour/scope_cache_key

Related

How does Rails fragment caching work?

I have just started using caching in a production application to speed things up. I've read the primary Rails guide, various blogs, the source itself, etc. But my head is still not clear on one simple thing when it comes to fragment caching:
When you destroy the cache after updating the object, are you only updating the single object, or the class? I think just the single object.
Here's an example:
<% #jobs.each do |job| %>
<% cache("jobs_index_table_environment_#{session[:merchant_id]}_job_#{job}") do %>
stuff
<% end %>
<% end %>
I use the code above in my jobs index page. Each row is rendered with some information the user wants, some CSS, clickable to view the individual job, etc.
I wrote this in my Job class (model)
after_save do
Rails.cache.delete("jobs_index_table_environment_#{merchant_id}_job_#{self}")
end
after_destroy do
Rails.cache.delete("jobs_index_table_environment_#{merchant_id}_job_#{self}")
end
I want the individual job objects destroyed from the cache if they are updated or destroyed, and of course newly created jobs get their own cache key the first time they pop on the page.
I don't do the Russian doll thing with #jobs because this is my "god" object and is changing all the time. The cache would almost never be helpful as the collection probably morphs by the minute.
Is my understanding correct that in the above view, if I rendered, say, 25 jobs to the first page, I would get 25 objects in my cache with the cache key, and then if I only change the first, it's cached value would be destroyed and the next time the jobs page loads, it would be re-cached while the other 24 would just be pulled from the cache?
I'm a novice to fragment caching as well, and I just encountered a very similar use-case so I feel my (limited) knowledge is fresh enough to be of help.
Trosborn is correct, your terminal will highlight when you READ and when you WRITE, which shows you how many "hits" you got on your cache. It should only WRITE when you've changed an object. And based on what I see above, your delete is only deleting individual records.
However, I think there is a potentially simpler way to accomplish this, which is passing the ActiveRecord object to the cache, such as:
<% #jobs.each do |job| %>
<% cache(job) do %>
stuff
<% end %>
<% end %>
Read this post from DHH on how this works. In short, when an AR object is passed to cache, the key is generated not just on the model name, but also on the id and the updated_at fields.
Obsolete fragments eventually get pushed out of the cache when memory runs out, so you don't need to worry about deleting old cache objects.

How can I allow users to choose the columns shown in Rails?

I have a model named Employee that has over a dozen fields making it impractical to display all fields at once. I would like to allow users to choose which columns are displayed by using either a multiple select box or a list of checkboxes, the result of which would ideally be stored in memory, not in the model since nothing will be saved long term, and be accessible for a loop to display the appropriate columns.
A sample of the view might be like so:
<% for employee in #employees %>
<tr>
<% for col in col_list %>
<td><%= employee.col %></td>
<% end %>
</tr>
<% end %>
where col_list is the list of columns selected by the user.
A better approach might be to output all the columns server side and do the filtering with javascript in the client. There are several libraries for this such as jQuery Datatables.
You can combine this with a preferences model which is persisted to the session or Redis instead of the main RDBMS if you want to remember the user prefs.
(Yes, you can use models for objects not stored in the DB. It gives you all the rails awesomeness of validations, form and param binding etc.)
As it is described, there is nothing particularly complicated in this feature.
If you have the users logged in somewhere, you can store a serialized list of columns somewhere in a preference for the user. The list of columns should probably be sanitized to avoid showing private columns.
If the user is not logged in, or you want a less persistent approach, simply store the list in a cookie.
If the list is not set (aka the cookie is empty), set it to a default or just render a default list of attributes.

Is every relavant calculation performed every time the page is loaded?

I have a model "Wrapper", which has_many of another model "Category", which in turn has_many of another model, "Thing".
"Thing" has the integer attributes :count and :number. It also has an instance method defined as such in models/thing.rb:
def ratio
(self.count + self.number).to_f / Thing.all.count.to_f
end
"Category", then, has this instance method, defined in models/category.rb:
def thing_ratios
self.things.sum(&:ratio.to_f)
end
Finally, my wrapper.html.erb view shows Categories, listed in order of thing_ratios:
<%= #wrapper.categories.all.order(&:thing_ratios).each do |category| %>
...
My question is this: every time someone reloads the page wrapper.html.erb, would every single relavant calculation have to be recalculated, all the way down to self.count for every Thing associated with every Category on the page?
In addition to the resources that #Kelseydh provided, you can also consider memoization for when you hit the same function multiple times as part of the same request. However, it will not retain its value after the request is processed.
Yes it will be recalculated every time. If this is an expensive operation you should add a counter_cache (guide: http://railscasts.com/episodes/23-counter-cache-column) for the count and look into caching the query result using a service like memcache.
Many caching strategies exist, but for the database/Rails app itself Russian doll caching is considered the most flexible approach. If your data doesn't update often (meaning you don't need to worry about cache expiration often) you may be able to get way with page caching -- if so, count yourself lucky.
Some resources to get you started:
DHH on Russian Doll Caching: https://signalvnoise.com/posts/3113-how-key-based-cache-expiration-works).
Railscast on cache keys: http://railscasts.com/episodes/387-cache-digests
Advanced caching guide: http://hawkins.io/2012/07/advanced_caching_revised/
Not free, but I found this series was what really let me understand various forms of caching correctly:
http://www.pluralsight.com/courses/rails-4-1-performance-fundamentals

Rails fragment caching of collection

I have a rails 4.1 app, that on a particular page retrieves a list of orders and lists them out in a table. It's important to note that the list is different depending on the logged in user.
To improve performance of this, I am looking to cache the partials for each order row. I am considering to do it like this:
_order_list.html.erb
<% cache(#orders) do %>
<%= render #orders %>
<% end %>
_order.html.erb
<% cache(order) do %>
...view code for order here
<% end %>
However, I'm unsure about the caching of the collection (#orders). Will all users then be served the same set of cached #orders (which is not desired)?
In other words, how can I ensure to cache the entire collection of #orders for each user individually?
Will all users then be served the same set of cached #orders (which is
not desired)?
Actually cache_digests does not cache #orders themselves. It caches html part of the page for a particular given object or set of objects (e.g. #orders). Each time a user asks for a webpage, #orders variable is going to be set in controller action and its digest is compared to the cached digest.
So, assuming we retrieve #orders like this:
def index
#orders = Order.where(:id => [1,20,34]).all
end
What we gonna get is cached view with a stamp like that:
views/orders/1-20131202075718784548000/orders/20-20131220073309890261000/orders/34-20131223112753448151000/6da080fdcd3e2af29fab811488a953d0
Note that ids of retrieved orders are mentioned in that stamp, so each user with his/her own unique set of orders should obtain his/her own individual cached view.
But here comes some great downsides of your approach:
Page caches are always stored on disk. That means that you can't have page stamp with any desired length. As soon as you retrieve a solid bunch of orders in a time, you exceed your OS's limit for filenames (e.g. it's 255 bytes for linux) and end up with runtime error.
Orders are dynamic content. As soon as at least one of them updates, your cache becomes invalid. Cache generation and saving it to disk is a pretty consuming operation, so it would be better to cache each order individually. In this case you will have to re-generate cache for a single order instead of re-generating the whole massive collection's cache.

Show next set of results in page with Rails, like Facebook-style "older posts" link to see next set of news items

I'm still pretty new to Rails and need your help: I have been creating a social fitness analytics site (zednine.com) that has an activity stream that lists workouts posted on the site. Several pages currently show the 10 most recently updated workouts. I'd like to add a link at the bottom to "Older workouts." On click, this should show the next 10 workouts in the page, immediately below the first 10, with a new link to Older below (now 20 displayed in the page) -- just like the news stream on Facebook and several other social networks.
What I've tried so far:
I'm currently using a find with :limit to get the first N results
I can set up a unique find with :limit and :offset for each set of N results with hidden divs, but that's lame and does not extend well
I also looked at:
pagination, including will_paginate, but not clear whether this can help for in same page chunking?
collections...?
What is the right/a good way to do this?
Also, how can I include records from multiple tables in this sort of stream? E.g., list could include workouts from one table, journal entries from another, comments from a third, all intermixed and sorted by date?
Thank you!
Will_paginate will do the job, just pass in the page you want:
<%= link_to "Older Posts", model_route_path(:page => next_page) %>
As for the second question, simply create a Feeds model (or tack it onto an existing one). Then have a method which fetches recent entries from the various other models and sorts them by created_at date. I would probably implement #recent method on each of the models and call that in your Feed object.
Models:
class Feed
def index
entries = []
entries << Journal.recent
entries << Comment.recent
# etc
entries.sort_by {|entry| entry['created_at']}
end
end
class Journal < ActiveRecord::Base
def recent
self.find_all_by_created_at(:limit => 10)
end
end
Or something like that. You will have to be very careful about scalability here.

Resources