Caching particular partial rails 3.0.x - ruby-on-rails

I have a home page which has some partials rendered all over the page.And it also has session header as well(login).
Partial contains set of books paginated. Now I want to cache this partial as it is getting updated once in a week.
Question 1 : How do I cache that particular partial (Without hitting
db) ?
Question 2 : How do I delete(expire) cached content when I update
that books model ?

You're looking for fragment caching here, which occurs on the view layer. Fragment caching and expiration of stored contents is surprisingly easy to do. You have a list of books, so let's say your view looks a bit like this:
<ul>
<% #books.each do |book| %>
<li><%= book.name %></li>
<% end %>
</ul>
To enable caching for just this bit, simply wrap it in cache:
<% cache do %>
<ul>
<% #books.each do |book| %>
<li><%= book.name %></li>
<% end %>
</ul>
<% end %>
Of course, this doesn't name the cache or do anything really special with it... while Rails will auto-select a unique name for this cache fragment, it won't be really helpful. We can do better. Let's use DHH's key-based cache expiration technique and give the cache a name relating to its content.
<% cache ['book-list', *#books] do %>
<ul>
<% #books.each do |book| %>
<li><%= book.name %></li>
<% end %>
</ul>
<% end %>
Passing arguments into cache builds the cache key from the supplied arguments. Strings are passed in directly -- so, here, the cache will always be prefaced with 'book-list'. This is to prevent cache collisions with other places you might be caching the same content, but with a different view. For each member of the #books array, Rails will call cache_key: for ActiveRecord objects, this yields a string composed of its model, ID, and crucially, the last time the object was updated.
This means that when you update the object, the cache key for this fragment will change. In other words, it's automatically getting expired -- when a book is updated, this cache statement will search for a nonexistent key, conclude it doesn't exist, and populate it with new content. Old, stale content will linger in your cache store until evicted by memory or age constraints (memcached does this automatically).
I use this technique in a number of production applications and it works wonderfully. For more information, check out that 37signals post, and for general caching information in Rails, see the Ruby on Rails caching guide.

"There are only two hard problems in Computer Science:
cache invalidation and naming things."
-- Phil Karlton
Caching
The Rails Guide to caching is probably always a good entry point for the built in caching strategies rails has to offer. Anyway here comes my very easy approach to caching
# _partial.html.erb
<% cache(some_key, :expires_in => 1.week) do %>
<%# ... content %>
<% end %>
Now some some_key can be any string as long as it is unique. But then again lets try to be a bit more clever about it and make the key somehow dependent on the list of books. Say you actually pass in the array of books some query returned then rails calls cache_key on each of its entries and eventually constructs a unique key for this collection. So when the collection changes the key changes. Thats because cache_key is implemented in ActiveRecord::Base and thus available on all Models. And further more it even uses the timestamps if available.
But then again this will hit the db every time a request is made.
The same in code:
# controller_method
#weekly_books = Books.where 'condition'
# _partial.html.erb
<% cache(#weekly_books) do %>
<%# ... content %>
<% end %>
To avoid hitting the db to often you can also cache the query its self by wrapping the call:
# Book.rb
def self.weeklies
Rails.cache.fetch("book_weeklies", :expires_in => 1.day) do
Books.where 'condition'
end
end
# controller_method
#weekly_books = Books.weeklies
# _partial.html.erb
<% cache(#weekly_books) do %>
<%# ... content %>
<% end %>

Related

Will this type of render create memory bloats in Rails?

I have four different Rails-apps (ruby 2.3.3/rails 4.1.13) on Heroku running on Unicorn (1 worker per app).
Two of them have a few thousand visitors per day, the other two around hundred a day. They all have a major issue in common: They are always running out on memory on Heroku! They are almost always above 100% of the 500 Mb limit on Heroku, thus using the slow Swap-memory. As all 4 share the same issue, I believe there is something in my programming habits that cause this - and possibly the way I render sub-items in partial lists (specifically after reading this question). I would like to hear whether this code is likely to bloat memory in my apps:
I have three files (the code obviously is quite simplified):
#show
render 'partials/product_list', :vars => { :products => Product.where(:foo => "bar"), :display_format => :grid }
#partials/product_list
<% if vars[:products].empty? %>
No products exist
<% else %>
<% if vars[:display_format].eql?(:grid) %>
<div class="abc">
<% vars[:products].each do |product|
<%= render 'partials/product_item', :product => product %>
<% end %>
</div>
<% elsif vars[:display_format].eql?(:wide) %>
<ul>
<% vars[:products].each do |product|
<li><%= render 'partials/product_item', :product => product %></li>
<% end %>
</ul>
<% elsif vars[:display_format].eql?(:rectangular) %>
<div class="ghi">
<ul class="rectangular_container">
<% vars[:products].each do |product|
<li class="rectangle"><%= render 'partials/product_item', :product => product %></li>
<% end %>
</ul>
</div>
<% else %>
<div class="jkl">
<% vars[:products].each do |product|
<%= render 'partials/product_item', :product => product %>
<% end %>
</div>
<% end >
#partials/product_item
<% if vars[:display_format].eql?(:wide) %>
<h1><%= product.name %></h1>
<p><%= product.description %></p>
<% elsif vars[:display_format].eql?(:rectangular)%>
// Similar but with lots of divs and other html
<% end %>
This may seem very weird but I reference partials/product_list from here and there all over the website (with different layouts and different product sets) and if I want to change the setup of e.g. grid-layout, I want to do it in one place only.
I have started using references in #show with Product.all.pluck(:id) and in partials/product_item start with product = Product.find(vars[:product]) but I can't really tell if this is making any difference. Edit: As max says in the comments, this is probably less efficient due to the amount of .find-calls I need to do.
Before I dig really deep into this, I have a few questions:
Is this something that strikes you as: "YES, this type of rendering will bloat your memory!"
If yes: How should I solve this type of rendering without memory problems?
Would there be a difference if I would use :collection, :layout etc instead of my own variables?
Any light on this issue would be highly appreciated!
One issue that stands out to me is it looks like you're doing some lookups in your views and breaking the MVC pattern. In my opinion, you should put that code into the controller for that action
so rewrite this line:
render 'partials/product_list', :vars => { :products => Product.where(:foo => "bar"), :display_format => :grid }
to this:
# in show controller action
#products = Product.all # <- This will get cached by rails
# in view
render 'partials/product_list', :vars => { :products => #products, :display_format => :grid }
Removing the Product call in the view and replacing it an instance variable that rails will cache. Keep in mind default rails cache is memory based. Which may be causing some of the bloat as your data grows. If you have memcache available perhaps changing your cache store to memcache would help too.
You can just chain a where clauses to your instance variable like you do now and the AREL will handle it.
# partial rendering with chained scope
render 'partials/product_list', :vars => { :products => #products.where(foo: "bar"), :display_format => :grid }
If you are finding that further filtering of items is becoming a common pattern, then you can use a scope in your Products model that will be chainable on the #products instance variable. This will keep it in AREL (SQL) and maintain the lazy loading/query caching while simplifying your code a bit.
# in Products model
scope :filter_to_bar, -> (term) {where(foo: term)}
# partial rendering with chained scope
render 'partials/product_list', :vars => { :products => #products.filter_to_bar("bar"), :display_format => :grid }
Personally I say keep your data fetches in SQL when you can, as this puts the strain on SQL and is lazy loaded, i.e. doesn't affect ruby memory. SQL done properly is fast, ruby is much slower and more memory intensive. SQL is only slow when your queries are O^n or you're making repeated small queries and putting unneccessary hits on the database. In that case, fetch the data and work with it on the rails application as suggested. While this will add to the memory consumption of your application is consuming, it will be more performant.
One thing you can do to speed up SQL fetches is to implement pagination tables or some sort of infinite scroll mechanism. This way you only load the data you need when you need it.
Also if you want a faster version of rails' partial pattern, check out the cells gem as suggested. It will require some refactoring but should be faster than rails' pure partials. However, caching will probably take you much further with less refactoring.

How do I group objects returned by a REST API based on a value inside that object?

I'm pretty new to ruby/rails so bear with me.
I'm attempting to take the results returned by the JIRA rest API and render them in a view. I can do that pretty easily using the jira-ruby gem. The problem I'm having is grouping the results by a specific object inside the object returned by the API (in this case, a "components" field object inside of a "issue" object). I've attempted to use group_by and chunk for this but I'm basically getting the inverse of what I want. Both methods return the same result.
In my controller I have:
#issues = #jira_client.Issue.all
In my view I have:
<% #issues.chunk {|issue_comp| issue_comp.fields["components"]}.each do |comps, issues| %>
<h2>
<% comps.each do |comp| %>
<%= comp["name"] %>
<% end %>
</h2>
<ul>
<% issues.each do |issue| %>
<li><p><%= link_to issue.key, "http://localhost:2990/jira/browse/#{issue.key}" %> - <%= issue.summary %></p></li>
<% end %>
</ul>
<% end %>
What I end up with is:
CompA CompB
IssueA
CompC CompD
IssueB
CompA CompC CompD
IssueC
etc.
What I want is:
CompA
IssueA
IssueC
CompB
IssueA
CompC
IssueB
IssueC
CompD
IssueB
IssueC
The object returned by the API is a pretty convoluted object (i.e. giant array of hashes inside arrays inside of hashes). So, I have to dig pretty deep to get at the component name.
I get the feeling that this shouldn't be as complicated as it seems but I have a terrible habit of making things more complicated than they need to be. What am I doing wrong here?
EDIT: I created a gist of the full dump that is returned with the above call. Notice the "components" array:
jira-ruby gem dump for all issues
I took a look at the data you're getting back from Jira. This is how it looks to me:
There is an outer array of Jira Issues.
Each issue has an "attrs" hash
Each "attrs" hash contains components.
If this understanding is correct, I think you are attempting to invert that structure so that you can get a complete list of components, then iterate over each of them, and show the Issues that belong to that component.
If that understanding is correct, you have two basic choices:
Check if you can ask Jira for that information (so you don't have to generate it yourself), or
Build your own data structure (in memory on in a local DB as you prefer):
Some sample code for building a useful structure in-memory:
# in a controller, model, or service class (as you wish)
#components = {}
#jira_issues_array.each do |jira_issue| # from your API call
jira_issues[:components].each do |jira_component|
#components[jira_component[:key]] ||= { name: jira_component[:name], issue_keys: [] }
#components[jira_component[:key]][:issue_keys] << jira_issue[:key]
end
end
In your view, you could iterate over #components like this:
# some html.erb file:
<h1>Components and Issues</h1>
<ul>
<% #components.keys.each do |component_key, component| %>
<li><%= component[:name] %>
<ul> <!-- nested -->
<% component[:issue_keys].each do |issue_key| %>
<%= #jira_issues_array.find { |issue| issue[:key] == issue_key }[:name] %>
<% end %>
</ul>
</li>
<% end %>
</ul>
Note: Like a typical lazy programmer, I haven't tried this out, but it's really intended to show how you might go about it. For example, each issue's name is embedded in the attrs section, so you'll need to dig that out a bit.
Finally, if anyone would find this useful, I use this to analyse and reformat JSON.
HTH - any questions or problems, post a comment.

Rails Russian Doll Caching and N+1

From what i understand of Russian doll caching in Rails it would be detrimental to eager load related objects or object lists when we are doing RDC (Russian Doll Caching) because in RDC we just load the top level object from the database, and look up its cached rendered template and serve. If we were to eager load related object lists that will be useless if the cache is not stale.
Is my understanding correct? If yes, how do we make sure that we eager load all the related objects on the very first call so as not to pay the cost of N+1 queries during the very first load (when the cache is not warm).
Correct - when loading a collection or a complicated object with many associations, a costly call to eager load all objects and associations can be avoided by doing a fast, simple call.
The rails guide for caching does have a good example, however, it's split up a bit. Looking at the common use case of caching a collection (ie the index action in Rails):
<% cache("products/all-#{Product.maximum(:updated_at).try(:to_i)}") do %>
All available products:
<% Product.all.each do |p| %>
<% cache(p) do %>
<%= link_to p.name, product_url(p) %>
<% end %>
<% end %>
<% end %>
This (condensed) example does 1 simple DB call Product.maximum(:updated_at) to avoid doing a much more expensive call Product.all.
For a cold cache (the second question), it is important to avoid N+1's by eager-loading associated objects. However, we know we need to do this expensive call because the first cache read for the collection missed. In Rails, this is typically done using includes. If a Product belongs to many Orders, then something like:
<% cache("products/all-#{Product.maximum(:updated_at).try(:to_i)}") do %>
All available products:
<% Product.includes(:orders).all.each do |p| %>
<% cache(p) do %>
<%= link_to p.name, product_url(p) %>
Bought at:
<ul>
<% p.orders.each do |o| %>
<li><%= o.created_at.to_s %></li>
<% end %>
</ul>
<% end %>
<% end %>
<% end %>
In the cold cache case we still do a cache read for collection and each member, however, in the partially warm cache case, we will skip rendering for a portion of the members. Note that this strategy relies on a Products associations being correctly set up to touch when associated objects are updated.
Update: This blog post describes a more complex pattern to further optimize building responses for partially cached collections. Instead of rebuilding the entire collection, it bulk fetches all available cached values then does a bulk query for the remaining values (and updates the cache). This is helpful in a couple of ways: the bulk cache read is faster than N+1 cache reads and the bulk query to the DB to build the cache is also smaller.

Rails - How to exclude blocks of code from fragment cache

I'm using fragment cache but i have inline code that is user specific like:
<% cache #page do %>
stuff here
<% if current_user %>
user specific
<% end %>
more here
<% end %>
So i want to exclude the several blocks of code that are user specific. Is there a way to do that in Rails or should i make an if statement in the beginning and make different caches for logged users and regular visitors? (i will have major duplication of code this way).
For per-user fragments, you can put models in array an array:
<% cache [#page, current_user] do %>
Rails will make a cache-key out of them, like:
pages/page_id-page_timestamp/users/user_id-user_timestamp
This way your fragments will be invalidated on a user/page update since the time-stamps are coming from their updated_at (see cache_key for details).

Cache only the main content in rails

Using Rails 3.1.1 and Heroku.
I believe this should be a fairly easy fix but I cannot find (and easily verify) how to do this. I have a very slow controller (6 sec) Product#show, with lots of N+1 and other things I will have to solve.
The website is a two-column website (main-column and right-column) where the main content from Product#show is shown in one column and daily product are shown in the other, including a "Random Product from the Database".
What I want to do is to let the content in main-column that is created by Product#show be cached (and thus bypass the controller and win 6 seconds). I do, however, want the right column to be dynamic (and loaded for each page request).
If I use caches_page :show it will cache the entire website, including the right-column, which makes me have to expire the cache every day in order to be able to load a new Daily Product. Not a good solution.
If I use cache('product-show' + #product.slug) do it only caches the view (right?) and still have to go through the controller.
So, how can I solve this?
You can achieve this with fragment caching like below:
def show
if !fragment_exist?("main_content")
#products = Product.all
#users_count = User.count
end
#random_products = Product.order("RANDOM()").limit(10)
end
show.html.erb
<!--MAIN CONTENT-->
<% cache("main_content") do %>
<%= #users_count %>
<% #products.each do |product| %>
<%= product.name %>
<% end %>
<% end %>
<!--SIDE CONTENT-->
<% #random_products.each do %>
<%= product.name %>
<% end %>
Use fragment caching, and don't load things in the controller.
If you have a very complex query, let it live in the controller as a scope, and only evaluate it in the view.
If you have a complex process to do so the query must be executed, use a helper method.
If you manage to just load lazy queries in the controller, if the cache is hit none of them will be executed.

Resources