Rails 3 and Memcached - Intelligent caching without expiration - ruby-on-rails

I am implementing caching into my Rails project via Memcached and particularly trying to cache side column blocks (most recent photos, blogs, etc), and currently I have them expiring the cache every 15 minutes or so. Which works, but if I can do it more up-to-date like whenever new content is added, updated or whatnot, that would be better.
I was watching the episode of the Scaling Rails screencasts on Memcached http://content.newrelic.com/railslab/videos/08-ScalingRails-Memcached-fixed.mp4, and at 8:27 in the video, Gregg Pollack talks about intelligent caching in Memcached in a way where intelligent keys (in this example, the updated_at timestamp) are used to replace previously cached items without having to expire the cache. So whenever the timestamp is updated, the cache would refresh as it seeks a new timestamp, I would presume.
I am using my "Recent Photos" sideblock for this example, and this is how it's set up...
_side-column.html.erb:
<div id="photos"">
<p class="header">Photos</p>
<%= render :partial => 'shared/photos', :collection => #recent_photos %>
</div>
_photos.html.erb
<% cache(photos) do %>
<div class="row">
<%= image_tag photos.thumbnail.url(:thumb) %>
<h3><%= link_to photos.title, photos %></h3>
<p><%= photos.photos_count %> Photos</p>
</div>
</div>
<% end %>
On the first run, Memcached caches the block as views/photos/1-20110308040600 and will reload that cached fragment when the page is refreshed, so far so good. Then I add an additional photo to that particular row in the backend and reload, but the photo count is not updated. The log shows that it's still loading from views/photos/1-20110308040600 and not grabbing an updated timestamp. Everything I'm doing appears to be the same as what the video is doing, what am I doing wrong above?
In addition, there is a part two to this question. As you see in the partial above, #recent_photos query is called for the collection (out of a module in my lib folder). However, I noticed that even when the block is cached, this SELECT query is still being called. I attempted to wrap the entire partial in a block at first as <% cache(#recent_photos) do %>, but obviously this doesn't work - especially as there is no real timestamp on the whole collection, just it's individual items of course. How can I prevent this query from being made if the results are cached already?
UPDATE
In reference to the second question, I found that unless Rails.cache.exist? may just be my ticket, but what's tricky is the wildcard nature of using the timestamp...
UPDATE 2
Disregard my first question entirely, I figured out exactly why the cache wasn't refreshing. That's because the updated_at field wasn't being updated. Reason for that is that I was adding/deleting an item that is a nested resource in a parent, and I probably need to implement a "touch" on that in order to update the updated_at field in the parent.
But my second question still stands...the main #recent_photos query is still being called even if the fragment is cached...is there a way using cache.exists? to target a cache that is named something like /views/photos/1-2011random ?

One of the major flaws with Rails caching is that you cannot reliably separate the controller and the view for cached components. The only solution I've found is to embed the query in the cached block directly, but preferably through a helper method.
For instance, you probably have something like this:
class PhotosController < ApplicationController
def index
# ...
#recent_photos = Photos.where(...).all
# ...
end
end
The first instinct would be to only run that query if it will be required by the view, such as testing for the presence of the cached content. Unfortunately there is a small chance that the content will expire in the interval between you testing for it being cached and actually rendering the page, something that will lead to a template rendering error when the nil-value #recent_photos is used.
Here's a simpler approach:
<%= render :partial => 'shared/photos', :collection => recent_photos %>
Instead of using an instance variable, use a helper method. Define your helper method as you would've the load inside the controller:
module PhotosHelper
def recent_photos
#recent_photos ||= Photos.where(...).all
end
end
In this case the value is saved so that multiple calls to the same helper method only triggers the query once. This may not be necessary in your application and can be omitted. All the method is obligated to do is return a list of "recent photos", after all.
A lot of this mess could be eliminated if Rails supported sub-controllers with their own associated views, which is a variation on the pattern employed here.

As I've been working further with caching since asking this question, I think I'm starting to understand exactly the value of this kind of caching technique.
For example, I have an article and through a variety of things I need for the page which include querying other tables, maybe I need to do five-seven different queries per article. However, caching the article in this way reduces all those queries to one.
I am assuming that with this technique, there always needs to have at least "one" query, as there needs to be "some" way to tell whether the timestamp has been updated or not.

Related

Rails 4 Fragment Caching

I'm trying to increase performance for my app so I'm looking into fragment caching.
I'm trying to understand what to cache. For example, on all pages of my site I display a list of recent articles.
In my application controller I have a filter that sets:
#recent_articles = Article.get_recent
I have the following in my view/footer:
- cache(cache_key_for_recent_articles) do
%h3 RECENT ARTICLES
- #recent_articles.each do |article|
.recent-article
= link_to add_glyph_to_link("glyphicon glyphicon-chevron-right", article.name), article_path(article, recent: true)
- if Article.count > 4
= link_to "MORE ARTICLES", articles_path(), class: "btn btn-primary more-articles"
My question is. Am I properly caching this? I'm tailing the logs, but I see a query for the articles so I'm assuming no. It's not clear to me what this would do when I query in the controller, but cache a section of the page.
Is this a place for low level caching rather than fragment caching?
Thanks.
You're doing it right. It might seem silly, because it always has to make the db hit anyway, but the gains can be substantial. Imagine each article had threaded comments, with images. In this case, if you kept the controller the exact same, using the same caching construct would save you a tremendous amount of db effort. So, yeah, if you can pull from memcached instead of running through haml with a bunch of rails helpers (those link_tos aren't free) you'll save a bit for sure, but the real gains are found when you can subtly restructure your architecture (as lazy as possible) in order to really take advantage. And that initial hit on Articles? Your db should do a pretty good job of caching that call, I'm not sure you would want to cache it too aggressively, in this case anyway, given the name of the method called.

Rails 2 cache view helper not saving to memcached

I have a block
<% cache 'unique_key', 60.minutes.from_now do %>
...
<% begin %>
...
<% rescue %>
...
<%end>
<% end %>
and I'm trying to make the implementation more robust by only caching (and thus allowing the user to see) the rescue message if there isn't a previous value already in the cache. Currently, if the response in the begin block sends back an error for any reason, I'm caching the user viewed error message. I would prefer to fall back onto the old cached data. The problem that I can't get past is -
Where is cache storing the data?
Every time I try Rails.cache.read 'unique_key', I get nil back. Is cache not storing the value in memcached? Is there a way that I can dump the cache to screen?
I couldn't follow the rails source. It seemed to me the the fragment_for method in cache was a rails 3 thing, and thus, I didn't debug further.
The cache view helper constructs a cache key based on the arguments you give it. At a minimum it adds the prefix 'views/' to the key.
You can use the fragment_cache_key helper to find out what cache key rails is using for any of your calls to cache. If you just want to grab what is currently stored, read_fragment does that. Of course with your particular usage, if your block is executed again it is because the 60 minutes are up: the cached value has been deleted from memcache.
With the memcache store you can't list all of the keys currently in the store - it's just something thy memcached itself doesn't support.
I solved this by using the fetch method. I used
<% Rails.cache.fetch('unique_key', :expires_in => 60.minutes){
begin
...
rescue
...
end
} %>
When I did this, I could successfully find the key. I'm still not sure why I couldn't find the cached data after adding the fragment_cache_key that I found, but using Rails.cache.fetch seemed to do the trick.

Improving performance on loading a long list of facebook friends

As part of my rails project, I have a feature that allows a user to issue invites to their FB friends. I'm using fb_graph for the API calls, and the below is a sampling of the code from the controller when the user hits the invite page.
This operation gets really expensive. I've seen it take longer than 30 seconds for users with upwards of 1000 friends. Also, this code re-executes each time a user hits the invite page. While a user's FB friends list isn't exactly static, it should be okay not recalculating this on every request.
So what I want to do is improve on this code and make it more efficient. I can think of a few different potential ways to potentially do this, but what makes the most sense in this case? This is a bit more open-ended than I usually like to ask on SO, but as I'm still relatively new to programming, I'm curious just as much on what you would/wouldn't do as much as how to do it.
Here are some ideas on optimizations I could make:
1) Only optimize within a session. The code executes the first time the page is hit and will persist for the rest of the session. I'm actually not sure how to do this.
2) Persist to the database. Add a column to the user table that will hold the friends hash. I could refresh this data periodically using a background job (perhaps once a week?)
3) Persist with caching. I'm not even exactly sure what's involved with this or if this is an appropriate use-case. My feeling is that option 2 requires a lot of manual maintenance, and that perhaps there's a nice caching solution for this that handles expirations, etc., but not sure
Other ideas? Appreciate your thoughts on the options.
# fetch full array of facebook friends
#fb_friends = current_user.facebook.fetch.friends
# strip out only id, name, and photo for each friend
#fb_friends.map! { |f| { identifier: f.identifier, name: f.name, picture: f.picture }}
# sort alphabetically by first name
#fb_friends.sort! { |a,b| a[:name].downcase <=> b[:name].downcase }
# split into two lists. those already on vs not on network
#fb_friends_on_network = Array.new
#fb_friends.each do |friend|
friend_find = Authorization.find_by_uid_and_provider(friend[:identifier], 'facebook')
if friend_find
#fb_friends_on_network << friend_find.user_id
#fb_friends.delete(friend)
end
end
UPDATE # 1
Adding a bit more on an initial experiment I conducted. I added a column to my user table that holds the #fb_friends array (post-processing the transformations shown above). Basically the controller code above is replaced with simply #fb_friends = current_user.fbfriends. I thought this would cut the load significantly as there is no more call to Facebook, not to mention all the processing done above. This did save some time but not as much as expected. My own friends list took about 6 secs to load on my local machine, after these changes its down to 4 secs. I must be missing something bigger here on the load issue.
UPDATE #2
Upon further investigation, I learned that almost half of data transfer was attributed to the form I was using for the "Invite" buttons. The form would load once for each friend and looked like this:
<%= form_for([#group, #invitation], :remote => true, :html => { :'data-type' => 'html', :class => 'fbinvite_form', :id => friend[:identifier]}) do |f| %>
<%= f.hidden_field :recipient_email, :value => "facebook#meetcody.com" %>
<div class = "fbinvite btn_list_right" id = "<%= friend[:identifier] %>">
<%= f.submit "Invite", :class => "btn btn-medium btn-primary", :name => "fb" %>
</div>
<% end %>
I decided to remove the form and inside place a simple button:
<div class = "fbinvite_form" id = "<%= friend[:identifier] %>" name = "fb">
<div class = "btn btn-small">
Invite
</div>
</div>
I then used ajax to detect a click and take the appropriate actions. This change literally cut the data transfer in half. Before loading about 500 friends took ~650kb, now its down to ~330kb.
I'm thinking I will go back and try what I tried in Update # 1, to do pre-processing. Combined I'm hoping I can get this down to a ~2 sec operation.
UPDATE #3
I ended up installing Miniprofiler to learn more what could be slowing this operation down and learned that my for loop above is terrible inefficient as it makes a trip to the DB on every friend. I posted in a separate question and got help to reduce the trips down to just one. I then went ahead and implemented the pre-processing I mentioned in Update #1. With all these changes, I have it down to ~700ms which is remarkable considering it was taking +20 secs plus before going down this path!
If you can run some query's in parallel I'll suggest you to take a look at: The futoroscope gem.
As you can see by the announcing blog post it tries to solve the same problem of making simultaneous API query's. It's seems to have pretty good support and good test coverage.

Rails 3 Fragment Caching Output Issue

I am running into an issue when looking into fragment-level caching within my Rails 3.0.4 application with memcached. I am a bit confused with what is going on, but I think it's something to do with the way the output is being pulled from within the caching region. I am running memcached locally in -vv mode, and can see the key for the fragment getting saved/pulled correctly, the problem is the value of the item within memcached.
Here is what I'm doing:
< ... html before ... >
<%= cache("item_#{i.id}") do %>
<%= render :partial => 'shared/item', :locals => { :item => i, :functionality => [:set_as_default] } %>
<% end %>
< ... html after ... >
When I look at the value of the key within the cache, it has html from within the page that is in that fragment cache block, but ALSO OUTSIDE of that (from both the html before and html after areas). Here is the interesting part though, and is kind of the reason I think its related to capturing the output--it doesn't do the whole page, only some of the html before and some after.
According to the rails fragment cacheing guide, I think I'm doing things correctly (http://guides.rubyonrails.org/caching_with_rails.html#fragment-caching). Does anyone have thoughts as to what could be going on?
Your help is much appreciated!
-Eric
In this case, you are using ERB incorrectly. Basically take out the = sign. What your doing is your returning the value of the block too and hence why you are seeing double output.
<% cache("item_#{i.id}") do %>
Also, ActiveRecord objects respond to an internally baked in #cache_key method. Try to take advantage of that. The default #cache_key for an ActiveRecord object uses the class name, the object id and the updated_at timestamp too. The cache method should be able to take multiple args or an array and it will inturn call cache_key for every object that responds to it. Using this method, it means you will cache miss when the object is updated to, pretty cool stuff. So, IIRC
<% cache("item",i) do %>

Rails optmization (with activerecord and view helpers)

Is there a way to do this in Rails:
I have an activerecord query
#posts = Post.find_by_id(10)
Anytime the query is called, SQL is generated and executed at the DB that looks like this
SELECT * FROM 'posts' WHERE id = 10
This happens every time the AR query is executed. Similarly with a helper method like this
<%= f.textarea :name => 'foo' %>
#=> <input type='textarea' name='foo' />
I write some Railsy code that generates some text that is used by some other system (database, web browser). I'm wondering if there's a way to write an AR query or a helper method call that generates the text in the file. This way the text rendering is only done once (each time the code changes) instead of each time the method is called?
Look at the line, it may be going to the database for the first one but ones after it could be saying CACHE at the start of the line meaning it's going to ActiveRecord's query cache.
It also sounds to me like you want to cache the page, not the query. And even if it were the query, I don't think it's as simple as find_by_id(10) :)
Like Radar suggested you should probably look into Rails caching. You can start with something simple like the memory store or file cache and then move to something better like memcached if necessary. You can throw in some caching into the helper method which will cache the result after it is queried once. So for example you can do:
id = 10 # id is probably coming in as a param/argument
cache_key = "Post/#{id}"
#post = Rails.cache.read(cache_key)
if #post.nil?
#post = Post.find_by_id(id)
# Write back to the cache for the next time
Rails.cache.write(cache_key,#post)
end
The only other thing left to do is put in some code to expire the cache entry if the post changes. For that take a look at using "Sweepers" in Rails. Alternatively you can look at some of the caching gems like Cache-fu and Cached-model.
I'm not sure I understand your question fully.
If you're asking about the generated query, you can just do find_by_sql and write your own SQL if you don't want to use the active record dynamic methods.
If you're asking about caching the resulset to a file, it's already in the database, I don't know that if it was in a file it would be much more efficient.

Resources