I have a questions about cache method in rails. Please take a look at the below code. It's about fragment caching.
<% cache text_post do %>
<% sleep 1 %>
-- snip --
<% end %>
I put a sleep method in the code to see the effect of rails cache method. After this cache, rails doesn't execute the sleep method anymore thus improving performance. But I don't understand why the sleep method doesn't get executed after caching. As far as I see the sleep method isn't view data but a method. I think it should be executed no matter it's cached or not.
Thank you for help in advance.
The execution of the cached block depends on the cached key. When its called first time it produces a key, value pair and stores on your underlying caching storage. Here key is text_post and the value is the output of the block. So unless your cache expires or your key gets removed, the cached block will remain untouched. If the text_post is an activerecord object, if you save the object that cache will be expired and corresponding block will be executed again.
Related
We are working on a data visualization problem right now. Our customer wants us to show the last 6 months data for a honeybee hive on a graph.
Clearly it's gonna be a huge dataset. Adding indexes we overcame the database slowness problem in loading data though we still have problem in visualizing data on a graph.
Here is the related code:
def self.prepare_single_hive_messages_for_datatable_dygraph(messages, us_metric_enabled)
data = []
messages.each do |message|
record = []
record << message.occurance_time.to_s(:dygraph_format)
record << weight_according_to_metric(message.weight, us_metric_enabled)
record << temperature_according_to_metric(message.temperature, us_metric_enabled)
record << (message.humidity.nil? ? nil : message.humidity.to_f)
data << record
end
return data
end
The problem is that messages.each is very slow and takes more than 30 seconds. Is there any solution to overcome this?
Project Specification:
Rails Version: 4.1.9
Graph Library: Dygraph
Database: Postgres
There are two ways to attack a performance problem like this.
Find and correct the performance bottle neck
Break it into smaller pieces
Finding Performance issues
First, get a dataset large enough to reproduce the problem setup on your dev system. Then look at the logs so you can see how long the transaction is taking. You should be looking for a line like this:
Completed 200 OK in 432.1ms (Views: 367.7ms | ActiveRecord: 61.4ms)
Rerun the task a couple times since caching can cause variations. Write down your different times. Then remove everything in the loop and run it with just the loop. Do the numbers go back to looking reasonable? If that is the case then you know the problem is the work you are doing inside the loop. Next, add each line in the loop back on its own (or one at a time if they depend on each other). Figure out which line causes those numbers to jump the most.
This is the point where you should try to performance tune your code. Check for queries that could be smarter. Make sure you aren't querying the same data over and over. If you have a function in a model that computes something and you call it multiple times to get the same answer then use this to only compute once:
def something
return #savedvalue if #savedvalue
#savedvalue = really complex calculation
end
The goal is to find the worse offender so you can make changes that have the biggest impact. However, if you are working with a LOT of data this may only get you so far. It may be impossible to performance tune enough for all the data. In that case there is option 2.
Break it into smaller pieces
Write a second rails action who's only job is to render a single record on a graph. It will do the inner part of your loop but only on the message who's id was passed to it.
Call your original function to setup the view and pass the list of messages to the view. In the view loop through the list of messages to setup jquery ajax code to call the above action once for each message. Have this run in on document ready.
Then, the page will load with an empty graph... but as soon as it is up the individual processed records will be fed to it and appear one at a time on the page. It will still take just ask long (or even a little longer because of overhead) to complete the graph... but it will no longer time out. Each ajax call will be its own quick hit to the server instead of one big long hit.
I just used this very technique to load a rather long report on a site I work on. Ideally we'd like to fix any underlying performance issues... but what we really wanted was to have a report working right away and then fix the performance issues as we had time.
Ok you said every person sees the same set of data, which is great, means we can cache without worrying about who's logged in, first here's your method, with tiny improvements
def self.prepare_single_hive_messages_for_datatable_dygraph(messages, us_metric_enabled)
messages.inject([]) do |records, message|
records << [].tap do |record|
record << message.occurance_time.to_s(:dygraph_format)
record << weight_according_to_metric(message.weight, us_metric_enabled)
record << temperature_according_to_metric(message.temperature, us_metric_enabled)
record << (message.humidity.nil? ? nil : message.humidity.to_f)
end
end
end
Then create a caching function, that runs this method and caches it
# some class constants
CACHE_KEY = 'some_cache_key'
EXPIRY_TIME = 15.minutes
# the methods
def self.write_single_hive_messages_to_cache(messages, us_metric_enabled)
Rails.cache.write CACHE_KEY,
self.class.prepare_single_hive_messages_for_datatable_dygraph(messages, us_metric_enabled),
expires_in: EXPIRY_TIME
end
And a simple cache reading method
self.read_single_hive_messages_from_cache
Rails.cache.read CACHE_KEY
end
Then create a rake task that just fetches these messages and call the caching method, and rails will write the cache.
Create a cron job that calls this rake task, set the cron job to 5 minutes or so, the expiry time is longer just in case for some reason the cron job didn't run, the data will still be available for the next run.
This way your processing is run in the background, every 5 ( or whatever time you choose ) minutes, the page load should happen normally with no delay at all, since the array data will be loaded from the pre-calculated cache.
In case the cron stops working, the data will expire in the 15 minutes I've set, and then the read cache method will return nil, you could avoid this and set the data to never expire, but then the data will become stale and the old data will keep getting returned.
Another way to handle this is to tell the cache reading method how to generate the cache it self, so if it finds the cache empty it generates one and caches it itself before returning the data, the method would look like this
def self.read_single_hive_messages_from_cache(messages, us_metric_enabled)
Rails.cache.fetch CACHE_KEY, expires_in: EXPIRY_TIME do
self.class.write_single_hive_messages_to_cache(messages, us_metric_enabled)
end
end
But then make sure that messages is an ActiveRecord::Relation and not a processed array, because you don't want to query for 1+ million records and then find the cache already ready, if it's an ActiveRecord::Relation it will not touch the database until the array is started ( inside the caching block ), if the cache exists it will be returned before you enter the block and thus the data won't get fetched, saving you that huge query.
I know the answer got long, if you need more help tell me.
Observers and Sweepers are removed from Rails 4. Cool.
But what is the way to cache and clear cache then ?
I read about russian doll caching. It nice and all but it only concerns the view rendering cache. It doesn't prevent the database from being hit.
For instance:
<% cache #product do %>
Some HTML code here
<% end %>
You still need to get #product from the db to get its cache_key. So page or action caching can still be useful to prevent unnecessary load.
I could use some timeout to clear the cache sometimes but what for if the records didn't change ?
At least with sweepers you have control on that aspect. What is/will be the right way to do cache and to clear it ?
Thanks ! :)
Welcome to one of the two hard problems in computer science, cache invalidation :)
You would have to handle that manually since the logic for when a cached object, unlike a cached view which can be simply derived from the objects it displays, should be invalidated is application and situation dependent.
You goto method for this is the Rails.cache.fetch method. Rails.cache.fetch takes 3 arguments; the cache key, an options hash, and a block. It first tries to read a valid cache record based on the key; if that key exists and hasn’t expired it will return the value from the cache. If it can’t find a valid record it instead takes the return value from the block and stores it in the cache with your specified key.
For example:
#models = Rails.cache.fetch my_cache_key do
Model.where(condition: true).all
end
This will cache the block and reuse the result until something (tm) invalidates the key, forcing the block to be reevaluated. Also note the .all at the end of the method chain. Normally Rails would return an ActiveRecord relation object that would be cached and this would then be evaluated when you tried to use #models for the first time, neatly sidestepping the cache. The .all call forces Rails to eager load the records and ensure that it's the result that we cache, not the question.
So now that you get all your cache on and never talk to the database again we have to make sure we cover the other end, invalidating the cache. This is done with the Rails.cache.delete method that simply takes a cache key and removes it, causing a miss the next time you try to fetch it. You can also use the force: trueoption with fetch to force a re-evaluation of the block. Whichever suits you.
The science of it all is where to call Rails.cache.delete, in the naïve case this would be on update and delete for a single instance and update, delete, create on any member for a collection. There will always bee corner cases and they are always application specific, so I can't help you much there.
I assume in this answer that you will set up some sane cache store, like memcached or Redis.
Also remember to add this to config/environments/development.rb:
config.cache_store = :null_store
or you development environment will cache and you will end up hairless from frustration.
For further reference read: Everyone should be using low level caching in Rails and The rails API docs
It is also worth noting that functionality is not removed from Rails 4, merely extracted into a gem. If you need or would like the full features of the sweepers simply add it back to your app with a gem 'rails-observers' line in your Gemfile. That gem contains both the sweepers and observers that where removed from Rails 4 core.
I hope that helpt you get started.
I have a block
<% cache 'unique_key', 60.minutes.from_now do %>
...
<% begin %>
...
<% rescue %>
...
<%end>
<% end %>
and I'm trying to make the implementation more robust by only caching (and thus allowing the user to see) the rescue message if there isn't a previous value already in the cache. Currently, if the response in the begin block sends back an error for any reason, I'm caching the user viewed error message. I would prefer to fall back onto the old cached data. The problem that I can't get past is -
Where is cache storing the data?
Every time I try Rails.cache.read 'unique_key', I get nil back. Is cache not storing the value in memcached? Is there a way that I can dump the cache to screen?
I couldn't follow the rails source. It seemed to me the the fragment_for method in cache was a rails 3 thing, and thus, I didn't debug further.
The cache view helper constructs a cache key based on the arguments you give it. At a minimum it adds the prefix 'views/' to the key.
You can use the fragment_cache_key helper to find out what cache key rails is using for any of your calls to cache. If you just want to grab what is currently stored, read_fragment does that. Of course with your particular usage, if your block is executed again it is because the 60 minutes are up: the cached value has been deleted from memcache.
With the memcache store you can't list all of the keys currently in the store - it's just something thy memcached itself doesn't support.
I solved this by using the fetch method. I used
<% Rails.cache.fetch('unique_key', :expires_in => 60.minutes){
begin
...
rescue
...
end
} %>
When I did this, I could successfully find the key. I'm still not sure why I couldn't find the cached data after adding the fragment_cache_key that I found, but using Rails.cache.fetch seemed to do the trick.
I am implementing caching into my Rails project via Memcached and particularly trying to cache side column blocks (most recent photos, blogs, etc), and currently I have them expiring the cache every 15 minutes or so. Which works, but if I can do it more up-to-date like whenever new content is added, updated or whatnot, that would be better.
I was watching the episode of the Scaling Rails screencasts on Memcached http://content.newrelic.com/railslab/videos/08-ScalingRails-Memcached-fixed.mp4, and at 8:27 in the video, Gregg Pollack talks about intelligent caching in Memcached in a way where intelligent keys (in this example, the updated_at timestamp) are used to replace previously cached items without having to expire the cache. So whenever the timestamp is updated, the cache would refresh as it seeks a new timestamp, I would presume.
I am using my "Recent Photos" sideblock for this example, and this is how it's set up...
_side-column.html.erb:
<div id="photos"">
<p class="header">Photos</p>
<%= render :partial => 'shared/photos', :collection => #recent_photos %>
</div>
_photos.html.erb
<% cache(photos) do %>
<div class="row">
<%= image_tag photos.thumbnail.url(:thumb) %>
<h3><%= link_to photos.title, photos %></h3>
<p><%= photos.photos_count %> Photos</p>
</div>
</div>
<% end %>
On the first run, Memcached caches the block as views/photos/1-20110308040600 and will reload that cached fragment when the page is refreshed, so far so good. Then I add an additional photo to that particular row in the backend and reload, but the photo count is not updated. The log shows that it's still loading from views/photos/1-20110308040600 and not grabbing an updated timestamp. Everything I'm doing appears to be the same as what the video is doing, what am I doing wrong above?
In addition, there is a part two to this question. As you see in the partial above, #recent_photos query is called for the collection (out of a module in my lib folder). However, I noticed that even when the block is cached, this SELECT query is still being called. I attempted to wrap the entire partial in a block at first as <% cache(#recent_photos) do %>, but obviously this doesn't work - especially as there is no real timestamp on the whole collection, just it's individual items of course. How can I prevent this query from being made if the results are cached already?
UPDATE
In reference to the second question, I found that unless Rails.cache.exist? may just be my ticket, but what's tricky is the wildcard nature of using the timestamp...
UPDATE 2
Disregard my first question entirely, I figured out exactly why the cache wasn't refreshing. That's because the updated_at field wasn't being updated. Reason for that is that I was adding/deleting an item that is a nested resource in a parent, and I probably need to implement a "touch" on that in order to update the updated_at field in the parent.
But my second question still stands...the main #recent_photos query is still being called even if the fragment is cached...is there a way using cache.exists? to target a cache that is named something like /views/photos/1-2011random ?
One of the major flaws with Rails caching is that you cannot reliably separate the controller and the view for cached components. The only solution I've found is to embed the query in the cached block directly, but preferably through a helper method.
For instance, you probably have something like this:
class PhotosController < ApplicationController
def index
# ...
#recent_photos = Photos.where(...).all
# ...
end
end
The first instinct would be to only run that query if it will be required by the view, such as testing for the presence of the cached content. Unfortunately there is a small chance that the content will expire in the interval between you testing for it being cached and actually rendering the page, something that will lead to a template rendering error when the nil-value #recent_photos is used.
Here's a simpler approach:
<%= render :partial => 'shared/photos', :collection => recent_photos %>
Instead of using an instance variable, use a helper method. Define your helper method as you would've the load inside the controller:
module PhotosHelper
def recent_photos
#recent_photos ||= Photos.where(...).all
end
end
In this case the value is saved so that multiple calls to the same helper method only triggers the query once. This may not be necessary in your application and can be omitted. All the method is obligated to do is return a list of "recent photos", after all.
A lot of this mess could be eliminated if Rails supported sub-controllers with their own associated views, which is a variation on the pattern employed here.
As I've been working further with caching since asking this question, I think I'm starting to understand exactly the value of this kind of caching technique.
For example, I have an article and through a variety of things I need for the page which include querying other tables, maybe I need to do five-seven different queries per article. However, caching the article in this way reduces all those queries to one.
I am assuming that with this technique, there always needs to have at least "one" query, as there needs to be "some" way to tell whether the timestamp has been updated or not.
I would like to cache my fragment page in my rails application by time.
I found this plugin to do this => ici but any download is available.
I searched in the rails doc but I don't found how to cache my fragment by time.
Are you know another plugin to do this or another method to do this ?
Thanks.
Creating a time-based cache key is quite simple.
Here's an example.
Now in your app you can write
<% cache :expires => CacheKey.expirable(:hour) do %>
...
<% end %>
If you want a more accurate control (for example 5.minutes instead of simply 1 minute), you can easily adapt the module in order to dynamically generate the cache key reading the time value passed as parameter.
An other approach is to check the last-modified time of the cache file. Here's a plugin.