ActiveResource Caching - ruby-on-rails

How would you cache an ActiveResource model? Preferably in memcached. Right now it's pulling a model from my REST API fine but it pulls dozens of records each time. Would be best to cache them.

I've been playing around with the same thing and I think I've found a pretty simple way to check redis for the cached object first. This will only work when you use the find method, but for my needs, I think this is sufficient.
By overriding find, I can check the checksum of the arguments to see if I already have the response saved in redis. If I do, I can pull the JSON response out of redis and create a new object right there. If I don't, I'll pass the find through to ActiveResource::Base's find and the normal action will happen.
I haven't implemented the saving of the responses into redis with ActiveResource yet, but my plan is to populate those caches elsewhere. This way, normally I can rely on my caches being there, but if they aren't, I can fall back to the API.
class MyResource < ActiveResource::Base
class << self
def find(*arguments)
checksum = Digest::MD5.hexdigest(arguments.md5key)
cached = $redis.get "cache:#{self.element_name}:#{checksum}"
if cached
return self.new JSON.parse(cached)
end
scope = arguments.slice!(0)
options = arguments.slice!(0) || {}
super scope, options
end
end
end
and a little patch so we can get an md5key for our array:
require 'digest/md5'
class Object
def md5key
to_s
end
end
class Array
def md5key
map(&:md5key).join
end
end
class Hash
def md5key
sort.map(&:md5key).join
end
end
Does that help?

Caching in rails is configurable. You can configure the cache to be backed by memcached. Typically you can cache when you retrieve. It's unclear if you are a rest consumer or service but it's really not relevant. If you cache on read (or retrieve) and then read the cache the next time, everything will work just fine. If you are pulling the data from a database, serve up the cache and if no cache is available then cache the read from the database.
I wrote a blog post about it here:
http://squarism.com/2011/08/30/memcached-with-rails-3/
However what I wrote about is really pretty simple. Just showing how to avoid an expensive operation with what is sort of similar to the ||= operator. For a better example, new relic has a scaling rails episode. For example, they show how to cache the latest 10 posts:
def self.recent
Rails.cache.fetch("recent_posts", :expires_in => 30.minutes) do
self.find(:all, :limit => 10)
end
end
Rails.cache has been configured to be a memcached cache, this is the configurable part I was talking about.

I would suggest looking into https://github.com/Ahsizara/cached_resource, almost all of the work is done for you through the gem.

Related

Best practice for a big array manipulation with values that never change and will be used in more than one view

What would be the best and more efficient way in Rails if I want to use a hash of about 300-500 integers (but it will never be modified) and use it in more than one view in the application?
Should I save the data in the database?, create the hash in each action that is used? (this is what I do now, but the code looks ugly and inefficient), or is there another option?
Why don't you put it in a constant? You said it will never change, so it fits either configuration or constant.
Using the cache has the downside that it can be dropped out of cache, triggering a reload, which seems quite useless in this case.
The overhead of having it always in memory is none, 500 integers are 4KB or something like that at most, you are safe.
You can write the hash manually or load a YAML file (or whatever) if you prefer, your choice.
My suggestion is create a file app/models/whatever.rb and:
module Whatever
MY_HASH = {
1 => 241
}.freeze
end
This will be preloaded by rails on startup (in production) and kept in memory all the time.
You can access those valus in view with Whatever::MY_HASH[1], or you can write a wrapper method like
module Whatever
MY_HASH = {
1 => 241
}.freeze
def self.get(id)
MY_HASH.fetch(id)
end
end
And use that Whatever.get(1)
If the data will never be changed, why not just calculate the values before hand and write them directly into the view?
Another option would be to put the values into a singleton and cache them there.
require 'singleton'
class MyHashValues
include Singleton
def initialize
#
#results = calculation
end
def result_key_1
#results[:result_key_1]
end
def calculation
Hash.new
end
end
MyHashValues.instance.result_key_1
Cache it, it'll do exactly what you want and it's a standard Rails component. If you're not caching yet, check out the Rails docs on caching. If you use the memory store, your data will essentially be in RAM.
You will then be able to do this sort of thing
# The block contains the value to cache, if there's a miss
# Setting the value is done initially and after the cache
# expires or is cleared.
# put this in application controller and make it a helper method
def integer_hash
cache.fetch('integer_hash') { ... }
end
helper_method :integer_hash

How to avoid duplicates from saving in database parsed from external JSON file with sidekiq in Rails

I have a small To Do list in a .json file that I´m reading, parsing, and saving to a rails app with Sidekiq. Everytime I refresh the browser, the worker executes and duplicates the entries on the database. How do I maintain a unique database that is synchronized with the .json file and avoid duplicate entries to save on my database and show on my browser?
Here's the worker:
class TodoWorker
include Sidekiq::Worker
def perform
json_text = File.read('todo_json.json')
json = JSON.parse(json_text, :headers => true)
json.each do |todo|
t = TodoList.create(name: todo["name"], done: todo["done"])
t.save
end
end
end
And the controller:
class TodoListsController < ApplicationController
def index
#todo_lists = TodoList.all
TodoWorker.perform_async
end
end
Thanks
This is a terrible solution btw, you have a huge race condition in your read/store code, and you're not going to be able to use a large part of what Rails is good at. If you want a simple DB why not just use sqlite?
That being said, you need some way of recognizing duplicates, in most DBs this is done with a primary key that is sent to the browser along with the rest of the data, and then back again with any changes. That primary key is used to ensure that existing data is updated, rather than duplicated.
You will need the same thing in your JSON file, and then you can change your create method to be something more like ActiveRecord's find_or_create_by

Rails 3: Where to put a Rails.cache.fetch "cache warmer" call?

Assume you want to do some low level caching in Rails (with memcached, for example) and that you'd like to have just 1 call somewhere in your app, like...
Rails.cache.fetch('books', expires_in: 1.day) do
Book.offset(offset)
.limit(limit)
.select('title, author, number_of_pages')
.all
end
...to warm up your cache when you boot your app, so you can just use a simple call like...
Rails.cache.read('books')
...anywhere and multiple times throughout your app (in views, controllers, helpers...) to access this "books" collection.
Where should one put the initial "fetch" call to make it work?
After your comment I want to clear up a couple of things.
You should always be using fetch if you require a result to come back. Wrap the call in a class method inside Book for easy access:
class Book
def self.cached_books
Rails.cache.fetch < ... >
end
end
You can have a different method forcing the cache to be recreated:
def self.write_book_cache
Rails.cache.write < ... >
end
end
Then in your initializer, or in a rake task, you can just do:
Book.write_book_cache
This seems more maintainable to me, while keeping the succinct call to the cache in the rest of your code.
My first thought would be to put it in an initializer - probably one specifically for the purpose (/config/initializers/init_cache.rb, or something similar).
It should be executed automatically (by virtue of being in the initializers folder) when the app starts up.

Rails: How to handle Thread.current data under a single-threaded server like Thin/Unicorn?

As Thin/Unicorn are single threaded, how do you handle Thread.current/per-request storage?
Just ran a simple test - set a key in one session, read it from another -- looks like it writes/reads from the same place all the time. Doesn't happen on WEBrick though.
class TestController < ApplicationController
def get
render text: Thread.current[:xxx].inspect
end
def set
Thread.current[:xxx] = 1
render text: "SET to #{Thread.current[:xxx]}"
end
end
Tried adding config.threadsafe! to application.rb, no change.
What's the right way to store per-request data?
How come there are gems (including Rails itself, and tilt) that use Thread.current for storage? How do they overcome this problem?
Could it be that Thread.current is safe per request, but just doesn't clear after request and I need to do that myself?
Tested with Rails 3.2.9
Update
To sum up the discussion below with #skalee and #JesseWolgamott and my findings--
Thread.current depends on the server the app is running on. Though the server might make sure no two requests run at the same time on same Thread.current, the values in this hash might not get cleared between requests, so in case of usage - initial value must be set to override last value.
There are some well known gems who use Thread.current, like Rails, tilt and draper. I guess that if it was forbidden or not safe they wouldn't use it. It also seems like they all set a value before using any key on the hash (and even set it back to the original value after the request has ended).
But overall, Thread.current is not the best practice for per-request storage. For most cases, better design will do, but for some cases, use of env can help. It is available in controllers, but also in middleware, and can be injected to any place in the app.
Update 2 - it seems that as for now, draper is uses Thread.current incorrectly. See https://github.com/drapergem/draper/issues/390
Update 3 - that draper bug was fixed.
You generally want to store stuff in session. And if you want something really short-living, see Rails' flash. It's cleared on each request. Any method which relies on thread will not work consistently on different webservers.
Another option would be to modify env hash:
env['some_number'] = 5
BTW Unicorn is not simply single-threaded, it's forking. The new process is spawned on each request (despite it sounds scary, it's pretty efficient on Linux). So if you set anything in Unicorn, even to global variable, it won't persist to another request.
While people still caution against using Thread.current to store "thread global" data, the possibly correct approach to do it in Rails is by clearing-up the Thread.current object using Rack middleware. Steve Labnik has written the request_store gem to do this easily. The source code of the gem is really, really small and I'd recommend reading it.
The interesting parts are reproduced below.
module RequestStore
def self.store
Thread.current[:request_store] ||= {}
end
def self.clear!
Thread.current[:request_store] = {}
end
end
module RequestStore
class Middleware
def initialize(app)
#app = app
end
def call(env)
RequestStore.clear!
#app.call(env)
end
end
end
Please note, clearing up the entire Thread.current is not a good practice. What request_store is basically doing, is it's keeping track of the keys that your app stashes into Thread.current, and clears it once the request is completed.
One of the caveats of using Thread.current, is that for servers that reuse threads or have thread-pools, it becomes very important to clean up after each request.
That's exactly what the request_store gem provides, a simple API akin to Thread.current which takes care of cleaning up the store data after each request.
RequestStore[:items] = []
Be aware though, the gem uses Thread.current to save the Store, so it won't work properly in a multi-threaded environment where you have more than one thread per request.
To circumvent this problem, I have implemented a store that can be shared between threads for the same request. It's called request_store_rails, and the usage is very similar:
RequestLocals[:items] = []

pagination and memcached in rails

What's the best way to cache a paginated result set with rails and memcached?
For example, posts controller:
def index
#posts = Rails.cache.fetch('all_posts') do
Post.paginate(:conditions => ['xx = ?', yy], :include => [:author], :page => params[:page], :order => 'created_at DESC')
end
end
This obviously doesn't work when the params[:page] changes. I can change the key to "all_posts_#{params[:page]}_#{params[:order]_#{last_record.created_at.to_i}", but then there could be several possible orders (recent, popular, most voted etc) and there will be a combination of pages and orders ... lots of keys this way.
Problem #2 - It seems that when I implement this solution, the caches get written correctly and the page loads fine during the first call to a paginated action. When I click back on the same page i.e. page1, with recent order, it seems the browser does not even make a call to the server. I don't see any controller action being called in the production log.
I am using passenger, REE, memcached, and rails 2.3.5. Firebug shows no requests being made....
Is there a simples/more graceful way of handling this?
When it comes to caching there is no easy solution. You might cache every variant of the result, and thats ok if you implement auto-expiration of entries. You can't just use all_posts, because this way you will have to expire dozens of keys if posts will get changed.
Every AR model instance has the .cache_key based on updated_at method, which is prefered way, so use this instead of last record. Also don't base your key on last record, because if some post in the middle will get deleted your key wont change. You can use logic like this instead.
class ActiveRecord::Base
def self.newest
order("updated_at DESC").first
end
def self.cache_key
newest.nil? ? "0:0" : "#{newest.cache_key}:#{count}"
end
end
Now you can use Post.cache_key, which will get changed if any post will get changed/deleted or created.
In general I would just cache Post.all and then paginate on this object. You really need to do some profiling to find bottle necks in your application.
Besides, if you want to cache every variant, then do fragment/page caching instead.
If up to you how and where to cache. No one-way here.
As for the second part of the question, there is way to few hints for me to figure an answer. Check if the browser is making a call at all LiveHTTPHeaders, tcpdump, etc.

Resources