Rails per-request hash? - ruby-on-rails

Is there a way to cache per-request data in Rails? For a given Rails/mongrel request I have the result of a semi-expensive operation that I'd like to access several times later in that request. Is there a hash where I can store and access such data?
It needs to be fairly global and accessible from views, controllers, and libs, like Rails.cache and I18n are.
I'm ok doing some monkey-patching if that's what it takes.
Memcached doesn't work because it'll be shared across requests, which I don't want.
A global variable similarly doesn't work because different requests would share the same data, which isn't what I want.
Instance variables don't work because I want to access the data from inside different classes.

There is also the request_store gem. From the documentation:
Add this line to your application's Gemfile:
gem 'request_store'
and use this code to store and retrieve data (confined to the request):
# Set
RequestStore.store[:foo] = 0
# Get
RequestStore.store[:foo]

Try PerRequestCache. I stole the design from the SQL Query Cache.
Configure it up in config/environment.rb with:
config.middleware.use PerRequestCache
then use it with:
PerRequestCache.fetch(:foo_cache){ some_expensive_foo }

One of the most popular options is to use the request_store gem, which allows you to access a global store that you from any part of your code. It uses Thread.current to store your data, and takes care of cleaning up the data after each request.
RequestStore[:items] = []
Be aware though, since it uses Thread.current, it won't work properly in a multi-threaded environment where you have more than one thread per request.
To circumvent this problem, I have implemented a store that can be shared between threads for the same request. It's called request_store_rails, it's thread-safe, and the usage is very similar:
RequestLocals[:items] = []

Have you considered flash? It uses Session but is automatically cleared.

Memoisation?
According to this railscast it's stored per request.

Global variables are evil. Work out how to cleanly pass the data you want to where you want to use it.

app/models/my_cacher.rb
class MyCacher
def self.result
##result ||= begin
# do expensive stuff
# and cache in ##result
end
end
end
The ||= syntax basically means "do the following if ##result is nil" (i.e. not set to anything yet). Just make sure the last line in the begin/end block is returning the result.
Then in your views/models/whatever you would just reference the function when you need it:
MyCacher.result
This will cache the expensive action for the duration of a request.

Related

Put class instance to class constant in initializers

In one of my old apps, I'm using several API connectors - like AWS or Mandill as example.
For some reason (may be I saw it somewhere, don't remember), I using class constant to initialize this objects on init stage of application.
As example:
/initializers/mandrill.rb:
require 'mandrill'
MANDRILL = Mandrill::API.new ENV['MANDRILL_APIKEY']
Now I can access MANDRILL class constant of my application in any method and use it. (full path MyApplication::Application::MANDRILL, or just MANDRILL). All working fine, example:
def update_mandrill
result = MANDRILL.inbound.update_route id, pattern, url
end
The question is: it is good practice to use such class constants? Or better create new class instance in every method that using this instance, like in example:
def update_mandrill
require 'mandrill'
mandrill = Mandrill::API.new ENV['MANDRILL_APIKEY']
result = mandrill.inbound.update_route id, pattern, url
end
Interesting question.
It's very handy approach but it may have cons in some scenarios.
Imagine you have a constant that either takes a lot of time to initialize or it loads a lot of data into memory. When its initialization takes long you essentially degrade app boot time (which may or may not be a problem, usually it will in development).
If it loads a lot of data into memory it may turn out it's gonna be a problem when running rake tasks for example which load entire environment. You may hit memory boundaries in use cases in which you essentially do not need this data at all.
I know one application which load a lot of data during boot - and it's done very deliberately. Sure, use case is a bit uncommon, but still.
Another thing to consider is - imagine, you're trying to establish connection to external service like Mongo or anything else. If this service is unavailable (what happens) your application won't be able to boot. Maybe this service is essential for app to work, and without it it would be "useless" anyway, but it's also possible that you essentially stop everything because storage in which you keeps log does not work.
I'm not saying you shouldn't use it as you suggested - I do it also in my apps, but you should be aware of potential drawbacks.
Yes, pre-creating a pseudo-constant object (like that api client) is usually a good idea. However, there is, approximately, a thousand ways go about it and the constant is not on top of my personal list.
These days I usually go with setting it in the env files.
# config/environments/production.rb
config.email_client = Mandrill::API.new ENV['MANDRILL_APIKEY'] # the real thing
# config/environments/test.rb
config.email_client = a_null_object # something that conforms to the same api, but does absolutely nothing
# config/environments/development.rb
config.email_client = a_dev_object # post to local smtp, or something
Then you refer to the client like this:
Rails.application.configuration.email_client
And the correct behaviour will be picked up in each env.
If I don't need this per-env variation, then I either use some kind of singleton object (EmailClient.get) or a global variable in the initializer ($email_client). It can be argued that a constant is better than global variable, semantically and because it raises a warning when you try to re-assign it. But I like that global variable stands out more. You see right away that it's something special. (And then again, it's only #3 in the list, so I don't do it very often.).

How do I bypass Rails.cache for a single request or code block?

I have an API endpoint that aggregates a bunch of data from code that leverages Rails.cache for small pieces of data here and there. There are times, however, when I want 100% up-to-date data, as if Rails.cache was empty. Obviously I could clear cache prior to aggregating the data, but that will affect unrelated data and requests.
Is there a way for me to have a request in rails act as if Rails.cache is empty, similar to if Rails.cache was configured to be :null_store?
The query cache in ActiveRecord has something like this - an "uncached" function that you can pass a block to, where the block will run w/o query cache enabled. I need something similar, but for Rails.cache in general.
Since it does not appear there is a solution to this out of the box, I coded a solution of my own by adding the following code as config/initializers/rails_cache.rb
module Rails
class << self
alias :default_rails_cache :cache
def cache
# Allow any thread to override Rails.cache with its own cache implementation.
RequestStore.store[:rails_cache] || default_rails_cache
end
end
end
This allows any thread to specify its own cache store, which will then be used for all fetches, reads, and writes. As such, it will not read from the default Rails.cache, nor will its values be written to the default Rails.cache.
If the thread is long-running and benefits from having caching enabled, you can easily set this to its own MemoryStore instance:
RequestStore.store[:rails_cache] = ActiveSupport::Cache.lookup_store(:memory_store)
And if you want caching completely off for this thread, you can :null_store instead of :memory_store.
If you are not using the request_store gem, "RequestStore.store" can be replaced with "Thread.current" for the same effect - just have to be more careful about thread reuse across requests.

Rails - how to cache data for server use, serving multiple users

I have a class method (placed in /app/lib/) which performs some heavy calculations and sub-http requests until a result is received.
The result isn't too dynamic, and requested by multiple users accessing a specific view in the app.
So, I want to schedule a periodic run of the method (using cron and Whenever gem), store the results somewhere in the server using JSON format and, by demand, read the results alone to the view.
How can this be achieved? what would be the correct way of doing that?
What I currently have:
def heavyMethod
response = {}
# some calculations, eventually building the response
File.open(File.expand_path('../../../tmp/cache/tests_queue.json', __FILE__), "w") do |f|
f.write(response.to_json)
end
end
and also a corresponding method to read this file.
I searched but couldn't find an example of achieving this using Rails cache convention (and not some private code that I wrote), on data which isn't related with ActiveRecord.
Thanks!
Your solution should work fine, but using Rails.cache should be cleaner and a bit faster. Rails guides provides enough information about Rails.cache and how to get it to work with memcached, let me summarize how I would use it in your case
Heavy method
def heavyMethod
response = {}
# some calculations, eventually building the response
Rails.cache.write("heavy_method_response", response)
end
Request
response = Rails.cache.fetch("heavy_method_response")
The only problem here is that when ur server starts for the first time, the cache will be empty. Also if/when memcache restarts.
One advantage is that somewhere on the flow, the data u pass in is marshalled into storage, and then unmartialled on the way out. Meaning u can pass in complex datastructures, and dont need to serialize to json manually.
Edit: memcached will clear your item if it runs out of memory. Will be very rare since its using a LRU (i think) algoritm to expire things, and I presume you will use this often.
To prevent this,
set expires_in larger than your cron period,
change your fetch code to call the heavy_method if ur fetch fails (like Rails.cache.fetch("heavy_method_response") {heavy_method}, and change heavy_method to just return the object.
Use something like redis which will not delete items.

Rails: How to handle Thread.current data under a single-threaded server like Thin/Unicorn?

As Thin/Unicorn are single threaded, how do you handle Thread.current/per-request storage?
Just ran a simple test - set a key in one session, read it from another -- looks like it writes/reads from the same place all the time. Doesn't happen on WEBrick though.
class TestController < ApplicationController
def get
render text: Thread.current[:xxx].inspect
end
def set
Thread.current[:xxx] = 1
render text: "SET to #{Thread.current[:xxx]}"
end
end
Tried adding config.threadsafe! to application.rb, no change.
What's the right way to store per-request data?
How come there are gems (including Rails itself, and tilt) that use Thread.current for storage? How do they overcome this problem?
Could it be that Thread.current is safe per request, but just doesn't clear after request and I need to do that myself?
Tested with Rails 3.2.9
Update
To sum up the discussion below with #skalee and #JesseWolgamott and my findings--
Thread.current depends on the server the app is running on. Though the server might make sure no two requests run at the same time on same Thread.current, the values in this hash might not get cleared between requests, so in case of usage - initial value must be set to override last value.
There are some well known gems who use Thread.current, like Rails, tilt and draper. I guess that if it was forbidden or not safe they wouldn't use it. It also seems like they all set a value before using any key on the hash (and even set it back to the original value after the request has ended).
But overall, Thread.current is not the best practice for per-request storage. For most cases, better design will do, but for some cases, use of env can help. It is available in controllers, but also in middleware, and can be injected to any place in the app.
Update 2 - it seems that as for now, draper is uses Thread.current incorrectly. See https://github.com/drapergem/draper/issues/390
Update 3 - that draper bug was fixed.
You generally want to store stuff in session. And if you want something really short-living, see Rails' flash. It's cleared on each request. Any method which relies on thread will not work consistently on different webservers.
Another option would be to modify env hash:
env['some_number'] = 5
BTW Unicorn is not simply single-threaded, it's forking. The new process is spawned on each request (despite it sounds scary, it's pretty efficient on Linux). So if you set anything in Unicorn, even to global variable, it won't persist to another request.
While people still caution against using Thread.current to store "thread global" data, the possibly correct approach to do it in Rails is by clearing-up the Thread.current object using Rack middleware. Steve Labnik has written the request_store gem to do this easily. The source code of the gem is really, really small and I'd recommend reading it.
The interesting parts are reproduced below.
module RequestStore
def self.store
Thread.current[:request_store] ||= {}
end
def self.clear!
Thread.current[:request_store] = {}
end
end
module RequestStore
class Middleware
def initialize(app)
#app = app
end
def call(env)
RequestStore.clear!
#app.call(env)
end
end
end
Please note, clearing up the entire Thread.current is not a good practice. What request_store is basically doing, is it's keeping track of the keys that your app stashes into Thread.current, and clears it once the request is completed.
One of the caveats of using Thread.current, is that for servers that reuse threads or have thread-pools, it becomes very important to clean up after each request.
That's exactly what the request_store gem provides, a simple API akin to Thread.current which takes care of cleaning up the store data after each request.
RequestStore[:items] = []
Be aware though, the gem uses Thread.current to save the Store, so it won't work properly in a multi-threaded environment where you have more than one thread per request.
To circumvent this problem, I have implemented a store that can be shared between threads for the same request. It's called request_store_rails, and the usage is very similar:
RequestLocals[:items] = []

What is the right way to clear cache in Rails without sweepers

Observers and Sweepers are removed from Rails 4. Cool.
But what is the way to cache and clear cache then ?
I read about russian doll caching. It nice and all but it only concerns the view rendering cache. It doesn't prevent the database from being hit.
For instance:
<% cache #product do %>
Some HTML code here
<% end %>
You still need to get #product from the db to get its cache_key. So page or action caching can still be useful to prevent unnecessary load.
I could use some timeout to clear the cache sometimes but what for if the records didn't change ?
At least with sweepers you have control on that aspect. What is/will be the right way to do cache and to clear it ?
Thanks ! :)
Welcome to one of the two hard problems in computer science, cache invalidation :)
You would have to handle that manually since the logic for when a cached object, unlike a cached view which can be simply derived from the objects it displays, should be invalidated is application and situation dependent.
You goto method for this is the Rails.cache.fetch method. Rails.cache.fetch takes 3 arguments; the cache key, an options hash, and a block. It first tries to read a valid cache record based on the key; if that key exists and hasn’t expired it will return the value from the cache. If it can’t find a valid record it instead takes the return value from the block and stores it in the cache with your specified key.
For example:
#models = Rails.cache.fetch my_cache_key do
Model.where(condition: true).all
end
This will cache the block and reuse the result until something (tm) invalidates the key, forcing the block to be reevaluated. Also note the .all at the end of the method chain. Normally Rails would return an ActiveRecord relation object that would be cached and this would then be evaluated when you tried to use #models for the first time, neatly sidestepping the cache. The .all call forces Rails to eager load the records and ensure that it's the result that we cache, not the question.
So now that you get all your cache on and never talk to the database again we have to make sure we cover the other end, invalidating the cache. This is done with the Rails.cache.delete method that simply takes a cache key and removes it, causing a miss the next time you try to fetch it. You can also use the force: trueoption with fetch to force a re-evaluation of the block. Whichever suits you.
The science of it all is where to call Rails.cache.delete, in the naïve case this would be on update and delete for a single instance and update, delete, create on any member for a collection. There will always bee corner cases and they are always application specific, so I can't help you much there.
I assume in this answer that you will set up some sane cache store, like memcached or Redis.
Also remember to add this to config/environments/development.rb:
config.cache_store = :null_store
or you development environment will cache and you will end up hairless from frustration.
For further reference read: Everyone should be using low level caching in Rails and The rails API docs
It is also worth noting that functionality is not removed from Rails 4, merely extracted into a gem. If you need or would like the full features of the sweepers simply add it back to your app with a gem 'rails-observers' line in your Gemfile. That gem contains both the sweepers and observers that where removed from Rails 4 core.
I hope that helpt you get started.

Resources