Rails 2 session clobbered some time AFTER variable is stored - ruby-on-rails

I am working on an existing Rails 2 app. I have converted a few of the hash data structures into objects, and if I put one in the session store, it seems to clobber the session, clearing out the user_id, among other things, and forcing another login. I'm using dalli_store for the session.
The following code causes the session to be wiped out:
bug = MyClass.get_an_object()
session[:debug] = bug
It's not clear WHERE it is getting wiped out. I can step through in the debugger to the end of rendering the view, and the session is fine, but when I hit another link in the UI, session is empty (Hash[0]), and I am redirected to the login page.
If I change the code slightly, to this, the session is OK:
bug = MyClass.get_an_object()
session[:debug] = Marshal::dump(bug)
However, if I store the actual object (even a deep copy), the session is lost. I.e. even this does not work:
session[:debug] = Marshal::load( Marshal::dump(bug) )
bug.size is about 140K when marshalled, so it should not be overrunning memcached. At any rate, I would assume that session is serialized by Marshal::dump(), so the size should be identical. It doesn't matter if I access the object in the session after storing it. Just putting it into the session is enough to cause the problem, but as I said, the session is fine after storing the object, and all the way through the view rendering. It isn't until the start of the next request that I find out that it has been clobbered.
I'm stumped.
Do you have any recommendations on how to debug this? For the moment, I guess I can explicitly call Marshal to save the object, but I would really like to understand why the session is getting clobbered.
I know it's a Bad Thing to put big objects into the session, but fixing that part of the problem is out of scope for the current project ... maybe later. Plus, I am very curious about what is going on here.

Related

Is there a way to have a persistent object in memory I can read/write to anywhere in a rails app?

I'm working on a rails based web backend, and I've ran into a bit of an issue. I'm building a crypto trading based application, which relies on knowing the exact current price of many cryptos/stocks. To do this I seem to need a websocket to update certain data, however I can't figure out how to store this data. I need to be able to write to it on every websocket update, as well as read from it when sending out data to the front end. Both of these actions seem too fast to rely on my database, so I'm wondering if there is a better option. My idea was to use a class with a class method that is set on server startup. Then read from/write to that method when needed. The class looks something like this
class CryptoSocket
def self.start
##cryptos = {
BTC: 0,
ETH: 0,
DOGE: 0
}
end
def self.value(symbol)
##cryptos[symbol]
end
end
Inside the start method is a websocket which gets opened, and on message writes to ##cryptos with the updated value of the coin. I call CryptoSocket.start when the server boots up
To get the value for a symbol I can just call CryptoSocket.value(symbol) anywhere in my app. This seemed like it was working, however I've noticed sometimes it fails telling me NameError: uninitialized class variable ##cryptos in CryptoSocket
It seems like the issue is running reload! in the rails console, or entering a binding.pry then exiting. My guess is some garbage collection is happening, but overall it's something I'd like to avoid.
Does anyone have a suggestion for a better way to go about setting this class up? Does rails have a better way to persist an object in memory? It's fine to lose it when the server shuts down, but I would like to keep access to it when the server stays up

Save a rails console session to disk and reload later

I have a rails console session in a tmux session, and it's taking up a lot of memory. It has a lot of data, pretty deeply nested, in a few variables, and it took a long time to query for that data, so right now my plan is to serialize the data and save it to a file. That way, I can reload it later and not take up too much memory on the machine while I'm not using it. I'm wondering, though, if there's a better way. Is it possible for me to save my whole rails console session and load it up again later?
No, you can't save a whole Rails console session (the same is true for a simple irb session) for later use.
Create, or edit your ~/.irbrc file to include:
require 'irb/ext/save-history'
IRB.conf[:SAVE_HISTORY] = 200
IRB.conf[:HISTORY_FILE] = "#{ENV['HOME']}/.irb-history"

Can I use Rails.cache to store short-lived session data?

We are already using cookie based sessions, and switching off them to file store sessions in not an option. However, I need a way to store larger amounts of session data (up to 10MG or so) -- beyond the limit of cookie session and, even it weren't, round-tripping that much data on multiple requests would be slow.
I am currently attempting to solve this by using (abusing?) Rails.cache. The basic setup is like this:
I post to a route, which has the following code:
# calculate some results...
Rails.cache.write('search_results' + session.id), search_results)
redirect_to '/results'
Inside GET /results, I read the cached data and send it to the client:
#results = Rails.cache.read('search_results' + session.id)
This works fine. However, if I subsequently make a request to another route like GET /results2 that also calls Rails.cache.read('search_results' + session.id), it will return nil. Even if all calls happen within a 5-10s span.
So my questions are:
Why does this happen? What determines when Rails.cache clean itself?
Is there a way to make this work?
Is there a better approach altogether that doesn't involve using a DB or redis?
Answer to your questions:
The problem with file cache store is that it stores file locally. Thus, if you have multiple servers, cache can be written to one server while cache is read on another server which will return ‘nil’. The solution is to use cache store that can be shared among multiple servers.
Using redis-store may be a solution: https://github.com/redis-store/redis-rails

Rails how to check size of current session (cookies)

I was happily throwing things into the session instead of direct saving into the database (like in multistep forms) and thought those 4kb is more than enough. Since I use Devise I thought it uses some session storage but felt safe until I tried to p session and OH!!! My terminal couldn't event print all data from it. That data is hard to understand - there are some items that I pass to it, but also some local routes and other weird things.
So no I wonder how to check the size of it at some stages? I found similar question but following that I get #encryptor as undefined / nil..
Also tried:
#encryptor = ActiveSupport::MessageEncryptor.new(secret, cipher: encrypted_cookie_cipher, serializer: SERIALIZER)
data = session.to_hash.delete_if { |k,v| v.nil? }
data = #encryptor.encrypt_and_sign(serialize(name, data))
p data.bytesize
But then secret is undefined:
undefined local variable or method `secret'
I also tried to find a way to display the cookie-size with rails, without success.
But another way to check the size of your current session cookie is the developer tool in the browser.
There you can see all cookies with some information and its sizes.
It's maybe not the best way, but better than nothing.

How does Rails 4 Russian doll caching prevent stampedes?

I am looking to find information on how the caching mechanism in Rails 4 prevents against multiple users trying to regenerate cache keys at once, aka a cache stampede: http://en.wikipedia.org/wiki/Cache_stampede
I've not been able to find out much information via Googling. If I look at other systems (such as Drupal) cache stampede prevention is implemented via a semaphores table in the database.
Rails does not have a built-in mechanism to prevent cache stampedes.
According to the README for atomic_mem_cache_store (a replacement for ActiveSupport::Cache::MemCacheStore that mitigates cache stampedes):
Rails (and any framework relying on active support cache store) does
not offer any built-in solution to this problem
Unfortunately, I'm guessing that this gem won't solve your problem either. It supports fragment caching, but it only works with time-based expiration.
Read more about it here:
https://github.com/nel/atomic_mem_cache_store
Update and possible solution:
I thought about this a bit more and came up with what seems to me to be a plausible solution. I haven't verified that this works, and there are probably better ways to do it, but I was trying to think of the smallest change that would mitigate the majority of the problem.
I assume you're doing something like cache model do in your templates as described by DHH (http://37signals.com/svn/posts/3113-how-key-based-cache-expiration-works). The problem is that when the model's updated_at column changes, the cache_key likewise changes, and all your servers try to re-create the template at the same time. In order to prevent the servers from stampeding, you would need to retain the old cache_key for a brief time.
You might be able to do this by (dum da dum) caching the cache_key of the object with a short expiration (say, 1 second) and a race_condition_ttl.
You could create a module like this and include it in your models:
module StampedeAvoider
def cache_key
orig_cache_key = super
Rails.cache.fetch("/cache-keys/#{self.class.table_name}/#{self.id}", expires_in: 1, race_condition_ttl: 2) { orig_cache_key }
end
end
Let's review what would happen. There are a bunch of servers calling cache model. If your model includes StampedeAvoider, then its cache_key will now be fetching /cache-keys/models/1, and returning something like /models/1-111 (where 111 is the timestamp), which cache will use to fetch the compiled template fragment.
When you update the model, model.cache_key will begin returning /models/1-222 (assuming 222 is the new timestamp), but for the first second after that, cache will keep seeing /models/1-111, since that is what is returned by cache_key. Once 1 second passes, all of the servers will get a cache-miss on /cache-keys/models/1 and will try to regenerate it. If they all recreated it immediately, it would defeat the point of overriding cache_key. But because we set race_condition_ttl to 2, all of the servers except for the first will be delayed for 2 seconds, during which time they will continue to fetch the old cached template based on the old cache key. Once the 2 seconds have passed, fetch will begin returning the new cache key (which will have been updated by the first thread which tried to read/update /cache-keys/models/1) and they will get a cache hit, returning the template compiled by that first thread.
Ta-da! Stampede averted.
Note that if you did this, you would be doing twice as many cache reads, but depending on how common stampedes are, it could be worth it.
I haven't tested this. If you try it, please let me know how it goes :)
The :race_condition_ttl setting in ActiveSupport::Cache::Store#fetch should help avoid this problem. As the documentation says:
Setting :race_condition_ttl is very useful in situations where a cache entry is used very frequently and is under heavy load. If a cache expires and due to heavy load seven different processes will try to read data natively and then they all will try to write to cache. To avoid that case the first process to find an expired cache entry will bump the cache expiration time by the value set in :race_condition_ttl. Yes, this process is extending the time for a stale value by another few seconds. Because of extended life of the previous cache, other processes will continue to use slightly stale data for a just a bit longer. In the meantime that first process will go ahead and will write into cache the new value. After that all the processes will start getting new value. The key is to keep :race_condition_ttl small.
Great question. A partial answer that applies to single multi-threaded Rails servers but not multiprocess(or) environments (thanks to Nick Urban for drawing this distinction) is that the ActionView template compilation code blocks on a mutex that is per template. See line 230 in template.rb here. Notice there is a check for completed compilation both before grabbing the lock and after.
The effect is to serialize attempts to compile the same template, where only the first will actually do the compilation and the rest will get the already completed result.
Very interesting question. I searched on google (you get more results if you search for "dog pile" instead of "stampede") but like you, did I not get any answers, except this one blog post: protecting from dogpile using memcache.
Basically does it store you fragment in two keys: key:timestamp (where timestamp would be updated_at for active record objects) and key:last.
def custom_write_dogpile(key, timestamp, fragment, options)
Rails.cache.write(key + ':' + timestamp.to_s, fragment)
Rails.cache.write(key + ':last', fragment)
Rails.cache.delete(key + ':refresh-thread')
fragment
end
Now when reading from the cache, and trying to fetch a non existing cache, will it instead try to fecth the key:last fragment instead:
def custom_read_dogpile(key, timestamp, options)
result = Rails.cache.read(timestamp_key(name, timestamp))
if result.blank?
Rails.cache.write(name + ':refresh-thread', 0, raw: true, unless_exist: true, expires_in: 5.seconds)
if Rails.cache.increment(name + ':refresh-thread') == 1
# The cache didn't exists
result = nil
else
# Fetch the last cache, as the new one has not been created yet
result = Rails.cache.read(name + ':last')
end
end
result
end
This is a simplified summary of the by Moshe Bergman that i linked to before, or you can find here.
There is no protection against memcache stampedes. This is a real problem when multiple machines are involved and multiple processes on those multiple machines. -Ouch-.
The problem is compounded when one of the key processes has "died" leaving any "locking" ... locked.
In order to prevent stampedes you have to re-compute the data before it expires. So, if your data is valid for 10 minutes, you need to regenerate again at the 5th minute and re-set the data with a new expiration for 10 more minutes. Thus you don't wait until the data expires to set it again.
Should also not allow your data to expire at the 10 minute mark, but re-compute it every 5 minutes, and it should never expire. :)
You can use wget & cron to periodically call the code.
I recommend using redis, which will allow you to save the data and reload it in the advent of a crash.
-daniel
A reasonable strategy would be to:
use a :race_condition_ttl with at least the expected time it takes to refresh the resource. Setting it to less time than expected to perform a refresh is not advisable as the angry mob will end up trying to refresh it, resulting in a stampede.
use an :expires_in time calculated as the maximum acceptable expiry time minus the :race_condition_ttl to allow for refreshing the resource by a single worker and avoiding a stampede.
Using the above strategy will ensure that you don't exceed your expiry/staleness deadline and also avoid a stampede. It works because only one worker gets through to refresh, whilst the angry mob are held off using the cache value with the race_condition_ttl extension time right up to the originally intended expiry time.

Resources