Safety of Thread.current[] usage in rails - ruby-on-rails

I keep getting conflicting opinions on the practice of storing information in the Thread.current hash (e.g., the current_user, the current subdomain, etc.). The technique has been proposed as a way to simplify later processing within the model layer (query scoping, auditing, etc.).
Why are my thread variables intermittent in Rails?
Alternative to using Thread.current in API wrapper for Rails
Are Thread.current[] values and class level attributes safe to use in rails?
Many consider the practice unacceptable because it breaks the MVC pattern.
Others express concerns about reliability/safety of the approach, and my 2-part question focuses on the latter aspect.
Is the Thread.current hash guaranteed to be available and private to one and only one response, throughout its entire cycle?
I understand that a thread, at the end of a response, may well be handed over to other incoming requests, thereby leaking any information stored in Thread.current. Would clearing such information before the end of the response (e.g. by executing Thread.current[:user] = nil from a controller's after_filter) suffice in preventing such security breach?
Thanks!
Giuseppe

There is not an specific reason to stay away from thread-local variables, the main issues are:
it's harder to test them, as you will have to remember to set the thread-local variables when you're testing out code that uses it
classes that use thread locals will need knowledge that these objects are not available to them but inside a thread-local variable and this kind of indirection usually breaks the law of demeter
not cleaning up thread-locals might be an issue if your framework reuses threads (the thread-local variable would be already initiated and code that relies on ||= calls to initialize variables might fail
So, while it's not completely out of question to use, the best approach is not to use them, but from time to time you hit a wall where a thread local is going to be the simplest possible solution without changing quite a lot of code and you will have to compromise, have a less than perfect object oriented model with the thread local or changing quite a lot of code to do the same.
So, it's mostly a matter of thinking which is going to be the best solution for your case and if you're really going down the thread-local path, I'd surely advise you to do it with blocks that remember to clean up after they are done, like the following:
around_filter :do_with_current_user
def do_with_current_user
Thread.current[:current_user] = self.current_user
begin
yield
ensure
Thread.current[:current_user] = nil
end
end
This ensures the thread local variable is cleaned up before being used if this thread is recycled.

This little gem ensures your thread/request local variables not stick between requests: https://github.com/steveklabnik/request_store

The accepted answer covers the question but as Rails 5 now provides a "Abstract super class" ActiveSupport::CurrentAttributes which uses Thread.current.
I thought I would provide a link to that as a possible(unpopular) solution.
https://github.com/rails/rails/blob/master/activesupport/lib/active_support/current_attributes.rb

The accepted answer is technically accurate, but as pointed out in the answer gently, and in http://m.onkey.org/thread-safety-for-your-rails not so gently:
Don't use thread local storage, Thread.current if you don't absolutely have to
The gem for request_store is another solution (better) but just read the readme there for more reasons to stay away from thread local storage.
There is almost always a better way.

Related

In rails it 'thread safe' to create a class instance variable as long as I clear it at the end of the request?

Would it be considered "thread safe" to store a class level instance invariable (i.e. MyClass.foo) as long as I clear it at the end of the request? For example setting the value in a before_filter and clearing it in an after_filter?
My understanding is if I do not clear it.. it will exist for future requests which I do not want. But if I set it and clear it.. is that good enough? Or could two requests overlap and cause a collision and mutated data?
This is probably a really bad idea as it might bleed information between requests and sessions. If you need a short-term cache that's thread-safe, you can always spike methods on to the request object that's created for you.
If you're trying to push session information down into the global context of a model then you're going against the MVC design pattern. Any information a model needs should be passed in in a well-defined manner, especially as this makes testing orders of magnitude simpler and more reliable.

Threading in Rails - do params[] persist?

I am trying to spawn a thread in Rails. I am usually not comfortable using threads as I will need to have an in-depth knowledge of Rails' request/response cycle, yet I cannot avoid using one as my request times out.
In order to avoid the time out, I am using a thread within a request. My question here is simple. The thread that I've used accesses a params[] variable inside it. And things seem to work OK now. I want to know whether this is right? I'd be happy if someone can throw some light on using Threads in Rails during request/response cycle.
[Starting a bounty]
The short answer is yes, but only to a degree; the binding in which the thread was created will continue to persist. The params will still exist only if no one (including Rails) goes out of their way to modify or delete the params hash. Instead, they rely on the garbage collector to clean up these objects. Since the thread has access to the current context (called the "binding" in Ruby) when it was created, all the variables that can be reached from that scope (effectively the entire state when the thread was created) cannot be deleted by the garbage collector. However, as executing continues in the main thread, the values of the variables in that context can be changed by the main thread, or even by the thread you created, if it can access it. This is the benefit--and the downfall--of threads: they share memory with everything else.
You can emulate a very similar environment to Rails to test your problem, using a function as such: http://gist.github.com/637719. As you can see, the code still prints 5.
However, this is not the correct way to do this. The better way to pass data to a thread is to pass it to Thread.new, like so:
# always dup objects when passing into a thread, else you really
# haven't done yourself any good-it would still be the same memory
Thread.new(params.dup) do |params|
puts params[:foo]
end
This way, you can be sure than any modifications to params will not affect your thread. The best practice is to only use data you pass to your thread in this way, or things that the thread itself created. Relying on the state of the program outside the thread is dangerous.
So as you can see, there are good reasons that this is not recommended. Multithreaded programming is hard, even in Ruby, and especially when you're dealing with as many libraries and dependencies as are used in Rails. In fact, Ruby seems to make it deceptively easy, but it's basically a trap. The bugs you will encounter are very subtle; they will happen "randomly" and be very hard to debug. This is why things like Resque, Delayed Job, and other background processing libraries are used so widely and recommended for Rails apps, and I would recommend the same.
The question is more does rails keep the request open whilst the thread is running than does it persist the value.
It won't persist the value as soon as the request ends and I also wouldn't recommend holding the request open unless there is a real need. As other users have said some stuff is just better in a delayed job.
Having said that we used threading a couple of times to query multiple sources concurrently and actually reduce the response time of an app (that was only for admins so didn't need to have fast response times) and if memory serves correctly the thread can keep the request open if you call join at the end and wait for each thread to finish before continuing.

Rails classes reloading in production mode

Is there a way to reload ruby model in runtime?
For example I've a model
class Model
def self.all_models
##all_models ||= Model.all
end
end
Records in this model are changed very rarely, but then they do, I don't want to reload whole application, just this one class.
On a Development server, this is not a problem. A production server is a big one.
In reality it's not feasible without restarting the server. The best you could do is add a before filter in ApplicationController to update class variables in each worker thread, but it has to be done on every request. You can't turn this behaviour off and on easily.
If it's an resource intensive operation, you can settle for a less intensive test like a comparing value in a database/last modified time of a file to a constant defined at runtime in an effort to determine if the full reload should occur. But you would still have to do this as part of every request.
However, to the best of my knowledge modifying routes once the server has been loaded is impossible. Modifying other site wide variables may require a little more effort, such as reading from a file/database and updating in a before filter.
There may be another way, but I haven't tried it at all. So there's no guarantee.
If you're using a ruby based server such as mongrel. In theory you could use hijack to update the model/routes/variables in the control thread from which, worker threads are spawned from.

Storing Objects in a Session in Rails

I have always been taught that storing objects in a session was a bad idea. Instead IDs should be stored that retrieve the record when needed.
However, I have an application that I wonder is an exception to this rule. I'm building a flashcard application, and the words being quizzed are in a table in the database whose schema doesn't change. I want to store the words currently being quizzed in a session, so a user can finish where they started in case they move on to a separate page.
In this case, is it possible to get away with storing these words as objects in the database? If so, why? The reason I ask is because the quiz is designed to move quickly, and I'd hate to waste a database call on retrieving a record that never changes in the first place. However, perhaps there are other negatives to a large session that I'm not aware of.
*For the record, I have tried caching it with the built-in memcache methods in Rails 2.3, but apparently that has a maximum size per item of 1MB.
The main reason not to store objects in the session is that if the object structure changes, you will get an exception. Consider the following:
class Foo
attr_accessor :bar
end
class Bar
end
foo = Foo.new
foo.bar = Bar.new
put_in_session(foo)
Then, in a subsequent release of the project, you change Bar's name. You reboot the server, and try to grab foo out of the session. When it tries to deserialize, it fails to find Bar and explodes.
It might seem like it would be easy to avoid this pitfall, but in practice, I've seen it bite a number of people. This is just because serializing an object can sometimes take more along with it than is immediately apparent (this sort of thing is supposed to be transparent) and unless you have rigorous rules about this, things will tend to get flummoxed up.
The reason it's normally frowned upon is that it's extremely common for this to bite people in ActiveRecord, since it's quite common for the structure of your app to shift over time, and sessions can be deserialized a week or longer after they were originally created.
If you understand all that and are willing to put in the energy to be sure that your model does not change and is not serializing anything extra, you're probably fine. But be careful :)
Rails tends to encourage RESTful design, and using sessions isn't very RESTful. I'd probably make a Quiz resource that has a bunch of words, as well as a current_word. This way, when they come back, you'll know where they were.
Now, REST isn't everything (depending on who you talk to), but there's a pretty good case against large sessions. Remember that sessions write things to and from disk, and the more data that you're writing, the longer it takes to read back...
Since your app is a Rails app, I would suggest either:
Using your clients' ability to cache
by caching the cards in javascript.
(you'd need a fairly ajaxy app to
do this, see the latest RailsCast for some interesting points on javascript page caching)
Use one of the many other rails-supported server-side
caching options (i.e. MemCached) to
cache this data.
A much more insidious issue you'll encounter storing objects directly in the session is when you're using CookieStore (the default in Rails 2+ I believe). It's very easy to get CookieOverflow errors which are very hard to recover from.

How to access the variables defined in environment.rb in RoR?

I want to create a thread object in environment.rb and use it in some other action of some controller.
How should I do it?
Thanks in advance.
Actually, I want three processes to be running perpetually which are fetching some data and storing it in database. That's why I am using threads. Is there any other way to do so?
To answer your initial question, constants declared in environment.rb are available throughout the entire codebase. Avoid doing so if you can, though; this can become configuration spaghetti pretty quickly.
More broadly, although Rails has been (from what I understand) thread-safe since version 2.2, threads are still quite uncommon - particularly in MRI - as a way to provide concurrent operation, and MRI's green threads are anyway not particularly helpful. Consider using a message queue like Starling that spins up other Ruby processes to perform asynchronous work.
Further to what Brian says, consider using an initializer (put in config/initializers to have it executed). I think it makes the intent more clear than using environment.rb.
Be very careful with this. To the best of my knowledge, rails is not thread-safe. And trying to use threads safely in the face of all the magic (excuse me, "meta programming") it does sounds risky as all get out.
Why do you want a thread object anyway?
In response to the comment, saying rails is thread safe might not mean as much as you think. I's certainly be leery of counting on it at if I didn't need to.

Resources