Are rails controllers multithreaded? Thread.exclusive in controllers - ruby-on-rails

Are Rails controllers multithreaded?
If so, can I protect a certain piece of code (which fires only once every ten minutes) from being run from multiple threads by simply doing
require 'thread'
Thread.exclusive do
# stuff here
end
on do I need to somehow synchronize on a monitor?

Running rake middleware on a basic rails app gives the following:
use Rack::Lock
use ActionController::Failsafe
use ActionController::Reloader
use ActiveRecord::ConnectionAdapters::ConnectionManagement
use ActiveRecord::QueryCache
use ActiveRecord::SessionStore, #<Proc:0x017fb394#(eval):8>
use ActionController::RewindableInput
use ActionController::ParamsParser
use Rack::MethodOverride
use Rack::Head
run ActionController::Dispatcher.new
The first item on the rack stack is Rack::Lock. This puts a lock around each request, so only one request is handled at a time. As such a standard rails app is single threaded. You can however spawn new threads within a request that would make your app multi threaded, most people never encounter this.
If you are having issues…
require 'thread'
Thread.exclusive do
# stuff here
end
… would ensure that stuff inside the block is never run in parallel with any other code. Creating a shared Mutext between all threads (in a class variable or something, but this could be wiped when reloaded in dev mode, so be careful), and locking on it as Rack::Lock#call does is to be preferred if you just want to ensure no two instances of the same code is executed at the same time.
Also, for the record, each request creates and dereferences one controller in each request cycle. No two requests should see the same instance, although they may see the same class.
Setting config.threadsafe! voids almost everything I said. That removes Rack::Lock from the stack, and means you will need to set a mutex manually to prevent double entry. Don't do it unless you have a really good reason.
Even without Rack::Lock you will still get one controller instance per request. The entry point to your controller ensures that, notice the call to new in process.

My understanding is that a new controller instance is created for each HTTP request that is processed by a controller.

Ruby is single threaded. So at anytime, a controller can only handle one request at a time. If there are more than one requests, these requests are queued up. To avoid this, people usually run a small set of Mongrels to get good concurrency. It works like this(straight from Mongrel WIKI FAQ):
A request hits mongrel.
Mongrel makes a thread and parses the HTTP request headers
If the body is small, then it puts the body into a StringIO
If the body is large then it streams the body to a temp file
When the request is "cooked" it call the RailsHandler.
The RailsHandler sees if the file is possibly page cached, if so then it sends the cached page.
Now you're finally ready to process the Rails request. LOCK!
Still locked, Mongrel calls the Rails Dispatcher to handle the request, passing in the headers, and StringIO or Tempfile for body.
When Rails is done, UNLOCK! . Rails has (hopefully) put all of its output into a StringIO.
Mongrel then takes this StringIO output, any output headers, and streams them back to the client super fast.
Notice that if there is no locking if the page is cached.

Related

Call a rails 3 controller action after server startup

We have a rails 3 controller action that calls a remote api & caches the response. Unfortunately it's a slow process, so the first person to call that action has to wait quite a long time for the response. How can I call this controller action when rails starts (or at some point soon after) so the first user does not have to have that long delay.
Thanks in advance
If you want to load some data on startup, I recommend doing that in an initializer, setting a variable somewhere appropriate, and then having your controller just read from that variable.
But now you're hitting the remote API for each app server instance. Also refreshing the cache in each of those instances will be challenging. A better approach would be to stash the data in a shared store like Redis or Memcached, and then set it using a cron job, e.g. via the whenever gem. Rails is not really designed for inter-request caching.

In Rails 3, how do I call some code via a controller but completely after the Request/Response cycle is done?

I have a very weird situation: I have a system where a client app (Client) makes an HTTP GET call to my Rails server, and that controller does some handling and then needs to make a separate call to the Client via a different pathway (i.e. it actually goes via Rabbit to a proxy and the proxy calls the Client). I can't change the pathway for that different call and I can't change the Client at all (it's a 3rd party system).
However: the issue is: the call via the different pathway fails UNLESS the HTTP GET from the client is completed.
So I'm trying to figure out: is there a way to have Rails finish the HTTP GET response and then make this additional call?
I've tried:
1) after_filter: this doesn't work because the after filter is apparently still within the Request/Response cycle so the TCP/HTTP response back to the Client hasn't completed.
2) enqueuing a worker: this works, but it is not ideal because if the workers are backed up, this call back to the client may not happen right away and it really does need to happen right after the Client calls the Rails app
3) starting a separate thread: this may work, but it makes me nervous: adding threading explicitly in Rails could be fraught with peril.
I welcome any ideas/suggestions.
Again, in short, the goal is: process the HTTP GET call to the Rails app and return a 200 OK back to the Client, completely finishing the HTTP request/response cycle and then call some extra code
I can provide any further details if that would help. I've found both #1 and #2 as recommended options but neither of them are quite what I need.
Ideally, there would be some "after_response" callback in Rails that allows some code to run but after the full request/response cycle is done.
Possibly use an around filter? Around filters allow us to define methods that wrap around every action that rails calls. So if I had an around filter for the above controller, I could control the execution of every action, execute code before calling the action, and after calling it, and also completely skip calling the action under certain circumstances if I wanted to.
So what I ended up doing was using a gem that I had long ago helped with: Spawnling
It turns out that this works well, although it required a tweak to get it working with Rails 3.2. It allows me to spawn a thread to do the extra, out-of-band callback to the Client, but let the normal, controller process complete. And I don't have to worry about thread management, or AR connection management. Spawnling handles that.
It's still not ideal, but pretty close. And it's slightly better than enqueuing a Resque/Sidekiq worker as there's no risk of worker backlog causing an unexpected delay.
I still wish there was an "after_response_sent" callback or something, but I guess this is too unusual a request.

Rails or Sinatra app - how to maintain background threads

I'd like to maintain a data structure on a Sinatra or Rails server (doesn't matter) that is accessible for all HTTP requests that arrive to it (i.e. to support concurrent modification). I don't want to rely on a database or similar because that doesn't allow me to code callbacks for the modification of this data structure and put concurrent blocks on the HTTP response threads.
Since HTTP is stateless there's apparently no easy way to achieve this.
How can I make a process to maintain data in the background for all the requests that arrive to an HTTP server without reliying on external programs and middleware? Does it require me to modify Rails or Sinatra to achieve this? Is there any alternative even outside ruby?
When using Sinatra, you can just code in a thread at the end of your application:
http://blog.markwatson.com/2011/11/ruby-sinatra-web-apps-with-background.html
Using this, you could maintain a worker that would do things even as http requests come in and out.
Sinatra also has the methods before and after which run before and after each request, respectively.
So if you wanted to add data to a data structure before each request is handled you could:
before do
puts request
end
Using these tools, you can easily achieve what you want to do.

In Rails, can an internal request be generated that behaves identically to an HTTP request?

Within my Rails application, I'd like to generate requests that behave identically to "genuine" HTTP requests.
For a somewhat contrived example, suppose I were creating a system that could batch incoming HTTP requests for later processing. The interface for it would be something like:
Create a new batch resource via the usual CRUD methodology (POST, receive a location to the newly created resource).
Update the batch resource by sending it URLs, HTTP methods, and data to be added to the collection of requests it's supposed to later perform in bulk.
"Process" the batch resource, wherein it would iterate over its collection of requests (each of which might be represented by a URL, HTTP method, and a set of data), and somehow tell Rails to process those requests in the same way as it would were they coming in as normal, "non-batched" requests.
It seems to me that there are two important pieces of work that need to happen to make this functional:
First, the incoming requests need to be somehow saved for later. This could be simply a case of saving various aspects of the incoming request, such as the path, method, data, headers, etc. that are already exposed as part of the incoming request object within a controller. It would be nice if there was a more "automatic" way of handling this--perhaps something more like object marshaling or serialization--but the brute force approach of recording individual parameters should work as well.
Second, the saved requests need to be able to be re-injected into the rails application at a later time, and go through the same process that a normal HTTP request goes through: routing, controllers, views, etc. I'd like to be able to capture the response in a string, much as the HTTP client would have seen it, and I'd also like to do this using Rails' internal machinery rather than simply using an HTTP library to have the application literally make a new request to itself.
Thoughts?
a straight forward way of storing the arguments should be serializing the request object in your controller - this should contain all important data
to call the requests later on, i would consider using the Dispatcher.dispatch class method, that takes 3 arguments: the cgi request, the session options (CgiRequest::DEFAULT_SESSION_OPTIONS should be ok) and the stream which the output is written to
Rack Middleware
After doing a lot of investigation after I'd initially asked this question, I eventually experimented with and successfully implemented a solution using Rack Middleware.
A Basic Methodology
In the `call' method of the middleware:
Check to see if we're making a request as a nested resource of a
transaction object, or if it's an otherwise ordinary request. If it's
ordinary, proceed as normal through the middleware by making a call to
app.call(env), and return the status, headers, and response.
Unless this is a transaction commit, record the "interesting" parts of the
request's env hash, and save them to the database as an "operation" associated
with this transaction object.
If this is a transaction commit, retrieve all of the relevant operations
for this transaction. Either create a new request environment, or clone the
existing one and populate it with the values saved for the operation. Also
make a copy of the original request environment for later restoration, if
control is meant to pass through the application normally post-commit.
Feed the constructed environment into a call to app.call(env). Repeat for
each operation.
If the original request environment was preserved, restore it and make one
final call to app.call(env), returning from the invocation of `call' in the
middleware the status, headers, and response from this final call to
app.call(env).
A Sample Application
I've implemented an example implementation of the methodology I describe here, which I've made available on GitHub. It also contains an in-depth example describing how the implementation might look from an API perspective. Be warned: it's quite rough, totally undocumented (with the exception of the README), and quite possibly in violation of Rails good coding practices. It can be obtained here:
http://github.com/mcwehner/transact-example
A Plugin/Gem
I'm also beginning work on a plugin or gem that will provide this sort of interface to any Rails application. It's in its formative stages (in fact it's completely devoid of code at the moment), and work on it will likely proceed slowly. Explore it as it develops here:
http://github.com/mcwehner/transact
See also
Railscasts - Rack Middleware
Rails Guides - Rails on Rack

Rails/Passenger/Unknown Content Type

We have the following situation:
We invoke a url which runs an action in a controller. The action is fairly long running - it builds a big string of XML, generates a PDF and is supposed to redirect when done.
After 60 seconds or so, the browswer gets a 200, but with content type of "application/x-unknown-content-type" no body and no Response Headers (using Tamper to look at headers)
The controller action actually continues to run to completion, producing the PDF
This is happening in our prod environment, in staging the controller action runs to completion, redirecting as expected.
Any suggestions where to look?
We're running Rails 2.2.2 on Apache/Phusion Passenger.
Thanks,
I am not 100% sure, but probably your Apache times out the request to Rails application. Could you try to set Apache's Timeout directive higher? Something like:
Timeout 120
I'd consider bumping this task off to a job queue and returning immediately rather than leaving the user to sit and wait. Otherwise you're heading for a world of problems when lots of people try to use this and you run out of available rails app instances to handle any new connections.
One way to do this easily might be to use an Ajax post to trigger creating the document, drop this into Delayed Job and then run a 10 second periodic check via ajax informing the waiting user of the jobs status. Once delayed_job has finished processing your task in the background and updated something in the database to indicate it is complete, then you can redirect the user via ajax to the newly created document.

Resources