Dealing with long requests in Unicorn + Rails app - ruby-on-rails

We have a Rails app that we run on Unicorn (2 workers) and nginx. We want to integrate a 3rd party API where processing of a single request takes between 1 and 20 seconds. If we simply create a new controller that proxies to that service the entire app suffers, because it takes only 2 people to make a request to that service via our API and for 20 seconds the rest of the users can't access the rest of our app.
We're thinking about 2 solutions.
Create a separate node.js server that will do all of the requests to the 3rd party API. We would only use Rails for authentication/authorization in this case, and we would redirect the requests to node via nginx using X-Accel-Redirect header (as described here http://blog.bitbucket.org/2012/08/24/segregating-services/)
Replace Unicorn with Thin or Rainbow! and keep proxying in our Rails app, which could then, presumably, allow us to handle many more concurrent connections.
Which solution might we be better off? Or is there something else we could do.
I personally feel that nodes even-loop is better suited for the job here, because in option 2 we would still be blocking many threads and waiting for HTTP requests to finish and in option 1, we could be doing more requests while waiting for the slow ones to finish.
Thanks!

We've been using the X-Accel-Redirect solution in production for a while now and it's working great.
In nginx config under server, we have entries for external services (written in node.js in our case), e.g.
server {
...
location ^~ /some-service {
internal;
rewrite ^/some-service/(.*)$ /$1 break;
proxy_pass http://location-of-some-service:5000;
}
}
In rails we authenticate and authorize the requests and when we want to pass it to some other service, in the controller we do something like
headers['X-Accel-Redirect'] = '/some-service'
render :nothing => true
Now, rails is done with processing the request and hands it back to nginx. Nginx sees the x-accel-redirect header and replays the request to the new url - /some-service which we configured to proxy to our node.js service. Unicorn and rails can now process new requests even if node.js+nginx is still processing that original request.
This way we're using Rails as our main entry point and gatekeeper of our application - that's where authentication and authorization happens. But we were able to move a lot of functionality into these smaller, standalone node.js services when that's more appropriate.

You can use EventMachine in your existing Rails app which would mean much less re-writing. Instead of making a net/http request to the API, you would make a EM::HttpRequest request to the API and add a callback. This is similar to node.js option but does not require a special server IMO.

Related

Is it possible for NGINX resend request to Unicorn?

I want to know if it is possible that if my NGINX send a request to Unicorn and NGINX doesn't receive an answer, can NGINX send the request to Unicorn again?
Not normally, no. I can think of a workaround, but I'm sure that I would advise it. Using the Nginx upstream module you can define multiple upstream backend ends. Nginx has no way of knowing if the the different upstream servers listed are actually the same backend, nor does it care. The docs for the upstream module say: " If an error occurs during communication with a server, the request will be passed to the next server."
So if you arrange to have your app available at two different addresses (sockets, domain names, IPs or ports), then you can set Nginx to use the upstream module to try the same request twice.
If the the app was not responding properly in time the first time the request was made, there's a good chance it won't respond successfully on the follow-up request. Using the upstream module in the intended way is recommended instead: Use two unique app servers, who might be sharing state through a backend resource like a database.

Rails - is new instance of Rails application created for every http request in nginx/passenger

I have deployed a Rails app at Engineyard in production and staging environment. I am curious to know if every HTTP request for my app initializes new instance of my Rails App or not?
Rails is stateless, which means each request to a Rails application has its own environment and variables that are unique to that request. So, a qualified "yes", each request starts a new instance[1] of your app; you can't determine what happened in previous requests, or other requests happening at the same time. But, bear in mind the app will be served from a fixed set of workers.
With Rails on EY, you will be running something like thin or unicorn as the web server. This will have a defined number of workers, let's say 5. Each worker can handle only one request at a time, because that's how rails works. So if your requests take 200ms each, that means you can handle approximately 5 requests per second, for each worker. If one request takes a long time (a few seconds), that worker is not available to take any other requests. Workers are typically not created and removed on Engineyard; they are set up and run continuously until you re-deploy, though for something like Heroku, your app may not have any workers (dynos) and if there are no requests coming in it will have to spin up.
[1] I'm defining instance, as in, a new instance of the application class. Each model and class will be re-instantiated and the #request and #session built from scratch.
According to what I have understood. No, It will definitely not initialize new instance for every request. Then again two questions might arise.
How can multiple user simultaneously login and access my system without interference?
Even though one user takes up too much processing time, how is another user able to access other features.
Answer to the first question is that HTTP is stateless, everything is stored in session, which is in cookie, which is in client machine and not in server. So when you send a HTTP request for a logged in user, browser actually sends the HTTP request with the required credentials/user information from clients cookies to the server without the user knowing it. Multiple requests are just queued and served accordingly. Since our server are very very fast, I feel its just processing instantly.
For the second query, your might might be concurrency. The server you are using (nginx, passenger) has the capacity to serve multiple request at same time. Even if our server might be busy for a particular user(Lets say for video processing), it might serve another request through another thread so that multiple user can simultaneously access our system.

AngularJS and Rails application security in AWS

I have an AngularJS front-end running on a Nginx server that sends requests to a Rails API backend running on a Puma application server. This application is running on an Amazon AWS EC2 instance.
The Rails API is listening on port 8081.
According to this architecture I had to open the HTTP port 8081 in AWS, so that I could receive the request from the front-end.
I have a domain, so It´s supposed all request should come from www.domain.com. However, I have noticed that if I use my EC2 instance name, such as, in example http://ec2-<ip>.eu-west-1.compute.amazonaws.com:8081/users the Rails API is serving all my users information.
How can I avoid this security bug. Where should I block this? In AWS configuration? In my Rails API CORS configuration? Any other place...
This seems an Authorization bug in your Rails API.
Who is the controller that answer to the route /users?
Let's say it is, for e.g., UsersController: in this case, you could have an action
def index
#users = User.all
end
or something similar, that returns the information you see.
Is difficult to give you a solution, without knowing if you need this action (maybe is just auto-generated boilerplate code) or if you want simply hiding it to who is not an Administrator...
Who wrote the Back End API should fix this for you based on your specifications.

Can Unicorn handle HTTPS requests directly (without the request going through Apache/Nginx)?

I'm sure many will consider this a dumb question but I can't find a straight answer anywhere on the web:
Can Unicorn be configured to handle HTTPS requests directly, without reverse-proxying the request through another web server such as Apache or Nginx? (And if so, how?)
Unicorn is optimized to handle fast connections and is not meant to be used for direct client communication. If you consider using https I would assume that your are transferring data through an potentially uncontrollable line. It's better to use another web server for terminating the ssl connection.

Rails' page caching vs. HTTP reverse proxy caches

I've been catching up with the Scaling Rails screencasts. In episode 11 which covers advanced HTTP caching (using reverse proxy caches such as Varnish and Squid etc.), they recommend only considering using a reverse proxy cache once you've already exhausted the possibilities of page, action and fragment caching within your Rails application (as well as memcached etc. but that's not relevant to this question).
What I can't quite understand is how using an HTTP reverse proxy cache can provide a performance boost for an application that already uses page caching. To simplify matters, let's assume that I'm talking about a single host here.
This is my understanding of how both techniques work (maybe I'm wrong):
With page caching the Rails process is hit initially and then generates a static HTML file that is served directly by the Web server for subsequent requests, for as long as the cache for that request is valid. If the cache has expired then Rails is hit again and the static file is regenerated with the updated content ready for the next request
With an HTTP reverse proxy cache the Rails process is hit when the proxy needs to determine whether the content is stale or not. This is done using various HTTP headers such as ETag, Last-Modified etc. If the content is fresh then Rails responds to the proxy with an HTTP 304 Not Modified and the proxy serves its cached content to the browser, or even better, responds with its own HTTP 304. If the content is stale then Rails serves the updated content to the proxy which caches it and then serves it to the browser
If my understanding is correct, then doesn't page caching result in less hits to the Rails process? There isn't all that back and forth to determine if the content is stale, meaning better performance than reverse proxy caching. Why might you use both techniques in conjunction?
You are right.
The only reason to consider it is if your apache sets expires headers. In this configuration, the proxy can take some of the load off apache.
Having said this, apache static vs proxy cache is pretty much an irrelevancy in the rails world. They are both astronomically fast.
The benefits you would get would be for your none page cacheable stuff.
I prefer using proxy caching over page caching (ala heroku), but thats just me, and a digression.
A good proxy cache implementation (e.g., Squid, Traffic Server) is massively more scalable than Apache when using the prefork MPM. If you're using the worker MPM, Apache is OK, but a proxy will still be much more scalable at high loads (tens of thousands of requests / second).
Varnish for example has a feature when the simultaneous requests to the same URL (which is not in cache) are queued and only single/first request actually hits the back-end. That could prevent some nasty dog-pile cases which are nearly impossible to workaround in traditional page caching scenario.
Using a reverse proxy in a setup with only one app server seems a bit overkill IMO.
In a configuration with more than one app server, a reverse proxy (e.g. varnish, etc.) is the most effective way for page caching.
Think of a setup with 2 app servers:
User 'Bob'(redirected to node 'A') posts a new message, the page gets expired and recreated on node 'A'.
User 'Cindy' (redirected to node 'B') requests the page where the new message from 'Bob' should appear, but she can't see the new message, because the page on node 'B' wasn't expired and recreated.
This concurrency problem could be solved with a reverse proxy.

Resources