How does Phusion Passenger reuse threads and processes? - ruby-on-rails

I am setting up an Apache2 webserver running multiple Ruby on Rails web applications with Phusion Passenger. I know that Passenger spawns Ruby processes for handling requests. I have the following questions:
If more than one request has to be handled at the same time, will Passenger spawn multiple processes or multiple (Ruby) threads? How do I configure it so it always spawns single-threaded processes?
If I have two Rails applications, imagine that a request for app A goes to process 1, then later request for app B arrives. Is it possible that process 1 will handle this request as well? When and how is this possible? In other words, is one process allowed to handle requests for multiple Rails applications?
I have the same Rails application exported in multiple URLs and multiple virtual hosts (such as http:// and https://). Will the same process be able to serve different virtual hosts? (The answer to this seems to be yes, I've set a global variable in answering a request to virtual host A, and I was able to retrieve the value in virtual host B.)

Generally speaking, Passenger spawns new processes by forking an ApplicationSpawner, which has the framework and application code pre-loaded into memory, or a FrameworkSpawner, which just has the framework code.
Passenger, as far as I know, doesn't deal in threads. Instead, as the load increases on an application, it will fork that Application's ApplicationSpawner and initialize another instance. When load decreases, one or more application instances are killed off.
If Passenger is configured in a certain way (I believe by choosing the "smart" spawn method), it will create a FrameworkSpawner, which loads the rails code, but no application code, which can then be forked to load and application using that version of Rails.
So to answer your questions:
It will serve them sequentially, then spawn additional processes if it decides the load is high enough.
No. One process can only belong to a single Rails Application.
I'm kind of sketchy on this one, but your experiment makes sense. Passenger should be smart enough to figure out that even though it's running from different places in the server config, you're talking about the same application. It's probably based on the application's filesystem path.
EDIT: I went and read up on this a bit. Turns out I was mostly right, but the technical details were a bit off. See the Passenger documentation

Yup, Burke is right. In case of the third question, Phusion Passenger recognizes applications by their application root path. So even if you have two virtual hosts, if they both point to the same DocumentRoot then Phusion Passenger will think that they're the same app.

Related

How rails server on production works?

I wonder, in general is it more like PHP (it loads into memory, executes, and dies for each connect).
Or it like Node.js (single instance stays in memory and accepts all requests)
Technically it's the latter, but depending on the application server, it can be made to look like the former because the former is easier to manage. One example is Phusion Passenger. Take a look at https://www.phusionpassenger.com/ and http://www.modrails.com/documentation/Architectural%20overview.html
Second option.
In fact it Ruby that boot the application (can have multiple instances depending of the case .i.e: using puma you can request multiple workers to handle requests) then as soon as ready (depending of the side of your application .i.e: if your routes.rb file where you build each URLs is huge, it will take more time of course) the application start to handle the requests.

Thin vs Unicorn on Heroku

Just wanted to get people's opinions on using Unicorn vs Thin as a rails server. Most of the articles/benchmarks I found online seem very incomplete, so it would nice to have a centralized place to discuss it.
Unicron is a multi-processes server, while thin is an event based/non-blocking server. Event-based servers are great... if your code is asynchronous/non-blocking - vanilla rails is blocking. So unless you use non-blocking rails libraries, I really don't see the advantage of using Thin. Even worse, in a non-blocking server, if your i/o loop is blocking you're going to block the entire loop and not be able to handle any more requests until the blocking call returns. Blocking libraries are going to slow thin down!
Why did Heroku choose Thin as their default server (for cedar)? They are smart guys, so I'm sure they had a reason.
Bellow is a link that suggests replacing Thin with 4 Unicorn workers - this makes perfect sense to me.
4 Unicron workers on Heroku
Thin is easy to configure - not optimal, but it just works in the Heroku environment.
Unicorn can be more efficient, but it needs to be configured: How many workers? Preload App? What do you pick?
I have released Unicorn Heroku apps with workers set to 3, 5 and 8 - just based on how big each app is - how much code, how much memory is used and how much traffic you get all go into picking this number, and you need to monitor over time to make sure you got the number right, and your app isn't running out of memory.
Preload false - this will make your app start slower, but when Unicorn restarts a worker, this is 'safer' with network connections (memcache, postgres, mongo etc)
Preload true - this is better, but you need to handle server re-connections correctly in the pre and post fork code.
Thin has none of these issues out of the box, but you only get process of execution.
Summary: It's really hard to configure Unicorn out of the box to work well (or at all) for everyone, whereas Thin can just work to get people running with fewer support requests.
Recently (only a few months ago) the folks behind Phusion Passenger add support to Heroku. Definitely this is an alternative you should try and see if fits your needs.
Is blazing fast even with 1 dyno and the drop in response time is palpable.
A simple Passenger Ruby Heroku Demo is hosted on github.
The main benefits that Passengers on Heroku claims are:
Static asset acceleration through Nginx - Don't let your Ruby app serve static assets, let Nginx do it for you and offload your app for the really important tasks. Nginx will do a much better job.
Multiple worker processes - Instead of running only one worker on a dyno, Phusion Passenger runs multiple worker on a single dyno, thus utilizing its resources to its fullest and giving you more bang for the buck. This approach is similar to Unicorn's. But unlike Unicorn, Phusion Passenger dynamically scales the number of worker processes based on current traffic, thus freeing up resources when they're not necessary.
Memory optimizations - Phusion Passenger uses less memory than Thin and Unicorn. It also supports copy-on-write virtual memory in combination with code preloading, thus making your app use even less memory when run on Ruby 2.0.
Request/response buffering - The included Nginx buffers requests and responses, thus protecting your app against slow clients (e.g. mobile devices on mobile networks) and improving performance.
Out-of-band garbage collection - Ruby's garbage collector is slow, but why bother your visitors with long response times? Fix this by running garbage collection outside of the normal request-response cycle! This concept, first introduced by Unicorn, has been improved upon: Phusion Passenger ensures that only one request at the same time is running out-of-band garbage collection, thus eliminating all the problems Unicorn's out-of-band garbage collection has.
JRuby support - Unicorn's a better choice than Thin, but it doesn't support JRuby. Phusion Passenger does.
Hope this helps.
Heroku does not use intelligent routing - it will randomly assign jobs to dynos regardless of whether the dyno is busy. Thus, if your dyno cannot handle multiple jobs at once, you will get latency (perhaps massive latency) even if you are paying for lots of other dynos that are free. " That's right — if your app needs 80 dynos with an intelligent router, it needs 4,000 with a random router. "
http://news.rapgenius.com/James-somers-herokus-ugly-secret-lyrics
Heroku says they are working on this, and their plan is to make it easier to use Unicorn. They basically said "Oops, we didn't notice that this was a problem for a few years... and now that we look, it's definitely a problem for Thin... so I guess you need to use a different program than the one we've been pushing all this time."
http://news.rapgenius.com/Jesper-joergensen-routing-performance-update-lyrics
From the official Heroku explanation (second link above):
"Rails, in fact, does not yet reliably support concurrent request handling. This leaves Rails developers unable to leverage the additional concurrency capabilities offered by the Cedar stack, unless they move to a concurrent web server like Puma or Unicorn.
Rails apps deployed to Cedar with Thin can rather quickly end up with request queuing problems. Because the Cedar router no longer does any queuing on behalf of the app, requests queued at the dyno must wait until the single Rails process works its way through the queue. Many customers have run into this issue and we failed to take action and provide them with a better approach to deploying Rails apps on Cedar."
Also of interest is that their performance tools, including New Relic, have not been reporting time spent in the dyno queue.
http://news.rapgenius.com/Lemon-money-trees-rap-genius-response-to-heroku-lyrics
Oops.

why do we need an apache server when we deploy a rails app?

i though we could just deploy it with webrick or mongrel
Most Ruby application servers will only run a single Ruby process (and Ruby has a global interpreter lock that makes multithreading quite pointless), which means that it can only serve one request at a time. To say the least, this will not give you very good performance.
There are two ways around this: either you run several Ruby application servers and put a load balancer or reverse proxy in front of them, e.g. Nginx or Apache in front of a pack of Mongrels or Thin servers (the number of processes you run reflects the number of requests you will be able to handle in parallel). Or you run Passenger, which is an Apache or Nginx module that manages a pool of applications that can dynamically grow and shrink as the load changes. The first option gives you more configuration options, but the second option is easier to manage. Which one you want depends on your use case.
There are of course other solutions too, but they are for more specific use cases. You can, for example, write a very performant application and deploy it with Thin -- but it requires that you write an event driven application. You can't deploy a Rails app and expect the same performance.
Before Phusion Passenger allowed Rails hosting with Apache and nginx, deploying a rails app was scary and difficult. Apache is a very mature web server which scales easily and is configurable to meet many needs. (nginx is not as mature but is very efficient, also very configurable and a great alternative to Apache for rails hosting.) Webrick and Mongrel are great for development, but unless you are an expert, it is difficult to set them up for production use.
You can technically, but you don't usually want to, because that will impose a fair bit of overhead when serving static files like css or images.
There are any number of ways you can deploy a Rails app without involving Apache, but Apache is the most popular server around, the most mature server around and among the most stable and scalable. WEBrick and Mongrel both have their own merits, but Apache is just the default assumption for Web servers and the path of least resistance in most cases.

How to balance load in a Apache + Mongrel application

I was wondering if someone can explain how can a rails application be balanced.
Two questions:
Does it even help having separate rails applications reading from the same database in the same dedicated server?
I understand Apache can balance load installing some extra modules? am i right? how can we accomplish this? (please provide explanation for dummies)
I would have a look at using Passenger - it has largely superseded Mongrel and handles running multiple Rails instances.
Rails is single threaded, so when deploying with Mongrel it is "normal" to run several Mongrel instances in a cluster fronted by Apache with mod_proxy installed. This lets Apache dispatch multiple requests to free application instances.
Any reasonable databases is designed for high levels of concurrent requests so should be able to handle a far number of application instances.
Depending on your server resources there is great benefit in running multiple Mongrel instances - it is actually the only way to serve concurrent requests.
Even on a small-memory host (say 512mb), if your Rails app uses 100mb of memory you would be easily able to run several instances without running out of resources - you could then serve as many concurrent requests as you have instances.
Sliecehost has some awesome articles like this one: http://articles.slicehost.com/2009/4/17/centos-apache-rails-and-mongrels

Recommendations (and Differences) between different Ruby on Rails Production Web Servers

Very soon I plan on deploying my first Ruby on Rails application to a production environment and I've even picked a webhost with all the managed server and Capistrano goodness you'd expect from a RoR provider.
The provider allows for Mongrel, Thin, Passenger & FastCGI web servers, which seems very flexible, but I honestly don't know the differences between them. I have looked into them some, but it all gets a bit much when they start talking about features and maximum simultaneous requests - and that this data seems to vary depending on who's publishing it.
I have looked at Passenger (on the surface) - which does seem very appealing to me - but I was under the impression that Passenger wasn't the actual webserver, and instead was more like a layer on top of Apache or nginx and managed spawned instances of the application (like a Mongrel cluster).
Can anyone please set me straight with the differences in layman's terms so as I can choose wisely (because anyone who's seen Indiana Jones and the Last Crusade knows what happens if you choose poorly).
Short answer
Go with Apache/Nginx + Passenger. Passenger is fast, reliable, easy to configure and deploy. Passenger has been adopted by a large number of big Rails applications, including Shopify.
(source: modrails.com)
The long answer
Forget about CGI and FastCGI. In the beginning there were no other alternatives so the only way to run Rails was using CGI or the faster browser FastCGI. Nowadays almost nobody runs Rails under CGI. The latest Rails versions no longer provides .cgi and .fcgi runners.
Mongrel has been a largely adopted solution, the best replacement for CGI and FCGI. Many sites still use Mongrel and Mongrel cluster, however Mongrel project is almost dead and many projects already moved to other solutions (mostly Passenger).
Also, a Mongrel based architecture is quite hard to configure because it needs a frontend proxy (thin, ngnix) and a backend architecture composed of multiple Mongrel instances.
Passenger has been gaining widespread attention since it was released. Many projects switched from Mongrel to Passenger for many reasons, including (but not limited to) easy deployment, maintainability and performance. Additionally, Passenger is now available for both Apache and Ngnix.
The simplest way to use Passenger is the Apache + Passenger configuration. One Apache installation and multiple Passenger processes.
If you need better performance and scalability, you can use Ngnix as a frontend proxy and forward all Rails requests to multiple backend servers, each one composed of Apache + Passenger.
I'm not going into the technical details here, this solution is intended to be used by Rails projects with an high level of traffic.
Even more complex solutions include a combination of different levels including http proxies and servers. You can have an idea of what I'm talking about reading some internal details from GitHub and Heroku.
Right now, Passenger is the best answer for most Rails projects.
Mongrel and Thin are single ruby process servers that you would run multiple of as a cluster behind some type of proxy (like Apache or Nginx). The proxy would manage which instance of Mongrel or Thin services the requests.
Passenger creates an interface between Apache or Nginx that creates an application spawning process and then forks out processes to server up incoming requests as they come in. There are a lot of configuration options for how long those processes live, how many there can be, and how many requests they will serve before they die. This is by far the most common way to scale up and handle a high traffic application, but it is not without drawbacks. This can only be done on a *nix operating system (linux, mac os x, etc). Also, these processes spin up on demand, so if no one accesses your site for a while, they processes die and the next request has the delay of it starting back up again. With Mongrel and Thin, the process is always running. Sometimes though, your processes being new and fresh can be a good thing for memory usage etc.
If it is going to be a relatively low traffic site, Mongrel or Thin provides a simple, easy to manage way to deploy the application. For higher traffic sites where you need the smart queuing and process management of something like Passenger, it is a very good solution.
As for fastcgi, you probably want to use that as a last option.
I use Passenger + nginx. It works really, really well.
To get some instant performance boast with passenger, I recommend using ruby enterprise edition.

Resources