why do we need an apache server when we deploy a rails app? - ruby-on-rails

i though we could just deploy it with webrick or mongrel

Most Ruby application servers will only run a single Ruby process (and Ruby has a global interpreter lock that makes multithreading quite pointless), which means that it can only serve one request at a time. To say the least, this will not give you very good performance.
There are two ways around this: either you run several Ruby application servers and put a load balancer or reverse proxy in front of them, e.g. Nginx or Apache in front of a pack of Mongrels or Thin servers (the number of processes you run reflects the number of requests you will be able to handle in parallel). Or you run Passenger, which is an Apache or Nginx module that manages a pool of applications that can dynamically grow and shrink as the load changes. The first option gives you more configuration options, but the second option is easier to manage. Which one you want depends on your use case.
There are of course other solutions too, but they are for more specific use cases. You can, for example, write a very performant application and deploy it with Thin -- but it requires that you write an event driven application. You can't deploy a Rails app and expect the same performance.

Before Phusion Passenger allowed Rails hosting with Apache and nginx, deploying a rails app was scary and difficult. Apache is a very mature web server which scales easily and is configurable to meet many needs. (nginx is not as mature but is very efficient, also very configurable and a great alternative to Apache for rails hosting.) Webrick and Mongrel are great for development, but unless you are an expert, it is difficult to set them up for production use.

You can technically, but you don't usually want to, because that will impose a fair bit of overhead when serving static files like css or images.

There are any number of ways you can deploy a Rails app without involving Apache, but Apache is the most popular server around, the most mature server around and among the most stable and scalable. WEBrick and Mongrel both have their own merits, but Apache is just the default assumption for Web servers and the path of least resistance in most cases.

Related

Is it necessary to use Nginx when deploy a Rails API ONLY app?

Currently I've already read a lot tutorials teaching about Rails App deployment, almost every one of them use Nginx or other alternatives like Apache to serve static pages. Also in this QA Why do we need nginx with thin on production setup? they said Nginx is used for load balancing.
I can understand the reasons mentioned above, but I write a Rails App as a pure API backend service, the only purpose is to serve JSON formatted data for other client-side apps, no pages rendering at all. So my questions are:
In my situation, do I really need Nginx just to deploy a pure API Rails App?
If not, how should I deploy my app? just running it (with unicorn in production env) at background is good enough? like rails server -e production -d?
I'm so curious about these two question, hope someone can explain the details or show me some good references for me, thanks in advance.
See basically, Unicorn or thin are all single threaded servers. Which in a way means they handle single requests at a time, using defering and other techniques.
For a production setup you would generally run many instances of unicorn or thin ( depending on your load etc ), you would need to load balance b/w those rails application server instances, thats why you need Nginx or something similar on top.
to serve JSON formatted data for other client-side apps, no pages rendering at all
You see, it makes no difference. These tasks are similar: format certain data into a specific text-based structure. Much like rendering a view in HAML, ERB or whatever.
The difference is, you won't be serving static assets. At least, it's not practical for pure JSON APIs.
If you aim for compact JSON responses, your best bet is Unicorn (several workers) + nginx.
Unicorn is simpler and aims for fast single-client response. That is, a slow client could force Unicorn to waste a lot of time serving him a response. When backed by nginx though, it fires the entire response into nginx's buffer and heads for the next one only waiting for nginx to accept the response (since it's usually on the same machine, it's blazing fast). nginx then hands out responses. There could possibly be multiple instances of Unicorn, if one is not enough: but using only one could eliminate any kinds of data races on app level (which are possible in multithreaded apps).
Thin is designed by itself to handle multiple clients concurrently by itself, through use of threads and workers. Keep in mind though that MRI ("classic" Ruby) doesn't have "truly concurrent threads" because of GIL. Technoligies it's based on (Ruby+C) make it inferior to nginx (pure C) in terms of resource usage efficiency. nginx is even used sometimes to counter DDoS attacks, efficiency is proven in the wild (<1> <2> <3> and many more).
You could benefit from Thin if your app implied concurrent service for multiple clients, like Server-Sent Events or WebSocket usage, that require maintaining a constant connection. This one doesn't seem to. Don't count on concurrency too much where it's not required.

Difference between Nginx and Mongrel?

I often read about Nginx and Mongrel being used together. Can someone explain to me how they are different? Why is Mongrel needed? Why is it not advisable to have Nginx directly communicate to the many Rails servers?
Both are web servers, but they do not share the same focus :
Mongrel is basically a ruby application server that presents an HTTP Interface. It does one thing, taking a request, passing it to your ruby code and serves the answer back in http. It does not handle concurrency, or any performance related feature. One mongrel means there is one ruby process that will handle requests.
Nginx is a fully featured web server, aimed at performances. It can deliver high performance on static files and can't handle Ruby, Python or any other language in a direct manner. It relies on FastGCI or proxying to other application servers to do that.
To be clear, your rails app by itself isn't directly usable, it needs what you can call a container (I suggest you read some about http://rack.github.com/), in this case Mongrel. When you run rails console, it's usually webrick, the most basic web "app" server we have in Ruby (it's part of the standard library).
Then why do we use Nginx in front ? Let's consider we use only Mongrel : we fire a mongrel instance, listening on the port 80. If your requests takes for example 500 ms to complete, you can handle 2 clients per second any nothing more. But wait that's clearly not enough. Let's fire another mongrel instance. But we can't have it listen on the port 80 since it's already used by the first instance and there's nothing we can do about it.
So we need something in front that can handle multiple Mongrel instances, by still listening the port 80. You throw in a Nginx server, that will (proxy) dispatch the requests to your many mongrel instances and you can now add more instances to serve more clients simultaneously.
Back to answering your question, having NGinx communicating to a rails server, means firing one or many Mongrel (or Thin / Unicorn, whatever server is available) and informing NGinx it has to pass the requests to them. It's a popular pattern to host rails services next to using Passenger, which basically provides a way for Apache workers to handle ruby code.
Difference between Nginx and Mongrel
Both are indeed HTTP server, but their focus is different. Mongrel is
a fast HTTP server aimed mostly at Ruby-based applications. it's easily extensible with Ruby code. However, it's not very
good at serving static files, i.e. it's slower than Apache and nginx.
Also, Rails is single threaded, meaning that during the course of a
request (calling a controller method until the actual rendering) the
mongrel is locked.
To work around the above mentioned disadvantages of Mongrel and
Rails, the preferred setup in a production app is to put either
Apache or nginx as the main webserver and if a request for a non-
static Rails page is received, to pass this to a number of underlying
mongrels, let the mongrel hand back the rendered page to Apache/nginx
and serve that page, together with static files such as images/
stylesheets/… It might seem a bit daunting and complex at first, but
once you actually implement it, it's extremely powerful and stable (I
have several apps that have been running for months to years on a
server without a single restart).
It boils down to this, let Apache/nginx do what it's best at, let the
mongrel cluster do what it's best at, everybody is happy.
Choosing nginx over Apache is mostly based on memory considerations.
Apache is quite a hefty webserver, especially if all you actually do
is serve some static files with it and balance the rest over a bunch
of mongrels. Nginx is very lightweight and performant and can do the
same job just as good as Apache. But if you're familiar with Apache,
don't want to get the grips with nginx configuration and have lots of
memory on your server, you can still go for Apache. On a basic VPS,
nginx is a more suitable approach.
for your more information
Apache vs Nginx
They're both web servers. They can serve static files but - with the right modules - can also serve dynamic web apps e.g. those written in PHP. Apache is more popular and has more features, Nginx is smaller and faster and has less features.
Neither Apache nor Nginx can serve Rails apps out-of-the-box. To do that you need to use Apache/Nginx in combination with some kind of add-on, described later.
Apache and Nginx can also act as reverse proxies, meaning that they can take an incoming HTTP request and forward it to another server which also speaks HTTP. When that server responds with an HTTP response, Apache/Nginx will forward the response back to the client. You will learn later why this is relevant.
Mongrel vs WEBrick
Mongrel is a Ruby "application server". In concrete terms this means that Mongrel is an application which:
Loads your Rails app inside its own process space.
Sets up a TCP socket, allowing it to communicate with the outside world (e.g. the Internet). Mongrel listens for HTTP requests on this socket and passes the request data to the Rails app. The Rails app then returns an object which describes how the HTTP response should look like, and Mongrel takes care of converting it to an actual HTTP response (the actual bytes) and sends it back over the socket.
WEBrick does the same thing. Differences with Mongrel:
It is written entirely in Ruby. Mongrel is part Ruby part C; mostly Ruby, but its HTTP parser is written in C for performance.
WEBrick is slower and less robust. It has some known memory leaks and some known HTTP parsing problems.
WEBrick is usually only used as the default server during development because WEBrick is included in Ruby by default. Mongrel needs to be installed separately. Nobody uses WEBrick in production environments.
Another Ruby application server that falls under the same category is Thin. While it's internally different from both Mongrel and WEBrick it falls under the same category when it comes to usage and its overall role in the server stack.

Recommendations (and Differences) between different Ruby on Rails Production Web Servers

Very soon I plan on deploying my first Ruby on Rails application to a production environment and I've even picked a webhost with all the managed server and Capistrano goodness you'd expect from a RoR provider.
The provider allows for Mongrel, Thin, Passenger & FastCGI web servers, which seems very flexible, but I honestly don't know the differences between them. I have looked into them some, but it all gets a bit much when they start talking about features and maximum simultaneous requests - and that this data seems to vary depending on who's publishing it.
I have looked at Passenger (on the surface) - which does seem very appealing to me - but I was under the impression that Passenger wasn't the actual webserver, and instead was more like a layer on top of Apache or nginx and managed spawned instances of the application (like a Mongrel cluster).
Can anyone please set me straight with the differences in layman's terms so as I can choose wisely (because anyone who's seen Indiana Jones and the Last Crusade knows what happens if you choose poorly).
Short answer
Go with Apache/Nginx + Passenger. Passenger is fast, reliable, easy to configure and deploy. Passenger has been adopted by a large number of big Rails applications, including Shopify.
(source: modrails.com)
The long answer
Forget about CGI and FastCGI. In the beginning there were no other alternatives so the only way to run Rails was using CGI or the faster browser FastCGI. Nowadays almost nobody runs Rails under CGI. The latest Rails versions no longer provides .cgi and .fcgi runners.
Mongrel has been a largely adopted solution, the best replacement for CGI and FCGI. Many sites still use Mongrel and Mongrel cluster, however Mongrel project is almost dead and many projects already moved to other solutions (mostly Passenger).
Also, a Mongrel based architecture is quite hard to configure because it needs a frontend proxy (thin, ngnix) and a backend architecture composed of multiple Mongrel instances.
Passenger has been gaining widespread attention since it was released. Many projects switched from Mongrel to Passenger for many reasons, including (but not limited to) easy deployment, maintainability and performance. Additionally, Passenger is now available for both Apache and Ngnix.
The simplest way to use Passenger is the Apache + Passenger configuration. One Apache installation and multiple Passenger processes.
If you need better performance and scalability, you can use Ngnix as a frontend proxy and forward all Rails requests to multiple backend servers, each one composed of Apache + Passenger.
I'm not going into the technical details here, this solution is intended to be used by Rails projects with an high level of traffic.
Even more complex solutions include a combination of different levels including http proxies and servers. You can have an idea of what I'm talking about reading some internal details from GitHub and Heroku.
Right now, Passenger is the best answer for most Rails projects.
Mongrel and Thin are single ruby process servers that you would run multiple of as a cluster behind some type of proxy (like Apache or Nginx). The proxy would manage which instance of Mongrel or Thin services the requests.
Passenger creates an interface between Apache or Nginx that creates an application spawning process and then forks out processes to server up incoming requests as they come in. There are a lot of configuration options for how long those processes live, how many there can be, and how many requests they will serve before they die. This is by far the most common way to scale up and handle a high traffic application, but it is not without drawbacks. This can only be done on a *nix operating system (linux, mac os x, etc). Also, these processes spin up on demand, so if no one accesses your site for a while, they processes die and the next request has the delay of it starting back up again. With Mongrel and Thin, the process is always running. Sometimes though, your processes being new and fresh can be a good thing for memory usage etc.
If it is going to be a relatively low traffic site, Mongrel or Thin provides a simple, easy to manage way to deploy the application. For higher traffic sites where you need the smart queuing and process management of something like Passenger, it is a very good solution.
As for fastcgi, you probably want to use that as a last option.
I use Passenger + nginx. It works really, really well.
To get some instant performance boast with passenger, I recommend using ruby enterprise edition.

How does Phusion Passenger reuse threads and processes?

I am setting up an Apache2 webserver running multiple Ruby on Rails web applications with Phusion Passenger. I know that Passenger spawns Ruby processes for handling requests. I have the following questions:
If more than one request has to be handled at the same time, will Passenger spawn multiple processes or multiple (Ruby) threads? How do I configure it so it always spawns single-threaded processes?
If I have two Rails applications, imagine that a request for app A goes to process 1, then later request for app B arrives. Is it possible that process 1 will handle this request as well? When and how is this possible? In other words, is one process allowed to handle requests for multiple Rails applications?
I have the same Rails application exported in multiple URLs and multiple virtual hosts (such as http:// and https://). Will the same process be able to serve different virtual hosts? (The answer to this seems to be yes, I've set a global variable in answering a request to virtual host A, and I was able to retrieve the value in virtual host B.)
Generally speaking, Passenger spawns new processes by forking an ApplicationSpawner, which has the framework and application code pre-loaded into memory, or a FrameworkSpawner, which just has the framework code.
Passenger, as far as I know, doesn't deal in threads. Instead, as the load increases on an application, it will fork that Application's ApplicationSpawner and initialize another instance. When load decreases, one or more application instances are killed off.
If Passenger is configured in a certain way (I believe by choosing the "smart" spawn method), it will create a FrameworkSpawner, which loads the rails code, but no application code, which can then be forked to load and application using that version of Rails.
So to answer your questions:
It will serve them sequentially, then spawn additional processes if it decides the load is high enough.
No. One process can only belong to a single Rails Application.
I'm kind of sketchy on this one, but your experiment makes sense. Passenger should be smart enough to figure out that even though it's running from different places in the server config, you're talking about the same application. It's probably based on the application's filesystem path.
EDIT: I went and read up on this a bit. Turns out I was mostly right, but the technical details were a bit off. See the Passenger documentation
Yup, Burke is right. In case of the third question, Phusion Passenger recognizes applications by their application root path. So even if you have two virtual hosts, if they both point to the same DocumentRoot then Phusion Passenger will think that they're the same app.

is mod_rails or Phusion Passenger finally the answer to Ruby on Rails Deployment?

I read from some books that Phusion Passenger is the answer to easy Ruby on Rails deployment. But my friend said that first there was Apache + bunch of Mongrels, and then lighttpd, and then nginx, and now Passenger, and it seems endless...
he also said he uses dreamhost which uses Passenger, and sometimes he sees his request not being processed.
So I wonder if Passenger is the final answer to RoR deployment? do you use it and used the "ab" command to test if the site is doing quite well?
short answer: yes.
long answer: yeeeeeeeeeeeeeeesssssssssssssssss.
In all seriousness, Phusion Passenger and Ruby Enterprise Edition have taken out pretty much all of the pain of moving a Rails app into production. Previous approaches, including running a suite of Mongrels, required lots of setup surrounding starting, stopping, and recycling listener processes that Passenger handles transparently, or via simple Apache (or nginx) configuration options. And REE's complementary garbage collector means that forking off a new listener uses MUCH less memory, and is faster to boot (in Passenger's "smart" spawning mode).
Edit: #srboisvert makes a very good point; Passenger isn't the final answer to RoR deployment, but for now it's my favorite by far. One day, after a lot of hard engineering problems are solved, mainstream Ruby will probably move from hosting RoR using a multi-process model to a single-process model, which would make management even easier than with Passenger.
It's the best solution so far. I started deploying with FCGI and it was a pain. Then came mongrel and it was better. Then came mod_rails and it was WAY better.
Also a lot of large cool application are migrating to mod_rails including some by 37signals, so you know that's good.
I'll just end with a quote from DHH:
The one-piece solution with Phusion
Passenger
Once you've completed the incredibly
simple installation, you get an Apache
that acts as both web server, load
balancer, application server and
process watcher. You simply drop in
your application and touch
tmp/restart.txt when you want to
bounce it and bam, you're up and
running.
But somehow the message of Passenger
has been a little slow to sink in.
There's already a ton of big sites
running off it. Including Shopify,
MTV, Geni, Yammer, and we'll be moving
over first Ta-da List shortly, then
hopefully the rest of the 37signals
suite quickly thereafter.
So while there are still reasons to
run your own custom multi-tier setup
of manually configured pieces, just
like there are people shying away from
mod_php for their particulars, I think
we've finally settled on a default
answer. Something that doesn't require
you to really think about the first
deployment of your Rails application.
Something that just works out of the
box. Even if that box is a shared
host!
In conclusion, Rails is no longer hard
to deploy. Phusion Passenger has made
it ridiculously easy.
(via)
Yes, it is the easiest, fastest and most efficient solution.
After a lot of problems with gems like soap4r etc. had been resolved in recent releases, Passenger is the answer to deployment questions now.
We're running Apache/mod_rails in a balanced environment with HAProxy in front of 2 servers. It's much more reliable than our previous setup using Mongrel/Aapache.
It's very easy to take control over
the amount of Passenger processes running in Apache
the amount of Passenger processes running per application
and all that without the pain of tweaking a number of config files like mod_proxy, Apache.
setting up a virtual host and adding 3 lines to your Apache config is basically enough to get it running
Matt
Final Answer? Nothing is ever the final answer.
I'd say Passenger is the current answer though.
Yes. I've been running Nginx/Passenger in front of Apache for whatever still needs PHP since they released 2.2.0 a few weeks back. Especially with Ruby Enterprise Edition, it approaches what I would call "perfect".
I guess that now people will stick to mod_rails for many years. The module is really good. Configuration is dead simple. It will be hard to replace it with some better solution. Similar to mod_php. The only key component which is missing: Windows port.
In some situations (enterprise, etc) the JVM can also be a good option.

Resources