What is the role of Apache/nginx in a Rails app - ruby-on-rails

This question is probably obvious, but I don't quite get it yet. To my knowledge a Rails app is deployed by using a web server like Apache or nginx or a cloud provider like Heroku.
What is the responsibility of a webserver like Apache/nginx. Why is the Rails app not simply run on WEBrick only by running rails server.

Basically, it sounds like your question is:
Q: Why not just use WebBrick as a production server (instead of only
for RoR development)?
Here are several good discussions:
http://www.redmine.org/boards/2/topics/2656
Advantages of using Passenger + Apache over Webrick
And from the Arch Linux documentation:
While the default Ruby On Rails HTTP server (WeBrick) is convenient
for basic development it is not recommended for production use.
Generally, you should choose between installing the Phusion Passenger
module for your webserver (Apache or Nginx), or use a dedicated
application-server (such as Mongrel or Unicorn) combined with a
separate web-server acting as a reverse proxy.

I'd probably rephrase your question as "why is there a separate web and application tier in Ruby apps?"
In production deployments of Ruby applications there is typically a web tier (e.g. Apache or Nginx) and an application tier (e.g. Unicorn, Thin, Passenger). The web tier and application tiers serve different purposes:
Web tier - Manages HTTP connections, which are potentially persistent and long-lived. Usually responsible for some configuration of the production deployment (normalizing URLs through rewrites, blocking categories of bad requests, etc.). Sometimes responsible for HTTPS termination (especially in environments without a load balancer). Sometimes responsible for serving static assets, a task at which web servers excel. Most web servers can handle thousands of concurrent requests, with minimal resources needed per request. So if the web server can handle a request without hitting the app tier, it's strongly preferable for the web server to handle the request.
Application tier - Manages requests to the application itself, which usually require some amount of application logic and access to the data storage tier. Requests are generally expected to be short lived (max a few seconds and ideally a few 10s of msec, Rails Live Streaming excepted). Concurrency is far more limited in the application tier - most app servers can handle a much smaller number of concurrent requests (1 per process for Thin/Unicorn).
Note this architecture is relatively common to other languages - PHP, Java - as these differentiations largely hold true in systems running those languages as well.
It is possible to run with a unified web and application tier, but that generally requires a system that decouples requests from threads or processes - meaning that one doesn't need a thread or process for each concurrent request. It adds some complexity to the development side (see Node.js) but can have significant scalability benefits.

Related

Why is Rails' ActiveSupport::Cache::MemoryStore Not Appropriate for Large Applications?

In the Rails Documentation, it is stated that the memory store isn't good for large deployments, but it doesn't say why:
"This cache store is not appropriate for large application deployments"
In a production application deployment you will have many processes running your Rails app on many servers. Your MemoryStore cache will be unique to each process. This doesn't allow them to share cache, and compounds the work of warming and invalidation. From the doc:
If you’re running multiple Ruby on Rails server processes (which is the case if you’re using mongrel_cluster or Phusion Passenger), then your Rails server process instances won’t be able to share cache data with each other. This cache store is not appropriate for large application deployments, but can work well for small, low traffic sites with only a couple of server processes or for development and test environments.

Difference between Nginx and Mongrel?

I often read about Nginx and Mongrel being used together. Can someone explain to me how they are different? Why is Mongrel needed? Why is it not advisable to have Nginx directly communicate to the many Rails servers?
Both are web servers, but they do not share the same focus :
Mongrel is basically a ruby application server that presents an HTTP Interface. It does one thing, taking a request, passing it to your ruby code and serves the answer back in http. It does not handle concurrency, or any performance related feature. One mongrel means there is one ruby process that will handle requests.
Nginx is a fully featured web server, aimed at performances. It can deliver high performance on static files and can't handle Ruby, Python or any other language in a direct manner. It relies on FastGCI or proxying to other application servers to do that.
To be clear, your rails app by itself isn't directly usable, it needs what you can call a container (I suggest you read some about http://rack.github.com/), in this case Mongrel. When you run rails console, it's usually webrick, the most basic web "app" server we have in Ruby (it's part of the standard library).
Then why do we use Nginx in front ? Let's consider we use only Mongrel : we fire a mongrel instance, listening on the port 80. If your requests takes for example 500 ms to complete, you can handle 2 clients per second any nothing more. But wait that's clearly not enough. Let's fire another mongrel instance. But we can't have it listen on the port 80 since it's already used by the first instance and there's nothing we can do about it.
So we need something in front that can handle multiple Mongrel instances, by still listening the port 80. You throw in a Nginx server, that will (proxy) dispatch the requests to your many mongrel instances and you can now add more instances to serve more clients simultaneously.
Back to answering your question, having NGinx communicating to a rails server, means firing one or many Mongrel (or Thin / Unicorn, whatever server is available) and informing NGinx it has to pass the requests to them. It's a popular pattern to host rails services next to using Passenger, which basically provides a way for Apache workers to handle ruby code.
Difference between Nginx and Mongrel
Both are indeed HTTP server, but their focus is different. Mongrel is
a fast HTTP server aimed mostly at Ruby-based applications. it's easily extensible with Ruby code. However, it's not very
good at serving static files, i.e. it's slower than Apache and nginx.
Also, Rails is single threaded, meaning that during the course of a
request (calling a controller method until the actual rendering) the
mongrel is locked.
To work around the above mentioned disadvantages of Mongrel and
Rails, the preferred setup in a production app is to put either
Apache or nginx as the main webserver and if a request for a non-
static Rails page is received, to pass this to a number of underlying
mongrels, let the mongrel hand back the rendered page to Apache/nginx
and serve that page, together with static files such as images/
stylesheets/… It might seem a bit daunting and complex at first, but
once you actually implement it, it's extremely powerful and stable (I
have several apps that have been running for months to years on a
server without a single restart).
It boils down to this, let Apache/nginx do what it's best at, let the
mongrel cluster do what it's best at, everybody is happy.
Choosing nginx over Apache is mostly based on memory considerations.
Apache is quite a hefty webserver, especially if all you actually do
is serve some static files with it and balance the rest over a bunch
of mongrels. Nginx is very lightweight and performant and can do the
same job just as good as Apache. But if you're familiar with Apache,
don't want to get the grips with nginx configuration and have lots of
memory on your server, you can still go for Apache. On a basic VPS,
nginx is a more suitable approach.
for your more information
Apache vs Nginx
They're both web servers. They can serve static files but - with the right modules - can also serve dynamic web apps e.g. those written in PHP. Apache is more popular and has more features, Nginx is smaller and faster and has less features.
Neither Apache nor Nginx can serve Rails apps out-of-the-box. To do that you need to use Apache/Nginx in combination with some kind of add-on, described later.
Apache and Nginx can also act as reverse proxies, meaning that they can take an incoming HTTP request and forward it to another server which also speaks HTTP. When that server responds with an HTTP response, Apache/Nginx will forward the response back to the client. You will learn later why this is relevant.
Mongrel vs WEBrick
Mongrel is a Ruby "application server". In concrete terms this means that Mongrel is an application which:
Loads your Rails app inside its own process space.
Sets up a TCP socket, allowing it to communicate with the outside world (e.g. the Internet). Mongrel listens for HTTP requests on this socket and passes the request data to the Rails app. The Rails app then returns an object which describes how the HTTP response should look like, and Mongrel takes care of converting it to an actual HTTP response (the actual bytes) and sends it back over the socket.
WEBrick does the same thing. Differences with Mongrel:
It is written entirely in Ruby. Mongrel is part Ruby part C; mostly Ruby, but its HTTP parser is written in C for performance.
WEBrick is slower and less robust. It has some known memory leaks and some known HTTP parsing problems.
WEBrick is usually only used as the default server during development because WEBrick is included in Ruby by default. Mongrel needs to be installed separately. Nobody uses WEBrick in production environments.
Another Ruby application server that falls under the same category is Thin. While it's internally different from both Mongrel and WEBrick it falls under the same category when it comes to usage and its overall role in the server stack.

why do we need an apache server when we deploy a rails app?

i though we could just deploy it with webrick or mongrel
Most Ruby application servers will only run a single Ruby process (and Ruby has a global interpreter lock that makes multithreading quite pointless), which means that it can only serve one request at a time. To say the least, this will not give you very good performance.
There are two ways around this: either you run several Ruby application servers and put a load balancer or reverse proxy in front of them, e.g. Nginx or Apache in front of a pack of Mongrels or Thin servers (the number of processes you run reflects the number of requests you will be able to handle in parallel). Or you run Passenger, which is an Apache or Nginx module that manages a pool of applications that can dynamically grow and shrink as the load changes. The first option gives you more configuration options, but the second option is easier to manage. Which one you want depends on your use case.
There are of course other solutions too, but they are for more specific use cases. You can, for example, write a very performant application and deploy it with Thin -- but it requires that you write an event driven application. You can't deploy a Rails app and expect the same performance.
Before Phusion Passenger allowed Rails hosting with Apache and nginx, deploying a rails app was scary and difficult. Apache is a very mature web server which scales easily and is configurable to meet many needs. (nginx is not as mature but is very efficient, also very configurable and a great alternative to Apache for rails hosting.) Webrick and Mongrel are great for development, but unless you are an expert, it is difficult to set them up for production use.
You can technically, but you don't usually want to, because that will impose a fair bit of overhead when serving static files like css or images.
There are any number of ways you can deploy a Rails app without involving Apache, but Apache is the most popular server around, the most mature server around and among the most stable and scalable. WEBrick and Mongrel both have their own merits, but Apache is just the default assumption for Web servers and the path of least resistance in most cases.

How to balance load in a Apache + Mongrel application

I was wondering if someone can explain how can a rails application be balanced.
Two questions:
Does it even help having separate rails applications reading from the same database in the same dedicated server?
I understand Apache can balance load installing some extra modules? am i right? how can we accomplish this? (please provide explanation for dummies)
I would have a look at using Passenger - it has largely superseded Mongrel and handles running multiple Rails instances.
Rails is single threaded, so when deploying with Mongrel it is "normal" to run several Mongrel instances in a cluster fronted by Apache with mod_proxy installed. This lets Apache dispatch multiple requests to free application instances.
Any reasonable databases is designed for high levels of concurrent requests so should be able to handle a far number of application instances.
Depending on your server resources there is great benefit in running multiple Mongrel instances - it is actually the only way to serve concurrent requests.
Even on a small-memory host (say 512mb), if your Rails app uses 100mb of memory you would be easily able to run several instances without running out of resources - you could then serve as many concurrent requests as you have instances.
Sliecehost has some awesome articles like this one: http://articles.slicehost.com/2009/4/17/centos-apache-rails-and-mongrels

Recommendations (and Differences) between different Ruby on Rails Production Web Servers

Very soon I plan on deploying my first Ruby on Rails application to a production environment and I've even picked a webhost with all the managed server and Capistrano goodness you'd expect from a RoR provider.
The provider allows for Mongrel, Thin, Passenger & FastCGI web servers, which seems very flexible, but I honestly don't know the differences between them. I have looked into them some, but it all gets a bit much when they start talking about features and maximum simultaneous requests - and that this data seems to vary depending on who's publishing it.
I have looked at Passenger (on the surface) - which does seem very appealing to me - but I was under the impression that Passenger wasn't the actual webserver, and instead was more like a layer on top of Apache or nginx and managed spawned instances of the application (like a Mongrel cluster).
Can anyone please set me straight with the differences in layman's terms so as I can choose wisely (because anyone who's seen Indiana Jones and the Last Crusade knows what happens if you choose poorly).
Short answer
Go with Apache/Nginx + Passenger. Passenger is fast, reliable, easy to configure and deploy. Passenger has been adopted by a large number of big Rails applications, including Shopify.
(source: modrails.com)
The long answer
Forget about CGI and FastCGI. In the beginning there were no other alternatives so the only way to run Rails was using CGI or the faster browser FastCGI. Nowadays almost nobody runs Rails under CGI. The latest Rails versions no longer provides .cgi and .fcgi runners.
Mongrel has been a largely adopted solution, the best replacement for CGI and FCGI. Many sites still use Mongrel and Mongrel cluster, however Mongrel project is almost dead and many projects already moved to other solutions (mostly Passenger).
Also, a Mongrel based architecture is quite hard to configure because it needs a frontend proxy (thin, ngnix) and a backend architecture composed of multiple Mongrel instances.
Passenger has been gaining widespread attention since it was released. Many projects switched from Mongrel to Passenger for many reasons, including (but not limited to) easy deployment, maintainability and performance. Additionally, Passenger is now available for both Apache and Ngnix.
The simplest way to use Passenger is the Apache + Passenger configuration. One Apache installation and multiple Passenger processes.
If you need better performance and scalability, you can use Ngnix as a frontend proxy and forward all Rails requests to multiple backend servers, each one composed of Apache + Passenger.
I'm not going into the technical details here, this solution is intended to be used by Rails projects with an high level of traffic.
Even more complex solutions include a combination of different levels including http proxies and servers. You can have an idea of what I'm talking about reading some internal details from GitHub and Heroku.
Right now, Passenger is the best answer for most Rails projects.
Mongrel and Thin are single ruby process servers that you would run multiple of as a cluster behind some type of proxy (like Apache or Nginx). The proxy would manage which instance of Mongrel or Thin services the requests.
Passenger creates an interface between Apache or Nginx that creates an application spawning process and then forks out processes to server up incoming requests as they come in. There are a lot of configuration options for how long those processes live, how many there can be, and how many requests they will serve before they die. This is by far the most common way to scale up and handle a high traffic application, but it is not without drawbacks. This can only be done on a *nix operating system (linux, mac os x, etc). Also, these processes spin up on demand, so if no one accesses your site for a while, they processes die and the next request has the delay of it starting back up again. With Mongrel and Thin, the process is always running. Sometimes though, your processes being new and fresh can be a good thing for memory usage etc.
If it is going to be a relatively low traffic site, Mongrel or Thin provides a simple, easy to manage way to deploy the application. For higher traffic sites where you need the smart queuing and process management of something like Passenger, it is a very good solution.
As for fastcgi, you probably want to use that as a last option.
I use Passenger + nginx. It works really, really well.
To get some instant performance boast with passenger, I recommend using ruby enterprise edition.

Resources