I have read that passenger is a multi-process server which means that it can handle multiple requests at a time.
I am running passenger in a standalone mode on my local machine and have written code to check if passenger is able to run multiple requests simultaneously or not. My code is:
class Test < ApplicationController
def index
sleep 10
end
end
I am hitting two http requests simultaneously and expecting two requests to return output after 10 seconds but one request returns output after 10 seconds and another one returns output after 20 seconds. So it proves that it is handling one request at a time and not simultaneously.
Does it means that passenger is a single process server and not multi-process server? or I am missing something.
Passenger (along with most other application servers) runs no more than one request per thread. Typically there is also only one thread per process. From the Phusion Passenger docs:
Phusion Passenger supports two concurrency models:
process: single-threaded, multi-processed I/O concurrency. Each application process only has a single thread and can only handle 1 request at a time. This is the concurrency model that Ruby applications traditionally used. It has excellent compatibility (can work with applications that are not designed to be thread-safe) but is unsuitable workloads in which the application has to wait for a lot of external I/O (e.g. HTTP API calls), and uses more memory because each process has a large memory overhead.
thread: multi-threaded, multi-processed I/O concurrency. Each application process has multiple threads (customizable via PassengerThreadCount). This model provides much better I/O concurrency and uses less memory because threads share memory with each other within the same process. However, using this model may cause compatibility problems if the application is not designed to be thread-safe.
(Emphasis my own)
try like
def index
n = params[:n].to_i
sleep n
render :text => "I should have taken #{n} seconds!"
end
Related
I have a Rails application which has 100% CPU utilization most of the time.
I am not able to figure out why there is so much load on the server. I am using the Puma web server with a default configuration, and am running multiple background jobs using the sucker-punch gem. There are 7 files which are using sucker punch jobs with 5 workers:
include SuckerPunch::Job
workers 5
I ran the top -i query and found the following processes running on the server. I can see multiple Ruby commands on the server. Can someone tell me whether this is normal behavior on a server, or if something is wrong?
Some Ways to Reduce Resource Contention
Your user space load is high (~48%), so you'll probably want to reduce the number of workers in your web application, increase the number of CPUs available on your instance, move to a version of Ruby that has better concurrency and real multi-core support (e.g. Rubinius or JRuby), or some combination of these options. Depending on what your code is actually doing, you may also need to re-architect your application to offload expensive I/O from the application server.
In addition, your steal time is quite high (~41%), so your EC2 instance is probably overloaded. Simply moving your application to a less-loaded instance may free up sufficient resources to reduce application wait times.
I have a simple app and I want to use it as webservice.
My problem is that I can't receive more than 1 request at the same time.
Apparently, the requests are enqueued and executed one by one. So, if I make 2 requests on the same URL, the second has to wait for the first one.
I've already tried to use Unicorn, Puma and Thin to enable concurrency on the requests, but it seems to keep queuing the requests by URL.
Example:
I make the request 1 at localhost:3000/example
I make another request at localhost:3000/another_example
I make the last request at localhost:3000/example
The first and second requests are executed concurrently, but the last one (that has the same URL that the first) has to wait for the first to finish.
Unicorn, Puma and Thin enable concurrency, but on different URLs.
NOTES:
I added on my config/application.rb:
config.allow_concurrency = true
I'm running the app with:
rails s Puma
How can I perform my requests concurrently?
You're right, each Puma/Thin/Unicorn/Passenger/Webrick workers houses a single Rails app instance (or Sinatra app instance, etc) per Ruby process. So it's 1 web worker = 1 app instance = 1 Ruby process.
Each request blocks the process until the response is ready. So it's usually 1 request per process.
Ruby itself has the so called "GIL" (Global Interpreter Lock) which blocks execution of multiple threads because of C extensions lack of thread-safe controles such as mutexes and semaphores. It means that threads won't run concurrently. In practice, they "can". I/O operations can block execution waiting for a response. For example, reading a file or waiting response from a network socket. In this case, Ruby allow another thread to resume until the I/O operation of the previous thread finishes.
But Rails used to have a single block of execution per request as well, it's own lock. But in Rails 3, they added thread-safe controls through the Rails code to ensure it could run in JRuby for example. And in Rails 4 they decided to have the thread-safe controls on by default.
In theory this means that more than one request can run in parallel even in Ruby MRI (as it supports native threads since 1.9). In practice one request can run while another is waiting for a database process to return for example. So you should see a few more requests running in parallel. If your example is CPU bound (more internal processing than I/O blocks) the effect should be as if the requests are running one after the other. Now, if you have more I/O blocks (as waiting for a large SQL select to return), you should see it running more in parallel (not completely though).
You will see parallel requests more often if you use a virtual machine with not only native threads but no Global Interpreter Lock, such is the case of JRuby. So I recommend using JRuby with Puma.
Puma and Passenger are both multi-threaded. Unicorn is fork-based. Thin is Eventmachine based. I'd personally recommend testing Passenger as well.
http://tenderlovemaking.com/2012/06/18/removing-config-threadsafe.html
https://bearmetal.eu/theden/how-do-i-know-whether-my-rails-app-is-thread-safe-or-not/
Update:
Read "Indicate to an ajax process that the delayed job has completed" before if you have the same problem. Thanks Gene.
I have a problem with concurrency. I have a controller scraping a few web sites, but each call to my controller needs about 4-5 seconds to respond.
So if I call 2 (or more) times in a row, the second call needs wait for the first call before starting.
So how I can fix this problem in my controller? Maybe with something like EventMachine?
Update & Example:
application_controller.rb
def func1
i=0
while i<=2
puts "func1 at: #{Time.now}"
sleep(2)
i=i+1
end
end
def func2
j=0
while j<=2
puts "func2 at: #{Time.now}"
sleep(1)
j=j+1
end
end
whatever_controller.rb
puts ">>>>>>>> Started At #{Time.now}"
func1()
func2()
puts "End at #{Time.now}"
So now I need request http://myawesome.app/whatever several times at the same times from the same user/browser/etc.
I tried Heroku (and local) with Unicorn but without success, this is my setup:
unicorn.rb http://pastebin.com/QL0wdGx0
Procfile http://pastebin.com/RrTtNWJZ
Heroku setup https://www.dropbox.com/s/wxwr5v4p61524tv/Screenshot%202014-02-20%2010.33.16.png
Requirements:
I need a RESTful solution. This is API so I need to responds JSON
More info:
I have right now 2 cloud servers running.
Heroku with Unicorn
Engineyard Cloud with Nginx + Panssenger
You're probably using webrick in development mode. Webrick only handles one request at a time.
You have several solutions, many ruby web servers exist that can handle concurrency.
Here are a few of them.
Thin
Thin was originally based on mongrel and uses eventmachine for handling multiple concurrent connections.
Unicorn
Unicorn uses a master process that will dispatch requests to web workers, 4 workers equals 4 concurrent possible requests.
Puma
Puma is a relatively new ruby server, its shiny feature is that it handles concurrent requests in threads, make sure your code is threadsafe !
Passenger
Passenger is a ruby server bundled inside nginx or apache, it's great for production and development
Others
These are a few alternatives, many other exist, but I think they are the most used today.
To use all these servers, please check their instructions. They are generally available on their github README.
For any long response time controller function, the delayed job gem
is a fine way to go. While it is often used for bulk mailing, it works as well for any long-running task.
Your controller starts the delayed job and responds immediately with a page that has a placeholder - usually a graphic with a progress indicator - and Ajax or a timed reload that updates the page with the full information when it's available. Some information on how to approach this is in this SO article.
Not mentioned in the article is that you can use redis or some other memory cache to store the results rather than the main database.
Answers above are part of the solution: you need a server environment that can properly dispatch concurrent requests to separate workers; unicorn or passenger can both work by creating workers in separate processes or threads. This allows many workers to sit around waiting while not blocking other incoming requests.
If you are building a typical bot whose main job is to get content from other sources, these solutions may be ok. But if what you need is a simple controller that can accept hundreds of concurrent requests, all of which are sending independent requests to other servers, you will need to manage threads or processes yourself. Your goal is to have many workers waiting to do a simple job, and one or more masters whose jobs it is to send requests, then be there to receive the responses. Ruby's Thread class is simple, and works well for cases like this with ruby 2.x or 1.9.3.
You would need to provide more detail about what you need to do for help getting to any more specific solution.
Try something like unicorn as it handles concurrency via workers. Something else to consider if there's a lot of work to be done per request, is to spin up a delayed_job per request.
The one issue with delayed job is that the response won't be synchronous, meaning it won't return to the user's browser.
However, you could have the delayed job save its responses to a table in the DB. Then you can query that table for all requests and their related responses.
What ruby version are you utilizing?
Ruby & Webserver
Ruby
If its a simple application I would recommend the following. Try to utilize rubinius (rbx) or jruby as they are better at concurrency. Although they have drawback as they're not mainline ruby so some extensions won't work. But if its a simple app you should be fine.
Webserver
use Puma or Unicorn if you have the patience to set it up
If you're app is hitting the API service
You indicate that the Global Lock is killing you when you are scraping other sites (presumably ones that allow scraping), if this is the case something like sidekiq or delayed job should be utilized, but with caution. These will be idempotent jobs. i.e. they might be run multiple times. If you start hitting a website multiple times, you will hit a website's Rate limit pretty quickly, eg. twitter limits you to 150 requests per hour. So use background jobs with caution.
If you're the one serving the data
However reading your question it sounds like your controller is the API and the lock is caused by users hitting it.
If this is the case you should utilize dalli + memcached to serve your data. This way you won't be I/O bound by the SQL lookup as memcached is memory based. MEMORY SPEED > I/O SPEED
I've read tons of material around the web about thread safety and performance in different versions of ruby and rails and I think I understand those things quite well at this point.
What seems to be oddly missing from the discussions is how to actually deploy an asynchronous Rails app. When talking about threads and synchronicity in an app, there are two things people want to optimize:
utilizing all CPU cores with minimal RAM usage
being able to serve new requests while previous requests are waiting on IO
Point 1 is where people get (rightly) excited about JRuby. For this question I am only trying to optimize point 2.
Say this is the only controller in my app:
class TheController < ActionController::Base
def fast
render :text => "hello"
end
def slow
render :text => User.count.to_s
end
end
fast has no IO and can serve hundreds or thousands of requests per second, and slow has to send a request over the network, wait for work to be done, then receive the answer over the network, and is therefore much slower than fast.
So an ideal deployment would allow hundreds of requests to fast to be fulfilled while a request to slow is waiting on IO.
What seems to be missing from the discussions around the web is which layer of the stack is responsible for enabling this concurrency. thin has a --threaded flag, which will "Call the Rack application in threads [experimental]" -- does that start a new thread for each incoming request? Spool up rack app instances in threads that persist and wait for incoming requests?
Is thin the only way or are there others? Does the ruby runtime matter for optimizing point 2?
The right approach for you depends heavily on what your slow method is doing.
In a perfect world, you could use use something like the sinatra-synchrony gem to handle each request in a fiber. You'd only be limited by the maximum number of fibers. Unfortunately, the stack size on fibers is hardcoded, and it is easy to overrun in a Rails app. Additionally, I've read a few horror stories of the difficulties of debugging fibers, due to the automatic yielding after async IO has been initiated. Race conditions are still possible when using fibers, as well. Currently, fibered Ruby is a bit of a ghetto, at least on the front-end of a web app.
A more pragmatic solution that doesn't require code changes is to use a Rack server that has pool of worker threads such as Rainbows! or Puma. I believe Thin's --threaded flag handles each request in a new thread, but spinning up a native OS thread is not cheap. Better to use a thread pool with the pool size set sufficiently high. In Rails, don't forget to set config.threadsafe! in production.
If you're OK with changing code, you can check out Konstantin Haase's excellent talk on real-time Rack. He discusses using the EventMachine::Deferrable class to produce a response outside of the traditional request/response cycle that Rack is built on. This seems really neat, but you have to rewrite the code in an async style.
Also take a look at Cramp and Goliath. These let you implement your slow method in a separate Rack app that is hosted alongside your Rails app, but you will probably have to rewrite your code to work in the Cramp/Goliath handlers as well.
As for your question about the Ruby runtime, it also depends on the work that slow is doing. If you're doing CPU-heavy computation, then you run the risk of the GIL giving you issues. If you're doing IO, then the GIL shouldn't get in your way. (I say shouldn't because I believe I've read about issues with the older mysql gem blocking the GIL.)
Personally, I've had success using sinatra-synchrony for a backend, mashup web service. I can issue several requests to external web services in parallel, and wait for all of them to return. Meanwhile, the frontend Rails server uses a thread pool, and makes requests directly to the backend. Not perfect, but it works well enough right now.
Sorry if this might seem obvious. I've monitored that a web request on my Rails app uses 30-33% of CPU every time. For example, if I load a web page, then 30% of CPU is used. Does that mean that my box can only handle 3 concurrent web requests, and will stall if there are more than 3 web requests (i.e. I'll get a 100% CPU)?
If so, does that also mean that if I want to handle more than 3 concurrent web requests, then I'll have to get more servers to handle the load using a load balancer? (e.g. to handle 6 concurrent web requests, I'll need 2 servers; for 9 concurrent requests, I'll need 3 servers; for 12, I'll need 4 servers -- and so on?)
I think you should start with load tests. I wouldn't trust manual testing that much.
Load tests tell you how long the response takes for each client, and how many clients
simply time-out.
Also you will be able to measure the improvements objectively for any changes that you make.
Look at ab, or httperf; there are many other tools available.
Stephan
Your Apache or Nginx in front of the Passenger will queue requests until a Passenger worker becomes available. You can limit the number of concurrent workers so your server never stalls (but new visitors will have to wait longer until it's their turn).
It's difficult to tell based on this information. It depends very much on the web server stack you're using and which environment you're running. Different servers (Mongrel, Webrick, Apache using various mechanisms, Unicorn) all have different memory characteristics. Different environments (development vs. test vs. production) all exhibit radically different memory usage characteristics.