For example if I assign Thread.current[:user] in the beginning of some requests, do I have to clean it up at the end of those request? Is this different between different versions of Rails or different server software such as Passenger, Mongrel and JRuby + Glassfish?
Hongli Lai (http://groups.google.com/group/phusion-passenger/msg/8c3fc0ba589726bf) says that mongrel spawns a new thread for every request but all other app servers process subsequent requests in the same thread. Cleaning up Thread.current in the beginning of every request (or not using it) seems to be the best way to deal with it.
Related
I have a simple app and I want to use it as webservice.
My problem is that I can't receive more than 1 request at the same time.
Apparently, the requests are enqueued and executed one by one. So, if I make 2 requests on the same URL, the second has to wait for the first one.
I've already tried to use Unicorn, Puma and Thin to enable concurrency on the requests, but it seems to keep queuing the requests by URL.
Example:
I make the request 1 at localhost:3000/example
I make another request at localhost:3000/another_example
I make the last request at localhost:3000/example
The first and second requests are executed concurrently, but the last one (that has the same URL that the first) has to wait for the first to finish.
Unicorn, Puma and Thin enable concurrency, but on different URLs.
NOTES:
I added on my config/application.rb:
config.allow_concurrency = true
I'm running the app with:
rails s Puma
How can I perform my requests concurrently?
You're right, each Puma/Thin/Unicorn/Passenger/Webrick workers houses a single Rails app instance (or Sinatra app instance, etc) per Ruby process. So it's 1 web worker = 1 app instance = 1 Ruby process.
Each request blocks the process until the response is ready. So it's usually 1 request per process.
Ruby itself has the so called "GIL" (Global Interpreter Lock) which blocks execution of multiple threads because of C extensions lack of thread-safe controles such as mutexes and semaphores. It means that threads won't run concurrently. In practice, they "can". I/O operations can block execution waiting for a response. For example, reading a file or waiting response from a network socket. In this case, Ruby allow another thread to resume until the I/O operation of the previous thread finishes.
But Rails used to have a single block of execution per request as well, it's own lock. But in Rails 3, they added thread-safe controls through the Rails code to ensure it could run in JRuby for example. And in Rails 4 they decided to have the thread-safe controls on by default.
In theory this means that more than one request can run in parallel even in Ruby MRI (as it supports native threads since 1.9). In practice one request can run while another is waiting for a database process to return for example. So you should see a few more requests running in parallel. If your example is CPU bound (more internal processing than I/O blocks) the effect should be as if the requests are running one after the other. Now, if you have more I/O blocks (as waiting for a large SQL select to return), you should see it running more in parallel (not completely though).
You will see parallel requests more often if you use a virtual machine with not only native threads but no Global Interpreter Lock, such is the case of JRuby. So I recommend using JRuby with Puma.
Puma and Passenger are both multi-threaded. Unicorn is fork-based. Thin is Eventmachine based. I'd personally recommend testing Passenger as well.
http://tenderlovemaking.com/2012/06/18/removing-config-threadsafe.html
https://bearmetal.eu/theden/how-do-i-know-whether-my-rails-app-is-thread-safe-or-not/
Update:
Read "Indicate to an ajax process that the delayed job has completed" before if you have the same problem. Thanks Gene.
I have a problem with concurrency. I have a controller scraping a few web sites, but each call to my controller needs about 4-5 seconds to respond.
So if I call 2 (or more) times in a row, the second call needs wait for the first call before starting.
So how I can fix this problem in my controller? Maybe with something like EventMachine?
Update & Example:
application_controller.rb
def func1
i=0
while i<=2
puts "func1 at: #{Time.now}"
sleep(2)
i=i+1
end
end
def func2
j=0
while j<=2
puts "func2 at: #{Time.now}"
sleep(1)
j=j+1
end
end
whatever_controller.rb
puts ">>>>>>>> Started At #{Time.now}"
func1()
func2()
puts "End at #{Time.now}"
So now I need request http://myawesome.app/whatever several times at the same times from the same user/browser/etc.
I tried Heroku (and local) with Unicorn but without success, this is my setup:
unicorn.rb http://pastebin.com/QL0wdGx0
Procfile http://pastebin.com/RrTtNWJZ
Heroku setup https://www.dropbox.com/s/wxwr5v4p61524tv/Screenshot%202014-02-20%2010.33.16.png
Requirements:
I need a RESTful solution. This is API so I need to responds JSON
More info:
I have right now 2 cloud servers running.
Heroku with Unicorn
Engineyard Cloud with Nginx + Panssenger
You're probably using webrick in development mode. Webrick only handles one request at a time.
You have several solutions, many ruby web servers exist that can handle concurrency.
Here are a few of them.
Thin
Thin was originally based on mongrel and uses eventmachine for handling multiple concurrent connections.
Unicorn
Unicorn uses a master process that will dispatch requests to web workers, 4 workers equals 4 concurrent possible requests.
Puma
Puma is a relatively new ruby server, its shiny feature is that it handles concurrent requests in threads, make sure your code is threadsafe !
Passenger
Passenger is a ruby server bundled inside nginx or apache, it's great for production and development
Others
These are a few alternatives, many other exist, but I think they are the most used today.
To use all these servers, please check their instructions. They are generally available on their github README.
For any long response time controller function, the delayed job gem
is a fine way to go. While it is often used for bulk mailing, it works as well for any long-running task.
Your controller starts the delayed job and responds immediately with a page that has a placeholder - usually a graphic with a progress indicator - and Ajax or a timed reload that updates the page with the full information when it's available. Some information on how to approach this is in this SO article.
Not mentioned in the article is that you can use redis or some other memory cache to store the results rather than the main database.
Answers above are part of the solution: you need a server environment that can properly dispatch concurrent requests to separate workers; unicorn or passenger can both work by creating workers in separate processes or threads. This allows many workers to sit around waiting while not blocking other incoming requests.
If you are building a typical bot whose main job is to get content from other sources, these solutions may be ok. But if what you need is a simple controller that can accept hundreds of concurrent requests, all of which are sending independent requests to other servers, you will need to manage threads or processes yourself. Your goal is to have many workers waiting to do a simple job, and one or more masters whose jobs it is to send requests, then be there to receive the responses. Ruby's Thread class is simple, and works well for cases like this with ruby 2.x or 1.9.3.
You would need to provide more detail about what you need to do for help getting to any more specific solution.
Try something like unicorn as it handles concurrency via workers. Something else to consider if there's a lot of work to be done per request, is to spin up a delayed_job per request.
The one issue with delayed job is that the response won't be synchronous, meaning it won't return to the user's browser.
However, you could have the delayed job save its responses to a table in the DB. Then you can query that table for all requests and their related responses.
What ruby version are you utilizing?
Ruby & Webserver
Ruby
If its a simple application I would recommend the following. Try to utilize rubinius (rbx) or jruby as they are better at concurrency. Although they have drawback as they're not mainline ruby so some extensions won't work. But if its a simple app you should be fine.
Webserver
use Puma or Unicorn if you have the patience to set it up
If you're app is hitting the API service
You indicate that the Global Lock is killing you when you are scraping other sites (presumably ones that allow scraping), if this is the case something like sidekiq or delayed job should be utilized, but with caution. These will be idempotent jobs. i.e. they might be run multiple times. If you start hitting a website multiple times, you will hit a website's Rate limit pretty quickly, eg. twitter limits you to 150 requests per hour. So use background jobs with caution.
If you're the one serving the data
However reading your question it sounds like your controller is the API and the lock is caused by users hitting it.
If this is the case you should utilize dalli + memcached to serve your data. This way you won't be I/O bound by the SQL lookup as memcached is memory based. MEMORY SPEED > I/O SPEED
My Rails application has a route that takes a lot of time to process, which makes the entire webpage freeze.
Why does this happen? Is it Rails or third-party gems which are not thread-safe?
Is there any way to work around this? I'm considering using a process pool, just like a thread pool, except it is heavier, it'll take a lot of memory, but it'll be cheaper than halting the whole app.
First thing to notice, your Rails action should not be heavy-weight. When a user requests a page, you should serve the user right away.
Now, there are cases when you need the user to wait for the result, in which case, you can always use websockets, or HTTP streaming.
Now, Ruby and Rails have a problem with threads, which you can read about in "Parallelism is a Myth in Ruby."
A solution you can use in Rails, is to use servers like Unicorn, which forks as many process workers as you want, and each one will be working independent of the others, Puma for creating multi threads, etc.
Now, if you have an action which is a heavy process, you may want to delay the work to a process pool like delayed_job. You can even create a nice UI with JavaScript to fetch the status of the job and show the progress to the user. You can use a pool of tasks to be performed with RabbitMQ, where another process On the background could listen to new messages and act on them, and even give a response, etc.
Have in mind that most webservers have a client timeout, and you don't really want the user to wait for one minute or more without a response, so it's always nice to use a stream response to give some feedback right away while the action is being completed, or answer with some JavaScript code that will continue hitting the server to see how the task is being performed, or even a websocket if required.
Rails uses a mutex lock around the entire request in the middleware stack, so a Rails process only ever takes one request at a time.
However, you can disable this by enabling the config.threadsafe! option AND using a multithreaded server, such as Puma.
Then there is the whole roadblock of using MRI which doesn't really let two threads run at the same time unless they are doing non-blocking I/O.
You would need to use a Ruby implementation that supports real threads, such as Rubinius or Jruby.
I'm contemplating writing a web application with Rails. Each request made by the user will depend on an external API being called. This external API can randomly be very slow (2-3 seconds), and so obviously this would impact an individual request.
During this time when the code is waiting for the external API to return, will further user requests be blocked?
Just for further clarification as there seems to be some confusion, this is the model I'm anticipating:
Alice makes request to my web app. To fulfill this, a call to API server A is made. API server A is slow and takes 3 seconds to complete.
During this wait time when the Rails app is calling API server A, Bob makes a request which has to make a request to API server B.
Is the Ruby (1.9.3) interpreter (or something in the Rails 3.x framework) going to block Bob's request, requiring him to wait until Alice's request is done?
If you only use one single-threaded, non-evented server (or don't use evented I/O with an evented server), yes. Among other solutions using Thin and EM-Synchrony will avoid this.
Elaborating, based on your update:
No, neither Ruby nor Rails is going to cause your app to block. You left out the part that will, though: the web server. You either need multiple processes, multiple threads, or an evented server coupled with doing your web service requests with an evented I/O library.
#alexd described using multiple processes. I, personally, favor an evented server because I don't need to know/guess ahead of time how many concurrent requests I might have (or use something that spins up processes based on load.) A single nginx process fronting a single thin process can server tons of parallel requests.
The answer to your question depends on the server your Rails application is running on. What are you using right now? Thin? Unicorn? Apache+Passenger?
I wholeheartedly recommend Unicorn for your situation -- it makes it very easy to run multiple server processes in parallel, and you can configure the number of parallel processes simply by changing a number in a configuration file. While one Unicorn worker is handling Alice's high-latency request, another Unicorn worker can be using your free CPU cycles to handle Bob's request.
Most likely, yes. There are ways around this, obviously, but none of them are easy.
The better question is, why do you need to hit the external API on every request? Why not implement a cache layer between your Rails app and the external API and use that for the majority of requests?
This way, with some custom logic for expiring the cache, you'll have a snappy Rails app and still be able to leverage the external API service.
I have a Ruby on Rails Website that makes HTTP calls to an external Web Service.
About once a day I get a SystemExit (stacktrace below) error email where a call to the service has failed. If I then try the exact same query on my site moments later it works fine.
It's been happening since the site went live and I've had no luck tracking down what causes it.
Ruby is version 1.8.6 and rails is version 1.2.6.
Anyone else have this problem?
This is the error and stacktrace.
A SystemExit occurred
/usr/local/lib/ruby/gems/1.8/gems/rails-1.2.6/lib/fcgi_handler.rb:116:in
exit'
/usr/local/lib/ruby/gems/1.8/gems/rails-1.2.6/lib/fcgi_handler.rb:116:in
exit_now_handler'
/usr/local/lib/ruby/gems/1.8/gems/activesupport-1.4.4/lib/active_support/inflector.rb:250:in
to_proc' /usr/local/lib/ruby/1.8/net/protocol.rb:133:in call'
/usr/local/lib/ruby/1.8/net/protocol.rb:133:in sysread'
/usr/local/lib/ruby/1.8/net/protocol.rb:133:in rbuf_fill'
/usr/local/lib/ruby/1.8/timeout.rb:56:in timeout'
/usr/local/lib/ruby/1.8/timeout.rb:76:in timeout'
/usr/local/lib/ruby/1.8/net/protocol.rb:132:in rbuf_fill'
/usr/local/lib/ruby/1.8/net/protocol.rb:116:in readuntil'
/usr/local/lib/ruby/1.8/net/protocol.rb:126:in readline'
/usr/local/lib/ruby/1.8/net/http.rb:2017:in read_status_line'
/usr/local/lib/ruby/1.8/net/http.rb:2006:in read_new'
/usr/local/lib/ruby/1.8/net/http.rb:1047:in request'
/usr/local/lib/ruby/1.8/net/http.rb:945:in request_get'
/usr/local/lib/ruby/1.8/net/http.rb:380:in get_response'
/usr/local/lib/ruby/1.8/net/http.rb:543:in start'
/usr/local/lib/ruby/1.8/net/http.rb:379:in get_response'
Using fcgi with Ruby is known to be very buggy.
Practically everybody has moved to Mongrel for this reason, and I recommend you do the same.
It's been awhile since I used FCGI but I think a FCGI process could throw a SystemExit if the thread was taking too long. This could be the web service not responding or even a slow DNS query. Some google results show a similar error with Python and FCGI so moving to mongrel would be a good idea. This post is my reference I used to setup mongrel and I still refer back to it.
I used to get these all the time on Apache1/fastcgi. I think it's caused by fastcgi hanging up before Ruby is done.
Switching to mongrel is a good first step, but there's more to do. It's a bad idea to cull from web services on live pages, particularly from Rails. Rails is not thread-safe. The number of concurrent connections you can support equals the number of mongrels (or Passenger processes) in your cluster.
If you have one mongrel and someone accesses a page that calls a web service that takes 10 seconds to time out, every request to your website will timeout during that time. Most of the load balancers just cycle through your mongrels blindly, so if you have two mongrels, every other request will timeout.
Anything that can be unpredictably slow needs to happen in a job queue. The first hit to /slow/action adds the job to the queue, and /slow/action keeps on refreshing via page refreshes or queries via ajax until the job is finished, and then you get your results from the job queue. There are a few job queues for Rails nowadays, but the oldest and probably most widely used one is BackgroundRB.
Another alternative, depending on the nature of your app, is to cull the service every N minutes via cron, cache the data locally, and have your live page read from the cache.
I would also take a look at Passenger. It's a lot easier to get going than the traditional solution of Apache/nginx + Mongrel.