How rails resolve multi-requests at the same time? - ruby-on-rails

Prepare for the test:
sleep 10 in a action
Test:
Open two tabs in the browser to visit the action
Result:
When the second request is running, the first request finished and began rendering the view, but the view is still blank.
After the second request finished too, the two requests finished rendering the view at the same time.
Conclusion:
Rails is just one single instance. One request can only enter the action after the previous requests finish. But how to explain the response part? Why the multi-requests finish rendering the views at the same time?

WEBrick is multi-threaded but Rails developers hard-coded a mutex, so it can handle just one request at a time. You can monkey-patch Rails::Server and you are free to run a multi-threaded WEBrick.
Just note that WEBrick will be multithreaded only when config config.cache_classes = true and config.eager_load = true, which is typical to RAILS_ENV=production. This is because class reloading in development is not thread safe.
To get WEBrick fully multi-threaded in Rails 4.0, just add this to config/initializers/multithreaded_webrick.rb:
# Remove Rack::Lock so WEBrick can be fully multi-threaded.
require 'rails/commands/server'
class Rails::Server
def middleware
middlewares = []
middlewares << [Rails::Rack::Debugger] if options[:debugger]
middlewares << [::Rack::ContentLength]
Hash.new middlewares
end
end
The offending code in rails/commands/server.rb that we got rid of is:
# FIXME: add Rack::Lock in the case people are using webrick.
# This is to remain backwards compatible for those who are
# running webrick in production. We should consider removing this
# in development.
if server.name == 'Rack::Handler::WEBrick'
middlewares << [::Rack::Lock]
end
It's not needed on Rails 4.2. It's concurrent out-of-the-box.

Are you using a WEBrick server? That must be because your server is a single threaded server and is capable of fulfilling one request at a time (because of the single worker thread). Now in case of multiple requests, it runs the action part of the request and before running the view renderer it checks to see if there are any pending requests. Now if 10 requests are lined up, it would first complete all of them before actually rendering the views. When all of these requests are completed, the views would now be rendered sequentially.
You can switch to Passenger or Unicorn server if you want multi-threaded environment.
Hope that makes sense.

under your env setup config/environments/development.rb (or in config/application.rb)
add this line :
#Enable threaded mode
config.threadsafe!

Related

Multithreading in Rails: Circular dependency detected while autoloading constant

I have a Rails app in which I have a Rake task that uses multithreading functions supplied by the concurrent-ruby gem.
From time to time I encounter Circular dependency detected while autoloading constant errors.
After Googling for a bit I found this to be related to using threading in combination with loading Rails constants.
I stumbled upon the following GitHub issues: https://github.com/ruby-concurrency/concurrent-ruby/issues/585 and https://github.com/rails/rails/issues/26847
As explained here you need to wrap any code that is called from a new thread in a Rails.application.reloader.wrap do or Rails.application.executor.wrap do block, which is what I did. However, this leads to deadlock.
The recommendation is then to use ActiveSupport::Dependencies.interlock.permit_concurrent_loads to wrap another blocking call on the main thread. However, I am unsure which code I should wrap with this.
Here's what I tried, however this still leads to a deadlock:
#beanstalk = Beaneater.new("#{ENV.fetch("HOST", "host")}:#{ENV.fetch("BEANSTALK_PORT", "11300")}")
tube_name = ENV.fetch("BEANSTALK_QUEUE_NAME", "queue")
pool = Concurrent::FixedThreadPool.new(Concurrent.processor_count * 2)
# Process jobs from tube, the body of this block gets executed on each message received
#beanstalk.jobs.register(tube_name) do |job|
ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
#logger.info "Received job: #{job.id}"
Concurrent::Future.execute(executor: pool) do
Rails.application.reloader.wrap do
# Stuff that references Rails constants etc
process_beanstalk_message(job.body)
end
end
end
end
#beanstalk.jobs.process!(reserve_timeout: 10)
Can anyone shed a light as to how I should solve this? The odd thing is I encounter this in production while other information on this topic seems to imply it should normally only occur in development.
In production I use the following settings:
config.eager_load = true
config.cache_classes = true.
Autoload paths for all environments are Rails default plus two specific folders ("models/validators" & "jobs/concerns").
eager_load_paths is not modified or set in any of my configs so must be equal to the Rails default.
I am using Rails 5 so enable_dependency_loading should equal to false in production.
You likely need to change your eager_load_paths to include the path to the classes or modules that are raising the errors. eager_load_paths is documented in the Rails Guides.
The problem you're running into is that Rails is not loading these constants when the app starts; it automatically loads them when they are called by some other piece of code. In a multithreaded Rails app, two threads may have a race condition when they try to load these constants.
Telling Rails to eagerly load these constants means they will be loaded once when the Rails app is started. It's not enough to say eager_load = true; you have to specify the paths to the class or module definitions as well. In the Rails application configuration, this is an Array under eager_load_paths. For example, to eager load ActiveJob classes:
config.eager_load_paths += ["#{config.root}/app/jobs"]
Or to load a custom module from lib/:
config.eager_load_paths += ["#{config.root}/lib/custom_module"]
Changing your eager load settings will affect the behavior of Rails. For example, in the Rails development environment, you're probably used to running rails server once, and every time you reload one of the endpoints it will reflect any changes to code you've made. That will not work with config.eager_load = true, because the classes are loaded once, at startup. Therefore, you will typically only change your eager_load settings for production.
Update
You can check your existing eager_load_paths from the rails console. For example, these are the default values for a new Rails 5 app. As you can see, it does not load app/**/*.rb; it loads the specific paths that Rails is expected to know about.
Rails.application.config.eager_load_paths
=> ["/app/assets",
"/app/channels",
"/app/controllers",
"/app/controllers/concerns",
"/app/helpers",
"/app/jobs",
"/app/mailers",
"/app/models",
"/app/models/concerns"]
In my gems (i.e., in plezi and iodine) I solve this with if statements, mostly.
You'll find code such as:
require 'uri' unless defined?(::URI)
or
begin
require 'rack/handler' unless defined?(Rack::Handler)
Rack::Handler::WEBrick = ::Iodine::Rack # Rack::Handler.get(:iodine)
rescue Exception
end
I used these snippets because of Circular dependency detected warnings and errors.
I don't know if this helps, but I thought you might want to try it.
I had this issue while trying out two gems that handles parallel processing;
pmap gem
parallel gem
For pmap I kept getting an error related to Celluloid::TaskTerminated and for parallel I was getting a Circular dependency detected while autoloading constant for when I ran it with more than 1 thread. I knew this issue was related to how my classes and modules were eager loading and race to be placed on a thread. I try enabling both of the configs to true config.cache_classes = true and config.eager_load = true in the development env and that did the trick for me.

ActiveRecord::Base.connection.query_cache_enabled in sidekiq

I have a piece of code that performs the same queries over and over, and it's doing that in a background worker within a thread.
I checkout out the activerecord query cache middleware but apparently it needs to be enabled before use. However I'm not sure if it's a safe thing to do and if it will affect other running threads.
you can see the tests here: https://github.com/rails/rails/blob/3e36db4406beea32772b1db1e9a16cc1e8aea14c/activerecord/test/cases/query_cache_test.rb#L19
my question is: can I borrow and/or use the middleware directly to enable query cache for the duration of a block safely in a thread?
when I tried ActiveRecord::Base.cache do my CI started failing left and right...
EDIT: Rails 5 and later: the ActiveRecord query cache is automatically enabled even for background jobs like Sidekiq (see: https://github.com/mperham/sidekiq/wiki/Problems-and-Troubleshooting#activerecord-query-cache for information on how to disable it).
Rails 4.x and earlier:
The difficulty with applying ActiveRecord::QueryCache to your Sidekiq workers is that, aside from the implementation details of it being a middleware, it's meant to be built during the request and destroyed at the end of it. Since background jobs don't have a request, you need to be careful about when you clear the cache. A reasonable approach would be to cache only during the perform method, though.
So, to implement that, you'll probably need to write your own piece of Sidekiq middleware, based on ActiveRecord::QueryCache but following Sidekiq's middleware guide. E.g.,
class SidekiqQueryCacheMiddleware
def call(worker, job, queue)
connection = ActiveRecord::Base.connection
enabled = connection.query_cache_enabled
connection_id = ActiveRecord::Base.connection_id
connection.enable_query_cache!
yield
ensure
ActiveRecord::Base.connection_id = connection_id
ActiveRecord::Base.connection.clear_query_cache
ActiveRecord::Base.connection.disable_query_cache! unless enabled
end
end

Rails Unicorn - Delay between starting request and reaching controller

I am using Unicorn as my app server for my Rails app, and am trying to figure out why there sometimes is sometimes a non-trivial (> 5 seconds) delay between the start of a request, and when it reaches my controller.
This is what my production.log prints out:
Started GET "/search/articles.json?q=mashable.com" for 138.7.7.33 at 2015-07-23 14:59:19 -0400**
Parameters: {"q"=>"mashable.com"}
Searching articles for keyword: mashable.com, format: json, Time: 2015-07-23 14:59:26 -0400
Notice how there is a 7 second delay in between STARTED GET: and "Searching articles for keyword", which is the first thing the controller method does.
articles.json is routed to my controller method "articles" which simply does this for now:
def articles
format = params[:format]
keyword = params["q"]
Rails.logger.info "Searching articles for keyword: #{keyword}, format: #{format}, Time: #{Time.now.to_s}"
end
This is my routes.rb
MyApp::Application.routes.draw do
match '/search/articles' => 'search#articles'
#more routes here, but articles is the first route
end
What could possibly cause this delay? Is it because an Unicorn worker is busy? Is it because an Unicorn worker is taking up too much memory which leads the system to be slow?
Note: I don't believe the delay is in making any database connections but I could be wrong. The code doesn't need to make a database call, and the max connections for my database is 1000, and there are usually at most 1-2 connections.
Three thoughts:
You'll probably be better served using Puma instead of Unicorn
It could be that your system is running out of memory, or it could have plenty of memory available: install New Relic to troubleshoot where the bottleneck is
It could also be that you have more Unicorn instances than the number of connections your DB allows, in which case the instance is having to wait for others to disconnect before it can connect. This would likely manifest itself with irregular 5-second delays rather than happening every time.
Actually, it might be caused by an before_filter callback, you should check it
I think it can be because of lack of memory and thus frequent garbage collection, which freeze whole system.
If it's a production problem it could be caused by slow clients sending requests. New Relic and Monit are good options. You could consider sending signals to Unicorn workers to restart them to better understand the problem.
You could also try adding preload_app true in your Unicorn config to speed up the startup time of worker processes.

max-requests-per-worker in unicorn

I sought, but did not find, a max-requests-per-worker option in unicorn similar to gunicorn's max_requests or apache's MaxRequestsPerChild.
Does it exist?
If not, has anyone implemented it?
I'm thinking of putting it in the file where I have oobgc, since that gets control after every requests anyway. Does that sound about right?
The problem is that my unicorn workers are getting big and fat, and garbage collection is taking more and more of my CPU.
i've just released 'unicorn-worker-killer' gem. This enables you to kill Unicorn worker based on 1) Max number of requests and 2) Process memory size (RSS), without affecting the request. It's really easy to use. At first, please add this line to your Gemfile.
gem 'unicorn-worker-killer'
Then, please add the following lines to your config.ru.
# Unicorn self-process killer
require 'unicorn/worker_killer'
# Max requests per worker
use Unicorn::WorkerKiller::MaxRequests, 3072, 4096
# Max memory size (RSS) per worker
use Unicorn::WorkerKiller::Oom, (256*(1024**2)), (384*(1024**2))
It's highly recommended to randomize the threshold to avoid killing all workers at once.
Unicorn doesn't offer a max-requests.
The unicorn master will re-spawn any worker which exits and a worker will gracefully exit at the end of the current request when it receives a QUIT signal, so you could easily roll your own max request logic into your worker request life-cycle.
With Rails, something like the following in your application controller (alternatively, similar logic in a rack middleware)
after_filter do
##request_count ||= 0
Process.kill('QUIT',$$) if (##request_count += 1) > MAX_REQUESTS
end

Determine if script/server is being started

In Rails, in an initializer/environment.rb Whats the pefered way to detemrine if the webapp itself is being loaded (script/server).
All the initializers are loaded for migrations script/console and other rails task as well, but in my case some stuff only has to be loaded when the server itself is being initialized.
My ideas: checking $0
Thanks!
Reto
Because there are multiple application servers, each with their own initialization strategy, I would recommend the only way to reliably hook into the server boot process: ActionController::Dispatcher.
The dispatcher has some callbacks; namely:
prepare_dispatch (added with to_prepare)
before_dispatch
after_dispatch
The "prepare" callbacks are run before every request in development mode, and before the first request in production mode. The Rails configuration object allows you to add such callbacks via its own to_prepare method:
Rails::Initializer.run do |config|
config.to_prepare do
# do your special initialization stuff
end
end
Unfortunately, to my knowledge this callback will always be run since Rails initializer calls Dispatcher.run_prepare_callbacks regardless of if we're booting up with a server or to a script/console or even a rake task. You want to avoid this, so you might try this in your environment.rb:
Rails::Initializer.run do |config|
# your normal stuff
end
if defined? ActionController::Dispatcher
ActionController::Dispatcher.to_prepare do
# your special stuff
end
end
Now, your "special stuff" will only execute before first request in production mode, but before every request in development. If you're loading extra libraries, you might want to avoid loading something twice by putting an if statement around load or require. The require method will not load a single file twice, but I still recommend that you put a guard around it.
There is probably a better way to do this, but since I am not aware of one, I would probably alter script/server to set an environment variable of some kind.
Then I would have my initializer check for that environment variable.

Resources