Multithreading in Rails: Circular dependency detected while autoloading constant - ruby-on-rails

I have a Rails app in which I have a Rake task that uses multithreading functions supplied by the concurrent-ruby gem.
From time to time I encounter Circular dependency detected while autoloading constant errors.
After Googling for a bit I found this to be related to using threading in combination with loading Rails constants.
I stumbled upon the following GitHub issues: https://github.com/ruby-concurrency/concurrent-ruby/issues/585 and https://github.com/rails/rails/issues/26847
As explained here you need to wrap any code that is called from a new thread in a Rails.application.reloader.wrap do or Rails.application.executor.wrap do block, which is what I did. However, this leads to deadlock.
The recommendation is then to use ActiveSupport::Dependencies.interlock.permit_concurrent_loads to wrap another blocking call on the main thread. However, I am unsure which code I should wrap with this.
Here's what I tried, however this still leads to a deadlock:
#beanstalk = Beaneater.new("#{ENV.fetch("HOST", "host")}:#{ENV.fetch("BEANSTALK_PORT", "11300")}")
tube_name = ENV.fetch("BEANSTALK_QUEUE_NAME", "queue")
pool = Concurrent::FixedThreadPool.new(Concurrent.processor_count * 2)
# Process jobs from tube, the body of this block gets executed on each message received
#beanstalk.jobs.register(tube_name) do |job|
ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
#logger.info "Received job: #{job.id}"
Concurrent::Future.execute(executor: pool) do
Rails.application.reloader.wrap do
# Stuff that references Rails constants etc
process_beanstalk_message(job.body)
end
end
end
end
#beanstalk.jobs.process!(reserve_timeout: 10)
Can anyone shed a light as to how I should solve this? The odd thing is I encounter this in production while other information on this topic seems to imply it should normally only occur in development.
In production I use the following settings:
config.eager_load = true
config.cache_classes = true.
Autoload paths for all environments are Rails default plus two specific folders ("models/validators" & "jobs/concerns").
eager_load_paths is not modified or set in any of my configs so must be equal to the Rails default.
I am using Rails 5 so enable_dependency_loading should equal to false in production.

You likely need to change your eager_load_paths to include the path to the classes or modules that are raising the errors. eager_load_paths is documented in the Rails Guides.
The problem you're running into is that Rails is not loading these constants when the app starts; it automatically loads them when they are called by some other piece of code. In a multithreaded Rails app, two threads may have a race condition when they try to load these constants.
Telling Rails to eagerly load these constants means they will be loaded once when the Rails app is started. It's not enough to say eager_load = true; you have to specify the paths to the class or module definitions as well. In the Rails application configuration, this is an Array under eager_load_paths. For example, to eager load ActiveJob classes:
config.eager_load_paths += ["#{config.root}/app/jobs"]
Or to load a custom module from lib/:
config.eager_load_paths += ["#{config.root}/lib/custom_module"]
Changing your eager load settings will affect the behavior of Rails. For example, in the Rails development environment, you're probably used to running rails server once, and every time you reload one of the endpoints it will reflect any changes to code you've made. That will not work with config.eager_load = true, because the classes are loaded once, at startup. Therefore, you will typically only change your eager_load settings for production.
Update
You can check your existing eager_load_paths from the rails console. For example, these are the default values for a new Rails 5 app. As you can see, it does not load app/**/*.rb; it loads the specific paths that Rails is expected to know about.
Rails.application.config.eager_load_paths
=> ["/app/assets",
"/app/channels",
"/app/controllers",
"/app/controllers/concerns",
"/app/helpers",
"/app/jobs",
"/app/mailers",
"/app/models",
"/app/models/concerns"]

In my gems (i.e., in plezi and iodine) I solve this with if statements, mostly.
You'll find code such as:
require 'uri' unless defined?(::URI)
or
begin
require 'rack/handler' unless defined?(Rack::Handler)
Rack::Handler::WEBrick = ::Iodine::Rack # Rack::Handler.get(:iodine)
rescue Exception
end
I used these snippets because of Circular dependency detected warnings and errors.
I don't know if this helps, but I thought you might want to try it.

I had this issue while trying out two gems that handles parallel processing;
pmap gem
parallel gem
For pmap I kept getting an error related to Celluloid::TaskTerminated and for parallel I was getting a Circular dependency detected while autoloading constant for when I ran it with more than 1 thread. I knew this issue was related to how my classes and modules were eager loading and race to be placed on a thread. I try enabling both of the configs to true config.cache_classes = true and config.eager_load = true in the development env and that did the trick for me.

Related

Access the model in production.rb rails 3

I have a model called SystemSettings with a name on and a value. It is where I store the majority of my configuration for my app. I need to be able to access it in my production.rb inside my rails 3.2 app. How would you go about doing this?
Since the Rails config such as production.rbis read before ActiveRecord is initialised you would need to use a callback:
Rails.application.configure do
ActiveSupport.on_load(:active_record) do
config.custom_variable = SystemSettings.find_by(name: "Foo").value
end
end
But since the callback executes later when ActiveRecord is ready you can't immediately use its value which is why your approach may be flawed due to race conditions.
Unless you are building something like a CMS where you need to provide a user interface to edit system settings you will be better off using environmental variables. They are immediately available from memory and do not have the overhead of a database query.
http://guides.rubyonrails.org/v3.2.9/initialization.html

Autoload related race condition in Cucumber with AJAX requests

I'm using Cucumber with capybara-webkit for my app's integration tests on Ruby 2.0.0, Rails 4.1. A handful of test in my cucumber test suite unexpectedly began spitting out errors like this:
Circular dependency detected while autoloading constant UiValidators::ParameterFinder (RuntimeError)
/Users/kingp/.rvm/gems/ruby-2.0.0-p451#triquest/gems/activesupport-4.1.1/lib/active_support/dependencies.rb:484:in `load_missing_constant'
/Users/kingp/.rvm/gems/ruby-2.0.0-p451#triquest/gems/activesupport-4.1.1/lib/active_support/dependencies.rb:180:in `const_missing'
/Users/kingp/Projects/rails-triquest/app/controllers/contacts_controller.rb:2:in `<class:ContactsController>'
/Users/kingp/Projects/rails-triquest/app/controllers/contacts_controller.rb:1:in `<top (required)>'
/Users/kingp/.rvm/gems/ruby-2.0.0-p451#triquest/gems/activesupport-4.1.1/lib/active_support/dependencies.rb:247:in `require'
...
The error says 'circular dependency', but it is actually thrown at any time the Rails autoloader tries to load a constant that is already in its set of loaded constants. Typically this is indeed due to a circular dependency, but I'm pretty sure that's not the case in my app. A diff between the branch with the crashing test and the stable branch I forked from shows that the only changes are to coffeescript files, view templates, a migration, and the new cucumber features I was writing. I haven't touched any controller or model code.
I ended up inserting some logging code into the rails autoloader to help me figure out what's going on:
# Inserted at activesupport-4.1.1/lib/active_support/dependencies.rb:467
_thread_id_for_debug = Thread.current.object_id
STDERR.puts "*** #{loaded.count} #{from_mod} #{const_name} - #{_thread_id_for_debug}"
loaded is a set of paths to autoloaded code files, from_mod the context where the request came from, const_name the constant we're trying to load. Which all ultimately got me this, immediately before the crash:
*** 104 Object SitesController - 70180261360940
*** 105 Object ContactsController - 70180240113760
*** 105 SitesController UiValidators - 70180261360940
*** 105 Object UiValidators - 70180261360940
*** 105 UiValidators ParameterFinder - 70180261360940
*** 107 UiValidators ParameterFinder - 70180240113760
It looks like two threads are attempting to autoload the same constant. My guess is that the name of the constant is added to Rails' set of 'loaded' constants by the first thread before it has finished loading. The second thread can't resolve the constant (since the load hasn't finished yet), asks the autoloader to find it, and the autoloader raises when it sees the constant in its 'loaded' set.
At this point in the test, two controllers (SitesController and ContactsController) are responding to AJAX requests, launched nearly simultaneously.
I have found a way to work around the crash, by just including a reference to the module UiValidators::ParameterFinder ahead of the AJAX. But this seems fragile, and also not very elegant. Short of turning on eager loading for the test environment, is there any other way to avoid this problem?
I had the same problem (without Cucumber, just Capybara & Poltergeist). setting config.eager_load = true didn't even work for me (don't quite understand why not..).
I ended up using Spring and haven't had a circular dependency error since.
I have the same issue with Rails 4.1.4 when using Sidekiq. I assume that a race condition inside the threaded Sidekiq workers caused all kinds of hijinks when const_missing inside active_support was called.
In addition to make sure that my current environment would perform eager loading i.e. via config.eager_load = true I also had to add all components that my workers were using from the lib directory into config.eager_load_paths (via config.eager_load_paths += %W(#{config.root}/lib) inside config/application.rb).
This was necessary because I assume that setting config.eager_load = true only makes Rails eager load the contents of the app/ directory.
App::Application.config.eager_load_paths
=> [
[0] "/home/archive/releases/20140721180504/app/assets",
[1] "/home/archive/releases/20140721180504/app/controllers",
[2] "/home/archive/releases/20140721180504/app/helpers",
[3] "/home/archive/releases/20140721180504/app/mailers",
[4] "/home/archive/releases/20140721180504/app/models",
[5] "/home/archive/releases/20140721180504/app/services",
[6] "/home/archive/releases/20140721180504/app/workers"
]
The combination of both seemed to have helped with the issue.

Joining separate log to main Rails development log

This is the reverse of the question I have seen several times elsewhere, in which someone wants to see how to create an another, separate Rails log from the main development log. For some reason, my Rails app is logging my DelayedJob gem's activity to a separate log (delayed_job.log), but I want it to log to the main development.log file. I am using the Workless gem and NewRelic as well, should this be potentially relevant (although I experimented on this by removing NewRelic, and the issue still remained).
I'm not clear on how this happened. However, I was having some trouble earlier with seeing SQL insertions and deletions in my log, and another user kindly suggested that I use the following in an initializer file:
if defined?(Rails) && !Rails.env.nil?
logger = Logger.new(STDOUT)
ActiveRecord::Base.logger = logger
ActiveResource::Base.logger = logger
end
Once I did this, I saw the SQL statements, but no longer saw the DelayedJob information in the main development log.
So my question is: How can I make sure that DelayedJob activity logs to the main development log? I don't mind if it also logs to a separate log, but the important thing is that I see its activity in my Mac's console.
Please let me know if you'd like more code from my app - I'd be happy to provide it. Much thanks from a Rails newbie.
Try adding the following line to config/initializers/delayed_job_config.rb
Delayed::Worker.logger = Logger.new(STDOUT)
I finally got this to work. All thanks to Seamus Abshere's answer to the question here. I put what he posted below in an initializer file. This got delayed_job to log to my development.rb file (huzzah!).
However, delayed_job still isn't logging into my console (for reasons I still don't understand). I solved that by opening a new console tab and entering tail -f log/development.log.
Different from what Seamus wrote, though, auto-flushing=true is deprecated in Rails 4 and my Heroku app crashed. I resolved this by removing it from my initializer file and placing it in my environments/development.rb file as config.autoflush_log = true. However, I found that neither of the two types of flushing were necessary to make this work.
Here is his code (without the auto-flushing):
file_handle = File.open("log/#{Rails.env}_delayed_jobs.log", (File::WRONLY | File::APPEND | File::CREAT))
# Be paranoid about syncing
file_handle.sync = true
# Hack the existing Rails.logger object to use our new file handle
Rails.logger.instance_variable_set :#log, file_handle
# Calls to Rails.logger go to the same object as Delayed::Worker.logger
Delayed::Worker.logger = Rails.logger
If the above code doesn't work, try replacing Rails.logger with RAILS_DEFAULT_LOGGER.

Rails: "Stack level too deep" error when calling "id" primary key method

This is a repost on another issue, better isolated this time.
In my environment.rb file I changed this line:
config.time_zone = 'UTC'
to this line:
config.active_record.default_timezone = :utc
Ever since, this call:
Category.find(1).subcategories.map(&:id)
Fails on "Stack level too deep" error after the second time it is run in the development environment when config.cache_classes = false. If config.cache_classes = true, the problem does not occur.
The error is a result of the following code in active_record/attribute_methods.rb around line 252:
def method_missing(method_id, *args, &block)
...
if self.class.primary_key.to_s == method_name
id
....
The call to the "id" function re-calls method_missing and there is nothing that prevents the id to be called over and over again, resulting in stack level too deep.
I'm using Rails 2.3.8.
The Category model has_many :subcategories.
The call fails on variants of that line above (e.g. Category.first.subcategory_ids, use of "each" instead of "map", etc.).
Any thoughts will be highly appreciated.
Thanks!
Amit
Even though this is solved, I just wanted to chime in on this, and report how I fixed this issue. I had the same symptoms as the OP, initial request .id() worked fine, subsequent requests .id() would throw an the "stack too deep" error message. It's a weird error, as it generally it means you have an infinite loop somewhere. I fixed this by changing:
config.action_controller.perform_caching = true
config.cache_classes = false
to
config.action_controller.perform_caching = true
config.cache_classes = true
in environments/production.rb.
UPDATE: The root cause of this issue turned out to be the cache_store. The default MemoryStore will not preserve ActiveRecord models. This is a pretty old bug, and fairly severe, I'm not sure why it hasn't been fixed. Anyways, the workaround is to use a different cache_store. Try using this, in your config/environments/development.rb:
config.cache_store = :file_store
UPDATE #2: C. Bedard posted this analysis of the issue. Seems to sum it up nicely.
Having encountered this problem myself (and being stuck on it repeateadly) I have investigated the error (and hopefully found a good fix). Here's what I know about it:
It happens when ActiveRecord::Base#reset_subclasses is called by the dispatcher between requests (in dev mode only).
ActiveRecord::Base#reset_subclasses wipes out the inheritable_attributes Hash (where #skip_time_zone_conversion_for_attributes is stored).
It will not only happen on objects persisted through requests, as the "monkey test app" from #1290 shows, but also when trying to access generated association methods on AR, even for objects that live only on the current request.
This bug was introduced by this commit where the #skip_time_zone_conversion_for_attributes declaration was changed from base.cattr_accessor to base.class_inheritable_accessor. But then again, that same commit also fixed something else.
The patch initially submitted here that simply avoids clearing the instance_variables and instance_methods in reset_subclasses does introduce massive leaking, and the amounts leaked seem directly proportional to complexity of the app (i.e. number of models, associations and attributes on each of them). I have a pretty complex app which leaks nearly 1Mb on each request in dev mode when the patch is applied. So it's not viable (for me anyways).
While trying out different ways to solve this, I have corrected the initial error (skip_time_zone_conversion_for_attributes being nil on 2nd request), but it uncovered another error (which just didn't happen because the first exception would be raised before getting to it). That error seems to be the one reported in #774 (Stack overflow in method_missing for the 'id' method).
Now, for the solution, my patch (attached) does the following:
It adds wrapper methods for #skip_time_zone_conversion_for_attributes methods, making sure it always reads/writes the value as an class_inheritable_attribute. This way, nil is never returned anymore.
It ensures that the 'id' method is not wiped out when reset_subclasses is called. AR is kinda strange on that one, because it first defines it directly in the source, but redefines itself with #define_read_method when it is first called. And that is precisely what makes it fail after reloading (since reset_subclasses then wipes it out).
I also added a test in reload_models_test.rb, which calls reset_subclasses to try and simulate reloading between requests in dev mode. What I cannot tell at this point is if it really triggers the reloading mechanism as it does on a live dispatcher request cycle. I also tested from script/server and the error was gone.
Sorry for the long paste, it sucks that the rails lighthouse project is private. The patch mentioned above is private.
-- This answer is copied from my original post here.
Finally solved!
After posting a third question and with help of trptcolin, I could confirm a working solution.
The problem: I was using require to include models from within Table-less models (classes that are in app/models but do not extend ActiveRecord::Base). For example, I had a class FilterCategory that performed require 'category'. This messed up with Rails' class caching.
I had to use require in the first place since lines such as Category.find :all failed.
The solution (credit goes to trptcolin): replace Category.find :all with ::Category.find :all. This works without the need to explicitly require any model, and therefore doesn't cause any class caching problems.
The "stack too deep" problem also goes away when using config.active_record.default_timezone = :utc

Determine if script/server is being started

In Rails, in an initializer/environment.rb Whats the pefered way to detemrine if the webapp itself is being loaded (script/server).
All the initializers are loaded for migrations script/console and other rails task as well, but in my case some stuff only has to be loaded when the server itself is being initialized.
My ideas: checking $0
Thanks!
Reto
Because there are multiple application servers, each with their own initialization strategy, I would recommend the only way to reliably hook into the server boot process: ActionController::Dispatcher.
The dispatcher has some callbacks; namely:
prepare_dispatch (added with to_prepare)
before_dispatch
after_dispatch
The "prepare" callbacks are run before every request in development mode, and before the first request in production mode. The Rails configuration object allows you to add such callbacks via its own to_prepare method:
Rails::Initializer.run do |config|
config.to_prepare do
# do your special initialization stuff
end
end
Unfortunately, to my knowledge this callback will always be run since Rails initializer calls Dispatcher.run_prepare_callbacks regardless of if we're booting up with a server or to a script/console or even a rake task. You want to avoid this, so you might try this in your environment.rb:
Rails::Initializer.run do |config|
# your normal stuff
end
if defined? ActionController::Dispatcher
ActionController::Dispatcher.to_prepare do
# your special stuff
end
end
Now, your "special stuff" will only execute before first request in production mode, but before every request in development. If you're loading extra libraries, you might want to avoid loading something twice by putting an if statement around load or require. The require method will not load a single file twice, but I still recommend that you put a guard around it.
There is probably a better way to do this, but since I am not aware of one, I would probably alter script/server to set an environment variable of some kind.
Then I would have my initializer check for that environment variable.

Resources