Rails vs Ruby namespacing rules. - ruby-on-rails

Reading this really good article on Rails namespacing and module lookup. Here
I don't understand what this means:
If constants are loaded only when they’re first encountered at
runtime, then by necessity their load order depends on the individual
execution path.
What is the individual execution path?
I think that non-understand leads me to not understand this:
As soon as an already-loaded constant Baz is encountered, Rails knows
this cannot be the Baz it is looking for, and the algorithm raises a
NameError.
or more importantly this:
The first time, as before, is down to the loss of nesting information.
Rails can’t know that Foo::Qux isn’t what we’re after, so once it
realises that Foo::Bar::Qux does not exist, it happily loads it.
The second time, however, Foo::Qux is already loaded. So our reference can’t have been to that constant, otherwise Ruby would have
resolved it, and autoloading would never have been invoked. So the
lookup terminates with a NameError, even though our reference could
(and should) have resolved to the as-yet-unloaded ::Qux.
Why doesn't rails use the constant that is encountered that is already loaded? Also why does running:
Foo::Bar.print_qux
twice lead to two different outcomes?

By "execution path" they mean the way your code is running. If there's a reference to a class X::Y inside an if block that isn't executed, that means your execution path bypasses it so it's not loaded.
This is different than force-loading all classes referenced in your code at parse time. They're simply loaded as they're exercised if and only if that given line of code is executed.
The autoloader has a strategy for trying to load modules starting with the most specific and then looking for increasingly global names. Qux is tested against the current module context, then the root of that and so on. This is how symbols are resolved.
In that example the auto-loaded version actually pushes the Foo::Qux definition ahead of ::Qux in terms of priority. That's the major change there.

Related

How can I verify that an ActiveStorage blob is actually present?

I've got an application that's been running in production for many months now, with tens of thousands of attachments. This morning, I tried to do an operation on one of these attachments, and got the following error:
Azure::Core::Http::HTTPError: BlobNotFound (404): The specified blob does not exist.
I can easily recreate this blob, but this situation makes me want to write a script to check the integrity of all of my attachments, and verify that no others have gone missing. (I expect that this was a transitory network error, and I expect to find very few, but I need the peace of mind.)
I can see that there is a method to call that seems to do exactly what I need: exist?(key), which is documented here:
https://github.com/rails/rails/blob/master/activestorage/lib/active_storage/service/disk_service.rb
However, I can't figure out how I'm supposed to call it. According to this, it's implemented as an instance method. So how do I reference my Rails application's active ActiveStorage instance (depending on environment) to use this method?
I wondered if it were possible to backtrack to get the instance of ActiveStorage::Service::AzureStorageService (in production), and found this answer on finding each instance of a class.
From there, I found that I could:
asass = ObjectSpace.each_object(ActiveStorage::Service::AzureStorageService).first
Which then allowed me to:
2.5.5 :015 > asass.exist?(c.json.blob.key)
AzureStorage Storage (313.3ms) Checked if file exists at key: ....PTLWNbEFLgeB8Y5x.... (yes)
=> true
Further digging around in the bowels of Rails' GitHub issues led me to see that the reference to the ActiveStorage service instance can be reached through an instance of a stored object, and that I could always be using the send() method to call one of its methods:
2.5.5 :045 > c.json.blob.service.send(:exist?, c.json.blob.key)
AzureStorage Storage (372.4ms) Checked if file exists at key: ....PTLWNbEFLgeB8Y5x.... (yes)
=> true
I'm sure both approaches are lighting up the same code paths, but I'm not sure which one I prefer at the moment. I still think there must exist some way to traverse the Rails instance through app. or Rails.application. to get to the ActiveStorage instance, but I haven't been able to suss that out.

Constant definition in Ruby/Rails

I have initializers config in my rails application under config/initializers/my_config.rb.
What is the difference between:
A:
module MyModule
Config = "path/to/config.yml"
end
and:
B:
MyModule::Config = "path/to/config.yml"
Let's suppose we do some requests, change its implementation, and hit the application again. If I defined my constant the B way, I get an error:
uninitialized constant MyModule::Config
It will be resolved only when I restart my rails server. But when I do the A way, it still recognized the constant when I updated my code.
What is the importance of using the A syntax in this case?
Part of this seems to have to do with rails hot code reloading, which has a bunch of caveats. If you aren't using hot code reloading, A and B are more equivalent, as long as MyModule has been defined first.
However, when code is reloaded, (particularly the file that defines MyModule), it might end up overwriting the existing module, and not running the B line.
The main difference though, is that A doesn't rely on how the order of other code in the project is loaded/run, but B must be run after certain code.
The differences is that code A raises a syntax error, while code B is grammatical. Code B will raise a name error for MyModule unless it is previously defined, though.

Class name conflict across modules

I'm running into an issue in a Rails 4 application related to class names and modules.
I have an Event class in my main application that inherits from ActiveRecord::Base. I also have a set of files in my /lib directory that have been grouped into a module I call LibModule. There's a class in that module that's also named Event. I noticed something interesting about referencing these classes. Here are some examples using the Rails console.
Example #1: When Event has never been referenced, the ActiveRecord version gets loaded:
> Event
=> Event(id: integer...)
Example #2: When LibModule::Event gets referenced first:
> LibModule::Event
=> LibModule::Event
> Event
=> LibModule::Event
As a result, when my server restarts (after updates, etc), I'll occasionally get the following error if a user engages in behavior that triggers server activity similar to Example #2:
superclass mismatch for class Event
I know there are a few ways to ensure that there's no conflict here. What's the best practice way of handling a situation like this?
I tried replicating the behavior from Example #2 with class names from gems and it seems like Rails completely segregates the classes in gems. Is there a way to do the same here? I think this would be my ideal situation.
Should I just change the name of LibModule::Event?
Should I ensure that the ActiveRecord class loads during initialization?
Some other Rails best practice I haven't thought of?
This has to do with the way qualified constants are resolved by the Rails autoloader. The documentation offers the following solution:
Naming conflicts of this kind are rare in practice, but if one occurs, require_dependency provides a solution by ensuring that the constant needed to trigger the heuristic is defined in the conflicting place.
The solution, in your case, is to add this just above the class definition for LibModule::Event:
require_dependency 'event'
This will inform the autoloader of the ::Event constant that references your ActiveRecord model, ensuring the appropriate constant naming for LibModule::Event.
Although there's nothing in Ruby that precludes you from having duplicated class names, the Rails auto-loader does get easily confused by them which can cause a lot of problems.
Typically I go out of my way to avoid duplication for this very reason. Sometimes they work, sometimes they don't, and the working/not-working aspect of the code can often depend on which entry point is taken, which things are loaded first, making it unpredictable.
You can try and circumvent this by force-loading your Event class using require_relative at the end of lib_module.rb.
to reference the main event class try ::Event. :: is a scope resolution operator, which specifies the global/main scope.

Undefined Method errors on basic AR and AS methods when running threaded with Sidekiq on Heroku

I am getting a couple different errors at a particular line of code in one of my models when running in Sidekiq-queued jobs. The code in question is:
#lookup_fields[:asin] ||= self.book_lookups.find_by_name("asin").try(:value)
I either get undefined method 'scope' for #<ActiveRecord::Associations::AssociationScope:0x00000005f20cb0> or undefined method 'aliased_table_for' for #<ActiveRecord::Associations::AliasTracker:0x00000005bc3f90>.
At another line of code in another Sidekiq job, I get the error undefined method 'decrypt_and_verify' for #<ActiveSupport::MessageEncryptor:0x00000007143208>.
All of these errors make no sense, as they are standard methods of the Rails runtime support libraries.
The model in question has a :has_many association defined for the "book_lookups" model, "name" and "value" are fields in the "book_lookups" model. This always happens on the first 1-3 records processed. If I run the same code outside of a Sidekiq job, these errors do not occur.
I cannot reproduce the error on my development machine, only on production which is hosted at Heroku.
I may have "solved" the first set of errors by putting the code `BookLookup.new()' in an initializer, forcing the model to load before Sidekiq creates any threads. Only one night's work to go on, so we'll have to see if the trend continues...
Even if this solves the immediate problem, I don't think it solves the real underlying issue, which is classes not getting fully loaded before being used. Is class-loading an atomic operation? Is it possible for one thread to start loading a class and another to start using the class before it is fully loaded?
I believe that I have discovered the answer: config.threadsafe!, which I had not done. I have now done that and most if not all of the errors have disappeared. References: http://guides.rubyonrails.org/configuring.html, http://m.onkey.org/thread-safety-for-your-rails (especially the section "Ruby's require is not atomic").

Rails: "Stack level too deep" error when calling "id" primary key method

This is a repost on another issue, better isolated this time.
In my environment.rb file I changed this line:
config.time_zone = 'UTC'
to this line:
config.active_record.default_timezone = :utc
Ever since, this call:
Category.find(1).subcategories.map(&:id)
Fails on "Stack level too deep" error after the second time it is run in the development environment when config.cache_classes = false. If config.cache_classes = true, the problem does not occur.
The error is a result of the following code in active_record/attribute_methods.rb around line 252:
def method_missing(method_id, *args, &block)
...
if self.class.primary_key.to_s == method_name
id
....
The call to the "id" function re-calls method_missing and there is nothing that prevents the id to be called over and over again, resulting in stack level too deep.
I'm using Rails 2.3.8.
The Category model has_many :subcategories.
The call fails on variants of that line above (e.g. Category.first.subcategory_ids, use of "each" instead of "map", etc.).
Any thoughts will be highly appreciated.
Thanks!
Amit
Even though this is solved, I just wanted to chime in on this, and report how I fixed this issue. I had the same symptoms as the OP, initial request .id() worked fine, subsequent requests .id() would throw an the "stack too deep" error message. It's a weird error, as it generally it means you have an infinite loop somewhere. I fixed this by changing:
config.action_controller.perform_caching = true
config.cache_classes = false
to
config.action_controller.perform_caching = true
config.cache_classes = true
in environments/production.rb.
UPDATE: The root cause of this issue turned out to be the cache_store. The default MemoryStore will not preserve ActiveRecord models. This is a pretty old bug, and fairly severe, I'm not sure why it hasn't been fixed. Anyways, the workaround is to use a different cache_store. Try using this, in your config/environments/development.rb:
config.cache_store = :file_store
UPDATE #2: C. Bedard posted this analysis of the issue. Seems to sum it up nicely.
Having encountered this problem myself (and being stuck on it repeateadly) I have investigated the error (and hopefully found a good fix). Here's what I know about it:
It happens when ActiveRecord::Base#reset_subclasses is called by the dispatcher between requests (in dev mode only).
ActiveRecord::Base#reset_subclasses wipes out the inheritable_attributes Hash (where #skip_time_zone_conversion_for_attributes is stored).
It will not only happen on objects persisted through requests, as the "monkey test app" from #1290 shows, but also when trying to access generated association methods on AR, even for objects that live only on the current request.
This bug was introduced by this commit where the #skip_time_zone_conversion_for_attributes declaration was changed from base.cattr_accessor to base.class_inheritable_accessor. But then again, that same commit also fixed something else.
The patch initially submitted here that simply avoids clearing the instance_variables and instance_methods in reset_subclasses does introduce massive leaking, and the amounts leaked seem directly proportional to complexity of the app (i.e. number of models, associations and attributes on each of them). I have a pretty complex app which leaks nearly 1Mb on each request in dev mode when the patch is applied. So it's not viable (for me anyways).
While trying out different ways to solve this, I have corrected the initial error (skip_time_zone_conversion_for_attributes being nil on 2nd request), but it uncovered another error (which just didn't happen because the first exception would be raised before getting to it). That error seems to be the one reported in #774 (Stack overflow in method_missing for the 'id' method).
Now, for the solution, my patch (attached) does the following:
It adds wrapper methods for #skip_time_zone_conversion_for_attributes methods, making sure it always reads/writes the value as an class_inheritable_attribute. This way, nil is never returned anymore.
It ensures that the 'id' method is not wiped out when reset_subclasses is called. AR is kinda strange on that one, because it first defines it directly in the source, but redefines itself with #define_read_method when it is first called. And that is precisely what makes it fail after reloading (since reset_subclasses then wipes it out).
I also added a test in reload_models_test.rb, which calls reset_subclasses to try and simulate reloading between requests in dev mode. What I cannot tell at this point is if it really triggers the reloading mechanism as it does on a live dispatcher request cycle. I also tested from script/server and the error was gone.
Sorry for the long paste, it sucks that the rails lighthouse project is private. The patch mentioned above is private.
-- This answer is copied from my original post here.
Finally solved!
After posting a third question and with help of trptcolin, I could confirm a working solution.
The problem: I was using require to include models from within Table-less models (classes that are in app/models but do not extend ActiveRecord::Base). For example, I had a class FilterCategory that performed require 'category'. This messed up with Rails' class caching.
I had to use require in the first place since lines such as Category.find :all failed.
The solution (credit goes to trptcolin): replace Category.find :all with ::Category.find :all. This works without the need to explicitly require any model, and therefore doesn't cause any class caching problems.
The "stack too deep" problem also goes away when using config.active_record.default_timezone = :utc

Resources