I am having difficulty optimizing my Active Model Serializers to avoid the n+1 problem. As per suggestions from their docs, I have attempted to eager load the associations which i thought were causing my query bottlenecks, but my serializers are still taking forever.
Obviously, i must be doing something wrong. My application is deeply nested with associations, so i guess I'm more interested in discovering a tool to unveil to me exactly WHICH associations are costing me. Right now, i am attaching a stack trace to every query run through the ActiveRecord
ActiveSupport::Notifications.subscribe("sql.active_record") do |_, _, _, _, details|
puts caller.join("\n")
puts "*" * 50
end
which gives me a ridiculous output because i am running so many queries to begin with, but, in addition, the stack traces are not helpful at identifying which serializer is at fault. It shows me which controller method was calling render, but then from there the stack trace simply prints methods from gems/active_model_serializers, which does not help me.
I am hoping to uncover a method of debugging that would be able to identify to me which serializers were at fault, that way i am not guessing at how to optimize my queries. Has anybody discovered anything like this? Thanks!
===================
UPDATE
Just so it is clear, i am already printing a query log, in addition to a stack trace. Unfortunately, with so many associations to keep track of, the query log is not exactly helpful at identifying the source of the query. It is guess work at best, and ineffective at the association scope i am dealing with.
I have abandoned the stack traces altogether, finding them to be totally unhelpful. Now, all i have printing are SQL logs, and i am manually sifting through them, trying to discover the source of the association.
The next method I will attempt (although i hate to resort to it) is commenting out associations until i see improvements in my query times. It will be more effective than trying to trace the source of the problem, but it will provide me no comfort in a production environment (where commenting out critical associations is not an option), so if anybody finds a solution that can help, I would still be very grateful.
I will continue to post updates as I move through this problem, as it may help many others in the future.
======================UPDATE 2
It turns out that commenting out associations in my serializer and reintroducing them one at a time, while ineffective in production, is an excellent way to debug in a local environment. I was able to drill down to the problem within a minute and correct it. Still, this is not an ideal solution. I would ideally like to be able to identify the problem from a log so that in production i could ascertain the issue without affecting the application's behavior.
The active_record_query_trace gem can do that.
Add the following to your Gemfile:
group :development do
gem 'active_record_query_trace'
end
Create an initializer such as config/initializers/active_record_query_trace.rb to enable the gem. If you want to customize how the gem behaves, you can add any combination of the options described in the docs to the initializer as well.
if Rails.env.development?
ActiveRecordQueryTrace.enabled = true
# Optional: other gem config options go here
end
Restart the Rails development server.
Easy way within any gems - is logging each string of code. For example, if you have
that code of serializer:
module Flats
class IndexSerializer < Api::V2::RealtyObjects::IndexSerializer
attributes(
:flat_number,
:entrance_number,
:floor_in_house,
:live_area,
:room_count,
:total_area,
)
end
end
add method which will be log time to your development.log on each attribute:
module Flats
class IndexSerializer < Api::V2::RealtyObjects::IndexSerializer
attribute_list = %i[
flat_number
entrance_number
floor_in_house
live_area
room_count
total_area
]
attributes(*attribute_list)
def logger(name, value)
Rails.logger.debug name, value, "#{Time.now.strftime('%M:%S--%N')}"
end
attribute_list.each do |attribute_name|
define_method attribute_name do |value|
logger(attribute_name, value)
super
end
end
end
end
Related
I was wondering how you were testing the search in your application when using ElasticSearch and Tire.
How do you setup a new ElasticSearch test instance? Is there a way to mock it?
Any gems you know of that might help with that?
Some stuff I found helpful:
I found a great article answering pretty much all my questions :)
http://bitsandbit.es/post/11295134047/unit-testing-with-tire-and-elastic-search#disqus_thread
Plus, there is an answer from Karmi, Tire author.
This is useful as well: https://github.com/karmi/tire/wiki/Integration-Testing-Rails-Models-with-Tire
I can't believe I did not find these before asking...
Prefixing your index-names for the current environment
You could set a different index-name for each environment (in your case: the test environment).
For example, you could create an initializer in
config/initializers/tire.rb
with the following line:
Tire::Model::Search.index_prefix "#{Rails.application.class.parent_name.downcase}_#{Rails.env.to_s.downcase}"
A conceivable approach for deleting the indexes
Assuming that you have models named Customer, Order and Product, put the following code somewhere at your test-startup/before-block/each-run-block.
# iterate over the model types
# there are also ways to fetch all model classes of the rails app automaticly, e.g.:
# http://stackoverflow.com/questions/516579/is-there-a-way-to-get-a-collection-of-all-the-models-in-your-rails-app
[Customer, Order, Product].each do |klass|
# make sure that the current model is using tire
if klass.respond_to? :tire
# delete the index for the current model
klass.tire.index.delete
# the mapping definition must get executed again. for that, we reload the model class.
load File.expand_path("../../app/models/#{klass.name.downcase}.rb", __FILE__)
end
end
Alternative
An alternative could be to set up a different ElasticSearch instance for testing on another port, let's say 1234.
In your enviornment/test.rb you could then set
Tire::Configuration.url "http://localhost:1234"
And at a suitable location (e.g. your testing startup) you can then delete all indexes on the ElasticSearch testing-instance with:
Tire::Configuration.client.delete(Tire::Configuration.url)
Maybe you must still make sure that your Tire-Mapping definitions for you model classes are still getting called.
I ran into a quirky bug when deleting my elasticsearch index via tire in my rspec suite. In my Rspec configuration, similar to the Bits and Bytes blog, I have an after_each call which cleans the database and wipes out the index.
I found I needed to call Tire's create_elasticsearch_index method which is responsible for reading the mapping in the ActiveRecord class to set up the appropriate analyzers, etc. The issue I was seeing was I had some :not_analyzed fields in my model which were actually getting analyzed (this broke how I wanted faceting to work).
Everything was fine on dev, but the test suite was failing as facets were being broken down by individual words and not the entire multi word string. It seems that the mapping configuration was not being created appropriately in rspec after the index was deleted. Adding the create_elasticsearch_index call fixed the problem:
config.after(:each) do
DatabaseCleaner.clean
Media.tire.index.delete
Media.tire.create_elasticsearch_index
end
Media is my model class.
I ran into similar issues and here's how I solved it. Bare in mind that my solution builds on top of #spaudanjo solution. Since I'm using spork, I add this inside the spec_helper.rb's Spork.each_run block, but you may add this into any other each/before block.
# Define random prefix to prevent indexes from clashing
Tire::Model::Search.index_prefix "#{Rails.application.class.parent_name.downcase}_#{Rails.env.to_s.downcase}_#{rand(1000000)}"
# In order to know what all of the models are, we need to load all of them
Dir["#{Rails.root}/app/models/**/*.rb"].each do |model|
load model
end
# Refresh Elastic Search indexes
# NOTE: relies on all app/models/**/*.rb to be loaded
models = ActiveRecord::Base.subclasses.collect { |type| type.name }.sort
models.each do |klass|
# make sure that the current model is using tire
if klass.respond_to? :tire
# delete the index for the current model
klass.tire.index.delete
# the mapping definition must get executed again. for that, we reload the model class.
load File.expand_path("../../app/models/#{klass.name.downcase}.rb", __FILE__)
end
end
It basically defines it's own unique prefix for every test case so that there are no in indexes. The other solutions all suffered from a problem where even after deleting the index, Elastic Search wouldn't refresh the indexes (even after running Model.index.refresh) which is why the randomized prefix is there.
It also loads every model and checks if it responds to tire so that we no longer need to maintain a list of all of the models that respond to tire both in spec_helper.rb and in other areas.
As this method doesn't "delete" the indexes after using it, you will have to manually delete it on a regular basis. Though I don't imagine this to be a huge issue, you can delete with the following command:
curl -XDELETE 'http://localhost:9200/YOURRAILSNAMEHERE_test_*/'
To find what YOURRAILSNAMEHERE is, run rails console and run Rails.application.class.parent_name.downcase. The output will be your project's name.
I have a somewhat special use case, where I'd like to create a method that accepts a block, such that anything that happens inside that block is not written to the DB.
The obvious answer is to use transactions like so:
def no_db
ActiveRecord::Base.transaction do
yield
raise ActiveRecord::Rollback
end
end
But the trouble is that if my no_db method is used inside of another transaction block, then I'll ned up in the case of nested transactions. The drawback here is that nested transactions are only supported by MySQL, and I need support for PG, but more importantly SQLite (for tests). (I understand that PG is supported via savepoints, how reliable is that? performance hit?).
The other problem with this type of approach is that it seems really inefficient, writing things to a DB, and then rolling them back. It would be better if I could do something like this:
def no_db_2
# ActiveRecord::Base.turn_off_database
yield
# ActiveRecord::Base.turn_on_database
end
Is there such a method? Or a similar approach to what I'm looking for? I think it needs to be fairly low level..
(Rails version is 3.0.5, but I would be happy if there were an elegant solution for Rails 3.1)
This might be one way to do it:
class Book < ActiveRecord::Base
# the usual stuff
end
# Seems like a hack but you'll get the
# transaction behavior this way...
class ReadOnly < ActiveRecord::Base
establish_connection "#{Rails.env}_readonly"
end
I would think that this...
ReadOnly.transaction do
Book.delete_all
end
...should fail.
Finally, add another connection to config/database.yml
development:
username: fullaccess
development_readonly:
username: readonly
One downside is the lack of support for a read-only mode in the sqlite3-ruby driver. You'll notice that the mode parameter doesn't do anything yet according to the documentation. http://sqlite-ruby.rubyforge.org/classes/SQLite/Database.html#M000071
This is a repost on another issue, better isolated this time.
In my environment.rb file I changed this line:
config.time_zone = 'UTC'
to this line:
config.active_record.default_timezone = :utc
Ever since, this call:
Category.find(1).subcategories.map(&:id)
Fails on "Stack level too deep" error after the second time it is run in the development environment when config.cache_classes = false. If config.cache_classes = true, the problem does not occur.
The error is a result of the following code in active_record/attribute_methods.rb around line 252:
def method_missing(method_id, *args, &block)
...
if self.class.primary_key.to_s == method_name
id
....
The call to the "id" function re-calls method_missing and there is nothing that prevents the id to be called over and over again, resulting in stack level too deep.
I'm using Rails 2.3.8.
The Category model has_many :subcategories.
The call fails on variants of that line above (e.g. Category.first.subcategory_ids, use of "each" instead of "map", etc.).
Any thoughts will be highly appreciated.
Thanks!
Amit
Even though this is solved, I just wanted to chime in on this, and report how I fixed this issue. I had the same symptoms as the OP, initial request .id() worked fine, subsequent requests .id() would throw an the "stack too deep" error message. It's a weird error, as it generally it means you have an infinite loop somewhere. I fixed this by changing:
config.action_controller.perform_caching = true
config.cache_classes = false
to
config.action_controller.perform_caching = true
config.cache_classes = true
in environments/production.rb.
UPDATE: The root cause of this issue turned out to be the cache_store. The default MemoryStore will not preserve ActiveRecord models. This is a pretty old bug, and fairly severe, I'm not sure why it hasn't been fixed. Anyways, the workaround is to use a different cache_store. Try using this, in your config/environments/development.rb:
config.cache_store = :file_store
UPDATE #2: C. Bedard posted this analysis of the issue. Seems to sum it up nicely.
Having encountered this problem myself (and being stuck on it repeateadly) I have investigated the error (and hopefully found a good fix). Here's what I know about it:
It happens when ActiveRecord::Base#reset_subclasses is called by the dispatcher between requests (in dev mode only).
ActiveRecord::Base#reset_subclasses wipes out the inheritable_attributes Hash (where #skip_time_zone_conversion_for_attributes is stored).
It will not only happen on objects persisted through requests, as the "monkey test app" from #1290 shows, but also when trying to access generated association methods on AR, even for objects that live only on the current request.
This bug was introduced by this commit where the #skip_time_zone_conversion_for_attributes declaration was changed from base.cattr_accessor to base.class_inheritable_accessor. But then again, that same commit also fixed something else.
The patch initially submitted here that simply avoids clearing the instance_variables and instance_methods in reset_subclasses does introduce massive leaking, and the amounts leaked seem directly proportional to complexity of the app (i.e. number of models, associations and attributes on each of them). I have a pretty complex app which leaks nearly 1Mb on each request in dev mode when the patch is applied. So it's not viable (for me anyways).
While trying out different ways to solve this, I have corrected the initial error (skip_time_zone_conversion_for_attributes being nil on 2nd request), but it uncovered another error (which just didn't happen because the first exception would be raised before getting to it). That error seems to be the one reported in #774 (Stack overflow in method_missing for the 'id' method).
Now, for the solution, my patch (attached) does the following:
It adds wrapper methods for #skip_time_zone_conversion_for_attributes methods, making sure it always reads/writes the value as an class_inheritable_attribute. This way, nil is never returned anymore.
It ensures that the 'id' method is not wiped out when reset_subclasses is called. AR is kinda strange on that one, because it first defines it directly in the source, but redefines itself with #define_read_method when it is first called. And that is precisely what makes it fail after reloading (since reset_subclasses then wipes it out).
I also added a test in reload_models_test.rb, which calls reset_subclasses to try and simulate reloading between requests in dev mode. What I cannot tell at this point is if it really triggers the reloading mechanism as it does on a live dispatcher request cycle. I also tested from script/server and the error was gone.
Sorry for the long paste, it sucks that the rails lighthouse project is private. The patch mentioned above is private.
-- This answer is copied from my original post here.
Finally solved!
After posting a third question and with help of trptcolin, I could confirm a working solution.
The problem: I was using require to include models from within Table-less models (classes that are in app/models but do not extend ActiveRecord::Base). For example, I had a class FilterCategory that performed require 'category'. This messed up with Rails' class caching.
I had to use require in the first place since lines such as Category.find :all failed.
The solution (credit goes to trptcolin): replace Category.find :all with ::Category.find :all. This works without the need to explicitly require any model, and therefore doesn't cause any class caching problems.
The "stack too deep" problem also goes away when using config.active_record.default_timezone = :utc
In my rails application, I have a background process runner, model name Worker, that checks for new tasks to run every 10 seconds. This check generates two SQL queries each time - one to look for new jobs, one to delete old completed ones.
The problem with this - the main log file gets spammed for each of those queries.
Can I direct the SQL queries spawned by the Worker model into a separate log file, or at least silence them? Overwriting Worker.logger does not work - it redirects only the messages that explicitly call logger.debug("something").
The simplest and most idiomatic solution
logger.silence do
do_something
end
See Logger#silence
Queries are logged at Adapter level as I demonstrated here.
How do I get the last SQL query performed by ActiveRecord in Ruby on Rails?
You can't change the behavior unless tweaking the Adapter behavior with some really really horrible hacks.
class Worker < ActiveRecord::Base
def run
old_level, self.class.logger.level = self.class.logger.level, Logger::WARN
run_outstanding_jobs
remove_obsolete_jobs
ensure
self.class.logger.level = old_level
end
end
This is a fairly familiar idiom. I've seen it many times, in different situations. Of course, if you didn't know that ActiveRecord::Base.logger can be changed like that, it would have been hard to guess.
One caveat of this solution: this changes the logger level for all of ActiveRecord, ActionController, ActionView, ActionMailer and ActiveResource. This is because there is a single Logger instance shared by all modules.
I have this block of code:
users = Array.new
users << User.find(:all, :conditions => ["email like ?", "%foo%"])
users << User.find(:all, :conditions => ["name like ?", "%bar%"])
users.flatten!
users.uniq!
puts users.to_json :include => [:licenses]
When I run it using script/console, it returns exactly what you would think it should, a JSON representation of the Array of users that I found, flattened, and uniquified. But running that same line of code as part of a search_for_users method, I get this error
TypeError in ControllerName#search_for_users
wrong argument type Hash (expected Data)
and the line referenced is the line with the .to_json call.
It's baffling me because the code is verbatim the same. The only difference is that when I'm running it in the console, I'm entering the conditions manually, but in my method, I'm pulling the query from params[:query]. But, I just tried hardcoding the queries and got the same result, so I don't think that is the problem. If I remove the :include, I don't see the error, but I also don't get the data I want.
Anyone have any idea what the issue might be?
There are a few plugins and gems that can cause .to_json to fail if included in your controller. I believe that the Twitter gem is one of them (ran into a problem with this awhile back).
Do you have "include [anything]" or "require [anything]" in this controller?
If not, I'd suggest temporarily removing any plugins you're using to troubleshoot, etc.
Finally, what happens if you replace that entire controller action with simply:
%w(1 2 3 4 5).to_json
That should help you pin down what is failing.
Whenever code in tests or the console behaves different from production environment (which is a guess... you might be running your site in development mode), this calls for a load order issue. In production environment, all the models and controllers are preloaded, in other environments they are loaded lazily when needed.
Start your console with RAILS_ENV=production ./script/console and see if you can reproduce the error this way.
As cscotta mentioned, there are a couple of gems and librarys, that can interfere with .to_json, first to mention the functionality, that you get when you require 'json'. I personally ran into several issues with that.
Hope this helps
Seb