I have a production Rails app running on Heroku, and some API endpoints are taking a long period of time to resolve (~1-2s).
It's a normal Rails RESTful GET action cases#index. The method looks like so:
#cases = cases_query
meta = {
total: #cases.total_count,
count: params[:count],
page: params[:page],
sort: params[:order],
filter: pagination[:filter],
params: params
}
render json: #cases,
root: 'cases',
each_serializer: CaseSerializer,
meta: meta
The method runs an ActiveRecord query to select data, serializes each record and renders JSON. Skylight, a Rails profiler/monitoring/performance tool, is telling me that this endpoint amongst others is spending 70% in the controller method (in Ruby), and 30% in the database.
What in this code or in my app's setup is causing this to spend so much time in the app code? Could it be a gem?
Picture of Skylight analytics on this endpoint (you can see the bulk of the time is spent in Ruby in the controller action):
ActiveRecord can generate a ton of Ruby objects from queries. So you track the time it takes for the database to return results, and that may be ~20% of your request, but the rest could still be in ActiveRecord converting those results into Ruby objects.
Does your query for this request return a lot of rows? Are those rows very wide (like when you do a join of table1., table2., table3.*)?
I've had some experience in the past with serializers really really crushing performance. That usually ends up being a bit of a line by line hunt for what's responsible.
To troubleshoot this issue I recommend finding a way to get realtime or near realtime feedback on your performance. The newrelic_rpm gem has a /newrelic page you can view in development mode, which should provide feedback similar to Skylight. Skylight may have a similar development mode page you should look into.
There's also a gem called Peek that adds a little performance meter to each page view, that you can add gems to in order to show specific slices of the request, like DB, views, and even Garbage collection. https://github.com/peek/peek Check it out, especially the GC plugin.
Once you have that realtime feedback setup, and you can see something that maps roughly to your skylight output, you can start using a binary search in your code to isolate the performance problem.
In your controller, eliminate the rendering step by something like:
render json: {}
and look at the results of that request. If the performance improves dramatically then your issue is probably in the serialization phase.
If not, then maybe it is ActiveRecord blowing up the Ruby objectspace. Do a google search for Ruby Object Space profiling and you'll find ways to troubleshoot this.
If that's your problem, then try to narrow down the results returned by your query. select only the columns you need to render in this response. Try to eliminate joins if possible (by returning a foreign key instead of an object, if that is possible).
If serialization is your problem... Good luck. This one is particularly hard to troubleshoot in my experience. You may try using a more efficient JSON gem like OJ, or hardcoding your serializers rather than using ActiveRecord::Serializer (last resort!).
Good luck!
Normally database queries can cause this kind of issue revisit you database queries and try to optimize them apply joins where you can.
Also try to use Puma gem with heroku to improve your server performance.
Related
I have Rails 3 project that I updated to 5, that uses Mongoind instead of Active Record. I'm trying to implement fragment caching.
My understanding is, that with ActiveRecord, even if I have something like
#films = Film.all in a controller, but never use #films in the view, the query won't actually run. Hence, if I cache #films in the view, on the second request, it'll be read from cache, and query isn't going to run.
This is how I think ActiveRecord works.
Now to Mongoid. I cache variable in the view, but even though it's being read from cache, the query still hits db.
My question is, is there a way to avoid that with Mongoid?
Or am I missing something in terms of caching?
I tried searching online, but there isn't much on Rails Mongoid caching, not to mention, anything written after 2012.
I have a model which has through relationships to another model through two foreign key relationships. I do a lot of lookups on those tables (which are 3-4K rows) while importing a lot of data, and I'm trying to eliminate spurious repeated database lookups.
(Ideally, my database would be doing async writes/INSERTs only)
I have played with doing my own caching by ID, and recently switched to using Rails.cache (with MemoryStore for now. I have no need to sync against other instances, and I'm ram rich on the import machine). However, I find that I am getting multiple copies of the same associated records, and I'd like to get rid of this.
For instance:
irb> p = Phone.includes([:site => :client, :btn => :client]).first.
irb> p.site.client.object_id => 67190640
irb> p.btn.client.object_id => 67170780
Ideally, I'd like these to point to the same object in memory.
Rails.cache would serialize things in/out, which really just makes this worse, but I was surprised by this. Could I override find_by_id() or some such in a way that the association proxies would make use of my cache?
Maybe there is another caching module that I'm missing?
(please note that there is no web front end involved in this process. It's all models and ORM)
Try using IdentityCache (see https://github.com/Shopify/identity_cache). We currently have a similar issue. We're using JRuby because it's fast but mallocs are expensive in our target environment... making cacheing these record instances all the more necessary.
There used to be an IdentityMap within ActiveRecord, but it was removed due to issues with unexpected behaviour around associations.
Just noticed you asked this in August, did you find a good solution?
Introduction
I have a (mostly) single-page application built with BackboneJS and a Rails backend.
Because most of the interaction happens on one page of the webapp, when the user first visits the page I basically have to pull a ton of information out of the database in one large deeply joined query.
This is causing me some rather extreme load times on this one page.
NewRelic appears to be telling me that most of my problems are because of 457 individual fast method calls.
Now I've done all the eager loading I can do (I checked with the Bullet gem) and I still have a problem.
These method calls are most likely ocurring in my Rabl serializer which I use to serialize a bunch of JSON to embed into the page for initializing Backbone. You don't need to understand all this but suffice to say it could add up to 457 method calls.
object #search
attributes :id, :name, :subscription_limit
# NOTE: Include a list of the members of this search.
child :searchers => :searchers do
attributes :id, :name, :gravatar_icon
end
# Each search has many concepts (there could be over 100 of them).
child :concepts do |search|
attributes :id, :title, :search_id, :created_at
# The person who suggested each concept.
child :suggester => :suggester do
attributes :id, :name, :gravatar_icon
end
# Each concept has many suggestions (approx. 4 each).
node :suggestions do |concept|
# Here I'm scoping suggestions to only ones which meet certain conditions.
partial "suggestions/show", object: concept.active_suggestions
end
# Add a boolean flag to tell if the concept is a favourite or not.
node :favourite_id do |concept|
# Another method call which occurs for each concept.
concept.favourite_id_for(current_user)
end
end
# Each search has subscriptions to certain services (approx. 4).
child :service_subscriptions do
# This contains a few attributes and 2 fairly innocuous method calls.
extends "service_subscriptions/show"
end
So it seems that I need to do something about this but I'm not sure what approach to take. Here is a list of potential ideas I have:
Performance Improvement Ideas
Dumb-Down the Interface
Maybe I can come up with ways to present information to the user which don't require the actual data to be present. I don't see why I should absolutely need to do this though, other single-page apps such as Trello have incredibly complicated interfaces.
Concept Pagination
If I paginate concepts it will reduct the amount of data being extracted from the database each time. Would product an inferior user interface though.
Caching
At the moment, refreshing the page just extracts the entire search out of the DB again. Perhaps I can cache parts of the app to reduce on DB hits. This seems messy though because not much of the data I'm dealing with is static.
Multiple Requests
It is technically bad to serve the page without embedding the JSON into the page but perhaps the user will feel like things are happening faster if I load the page unpopulated and then fetch the data.
Indexes
I should make sure that I have indexes on all my foreign keys. I should also try to think about places where it would help to have indexes (such as favourites?) and add them.
Move Method Calls into DB
Perhaps I can cache some of the results of the iteration I do in my view layer into the DB and just pull them out instead of computing them. Or I could sync things on write rather than on read.
Question
Does anyone have any suggestions as to what I should be spending my time on?
This is a hard question to answer without being able to see the actual user interface, but I would focus on loading exactly only as much data as is required to display the initial interface. For example, if the user has to drill down to see some of the data you're presenting, then you can load that data on demand, rather than loading it as part of the initial payload. You mention that a search can have as many as 100 "concepts," maybe you don't need to fetch all of those concepts initially?
Bottom line, it doesn't sound like your issue is really on the client side -- it sounds like your server-side code is slowing things down, so I'd explore what you can do to fetch less data, or to defer the complex queries until they are definitely required.
I'd recommend separating your JS code-base into modules that are dynamically loaded using an asset loader like RequireJS. This way you won't have so many XHRs firing at load time.
When a specific module is needed it can load and initialize at an appropriate time instead of every page load.
If you complicate your code a little, each module should be able to start and stop. So, if you have any polling occurring or complex code executing you can stop the module to increase performance and decrease the network load.
I am trying to build a print process, which consists of printing a batch of financial applications using Rails.
I am printing around 100 applications, which consist of multiple levels of data (the application itself, sub-models, and their sub-models).
At the moment the page is very inefficient as it is doing a lot of N+1 querying which is causing the performance to be poor.
Question is, is there an efficient way of getting this data out of the database. I've tried pulling the forms with includes() for all the sub-models, but this doesn't help with the models that are below that (for instance, income_line_items on a financial_history model)
Any ideas?
Have you tried using a nested hash to access the sub-sub-models? Something like:
#app = Application.includes(:first_submodel,
{:second_submodel => [:first_sub_submodel, :second_sub_submodel]},
{:third_submodel => :third_sub_submodel})
This feels like a really bad idea on implementation and I'm not sure of the speed savings given the potential cost in memory.
Using Dalli and Memecache to store a set of data that gets pulled on every page request that needs to be paginated. Its a lot of data so I'd really rather not keep hitting the database if possible, but I also don't know how expensive this operation is for memory, both in memcache as well as just in system memory. To paginate, I do basically the following:
Call to memcache to see if given data exists
If found, get data and go to 4.
If not found, get from database and cache
Dup object because object is frozen. ?? <-- seems like a really bad idea...
Return paginated data for display.
WillPaginate has to perform actions on the dataset coming out of the DB but it throws a "can't modify frozen object" error.
How much bad is this? dup-ing a large object (which is cached to save time in calls to the db) seems like it will end up chewing up a lot of extra memory and run the GC a lot more than needs be. Anyone suggest some ways around this?
Instead of working with the objects at a Model / List view, you can also cache the rendered html. You can even pre-cache the contents for an extra boost :)