I am working on an "optimization" on my application and I am trying to understand the output that rails (version 2.2.2) gives at the end of the render.
Here is the "old" way:
Rendered user/_old_log (25.7ms)
Completed in 466ms (View: 195, DB: 8) | 200 OK
And the "new" way:
Rendered user/_new_log (48.6ms)
Completed in 337ms (View: 192, DB: 33) | 200 OK
These queries were exactly the same, the difference is the old way is parsing log files while the new way is querying the database log table.
The actual speed of the page is not the issue (the user understands that this is a slow request) ... but I would like the page to respond as quickly as possible even though it is a "slow" page.
So, my question is, what do the numbers represent/mean? In other words, which way was the faster method and why?
This:
Rendered user/_old_log (25.7ms)
is the time to render just the _old_log partial template, and comes from an ActiveSupport::Notification getting processed by ActionView::LogSubscriber
This:
Completed 200 OK in 466ms
Is the http status returned, as well as the total time for the entire request. It comes from ActionController::LogSubscriber.
Also, note those parenthetical items at the end:
(Views: 124.6ms | ActiveRecord: 10.8ms)
Those are the total times for rendering the entire view (partials & everything) and all database requests, respectively, and come from ActionController::LogSubscriber as well.
Jordan's answer is correct. To paraphrase, the first number is the time the page took to load. The second is how long the view took to generate. The last number is how long it took for your database to handle all queries you sent to it.
You can also get an estimation of how long your Controller and Model code took by subtracting the last two numbers from the first number, but a better way would be to use the Benchmark.measure method (http://www.ruby-doc.org/stdlib/libdoc/benchmark/rdoc/classes/Benchmark.html).
Your new way appears to have improved because code in the Controller/Model completes faster.
Your new way is spending less time overall but more time rendering the template.
Related
I have an AngularJS app and there's one page in my application, only one, that is taking 2 minutes to load. It is loading a bit of data, but the data itself is only 700KB and I benchmarked the entire rails action starting from the beginning until right before the render and it only takes 15-20 seconds. But when I look at the actual network call, or I put a timer before the angular http post call and then one in the success, they both show the call taking almost 2 minutes. I can't figure out what's going on between the render and the success on angular that would be causing this extreme time difference. Does anyone know how I could further debug this or possibly know what could be causing this?
The rails action just does a couple big database calls, all optimized, then does some work on the data, then the data (which is already JSONified with to_json) is rendered out.
Rails action ends with Completed 200 OK in 20458ms (Views: 913.8ms | ActiveRecord: 139.6ms)
Edit: If I put a limit on my data it's almost instant, so it definitely has to do with the data. But I'm not sure what could be causing the minute and a half disconnect between when the rails action finished and the http post success begins.
Edit2: An ajax call takes an equal amount of time. So there must be an issue with how the data is being parses on the front end, not sure the best way to do this. Since there's an obvious issue between the render and the page getting it.
Turns out the issue was somehow the extremely complex hash my old coworker wrote. I think the whole thing was pretty unnecessary so I deleted all 90 lines of code where he built the hash from scratch and instead 3 lines.
I now have the two activerecord queries with the proper includes, and then wrote one render state on those activerecord objects with as_json with the proper include and only parameters and the only thing now loads in 25 seconds on development. I can only imagine it'll be faster in production/staging. I don't know why the hashes were so hard to render as json, but calling as_json on the active record objects within the render statement completely fixed my issue.
We are working on a data visualization problem right now. Our customer wants us to show the last 6 months data for a honeybee hive on a graph.
Clearly it's gonna be a huge dataset. Adding indexes we overcame the database slowness problem in loading data though we still have problem in visualizing data on a graph.
Here is the related code:
def self.prepare_single_hive_messages_for_datatable_dygraph(messages, us_metric_enabled)
data = []
messages.each do |message|
record = []
record << message.occurance_time.to_s(:dygraph_format)
record << weight_according_to_metric(message.weight, us_metric_enabled)
record << temperature_according_to_metric(message.temperature, us_metric_enabled)
record << (message.humidity.nil? ? nil : message.humidity.to_f)
data << record
end
return data
end
The problem is that messages.each is very slow and takes more than 30 seconds. Is there any solution to overcome this?
Project Specification:
Rails Version: 4.1.9
Graph Library: Dygraph
Database: Postgres
There are two ways to attack a performance problem like this.
Find and correct the performance bottle neck
Break it into smaller pieces
Finding Performance issues
First, get a dataset large enough to reproduce the problem setup on your dev system. Then look at the logs so you can see how long the transaction is taking. You should be looking for a line like this:
Completed 200 OK in 432.1ms (Views: 367.7ms | ActiveRecord: 61.4ms)
Rerun the task a couple times since caching can cause variations. Write down your different times. Then remove everything in the loop and run it with just the loop. Do the numbers go back to looking reasonable? If that is the case then you know the problem is the work you are doing inside the loop. Next, add each line in the loop back on its own (or one at a time if they depend on each other). Figure out which line causes those numbers to jump the most.
This is the point where you should try to performance tune your code. Check for queries that could be smarter. Make sure you aren't querying the same data over and over. If you have a function in a model that computes something and you call it multiple times to get the same answer then use this to only compute once:
def something
return #savedvalue if #savedvalue
#savedvalue = really complex calculation
end
The goal is to find the worse offender so you can make changes that have the biggest impact. However, if you are working with a LOT of data this may only get you so far. It may be impossible to performance tune enough for all the data. In that case there is option 2.
Break it into smaller pieces
Write a second rails action who's only job is to render a single record on a graph. It will do the inner part of your loop but only on the message who's id was passed to it.
Call your original function to setup the view and pass the list of messages to the view. In the view loop through the list of messages to setup jquery ajax code to call the above action once for each message. Have this run in on document ready.
Then, the page will load with an empty graph... but as soon as it is up the individual processed records will be fed to it and appear one at a time on the page. It will still take just ask long (or even a little longer because of overhead) to complete the graph... but it will no longer time out. Each ajax call will be its own quick hit to the server instead of one big long hit.
I just used this very technique to load a rather long report on a site I work on. Ideally we'd like to fix any underlying performance issues... but what we really wanted was to have a report working right away and then fix the performance issues as we had time.
Ok you said every person sees the same set of data, which is great, means we can cache without worrying about who's logged in, first here's your method, with tiny improvements
def self.prepare_single_hive_messages_for_datatable_dygraph(messages, us_metric_enabled)
messages.inject([]) do |records, message|
records << [].tap do |record|
record << message.occurance_time.to_s(:dygraph_format)
record << weight_according_to_metric(message.weight, us_metric_enabled)
record << temperature_according_to_metric(message.temperature, us_metric_enabled)
record << (message.humidity.nil? ? nil : message.humidity.to_f)
end
end
end
Then create a caching function, that runs this method and caches it
# some class constants
CACHE_KEY = 'some_cache_key'
EXPIRY_TIME = 15.minutes
# the methods
def self.write_single_hive_messages_to_cache(messages, us_metric_enabled)
Rails.cache.write CACHE_KEY,
self.class.prepare_single_hive_messages_for_datatable_dygraph(messages, us_metric_enabled),
expires_in: EXPIRY_TIME
end
And a simple cache reading method
self.read_single_hive_messages_from_cache
Rails.cache.read CACHE_KEY
end
Then create a rake task that just fetches these messages and call the caching method, and rails will write the cache.
Create a cron job that calls this rake task, set the cron job to 5 minutes or so, the expiry time is longer just in case for some reason the cron job didn't run, the data will still be available for the next run.
This way your processing is run in the background, every 5 ( or whatever time you choose ) minutes, the page load should happen normally with no delay at all, since the array data will be loaded from the pre-calculated cache.
In case the cron stops working, the data will expire in the 15 minutes I've set, and then the read cache method will return nil, you could avoid this and set the data to never expire, but then the data will become stale and the old data will keep getting returned.
Another way to handle this is to tell the cache reading method how to generate the cache it self, so if it finds the cache empty it generates one and caches it itself before returning the data, the method would look like this
def self.read_single_hive_messages_from_cache(messages, us_metric_enabled)
Rails.cache.fetch CACHE_KEY, expires_in: EXPIRY_TIME do
self.class.write_single_hive_messages_to_cache(messages, us_metric_enabled)
end
end
But then make sure that messages is an ActiveRecord::Relation and not a processed array, because you don't want to query for 1+ million records and then find the cache already ready, if it's an ActiveRecord::Relation it will not touch the database until the array is started ( inside the caching block ), if the cache exists it will be returned before you enter the block and thus the data won't get fetched, saving you that huge query.
I know the answer got long, if you need more help tell me.
So I've been using MongoDB (with the Mongoid Ruby Gem) for a while now, and as our app has grown I've noticed requests taking longer and longer as my data has grown, here is what a typical request for my app looks like, but it takes about 500ms, just for the DB stuff.
Nothing special here just some controller stuff:
Started GET "/cities/san-francisco?date_range=past_week" for 127.0.0.1 at 2011-11-15 11:13:04 -0800
Processing by CitiesController#show as HTML
Parameters: {"date_range"=>"past_week", "id"=>"san-francisco"}
Then the queries run, but what I don't understand is that for every query that runs it performs a MONGODB dashboard_development['system.namespaces'].find({}) before actually running it! Why?
MONGODB dashboard_development['system.namespaces'].find({})
MONGODB dashboard_development['users'].find({:_id=>BSON::ObjectId('4e80e0090f6d2e306f000001')})
MONGODB dashboard_development['system.namespaces'].find({})
MONGODB dashboard_development['cities'].find({:slug=>"san-francisco"})
MONGODB dashboard_development['system.namespaces'].find({})
MONGODB dashboard_development['accounts'].find({:_id=>BSON::ObjectId('4e80e0090f6d2e306f000002')})
MONGODB dashboard_development['system.namespaces'].find({})
MONGODB dashboard_development['neighborhoods'].find({"city_id"=>BSON::ObjectId('4e80e00a0f6d2e306f000005')})
Then the views get rendered, they are pretty slow too... but that is a seperate problem all together, I'll address that at another time.
Rendered cities/_title_and_scope.html.erb (109.3ms)
Rendered application/_dropdown.html.erb (0.1ms)
Rendered application/_date_range_selector.html.erb (6.2ms)
Rendered cities/show.html.erb within layouts/application (122.7ms)
Rendered application/_user_dropdown.html.erb (0.9ms)
Rendered application/_main_navigation.html.erb (5.8ms)
So minus the views the request took about 500ms, thats too long for a really simple query, additionally the app is going to grow and that time is going to grow as well. Also this example is faster than the requests usually take, sometimes it takes 1000ms or more!
Completed 200 OK in 628ms (Views: 144.9ms)
Additionally I wanted to ask what fields are most appropriate for indexes? Maybe this is my problem, as I'm not really using them at all. Any help understanding this would be really really appreciated. Thanks!
You need to use indexes -- otherwise, your mongo queries are executing what is best described as a full table scan. It is loading the entirety of your collection's json documents into memory and then evaluating each one to determine if it should be including in the response.
Strings, Date, Numbers can all be used as index -- the trick is, have an index on each attribute you are doing a "where" on.
You can turn off table-scans in your mongo config to help find table scans and destroy them!
I have a controller that returns JSON or XML from a fairly complex relational query with some controller logic as well.
I've tuned on the DB side by refining my query and making sure my indexes are correct for my query.
In my log I see items like this:
Completed in 740ms (View: 1, DB: 50)
So if I understand correctly this means the view took 1 second to render and the DB query was 50ms. Is all the remaining time in the controller? I've tried bypassing my controller logic and just leaving my to_json and to_xml in there and it is just as slow. As a point of reference my average returned JSON result set is 168k.
Are there other steps that go into the Completed in time? Does it include time until last byte for the network transfer?
Update: I wrapped various parts of my controller in benchmarking blocks:
self.class.benchmark("Active Record Find") do
#my query here
end
What I found was that even though the log line says DB: 50 my active record find is taking almost all of the remaining time. So now I'm confused as to what that DB number means and why the benchmark line will say ~ 600ms but the DB: time will be ~50.
Thanks
Your DB number is time actually spent in the database, but not loading ActiveRecord objects.
So if you're loading 168,000 ruby active_record objects to render then as JSON, this would explain your 550 ms (or more!)
If these times are observed in the development environment, the additional time is probably because the application classes are not being cached. The application files are being reloaded on every request.
As suggested in this answer, try setting config.cache_classes = true in environment/development.rb and restarting your server to see what effect this has on your response times. Be sure to change this back to config.cache_classes = false and restart your server once you're done.
I'm doing some optimisation on my Rails (2.3.5) app, and can't seem to find an elegant way of benchmarking the filter chain. I'm ab testing the site with something like:
ab -n 200 -c 3 -i -k http://localtestingserver:80/test
/test is setup with nothing in the controller and nothing in the page, so it's just loading our default filter chain plus rendering the layout. I get an average of 86ms per request, fine.
When I disable the filters (skip_filter filter_chain) it drops to 37ms, and without the layout (render :layout => false) it drops to 16ms. Is there a way I can benchmark, perhaps with Benchmark.realtime, each function being loaded in the filter chain, before the controller is called (or indeed after)? Can I output even a list of all filters being called on a request?
Thanks,
Dan
Edit
I'm using the Hodel3000 logger and Oink so get output per request like:
Jan 27 17:56:55 testing rails[19611]: Memory usage: 98748 | PID: 19611
Jan 27 17:56:55 testing rails[19611]: Instantiation Breakdown: Total: 2 | Room: 1 | User: 1
Jan 27 17:56:55 testing rails[19611]: Completed in 240ms (View: 28, DB: 0) | 200 OK [/test]
I'd just like to better understand and profile what happens before the controller is called - I can profile the controllers themselves fine. Like where the extra 212ms is from in the above request. Obviously I could drop code into each of my own before_filters, but hoped there was a way to wrap every filter in one go (like ones from included gems, etc.).
The Performance Testing Rails Applications guide looks like a good place to start.