My main idea is to count the number of ActiveRecord queries for every API hit in rails. I was looking into ActiveSupport instrumentation API. And it seems it already provides a couple of useful data.
view_runtime
db_runtime
I also found a couple of gems which would count the number of queries. But they add query counts in the log data.
https://github.com/rubysamurai/query_count
https://github.com/comboy/sql_queries_count
https://github.com/makandra/query_diet
Ex: Below is a sample log line when I used the query_count gem.
Completed 200 OK in 140ms (Views: 12.7ms | ActiveRecord: 54.4ms | SQL Queries: 36 (0 cached) | Allocations: 20449)
But instead of a log line, is it possible to expose query_count via ActiveSupport events, maybe as part of the payload of process_action.action_controller?
The easiest way to do rails endpoint instrumentation, based on my experience is to use rails-panel.
It shows how long the rendering took, how long the query took and how many queries were executed.
Related
I am generating nested json objects for 300 records (GET request), all eager loaded from various tables, using fast_jsonapi gem from Netflix.
The time spent on ActiveRecord is a magnitude lower than previously when I used AMS. However the total time is not drastically improved, taking 1484ms as shown below:
1.080000 0.020000 1.100000 ( 1.421109)
Completed 200 OK in 1484ms (Views: 9.7ms | ActiveRecord: 51.3ms)
Views and ActiveRecord are fast - only 60ms, so what is the vast majority of time spent on??
How can I investigate this?
I have a Rails 5 app with a model called Sensor Registry.
It currently has about 160,000 records, but I am experiencing extremely low loading times when trying to display this data.
The application is running on a dual core Intel(R) Xeon(R) CPU E5-2670 v3 # 2.30GHz and 2GB of RAM.
The server logs show the following:
Started GET "/sensor_registries" for 187.220.30.180 at 2017-01-10 23:43:41 +0000
Cannot render console from 187.220.30.180! Allowed networks: 127.0.0.1, ::1, 127.0.0.0/127.255.255.255
ActiveRecord::SchemaMigration Load (1.2ms) SELECT "schema_migrations".* FROM "schema_migrations"
Processing by SensorRegistriesController#index as HTML
Rendering sensor_registries/index.html.erb within layouts/application
SensorRegistry Load (604.0ms) SELECT "sensor_registries".* FROM "sensor_registries"
Sensor Load (0.6ms) SELECT "sensors".* FROM "sensors" WHERE "sensors"."id" IN (49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 65, 61, 63, 64, 62)
Rendered sensor_registries/index.html.erb within layouts/application (54663.9ms)
Completed 200 OK in 55468ms (Views: 54827.7ms | ActiveRecord: 611.5ms)
I got rid of the N+1 problem, but I'm wondering if there is something more I can do about the database queries.
Anyway the problem seems to be at the moment of rendering the page, it takes about 54 seconds to process it.
Is there a way to optimize CPU usage?
What would be the best solution for speeding up the process and showing the data to the user fast enough?
This certainly doesn't look like a hardware problem, but a problem with implementation. This is impossible to answer perfectly without knowing more about your data structure and architecture, but a few thoughts that may help you track down the problem:
1) How big is the rendered page? Is it possible that the sheer size of the data is what's causing slow rendering times? If this is the problem and the rendered page is just too big, think about paginating the results.
2) How much memory is the ruby process using at any time? When you say, it "has about 160,000 records", I assume you're talking about the sensor_registries table, and I assume the subsequent sensors query with the sensors.id in (...) bit is constructed with some data from the sensor_registries table. If this entire table is loaded up in memory before any further work is done, is it possible that the ruby process is just out of memory?
3) Also, does the entire table really need to be loaded up all at once? You might want to take a look at http://apidock.com/rails/ActiveRecord/Batches/find_in_batches. That method is great for breaking up work that needs to be done on a large table.
4) Even better–and there's really not a better way to put this–rethink your architecture. Loading up a whole table in memory as part of a synchronous request (even a small one) is almost always a no-no. Can you come up with a sql query to get all your needed records without loading the whole table?
5) If you absolutely need the whole table, what about caching the result after a single load? If you expect the result of a query to be the same across requests with identical parameters, could you construct a cache key out of the params, and use that to store the result? At least that way, all requests after the first one will be fast(er).
User following methods:
User pagination to display limit number of records on page and load more records on click button.
User page caching to put the data into cache and load the heavy page in less time.
Load js files after the page load and use compressed js files.
Reduce the amount of records you show at the same time using pagination or something (5-10 at most). That way your queries and rendering will be reduces by a lot. I prefer the will_paginate gem for this but there are many more.
In your views, reduce the amount of logic you are using to render each sensor and create separate views for single sensors (show). In the single sensor views you can add more logic.
Use the new Relic gem to monitor your app and see which request take the most amount of time. Maybe you have slow external resources like api calls, etc which you can perform via Ajax and not on the server: https://docs.newrelic.com/docs/agents/ruby-agent/installation-configuration/ruby-agent-installation
Those are the basics, once you are done with that, move on to eager loading and caching.
For eager loading read this: http://guides.rubyonrails.org/active_record_querying.html#eager-loading-associations
For caching read DHH's blog post: https://signalvnoise.com/posts/3113-how-key-based-cache-expiration-works
Using Rails 4 and Mongoid 5.0.1
Our log output is showing almost all queries are duplicated. For a while I assumed it was just double output, but looking closer the times for execution are different, indicating it is actually sending two requests
d_562b2d81a54d7550ce000031.find | STARTED | {"find"=>"deals", "filter"=>{"contact_id"=>BSON::ObjectId('563bcb9da54d75116500010b')}}
d_562b2d81a54d7550ce000031.find | SUCCEEDED | 0.001186s
d_562b2d81a54d7550ce000031.find | STARTED | {"find"=>"deals", "filter"=>{"contact_id"=>BSON::ObjectId('563bcb9da54d75116500010b')}}
d_562b2d81a54d7550ce000031.find | SUCCEEDED | 0.0013s
This behaviour seems to apply to most queries but not all, sometimes specific queries only get called once. These seem to happen when the queries are against the database specified in the mongoid.yml and are the first in the web request.
This behaviour happens in the web request, but also any query in the Rails Console outputs two log lines too. It happens on 'where' queries, and on 'find' too
As this is a multi-tenant app, we have the following in most models:
store_in database: -> { Machine.current.database_name }
The collection for Machine (along with Users) is stored in the master_#{Rails.env} database
The duplicate requests (in the logs) are all against the correct databases though, so this might be a red herring.
When we were on Mongoid 3 this problem was never apparent, but Mongoid 5 has significantly better logging, so the problem may have existed then too but not been noticed.
Actually I suspect it's a gem called bullet causing the duplicate logs. Turning it off solve my problems.
I am trying to obtain count the number of Postgres Statements my Ruby on Rails application is performing against our database. I found this entry on stackoverflow, but it counts transactions. We have several transactions that make very large numbers of statements, so that doesn't give a good picture. I am hoping the data is available from PG itself - rather than trying to parse a log.
https://dba.stackexchange.com/questions/35940/how-many-queries-per-second-is-my-postgres-executing
I think you are looking for ActiveSupport instrumentation. Part of Rails, this framework is used throughout Rails applications to publish certain events. For example, there's an sql.activerecord event type that you can subscribe to to count your queries.
ActiveSupport::Notifications.subscribe "sql.activerecord" do |*args|
counter++
done
You could put this in config/initializers/ (to count across the app) or in one of the various before_ hooks of a controller (to count statements for a single request).
(The fine print: I have not actually tested this snippet, but that's how it should work AFAIK.)
PostgreSQL provides a few facilities that will help.
The main one is pg_stat_statements, an extension you can install to collect statement statistics. I strongly recommend this extension, it's very useful. It can tell you which statements run most often, which take the longest, etc. You can query it to add up the number of queries for a given database.
To get a rate over time you should have a script sample pg_stat_statements regularly, creating a table with the values that changed since last sample.
The pg_stat_database view tracks values including the transaction rate. It does not track number of queries.
There's pg_stat_user_tables, pg_stat_user_indexes, etc, which provide usage statistics for tables and indexes. These track individual index scans, sequential scans, etc done by a query, but again not the number of queries.
I started testing Neo4j for a program and I am facing some performance issues. As mentioned in the title, Neo4j is directly embedded in the java code.
My graphs contains about 4 millions nodes and several hundred million relationships. My test is simply to send a query counting the number of inbound relationships for a node.
This program uses ExecutionEngine execute procedure to send the following query:
start n=node:node_auto_index(id="United States") match s-[:QUOTES]->n return count(s)
By simply adding some prints I can see how much time this query took which is usually about 900ms which is a lot.
What surprises me the most is that I receive a "query execution time" in the response, which is really different.
For instance a query returned:
+----------+
| count(n) |
+----------+
| 427738 |
+----------+
1 row
1 ms
According to this response, I undertand that Neo4j took 1ms for the query, but when I print some log messages I can see that it actually took 917ms.
I guess that 1ms is equal to the time required to find the indexed object "United States", which would mean that Neo4j required about 916ms for the rest, like counting the number of relationships. In this case, how can I get getter performances for this query?
Thanks in advance!
Query timers were broken in 1.8.1 and 1.9.M04, when the cypher lazy stuff was fixed. (Definitely a worthwhile trade for most use cases). But yeah, I think it will be fixed soon.
For now you'll have to time things externally.
Update:
As for your question about whether that time is reasonable... It basically needs to scan all ~400k nodes to count them. This is probably reasonable, even if the cache is warmed up and all of those fit into RAM. Having "super nodes" like this is usually not best practice if it can be avoided, although they are going to be making a lot of improvements for this case in future versions (at least, that's what I hear).
Make sure not to measure the first query b/c that one only measures how long it takes to load the data from disk into memory.
Make sure to give Neo4j enough memory to cache your data.
And try this query if it is faster.
start n=node:node_auto_index(id="United States")
return length(()-[:QUOTES]->n) as cnt