How to test the number of database calls in Rails - ruby-on-rails

I am creating a REST API in rails. I'm using RSpec. I'd like to minimize the number of database calls, so I would like to add an automatic test that verifies the number of database calls being executed as part of a certain action.
Is there a simple way to add that to my test?
What I'm looking for is some way to monitor/record the calls that are being made to the database as a result of a single API call.
If this can't be done with RSpec but can be done with some other testing tool, that's also great.

The easiest thing in Rails 3 is probably to hook into the notifications api.
This subscriber
class SqlCounter< ActiveSupport::LogSubscriber
def self.count= value
Thread.current['query_count'] = value
end
def self.count
Thread.current['query_count'] || 0
end
def self.reset_count
result, self.count = self.count, 0
result
end
def sql(event)
self.class.count += 1
puts "logged #{event.payload[:sql]}"
end
end
SqlCounter.attach_to :active_record
will print every executed sql statement to the console and count them. You could then write specs such as
expect do
# do stuff
end.to change(SqlCounter, :count).by(2)
You'll probably want to filter out some statements, such as ones starting/committing transactions or the ones active record emits to determine the structures of tables.

You may be interested in using explain. But that won't be automatic. You will need to analyse each action manually. But maybe that is a good thing, since the important thing is not the number of db calls, but their nature. For example: Are they using indexes?
Check this:
http://weblog.rubyonrails.org/2011/12/6/what-s-new-in-edge-rails-explain/

Use the db-query-matchers gem.
expect { subject.make_one_query }.to make_database_queries(count: 1)

Fredrick's answer worked great for me, but in my case, I also wanted to know the number of calls for each ActiveRecord class individually. I made some modifications and ended up with this in case it's useful for others.
class SqlCounter< ActiveSupport::LogSubscriber
# Returns the number of database "Loads" for a given ActiveRecord class.
def self.count(clazz)
name = clazz.name + ' Load'
Thread.current['log'] ||= {}
Thread.current['log'][name] || 0
end
# Returns a list of ActiveRecord classes that were counted.
def self.counted_classes
log = Thread.current['log']
loads = log.keys.select {|key| key =~ /Load$/ }
loads.map { |key| Object.const_get(key.split.first) }
end
def self.reset_count
Thread.current['log'] = {}
end
def sql(event)
name = event.payload[:name]
Thread.current['log'] ||= {}
Thread.current['log'][name] ||= 0
Thread.current['log'][name] += 1
end
end
SqlCounter.attach_to :active_record
expect do
# do stuff
end.to change(SqlCounter, :count).by(2)

Related

How can I count the number of accesses/queries to database through Mongoid?

I'm using the Mongoid in a Rails project. To improve the performance of large queries, I'm using the includes method to eager load the relationships.
I would like to know if there is an easy way to count the real number of queries performed by a block of code so that I can check if my includes really reduced the number of DB accesses as expected. Something like:
# It will perform a large query to gather data from companies and their relationships
count = Mongoid.count_queries do
Company.to_csv
end
puts count # Number of DB access
I want to use this feature to add Rspec tests to prove that my query remains efficient after changes (e.g; when adding data from a new relationship). In python's Django framework, for instance, one may use the assertNumQueries method to this end.
Checking on rubygems.org didn't yield anything that seems to do what you want.
You might be better off looking into app performance tools like New Relic, Scout, or DataDog. You may be able to get some out of the gate benchmarking specs with
https://github.com/piotrmurach/rspec-benchmark
I just implemented this feature to count mongo queries in my rspec suite in a small module using mongo Command Monitoring.
It can be used like this:
expect { code }.to change { finds("users") }.by(3)
expect { code }.to change { updates("contents") }.by(1)
expect { code }.not_to change { inserts }
Or:
MongoSpy.flush
# ..code..
expect(MongoSpy.queries).to match(
"find" => { "users" => 1, "contents" => 1 },
"update" => { "users" => 1 }
)
Here is the Gist (ready to copy) for the last up-to-date version: https://gist.github.com/jarthod/ab712e8a31798799841c5677cea3d1a0
And here is the current version:
module MongoSpy
module Helpers
%w(find delete insert update).each do |op|
define_method(op.pluralize) { |ns = nil|
ns ? MongoSpy.queries[op][ns] : MongoSpy.queries[op].values.sum
}
end
end
class << self
def queries
#queries ||= Hash.new { |h, k| h[k] = Hash.new(0) }
end
def flush
#queries = nil
end
def started(event)
op = event.command.keys.first # find, update, delete, createIndexes, etc.
ns = event.command[op] # collection name
return unless ns.is_a?(String)
queries[op][ns] += 1
end
def succeeded(_); end
def failed(_); end
end
end
Mongo::Monitoring::Global.subscribe(Mongo::Monitoring::COMMAND, MongoSpy)
RSpec.configure do |config|
config.include MongoSpy::Helpers
end
What you're looking for is command monitoring. With Mongoid and the Ruby Driver, you can create a custom command monitoring class that you can use to subscribe to all commands made to the server.
I've adapted this from the Command Monitoring Guide for the Mongo Ruby Driver.
For this particular example, make sure that your Rails app has the log level set to debug. You can read more about the Rails logger here.
The first thing you want to do is define a subscriber class. This is the class that tells your application what to do when the Mongo::Client performs commands against the database. Here is the example class from the documentation:
class CommandLogSubscriber
include Mongo::Loggable
# called when a command is started
def started(event)
log_debug("#{prefix(event)} | STARTED | #{format_command(event.command)}")
end
# called when a command finishes successfully
def succeeded(event)
log_debug("#{prefix(event)} | SUCCEEDED | #{event.duration}s")
end
# called when a command terminates with a failure
def failed(event)
log_debug("#{prefix(event)} | FAILED | #{event.message} | #{event.duration}s")
end
private
def logger
Mongo::Logger.logger
end
def format_command(args)
begin
args.inspect
rescue Exception
'<Unable to inspect arguments>'
end
end
def format_message(message)
format("COMMAND | %s".freeze, message)
end
def prefix(event)
"#{event.address.to_s} | #{event.database_name}.#{event.command_name}"
end
end
(Make sure this class is auto-loaded in your Rails application.)
Next, you want to attach this subscriber to the client you use to perform commands.
subscriber = CommandLogSubscriber.new
Mongo::Monitoring::Global.subscribe(Mongo::Monitoring::COMMAND, subscriber)
# This is the name of the default client, but it's possible you've defined
# a client with a custom name in config/mongoid.yml
client = Mongoid::Clients.from_name('default')
client.subscribe( Mongo::Monitoring::COMMAND, subscriber)
Now, when Mongoid executes any commands against the database, those commands will be logged to your console.
# For example, if you have a model called Book
Book.create(title: "Narnia")
# => D, [2020-03-27T10:29:07.426209 #43656] DEBUG -- : COMMAND | localhost:27017 | mongoid_test_development.insert | STARTED | {"insert"=>"books", "ordered"=>true, "documents"=>[{"_id"=>BSON::ObjectId('5e7e0db3f8f498aa88b26e5d'), "title"=>"Narnia", "updated_at"=>2020-03-27 14:29:07.42239 UTC, "created_at"=>2020-03-27 14:29:07.42239 UTC}], "lsid"=>{"id"=><BSON::Binary:0x10600 type=uuid data=0xfff8a93b6c964acb...>}}
# => ...
You can modify the CommandLogSubscriber class to do something other than logging (such as incrementing a global counter).

How to speed up a very frequently made query using raw SQL and without ORM?

I have an API endpoint that accounts for a little less than half of the average response time (on averaging taking about 514 ms, yikes). The endpoint simply returns some statistics about stored data scoped to particular time periods, such as this week, last week, this month, and so on...
There are a number of ways that we could reduce it's impact, like getting the clients to hit it less and with more particular queries such as only querying for "this week" when only that data is used. Here we focus on what can be done at the database-level first. In our current implementation we generate this data for all "time scopes" on-the-fly and the number of queries is enormous and made multiple times per second. No caching is used, but maybe there is a way to use Rails's cache_key, or the low-level Rails.cache?
The current implementation look something like this:
class FooSummaries
include SummaryStructs
def self.generate_for(user)
#user = user
summaries = Struct::Summaries.new
TimeScope::TIME_SCOPES.each do |scope|
foos = user.foos.by_scope(scope.to_sym)
summary = Struct::Summary.new
# e.g: summaries.last_week = build_summary(foos)
summaries.send("#{scope}=", build_summary(summary, foos))
end
summaries
end
private_class_method
def self.build_summary(summary, foos)
summary.all_quuz = #user.foos_count
summary.all_quux = all_quux(foos)
summary.quuw = quuw(foos).to_f
%w[foo bar baz qux].product(
%w[quux quuz corge]
).each do |a, b|
# e.g: summary.foo_quux = quux(foos, "foo")
summary.send("#{a.downcase}_#{b}=", send(b, foos, a) || 0)
end
summary
end
def self.all_quuz(foos)
foos.count
end
def self.all_quux(foos)
foos.sum(:quux)
end
def self.quuw(foos)
foos.quuwable.total_quuw
end
def self.corge(foos, foo_type)
return if foos.count.zero?
count = self.quuz(foos, foo_type) || 0
count.to_f / foos.count
end
def self.quux(foos, foo_type)
case foo_type
when "foo"
foos.where(foo: true).sum(:quux)
when "bar"
foos.bar.where(foo: false).sum(:quux)
when "baz"
foos.baz.where(foo: false).sum(:quux)
when "qux"
foos.qux.sum(:quux)
end
end
def self.quuz(foos, foo_type)
case trip_type
when "foo"
foos.where(foo: true).count
when "bar"
foos.bar.where(foo: false).count
when "baz"
foos.baz.where(foo: false).count
when "qux"
foos.qux.count
end
end
end
To avoid making changes to the model, or creating migrations to create a table to store this data (both of which may be valid and better solutions) I decided maybe it would be easier to construct one large sql query that will be executed at once in the hopes that it will be faster to build the query string and execute it without the overhead of active record set up and tear down of SQL queries.
The new approach looks something like this, it is horrifying to me and I know there must be a more elegant way:
class FooSummaries
include SummaryStructs
def self.generate_for(user)
results = ActiveRecord::Base.connection.execute(build_query_for(user))
results.each do |result|
# build up summary struct from query results
end
end
def self.build_query_for(user)
TimeScope::TIME_SCOPES.map do |scope|
time_scope = TimeScope.new(scope)
%w[foo bar baz qux].map do |foo_type|
%[
select
'#{scope}_#{foo_type}',
sum(quux) as quux,
count(*), as quuz,
round(100.0 * (count(*) / #{user.foos_count.to_f}), 3) as corge
from
"foos"
where
"foo"."user_id" = #{user.id}
and "foos"."foo_type" = '#{foo_type.humanize}'
and "foos"."end_time" between '#{time_scope.from}' AND '#{time_scope.to}'
and "foos"."foo" = '#{foo_type == 'foo' ? 't' : 'f'}'
union
]
end
end.join.reverse.sub("union".reverse, "").reverse
end
end
The funny way of replacing the last occurance of union also horrifies but it seems to work. There must be a beter way as there are probably many things that are wrong with the above implementation(s). It may be helpful to note that I use Postgresql and have no problem with writing queries that are not portable to other DB's. Any advice is truly appreciated!
Thanks for reading!
Update: I found a solution that works for me and sped up the endpoint that uses this service object by 500% ! Essentially the idea is, instead of building a query string and then executing it for each set of parameters, we create a prepared statement using prepare followed by an exec_prepared passing in parameters to the query. Since this query is made many times over this is a useful optmization because, as per the documentation:
A prepared statement is a server-side object that can be used to optimize performance. When the PREPARE statement is executed, the specified statement is parsed, analyzed, and rewritten. When an EXECUTE command is subsequently issued, the prepared statement is planned and executed. This division of labor avoids repetitive parse analysis work, while allowing the execution plan to depend on the specific parameter values supplied.
We prepare the query like so:
def prepare_query!
ActiveRecord::Base.transaction do
connection.prepare("foos_summary",
%[with scoped_foos as (
select
*
from
"foos"
where
"foos"."user_id" = $3
and ("foos"."end_time" between $4 and $5)
)
select
$1::text as scope,
$2::text as foo_type,
sum(quux)::float as quux,
sum(eggs + bacon + ham)::float as food,
count(*) as count,
round((sum(quux) / nullif(
(select
sum(quux)
from
scoped_foos), 0))::numeric,
5)::float as quuz
from
scoped_foos
where
(case $6
when 'Baz'
then (baz = 't')
else
(baz = 'f' and foo_type = $6)
end
)
])
end
You can see in this query we use a common table expression for more readability and to avoid writing the same select query twice over.
Then we execute the query, passing in the parameters we need:
def connection
#connection ||= ActiveRecord::Base.connection.raw_connection
end
def query_results
prepare_query! unless query_already_prepared?
#results ||= TimeScope::TIME_SCOPES.map do |scope|
time_scope = TimeScope.new(scope)
%w[bacon eggs ham spam].map do |foo_type|
connection.exec_prepared("foos_summary",
[scope,
foo_type,
#user.id,
time_scope.from,
time_scope.to,
foo_type.humanize])
end
end
end
Where query_already_prepared? is a simple check in the prepared statements table maintained by postgres:
def query_already_prepared?
connection.exec(%(select
name
from
pg_prepared_statements
where name = 'foos_summary')).count.positive?
end
A nice solution, I thought! Hopefully the technique illustrated here will help others with a similar problems.

Speed up rake task by using typhoeus

So i stumbled across this: https://github.com/typhoeus/typhoeus
I'm wondering if this is what i need to speed up my rake task
Event.all.each do |row|
begin
url = urlhere + row.first + row.second
doc = Nokogiri::HTML(open(url))
doc.css('.table__row--event').each do |tablerow|
table = tablerow.css('.table__cell__body--location').css('h4').text
next unless table == row.eventvenuename
tablerow.css('.table__cell__body--availability').each do |button|
buttonurl = button.css('a')[0]['href']
if buttonurl.include? '/checkout/external'
else
row.update(row: buttonurl)
end
end
end
rescue Faraday::ConnectionFailed
puts "connection failed"
next
end
end
I'm wondering if this would speed it up, Or because i'm doing a .each it wouldn't?
If it would could you provide an example?
Sam
If you set up Typhoeus::Hydra to run parallel requests, you might be able to speed up your code, assuming that the Kernel#open calls are what's slowing you down. Before you optimize, you might want to run benchmarks to validate this assumption.
If it is true, and parallel requests would speed it up, you would need to restructure your code to load events in batches, build a queue of parallel requests for each batch, and then handle them after they execute. Here's some sketch code.
class YourBatchProcessingClass
def initialize(batch_size: 200)
#batch_size = batch_size
#hydra = Typhoeus::Hydra.new(max_concurrency: #batch_size)
end
def perform
# Get an array of records
Event.find_in_batches(batch_size: #batch_size) do |batch|
# Store all the requests so we can access their responses later.
requests = batch.map do |record|
request = Typhoeus::Request.new(your_url_build_logic(record))
#hydra.queue request
request
end
#hydra.run # Run requests in parallel
# Process responses from each request
requests.each do |request|
your_response_processing(request.response.body)
end
end
rescue WhateverError => e
puts e.message
end
private
def your_url_build_logic(event)
# TODO
end
def your_response_processing(response_body)
# TODO
end
end
# Run the service by calling this in your Rake task definition
YourBatchProcessingClass.new.perform
Ruby can be used for pure scripting, but it functions best as an object-oriented language. Decomposing your processing work into clear methods can help clarify your code and help you catch things like Tom Lord mentioned in the comments on your question. Also, instead of wrapping your whole script in a begin..rescue block, you can use method-level rescues as in #perform above, or just wrap #hydra.run.
As a note, .all.each is a memory hog, and is thus considered a bad solution to iterating over records: .all loads all of the records into memory before iterating over them with .each. To save memory, it's better to use .find_each or .find_in_batches, depending on your use case. See: http://api.rubyonrails.org/classes/ActiveRecord/Batches.html

How to DRY a list of functions in ruby that are differ only by a single line of code?

I have a User model in a ROR application that has multiple methods like this
#getClient() returns an object that knows how to find certain info for a date
#processHeaders() is a function that processes output and updates some values in the database
#refreshToken() is function that is called when an error occurs when requesting data from the object returned by getClient()
def transactions_on_date(date)
if blocked?
# do something
else
begin
output = getClient().transactions(date)
processHeaders(output)
return output
rescue UnauthorizedError => ex
refresh_token()
output = getClient().transactions(date)
process_fitbit_rate_headers(output)
return output
end
end
end
def events_on_date(date)
if blocked?
# do something
else
begin
output = getClient().events(date)
processHeaders(output)
return output
rescue UnauthorizedError => ex
refresh_token()
output = getClient().events(date)
processHeaders(output)
return output
end
end
end
I have several functions in my User class that look exactly the same. The only difference among these functions is the line output = getClient().something(date). Is there a way that I can make this code look cleaner so that I do not have a repetitive list of functions.
The answer is usually passing in a block and doing it functional style:
def handle_blocking(date)
if blocked?
# do something
else
begin
output = yield(date)
processHeaders(output)
output
rescue UnauthorizedError => ex
refresh_token
output = yield(date)
process_fitbit_rate_headers(output)
output
end
end
end
Then you call it this way:
handle_blocking(date) do |date|
getClient.something(date)
end
That allows a lot of customization. The yield call executes the block of code you've supplied and passes in the date argument to it.
The process of DRYing up your code often involves looking for patterns and boiling them down to useful methods like this. Using a functional approach can keep things clean.
Yes, you can use Object#send: getClient().send(:method_name, date).
BTW, getClient is not a proper Ruby method name. It should be get_client.
How about a combination of both answers:
class User
def method_missing sym, *args
m_name = sym.to_s
if m_name.end_with? '_on_date'
prop = m_name.split('_').first.to_sym
handle_blocking(args.first) { getClient().send(prop, args.first) }
else
super(sym, *args)
end
end
def respond_to? sym, private=false
m_name.end_with?('_on_date') || super(sym, private)
end
def handle_blocking date
# see other answer
end
end
Then you can call "transaction_on_date", "events_on_date", "foo_on_date" and it would work.

Rails more idiomatic way of adding up values

I am working on a survey app, and an Organisation has 0 or more SurveyGroups which have 0 or more Members who take the survey
For a SurveyGroup I need to know how many surveys still need to be completed so I have this method:
SurveyGroup#surveys_outstanding
def surveys_outstanding
respondents_count - surveys_completed
end
But I also need to know how many surveys are outstanding at an organisational level, so I have a method like below, but is there a more idiomatic way to do this with Array#inject or Array#reduce or similar?
Organisation#surveys_pending
def surveys_pending
result = 0
survey_groups.each do |survey_group|
result += survey_group.surveys_outstanding
end
result
end
Try this:
def surveys_pending
#surveys_pending ||= survey_groups.map(&:surveys_outstanding).sum
end
I'm using memoization in case it is slow to calculate
def surveys_pending
survey_groups.inject(0) do |result, survey_group|
result + survey_group.surveys_outstanding
end
end

Resources