Rails: Benchmarking rails ActiveRecord Queries - ruby-on-rails

I'm looking to benchmark a couple of my ActiveRecord requests in my app. What's the simplest way in the console to benchmark something like
User.find_by_name("Joe").id
versus
User.find(:first, :select => :id, :conditions => ["name = ?","Joe"]).id
Thanks

This question is a bit old and needs an updated answer. The easiest way to benchmark the query outside of a production scenario would be to run it in rails console (the benchmarker script isn't in Rails anymore.) Then you can simply test using the Benchmark class built into Ruby. Run the following in Rails:
puts Benchmark.measure { User.find_by_name("Joe").id }
puts Benchmark.measure { User.find(:first, :select => :id, :conditions => ["name = ?","Joe"]).id }
I'd run the above 5 times, discard the min and max, and take the average cost of the remaining three runs to figure out which query is going to give you better performance.
This is the most accurate solution to get the true cost of a query since Rails doesn't show you the cost to actually construct your objects. So while #Slobodan Kovacevic answer is correct in that the log shows you how log the query takes, the long doesn't give you object construction time which may be less for your second query since you're only populating a single field vs all the user fields.

In development mode each query is timed and logged in log/development.log. You'll have lines like:
Ad Load (1.4ms) SELECT "ads".* FROM "ads" ORDER BY created_at DESC

Use script/performance/benchmarker:
script/performance/benchmarker 2000 "User.find_by_name('Joe').id" "User.first(:conditions => {:name => 'Joe'}, :select => 'id').id"
On my dev machine, this reports:
user system total real
#1 1.110000 0.070000 1.180000 ( 1.500366)
#2 0.800000 0.050000 0.850000 ( 1.078444)
Thus, the 2nd method appears to be faster, since it has less work to do. Of course, you should benchmark this on your production machine, using the production environment:
RAILS_ENV=production script/performance/benchmarker 2000 "User.find_by_name('Joe').id" "User.first(:conditions => {:name => 'Joe'}, :select => 'id').id"
It might change conditions a bit for you.

Related

How can I compose/chain queries in Rails 2?

I have one query that gets the total count of rows with one condition, and a second query that gets the total count of rows with the same condition plus another condition. Ideally, I wouldn't repeat myself in the code and could instead just chain/compose the extra condition onto the first query.
I'm thinking of something like this.
query1 = Table.find(:all, :conditions => "condition1")
query2 = query1.find(:all, :conditions => "condition2")
It'd also be nice to find out what this looks like for the Table.count use case, since that's what I'm actually trying to do at the moment.
I'm guessing that the ActiveRecord::Base has some method that will return the query object as opposed to executing it, but I haven't found that in the docs.
Although Rails 3 makes this significantly easier, you can always do it in Rails 2 with a little hack that emulates it:
# config/initializers/rails2_where_scope.rb
class ActiveRecord::Base
named_scope :where, lambda { |conditions| {
:conditions => conditions
}}
end
This way you can chain together multiple conditions in a manner that's forward-compatible with Rails 3:
query2 = Table.where(condition1).where(condition2).all
Rails 3 uses AREL to do most of the SQL computations so that's why it's much more flexible than Rails 2.

Rails 3 select random follower query efficiency

I have a method that selects 5 random users who are following a certain user, and adds them to an array.
Relationship.find_all_by_followee_id( user.id ).shuffle[0,4].each do |follower|
follower = User.find(follower.user_id)
array.push follower
end
return array
I'm wondering, is this an efficient way of accomplishing this? My main concern is with the find_all_by_followee_id call. This returns a list of all the relationships where the specified user is being followed (this could be in the 100,000s). And then I shuffle that entire list, and then I trim it to the first 5. Is there a more efficient way to do this?
You can try this:
Relationship.find_all_by_followee_id( user.id, :order => 'rand()', :limit => 5 ) do |follower|
follower = User.find(follower.user_id)
array.push follower
end
return array
Btw, this will work with MySql. If you are using PostgreSQL or anything else you may need to change the rand() with any valid random function that your DB supports.
Some minor changes to make it a little more clean:
return Relationship.find_all_by_followee_id( user.id, :order => 'rand()', :limit => 5 ).collect {|follower| User.find(follower.user_id) }
You can also use a join in there in order to prevent the 5 selects but it won't make much difference.
Edit1:
As #mike.surowiec mentioned.
"Just for everyones benefit, translating this to the non-deprecated active record query syntax looks like this:"
Relationship.where(:followee_id => user.id).order( "random()" ).limit( 5 ).collect {|follower| User.find(follower.user_id) }

In Rails, how to get the actual Query when an ActiveRecord query executes

I am using a simple query in ActiveRecord which does something like this.
MyTable.find(:all, :conditions => {:start_date => format_time(params[:date]) })
I want to get the equivalent query that is executed in the background, perhaps using a puts statement or something similar to that. MySQL is my database.
You can see the SQL query that is executed by viewing the development log located in log/development.log. Note that the script/server command tails this log file by default.
In Rails 3 you can append a .to_sql method call to the end of the finder to output the SQL.
Alternatively, New Relic's free RPM Lite gem lets you see the SQL queries in developer mode as well as lots of other useful performance tuning information.
You can insatll mongrel
its gets you all sql queries on your terminal
or run active record query on script/console
In development mode, your DB queries are printed to the server console, you can localize the query there.
You are looking for construct_finder_sql.
>> User.send(:construct_finder_sql, {:conditions => { :username => 'joe' }})
=> "SELECT * FROM \"users\" WHERE (\"users\".\"username\" = E'joe') "
>> User.scoped(:joins => :orders, :conditions => { :username => 'joe' }).send(:construct_finder_sql, {})
=> "SELECT \"users\".* FROM \"users\" INNER JOIN \"orders\" ON orders.user_id = users.id WHERE (\"users\".\"username\" = E'joe') "

Ruby on Rails: setting future "publishing" dates for blog postings

I'm trying to set blog postings to publish at certain dates in the future. I have in my Posting model:
named_scope :published, :conditions => ["publish_at <= ?", Time.now]
I'm using this in my controller to call the published postings:
#postings = Posting.published
The development server works fine, but I believe the production server needs me to refresh the cache (using "pkill -9 dispatch.fcgi") or I won't see the new postings when it's supposed to publish.
Is there any way to set future times for the postings' publishing dates correctly on the production server? Do I have to refresh the cache every time?
You are correct, because the named scope is evaluated when the class loads.
You should re-write it to be dynamic or (maybe better) use the database's now() function.
Either of these should work:
named_scope :published, lambda { {:conditions => ["publish_at <= ?", Time.now]} }
Note how this uses a lambda to always return the current time in the conditions hash.
named_scope :published, :conditions => "publish_at <= now()"
This is database dependent (the above should work for MySQL) but probably a tiny bit faster.
Check to see if you have any of the following statements in your production environment:
ActionController::Base.cache_store = :memory_store
OR
ActionController::Base.cache_store = :file_store, "/path/to/cache/directory"
OR
ActionController::Base.cache_store = :mem_cache_store
OR any other setting for ActionController::Base.cache_store

Better Performance on Associations

Right now I have a table called Campaigns that has many Hits, if I call say:
Campaign.find(30).hits
Which takes 4 seconds, or 4213 ms.
If I call this instead:
campaign = Campaign.find(30)
campaign.hits.count
Does it still load all of the hits, then count? Or does it see I am counting and avoids loading all of the hits? (Which is currently 300,000+ rows).
I am trying to figure out a smart way to load/count my hits. I am thinking about adding a method to my Campaign.rb model, like:
def self.total_hits
find :first, :select => 'COUNT(id) as hits', :conditions => ["campaign_id = ?", self.id]
end
I know that query won't load from the hits table, but that is just an example of counting it from a self made query, apposed to Ruby on Rails doing this for me.
Would this memcache query be more effecient? (I have it running, but doesn't seem to be any better/faster/slower, just the same speed.)
def self.hits
Rails.cache.fetch("Campaign_Hits_#{self.campaign_id}", :expires_in => 40) {
find(:first, :select => 'COUNT(id) as hits', :conditions => ["campaign_id = ?", self.campaign_id]).hits
}
end
Any suggestions would be great!
How about:
Campaign.find(30).hits.count
You might also consider adding the following in hit.rb (assuming a one-to-many relationship between campaigns and hits).
belongs_to :campaign, :counter_cache => true
You then need a column in the campaigns table called hits_count. This will avoid hitting hits altogether if you're only getting the count.
You can check the API for the full rundown.
My ActiveRecord might be a little rusty, so forgive me if so, but IIRC Campaign.find(30).hits is at least two separate queries. How does Campaign.find(30, :include => [ :hits ]).hits do? That should perform a single query.

Resources