ActiveRecord SQL execution time - ruby-on-rails

How can I get the SQL query execution time in rails?
I can see the time in logs, e.g.:
Posting Load (10.8ms) SELECT "postings".* FROM "postings" ORDER BY "postings"."id" ASC LIMIT 1
But how can I get that value (10.08) programmatically to use it further in my code?

You can use ActiveSupport::Notifications to retrieve the runtime of the SQL statements
You should be able to instrument sql.active_record to see how long the SQL call takes
ActiveSupport::Notifications.subscribe('sql.active_record') do |*args|
event = ActiveSupport::Notifications::Event.new(*args)
event.duration #how long it took to run the sql command
end
This code was not tested, so I can't guarantee it works, but it should

try this out
time_consumed = Benchmark.measure { Post.limit(1) }

Related

Deadlocks in PostgreSQL when running a simple UPDATE

update cities set cdb_data = NULL, updated_at = now() where cities.id = 1;
We loop through cities and update cities with cdb_data as a part of rails code, however we keep getting the below error.
ActiveRecord::StatementInvalid: PG::TRDeadlockDetected: ERROR: deadlock detected
DETAIL: Process 26741 waits for ShareLock on transaction 2970537161; blocked by process 26818.
Process 26818 waits for ShareLock on transaction 2970537053; blocked by process 26741.
HINT: See server log for query details.
CONTEXT: while updating tuple (39,15) in relation "cities"
UPDATE "cities" SET "cdb_data" = $1, "updated_at" = $2 WHERE "cities"."id" = $3
Ruby code that updates the city object
city = City.find_or_create_by(uuid: city_data['uuid'])
city.name = city_data['name']
city.state_id = city_data['state_id']
city.cdb_data = city_data
city.save
I am clueless about to which record this error is happening and why?
Even with the production dump on local or in staging, this doesn't seem to happen.
Any help would be much appreciated.
I am running the server on heroku so I am not really sure I could see the postgres logs.
Two such transactions can easily deadlock.
To avoid that problem make sure that when you “loop through the cities”, you always do so in the same order, using something like:
FOR c IN
SELECT * FROM city
WHERE /* whatever */
ORDER BY city.id
LOOP
/* perform the update */
END LOOP;
To find what is locking the update query, one could use
SELECT pg_blocking_pids(<pid of the query that is locked>);

Using limit and offset in rails together with updated_at and find_each - will that cause a problem?

I have a Ruby on Rails project in which there are millions of products with different urls. I have a function "test_response" that checks the url and returns either a true or false for the Product attribute marked_as_broken, either way the Product is saved and has its "updated_at"-attribute updated to the current Timestamp.
Since this is a very tedious process I have created a task which in turn starts off 15 tasks, each with a N/15 number of products to check. The first one should check from, for example, the first to the 10.000th, the second one from the 10.000nd to the 20.000nd and so on, using limit and offset.
This script works fine, it starts off 15 process but rather quickly completes one script after another far too early. It does not terminate, it finishes with a "Process exited with status 0".
My guess here is that using find_each together with a search for updated_at as well as in fact updating the "updated_at" while running the script changes everything and does not make the script go through the 10.000 items as supposed but I can't verify this.
Is there something inherently wrong by doing what I do here. For example, does "find_each" run a new sql query once in a while providing completely different results each time, than anticipated? I do expect it to provide the same 10.000 -> 20.000 but just split it up in pieces.
task :big_response_launcher => :environment do
nbr_of_fps = Product.where(:marked_as_broken => false).where("updated_at < '" + 1.year.ago.to_date.to_s + "'").size.to_i
nbr_of_processes = 15
batch_size = ((nbr_of_fps / nbr_of_processes))-2
heroku = PlatformAPI.connect_oauth(auth_code_provided_elsewhere)
(0..nbr_of_processes-1).each do |i|
puts "Launching #{i.to_s}"
current_offset = batch_size * i
puts "rake big_response_tester[#{current_offset},#{batch_size}]"
heroku.dyno.create('kopa', {
:command => "rake big_response_tester[#{current_offset},#{batch_size}]",
:attach => false
})
end
end
task :big_response_tester, [:current_offset, :batch_size] => :environment do |task,args|
current_limit = args[:batch_size].to_i
current_offset = args[:current_offset].to_i
puts "Launching with offset #{current_offset.to_s} and limit #{current_limit.to_s}"
Product.where(:marked_as_broken => false).where("updated_at < '" + 1.year.ago.to_date.to_s + "'").limit(current_limit).offset(current_offset).find_each do |fp|
fp.test_response
end
end
As many have noted in the comments, it seems like using find_each will ignore the order and limit. I found this answer (ActiveRecord find_each combined with limit and order) that seems to be working for me. It's not working 100% but it is a definite improvement. The rest seems to be a memory issue, i.e. I cannot have too many processes running at the same time on Heroku.

ActiveRecord includes: How get access to multiple queries resuls that were performed by activerecord?

I have the following query that loads associations:
contacts = current_user.contacts.includes(:contact_lists).where(id: subscribed_contact_ids)
[DEBUG] Contact Load (2.6ms) SELECT "contacts".* FROM "contacts" WHERE "contacts"."user_id" = 7 AND "contacts"."id" IN (4273, 4275, 4277, 4278, 4281, 4285, 4297, 4305, 4307, 4308, 4315, 4318, 4323, 4326, 4331, 4333, 4344, 4349, 4359, 4361, 4368, 4372, 4373, 4378, 4382, 4389, 4392, 4394, 4404, 4428, 4450, 4469, 4473, 4489, 4490, 4495, 4497, 4498, 4501, 4505, 4514, 4520, 4525, 4536, 4545, 4554, 4555, 4561, 4568, 4572)
[DEBUG] Subscription Load (0.8ms) SELECT "subscriptions".* FROM "subscriptions" WHERE "subscriptions"."contact_id" IN (4273, 4275, 4277, 4278, 4281, 4285, 4297, 4305, 4307, 4308, 4315, 4318, 4323, 4326, 4331, 4333, 4344, 4349, 4359, 4361, 4368, 4372, 4373, 4378, 4382, 4389, 4392, 4394, 4404, 4428, 4450, 4469, 4473, 4489, 4490, 4495, 4497, 4498, 4501, 4505, 4514, 4520, 4525, 4536, 4545, 4554, 4555, 4561, 4568, 4572)
[DEBUG] ContactList Load (0.5ms) SELECT "contact_lists".* FROM "contact_lists" WHERE "contact_lists"."id" IN (9, 8)
I also need to get distinct contact_lists, basically i need to load the result of last query separately.
So the question is: is there way to load these contact lists without running the complex queries again?
The option to iterate through each record to get the contact_lists and then to remove duplicates is not attractive at all.
The other option is to run complex join query again (seems not really good either):
contact_list_ids = current_user.subscriptions.select('distinct contact_list_id').where(contact_id: contact_ids).pluck(:contact_list_id)
current_user.contact_lists.where(id: contact_list_ids)
Those queries are cached somewhere. Is there way to access the query cache directly?
Here is a solution for your problem. But it requires you to monkey patch rails which might not be a great idea.
How do I get the last SQL query performed by ActiveRecord in Ruby on Rails?
I would rather iterate over the contacts and store the contact_lists in a hash with their id as key. Then you do not need to worry about duplicates
contact_lists = {}
contacts.each { |c| c.contact_lists.each { |cl| contact_lists[cl.id] = cl } }
or something.

Building an ActiveRecord relation without having it execute the query

I am trying to build a query as follows:
rel = article.where(version: 'some_version')
.joins(:categories)
.merge(Category.where(:uuid => 'some_cat_uuid'))
articles = rel.where(published: true).limit(10)
# etc.
The problem is the first query seems to execute no matter what I do. Am I doing something wrong?
When you run commands in the console, it automatically adds something similar to .inspect at the end to display the results of the command. For instance (this is in my app that I'm working on right now):
irb(main):061:0> Job.where(id: 251000)
Job Load (3.8ms) SELECT "jobs".* FROM "jobs" WHERE "jobs"."deleted_at" IS NULL AND "jobs"."id" = 251000
=> [#<Job id: 251000, {...}>]
So, your first line of code is just fine and would not normally execute the query, but since you were running it in the console it executes immediately so that it can display the results for you.
One way to get around this is to add ; nil to the end of the command, that way the console won't attempt to display the results (it'll just display nil as the result of that line. IE:
irb(main):062:0> Job.where(id: 251000); nil
=> nil
Doing it this way you should be able to do what you were expecting (delay execution of the query until you actually need the results):
rel = article.where(version: 'some_version')
.joins(:categories)
.merge(Category.where(:uuid => 'some_cat_uuid')); nil
articles = rel.where(published: true).limit(10); nil
Then you can execute the query by using articles.all (in Rails 3) or articles.to_a (in Rails 4)
Of course if you then move this code to a rake task or model or something you can drop those ; nil bits because they look a little cluttered and would be useless at that point.
Another point of contention for the console might be that it'll see that .where() {NEWLINE} and execute the query at that point, I tend to put the dot on the previous line to remove any ambiguity of where my command is ending:
rel = article.where(version: 'some_version').
joins(:categories).
merge(Category.where(:uuid => 'some_cat_uuid')); nil

"FOR UPDATE" clause is throwing error in esql program

We are developing a migrate program. There are nearly 80 million records are there in DB. The code is as follows:
static int mymigration(struct progargs *args)
{
exec sql begin declare section;
const char *selectQuery;
const char *updateQuery;
long cur_start;
long cur_end;
long serial;
long number;
char frequency[3];
exec sql end declare section;
selectQuery = "select * from mytable where number >= ? and number <= ? for update of frequency ,status";
updateQuery = "update mytable set frequency = ?, "
" status = ? "
" where current of my_cursor";
cur_start= args->start;
cur_end = args->end;
exec sql prepare my_select_query from :selectQuery;
/* Verify the sql code for error here */
exec sql declare my_select_cursor cursor with hold for my_select_query;
exec sql open my_select_cursor using :cur_start, :cur_end;
/* Verify the sql code for error here */
exec sql prepare my_update_query from :updateQuery;
/* Verify the sql code for error here */
while (1)
{
number = 0;
serial = 0;
memset(frequency,0,sizeof(frequency));
exec sql fetch my_select_cursor into number,:serial,:frequency;
if (sqlca.sqlcode != SQL_OK)
break;
exec sql execute my_update_query using :frequency, :frequency;
}
exec sql close my_select_trade_cursor;
}
While implementing this, we are getting the error message "-255". We found one solution as to add being work and commit work. Since we have large amount of data, this might clutter the transaction log.
Is there any other solution available for this problem? The IBM website for informix shows the usage is correct.
Appreciate the help in advance.
Thanks,
Mathew Liju
Error -255 is "Not in transaction".
I see no BEGIN WORK (or COMMIT WORK or ROLLBACK WORK) statements.
You need to add BEGIN WORK before you open the cursor with the FOR UPDATE clause. You then need to decide whether to commit periodically to avoid overlong transactions. The fact that you use a FOR HOLD cursor shows that you had thought about using sub-transactions; if you were not going to do so, you would not use that clause.
Note that Informix has 3 primary database logging modes:
Unlogged (no transaction support)
Logged (by default, each statement is a singleton transaction; an explicit BEGIN WORK starts a multi-statement transaction terminated by COMMIT WORK or ROLLBACK WORK).
Logged MODE ANSI (slightly simplistically, you are automatically in a transaction; you need an explicit COMMIT or ROLLBACK to terminate a transaction, and may then, optionally, use an explicit BEGIN, but the BEGIN is not actually necessary).
From the symptoms you describe, you have a logged but not MODE ANSI database. Therefore, you must explicitly code the BEGIN WORK statements.

Resources