Can I combine these two #update_all lines into one line? - ruby-on-rails

I wrote a migration that does the following:
Event.update_all 'tom_cancelled = false', 'tom_cancelled IS NULL'
Event.update_all 'jerry_cancelled = false', 'jerry_cancelled IS NULL'
Can (and if I can, how do) I combine these together to dry it up? Would I use a block?

You can't unless you want to use a loop in ActiveRecord, but it doesn't make sense because you'll end up with one UPDATE query for every record in the result set + 1 for the SELECT.
In the current way, you only run two queries, no matter how many records are in the result set.
There is definitely no need to abstract more the current code. Especially considering it runs within a migration.

Related

Hash/Array to Active Record

I have been searching everywhere but I can't seem to find this problem anywhere. In Rails 5.0.0.beta3 I need to sort a my #record = user.records with an association and it's record.
The sort goes something like this.
#record = #record.sort_by { |rec|
If user.fav_record.find(rec.id)
User.fav_record(rec.id).created_at
Else
rec.created_at
End
This is just an example of what I do. But everything sorts fine.
The problem:
This returns an array and not an Active Record Class.
I've tried everything to get this to return an Active Record Class. I've pushed the sorted elements into an ID array and tried to extract it them in that order, I've tried mapping. Every result that I get turns my previous active record into an array or hash. Now I need it to go back into an active record. Does anyone know how to convert an array or hash of that active record back into an Active Record class?
There isn't a similarly easy way to convert ActiveRecord to array.
If you want to optimize the performance of your app, you should try to avoid converting arrays to ActiveRecord queries. Try and keep the object as a query as long as possible.
That being said, working with arrays is generally easier than queries, and it can feel like a hassle to convert a lot of array operations to ActiveRecord query (SQL) code.
It'd be better to write the sort code using ActiveRecord::Query methods or even writing it in plain SQL using find_by_sql.
I don't know what code you should specifically use here, but I do see that your code could be refactored to be clearer. First of all, If and Else should not be capitalized, but I'm assuming that this is just pseudocode and you already realize this. Second, your variable names should be pluralized if they are queries or arrays (i.e. #record.sort_by should be #records.sort_by instead).
It's worth mentioning that ActiveRecord queries are difficult to master and a lot of people just use array operations instead since they're easier to write. If "premature optimization is the root of all evil", it's really not the end of the world if you sacrifice a bit of performance and just keep your array implementation if you're just trying to make an initial prototype. Just make sure that you're not making "n+1" SQL calls, i.e. do not make a database call every iteration of your loop.
Here's an example of an array implementation which avoids the N+1 SQL issue:
# first load all the user's favorites into memory
user_fav_records = user.fav_records.select(:id, :created_at)
#records = #records.sort_by do |record|
matching_rec = user.fav_records.find { |x| x.id.eql?(rec.id) }
# This is using Array#find, not the ActiveRecord method
if matching_rec
matching_rec.created_at
else
rec.created_at
end
end
The main difference between this implementation and the code in your question is that I'm avoiding calling ActiveRecord's find each iteration of the loop. SQL read/writes are computationally expensive, and you want your code to make as little of them as possible.

Breaking Association/ Relation collection objects into smaller Association/ Relation collections in Ruby on Rails

JRuby, Rails 3
I have a piece of code that queries a number of tables, related through association, returning a combined result set as an ActiveRecord::Relation. My problem is that when this function retrieves a very large result set and tries to do something with it (in my case, create a .xls file), the JVM errors, reporting a GC Memory Heap problem.
The problem is partly down to all these records being held in memory when trying to process the .xls export, as well as JRuby's questionable garbage collector- but, all these records should not be processed at once anyway! So my solution is to break these records into smaller chunks, write them to the file and repeat.
However, amongst all my other constraints, the next part of code that I need to use requires a relation object passed to it. Previously, this was the entire result set, but at this point, I've broken it down into smaller bits (for arguments sake, lets say 100 records).
At this point, you're probably thinking, yeah- what's the problem? Well, see my example code below:
#result_set = relation object
result_set.scoped.each_slice(100) do |chunk|
generic_filter = App::Filter.new(chunk, [:EXCEL_EXPORT_COLUMNS]) #<-- errors here
#do some stuff
generic_filter.relation.each_with_index do |work_type, index|
xls_doc.row(index + 1).concat(generic_filter.values_for_row(work_type))
DATE_COLUMN_INDEX.each do |column_index|
xls_doc.row(index + 1).set_format column_index,
::Spreadsheet::Format.new(number_format: 'DD-MM-YYYY')
end
end
[...] #some other stuff
end
As you can see, I am splitting the result_set into smaller chunks of 100 records and passing it to the App::Filter class that expects a relation object. However, slitting result_set into smaller chunks using each_slice or in_groups causes an error within the block because these two methods return an array of results, not a relation.
I'm fairly new to Ruby on Rails, so my questions are:
Is a relation in fact an object/ collection/ or something like a pre-
defined query, much like a prepared statement?
Is it possible to return smaller relation objects using methods similar to
each_slice or in_groups and process them as intended?
Any pointers/ suggestions will be well received- thanks!
A relation is a kind of helper to build SQL queries (INSERT, SELECT, DELETE, etc). In your exemple, you trigger SELECT queries with each_slice and you get arrays of results.
I havn't checked, I'm not sure each_slice is doing what you want... You should check find_each instead.
You should probably do something like this:
# do what you need with the relation but do NOT trigger the query
generic_filter = App::Filter.new(result_set.scoped, [:EXCEL_EXPORT_COLUMNS]) #<-- errors here
# trigger the query by slice
generic_filter.relation.find_each do |chunk|
chunk.each_with_index do |work_type, index|
xls_doc.row(index + 1).concat(generic_filter.values_for_row(work_type))
DATE_COLUMN_INDEX.each do |column_index|
xls_doc.row(index + 1).set_format column_index,
::Spreadsheet::Format.new(number_format: 'DD-MM-YYYY')
end
end
end

.map(&:dup) Calculations Slow

I have an ActiveRecord query user.loans, and am using user.loans.map(&:dup) to duplicate the result. This is so that I can loop through each Loan (100+ times) and run several calculations.
These calculations take several seconds longer compared to when I run them directly on user.loans or user.loans.dup. If I do this however, all queries user.loans are affected, even when querying with different methods.
Is there an alternative to .map(&:dup) that can achieve the same result with faster calculations? I'd like to preserve the relations so that I can retrieve associated records to each Loan.
The fastest way you can achieve what you want is making calculations directly on ActiveRecord, this way you would not have to loop through resulting Array.
If you still want to loop through Array elements, maybe you should not use map to duplicate each Array element. You could use each instead, which does not affect original Array element. Here is what I think you should do:
def calculate_loans
calculated_loans = Array.new
user.loans.each do |loan|
# Here you make your calculations. For example:
calculated_loans.push(loan.value += 10)
end
calculated_loans
end
This way, you will have original user.loans elements, and a duplicated Array with calculated_loans.
Please, let me know if this improve your performance :)
To resolve conflicts with other calls to user.loans, I wound up using user.loans.reload in the Presenter I have for this particular view. This way I was able to continue making calculations directly on Active Record elsewhere(per Daniel Batalla's suggestion), but without the conflicts I mentioned in my original question.

Rails. How can I make .sum method faster?

I need to calculate total value for column :total_value in Order model. I try to do:
Order.where("created_at > :day", {:day => 10.day.ago}).where(user_id: 3, state: 'collected').sum(:total_value)
It works for me. But... Is there any possibility to do it faster?
Should I add indexes for :total_value and :created_at columns. Does it make a sense?
I'm not really that well aware of Rails internal optimizations, but the thing that first comes into my mind is combining the where clauses into one instead of two. Now it has to first create a result set for the first where and then apply other where on that set (assuming that Rails doesn't optimize it already).
Creating index for total_value makes no sense since you aren't looking for anything based on that column.

Updating several records at once in rails

In a rails 2 app I'm building, I have a need to update a collection of records with specific attributes. I have a named scope to find the collection, but I have to iterate over each record to update the attributes. Instead of making one query to update several thousand records, I'll have to make several thousand queries.
What I've found so far is something like Model.find_by_sql("UPDATE products ...)
This feels really junior, but I've googled and looked around SO and haven't found my answer.
For clarity, what I have is:
ps = Product.last_day_of_freshness
ps.each { |p| p.update_attributes(:stale => true) }
What I want is:
Product.last_day_of_freshness.update_attributes(:stale => true)
It sounds like you are looking for ActiveRecord::Base.update_all - from the documentation:
Updates all records with details given if they match a set of conditions supplied, limits and order can also be supplied. This method constructs a single SQL UPDATE statement and sends it straight to the database. It does not instantiate the involved models and it does not trigger Active Record callbacks or validations.
Product.last_day_of_freshness.update_all(:stale => true)
Actually, since this is rails 2.x (You didn't specify) - the named_scope chaining may not work, you might need to pass the conditions for your named scope as the second parameter to update_all instead of chaining it onto the end of the Product scope.
Have you tried using update_all ?
http://api.rubyonrails.org/classes/ActiveRecord/Relation.html#method-i-update_all
For those who will need to update big amount of records, one million or even more, there is a good way to update records by batches.
product_ids = Product.last_day_of_freshness.pluck(:id)
iterations_size = product_ids.count / 5000
puts "Products to update #{product_ids.count}"
product_ids.each_slice(5000).with_index do |batch_ids, i|
puts "step #{i} of iterations_size"
Product.where(id: batch_ids).update_all(stale: true)
end
If your table has a lot indexes, it also will increase time for such operations, because it will need to rebuild them. When I called update_all for all records in table, there were about two million records and twelve indexes, operation didn't accomplish in more than one hour. With this approach it took about 20 minutes in development env and about 4 minutes in production, of course it depends on application settings and server hardware. You can put it in rake task or some background worker.
Loos like update_all is the best option... though I'll maintain my hacky version in case you're curious:
You can use just plain-ole SQL to do what you want thus:
ps = Product.last_day_of_freshness
ps_ids = ps.map(%:id).join(',') # local var just for readability
Product.connection.execute("UPDATE `products` SET `stale` = TRUE WHERE id in (#{ps_ids)")
Note that this is db-dependent - you may need to adjust quoting style to suit.

Resources