Hash/Array to Active Record - ruby-on-rails

I have been searching everywhere but I can't seem to find this problem anywhere. In Rails 5.0.0.beta3 I need to sort a my #record = user.records with an association and it's record.
The sort goes something like this.
#record = #record.sort_by { |rec|
If user.fav_record.find(rec.id)
User.fav_record(rec.id).created_at
Else
rec.created_at
End
This is just an example of what I do. But everything sorts fine.
The problem:
This returns an array and not an Active Record Class.
I've tried everything to get this to return an Active Record Class. I've pushed the sorted elements into an ID array and tried to extract it them in that order, I've tried mapping. Every result that I get turns my previous active record into an array or hash. Now I need it to go back into an active record. Does anyone know how to convert an array or hash of that active record back into an Active Record class?

There isn't a similarly easy way to convert ActiveRecord to array.
If you want to optimize the performance of your app, you should try to avoid converting arrays to ActiveRecord queries. Try and keep the object as a query as long as possible.
That being said, working with arrays is generally easier than queries, and it can feel like a hassle to convert a lot of array operations to ActiveRecord query (SQL) code.
It'd be better to write the sort code using ActiveRecord::Query methods or even writing it in plain SQL using find_by_sql.
I don't know what code you should specifically use here, but I do see that your code could be refactored to be clearer. First of all, If and Else should not be capitalized, but I'm assuming that this is just pseudocode and you already realize this. Second, your variable names should be pluralized if they are queries or arrays (i.e. #record.sort_by should be #records.sort_by instead).
It's worth mentioning that ActiveRecord queries are difficult to master and a lot of people just use array operations instead since they're easier to write. If "premature optimization is the root of all evil", it's really not the end of the world if you sacrifice a bit of performance and just keep your array implementation if you're just trying to make an initial prototype. Just make sure that you're not making "n+1" SQL calls, i.e. do not make a database call every iteration of your loop.
Here's an example of an array implementation which avoids the N+1 SQL issue:
# first load all the user's favorites into memory
user_fav_records = user.fav_records.select(:id, :created_at)
#records = #records.sort_by do |record|
matching_rec = user.fav_records.find { |x| x.id.eql?(rec.id) }
# This is using Array#find, not the ActiveRecord method
if matching_rec
matching_rec.created_at
else
rec.created_at
end
end
The main difference between this implementation and the code in your question is that I'm avoiding calling ActiveRecord's find each iteration of the loop. SQL read/writes are computationally expensive, and you want your code to make as little of them as possible.

Related

How can I disable lazy loading of active record queries?

I want to query some objects from the database using a WHERE clause similar to the following:
#monuments = Monument.where("... lots of SQL ...").limit(6)
Later on, in my view I use methods like #monuments.first, then I loop through #monuments, then I display #monuments.count.
When I look at the Rails console, I see that Rails queries the database multiple times, first with a limit of 1 (for #monuments.first), then with a limit of 6 (for looping through all of them), and finally it issues a count() query.
How can I tell ActiveRecord to only execute the query once? Just executing the query once with a limit of 6 should be enough to get all the data I need. Since the query is slow (80ms), repeating it costs a lot of time.
In your situation you'll want to trigger the query before you your call to first because while first is a method on Array, it's also a “finder method” on ActiveRecord objects that'll fetch the first record.
You can prompt this with any method that requires data to work with. I prefer using to_a since it's clear that we'll be dealing with an array after:
#moments = Moment.where(foo: true).to_a
# SQL Query Executed
#moments.first #=> (Array#first) <Moment #foo=true>
#moments.count #=> (Array#count) 42
In this case, you can also use first(6) in place of limit(6), which will also trigger the query. It may be less obvious to another developer on your team that this is intentional, however.
AFAIK, #monuments.first should not hit the db, I confirmed it on my console, maybe you have multiple instance with same variable or you are doing something else(which you haven't shared here), share the exact code and query and we might debug.
Since, ActiveRecord Collections acts as array, you can use array analogies to avoid querying the db.
Regarding first you can do,
#monuments[0]
Regarding the count, yes, it is a different query which hits the db, to avoid it you can use length as..
#monuments.length

Ruby's .where vs. detect

I'm looking for a method that is faster and uses less server processing. In my application, I can use both .where and .detect:
Where:
User.where(id: 1)
# User Load (0.5ms)
Detect:
User.all.detect{ |u| u.id == 1 }
# User Load (0.7ms). Sometimes increases more than .where
I understand that .detect returns the first item in the list for which the block returns TRUE but how does it compares with .where if I have thousands of Users?
Edited for clarity.
.where is used in this example because I may not query for the id alone. What if I have a table column called "name"?
In this example
User.find(1) # or
User.find_by(id: 1)
will be the fastest solutions. Because both queries tell the database to return exactly one record with a matching id. As soon as the database finds a matching record, it doesn't look further but returns that one record immediately.
Whereas
User.where(id: 1)
would return an array of objects matching the condition. That means: After a matching record was found the database would continue looking for other records to match the query and therefore always scan the whole database table. In this case – since id is very likely a column with unique values – it would return an array with only one instance.
In opposite to
User.all.detect { |u| u.id == 1 }
that would load all users from the database. This will result in loading thousands of users into memory, building ActiveRecord instances, iterating over that array and then throwing away all records that do not match the condition. This will be very slow compared to just loading matching records from the database.
Database management systems are optimized to run selection queries and you can improve their ability to do so by designing a useful schema and adding appropriate indexes. Every record loaded from the database will need to be translated into an instance of ActiveRecord and will consume memory - both operations are not for free. Therefore the rule of thumb should be: Whenever possible run queries directly in the database instead of in Ruby.
NB One should use ActiveRecord#find in this particular case, please refer to the answer by #spickermann instead.
User.where is executed on DB level, returning one record.
User.all.detect will return all the records to the application, and only then iterate through on ruby level.
That said, one must use where. The former is resistant to an amount of records, there might be billions and the execution time / memory consumption would be nearly the same (O(1).) The latter might even fail on billions of records.
Here's a general guide:
Use .find(id) whenever you are looking for a unique record. You can use something like .find_by_email(email) or .find_by_name(name) or similar (these finders methods are automatically generated) when searching non-ID fields, as long as there is only one record with that particular value.
Use .where(...).limit(1) if your query is too complex for a .find_by query or you need to use ordering but you are still certain that you only want one record to be returned.
Use .where(...) when retrieving multiple records.
Use .detect only if you cannot avoid it. Typical use cases for .detect are on non-ActiveRecord enumerables, or when you have a set of records but are unable to write the matching condition in SQL (e.g. if it involves a complex function). As .detect is the slowest, make sure that before calling .detect you have used SQL to narrow down the query as much as possible. Ditto for .any? and other enumerable methods. Just because they are available for ActiveRecord objects doesn't mean that they are a good idea to use ;)

.map(&:dup) Calculations Slow

I have an ActiveRecord query user.loans, and am using user.loans.map(&:dup) to duplicate the result. This is so that I can loop through each Loan (100+ times) and run several calculations.
These calculations take several seconds longer compared to when I run them directly on user.loans or user.loans.dup. If I do this however, all queries user.loans are affected, even when querying with different methods.
Is there an alternative to .map(&:dup) that can achieve the same result with faster calculations? I'd like to preserve the relations so that I can retrieve associated records to each Loan.
The fastest way you can achieve what you want is making calculations directly on ActiveRecord, this way you would not have to loop through resulting Array.
If you still want to loop through Array elements, maybe you should not use map to duplicate each Array element. You could use each instead, which does not affect original Array element. Here is what I think you should do:
def calculate_loans
calculated_loans = Array.new
user.loans.each do |loan|
# Here you make your calculations. For example:
calculated_loans.push(loan.value += 10)
end
calculated_loans
end
This way, you will have original user.loans elements, and a duplicated Array with calculated_loans.
Please, let me know if this improve your performance :)
To resolve conflicts with other calls to user.loans, I wound up using user.loans.reload in the Presenter I have for this particular view. This way I was able to continue making calculations directly on Active Record elsewhere(per Daniel Batalla's suggestion), but without the conflicts I mentioned in my original question.

why is Model.all different to Model.where('true') in rails 3

I have a query, which works fine:
ModelName.where('true')
I can chain this with other AR calls such as where, order etc. However when I use:
ModelName.all
I receive the "same" response but can't chain a where or order to it as it's an array rather than a AR collection.
Whereas I have no pragmatic problem using the first method it seems a bit ugly/unnecessary. Is there a cleaner way of doing this maybe a .to_active_record_collection or something?
There is an easy solution. Instead of using
ModelName.where('true')
Use:
ModelName.scoped
As you said:
ModelName.where('true').class #=> ActiveRecord::Relation
ModelName.all.class #=> Array
So you can make as many lazy loading as long as you don't use all, first or last which trigger the query.
It's important to catch these differences when you consider caching.
Still I can't understand what kind of situation could lead you to something like:
ModelName.all.where(foobar)
... Unless you need the whole bunch of assets for one purpose and get it loaded from the database and need a subset of it to other purposes. For this kind of situation, you'd need to use ruby's Array filtering methods.
Sidenote:
ModelName.all
should never be used, it's an anti-pattern since you don' control how many items you'll retrieve. And hopefully:
ModelName.limit(20).class #=> ActiveRecord::Relation
As you said, the latter returns an array of elements, while the former is an ActiveRecord::Relation. You can order and filter array using Ruby methods. For example, to sort by id you can call sort_by(&:id). To filter elements you can call select or reject. For ActiveRecord::Relation you can chain where or order to it, as you said.
The difference is where the sorting and processing goes. For Array, it is done by the application; for Relation - by the database. The latter is usually faster, when there is more records. It is also more memory efficient.

How to make an Active Record to Sequel transition

I'm using Rails3.rc and Active Record 3 (with meta_where) and just started to switch to Sequel, because it seems to be much faster and offers some really great features.
I'm already using the active_model plugin (and some others).
As far as I know, I should use User[params[:id]] instead of User.find(params[:id]). But this doesn't raise if no record exists and doesn't convert the value to an integer (type of PK), so it's as a string in the where clause. I'm not sure if this is causing any performance issues. Does this harm identity_map? What's the best way to solve both these issues?
Is there an easy way to flip the usage of associations like User.messages_dataset and User.messages so that User.messages behaves like in Active Record (User.messages_data_set). I guess I'd use the #..._dataset a lot but never need the array method, because I could just add .all?
I noticed some same (complex) queries are executed several times within one action sometimes. Is there something like the Active Record query cache? (identity_map doesn't seem to work for these cases).
Is there a to_sql I can call to get the raw SQL a dataset would produce?
You can either use:
User[params[:id].to_i] || raise Sequel::Error
Or write your own method that does something like that. Sequel supports non-integer primary keys, so it wouldn't do the conversion automatically. It shouldn't have any affect on an identity map. Note that Sequel doesn't use an identity map unless you are using the identity_map plugin. I guess the best way is to write your own helper method.
Not really. You can use the association_proxies plugin so that non-array methods are sent to the dataset instead of the array of objects. In general, you shouldn't be using the association dataset method much. If you are using it a lot, it's a sign that you should have an association for that specific usage.
There is and will never be a query cache. You should write your actions so that the results of the first query are cached and reused.
Dataset#sql gives you the SELECT SQL for the dataset.

Resources