I have seen Rails find method taking a block as
Consumer.find do |c|
c.id == 3
end
Which is similar to Consumer.find(3).
What are some of the use cases where we can actually use block for a find ?
It's a shortcut for .to_a.find { ... }. Here's the method's source code:
def find(*args)
if block_given?
to_a.find(*args) { |*block_args| yield(*block_args) }
else
find_with_ids(*args)
end
end
If you pass a block, it calls .to_a (loading all records) and invokes Enumerable#find on the array.
In other words, it allows you to use Enumerable#find on a ActiveRecord::Relation. This can be useful if your condition can't be expressed or evaluated in SQL, e.g. querying serialized attributes:
Consumer.find { |c| c.preferences[:foo] == :bar }
To avoid confusion, I'd prefer the more explicit version, though:
Consumer.all.to_a.find { |c| c.preferences[:foo] == :bar }
The result may be similar, but the SQL query is not similar to Consumer.find(3)
It is fetching all the consumers and then filtering based on the block. I cant think of a use case where this might be useful
Here is a sample query in the console
consumer = Consumer.find {|c|c.id == 2}
# Consumer Load (0.3ms) SELECT `consumers`.* FROM `consumers`
# => #<Consumer id: 2, name: "xyz", ..>
A good example of a use-case is if you have a JSON/JSONB column and don't want to get involved in the more complex JSON SQL.
required_item = item_collection.find do |item|
item.jsondata['json_array_property'][index]['property'] == clause
end
This is useful if you can constrain the scope of the item_collection to a date-range, for example, and have a smaller set of items that require filtering further.
Related
What I have is currently working, but seems to be very expensive, any ideas on making it less expensive would be great!
A User has many Plans, which has many PlanDates. Each PlanDates has a certain recipe denoted by a recipe_id attribute. Each Plan has a meal_type attribute which is either Meat, Vegetarian, or Choice, the latter means mixed. Each Recipe has a type_of_meal attribute that is either Meat or Vegetarian. Each Recipe also has a friendly name attribute.
For a given PlanDate, I need to build an options_for_select in the following format:
[ [recipe_id, "recipe_name"], [recipe_id, "recipe_name"] ... ]
The options:
must remove all the recipe_ids that have previously been given to the User (regardless of Plan)
must remove all the recipe_ids with a type mismatch (i.e., if a Plan has Meat designated, the options must not have any Vegetarian recipe_ids), certainly this is not true if the Plan has Choice designated
Here's the code I currently have:
# builds an array of all the recipe_ids that have been given to this User on some PlanDate on some Plan
recipes_used_before_for_this_user = PlanDate.select { |pd| pd.plan.user.id == user_id }.map { |pd| pd.recipe_id }
# narrows down the world of recipes to those that do NOT have an id of a recipe_used_before_for_this_user
recipes_not_used_before = Recipe.select { |r| (recipes_used_before_for_this_user.include? r.id) == false }
# going forward, let's assume current_pd = the PlanDate object in question
if current_pd.plan.meal_type == "Choice"
# easiest: if the meal_type is "Choice" then we just take the recipes_not_used_before and map them into the appropriate format
recipe_choices_array = recipes_not_used_before.map { |r| [ r.id, r.name ] }
else
# if the plan has a "Meat" or "Vegetarian" specification, we need to first narrow the recipes_not_used_before down by the right type and then map into the appropriate format
recipe_choices_array = recipes_not_used_before.select { |r| r.type_of_meal == potential_pd.first.plan.meal_type }.map { |r| [ r.id, r.name ] }
end
Again, working, but I have a lot of PlanDates and a lot of Recipes, so if there is any way to streamline even further, would love your ideas. Thanks!
The reason you're experiencing expensive queries is because you're not actually using ActiveRecord's query interface, or even SQL to narrow your query, but instead are loading the entire dataset into Ruby memory objects and then looping over the result in Ruby.
I suspect that if you inspect your logfiles you'll see something like this:
>> PlanDate.select{ |pd| pd.plan.user.id == user_id }.map { |pd| pd.recipe_id }
PlanDate Load (1.3ms) SELECT "plan_dates".* FROM "plan_dates"
=> [#<PlanDate....
What you want to do is to use ActiveRecord's query interface to build the query, something like this:
PlanDate.includes(plan: [:user]).where("plan.user_id == ?", :user_id).pluck('recipe_id')
What that does is first: Specify relationships to be included in the result set, then specify the where conditions of your SQL query, and finally pull out the recipe ids using pluck.
See http://guides.rubyonrails.org/active_record_querying.html for more info.
Given this model:
class User < ActiveRecord::Base
has_many :things
end
Then we can do this::
#user = User.find(123)
#user.things.find_each{ |t| print t.name }
#user.thing_ids.each{ |id| print id }
There are a large number of #user.things and I want to iterate through only their ids in batches, like with find_each. Is there a handy way to do this?
The goal is to:
not load the entire thing_ids array into memory at once
still only load arrays of thing_ids, and not instantiate a Thing for each id
Rails 5 introduced in_batches method, which yields a relation and uses pluck(primary_key) internally. And we can make use of the where_values_hash method of the relation in order to retrieve already-plucked ids:
#user.things.in_batches { |batch_rel| p batch_rel.where_values_hash['id'] }
Note that in_batches has order and limit restrictions similar to find_each.
This approach is a bit hacky since it depends on the internal implementation of in_batches and will fail if in_batches stops plucking ids in the future. A non-hacky method would be batch_rel.pluck(:id), but this runs the same pluck query twice.
You can try something like below, the each slice will take 4 elements at a time and them you can loop around the 4
#user.thing_ids.each_slice(4) do |batch|
batch.each do |id|
puts id
end
end
It is, unfortunately, not a one-liner or helper that will allow you to do this, so instead:
limit = 1000
offset = 0
loop do
batch = #user.things.limit(limit).offset(offset).pluck(:id)
batch.each { |id| puts id }
break if batch.count < limit
offset += limit
end
UPDATE Final EDIT:
I have updated my answer after reviewing your updated question (not sure why you would downvote after I backed up my answer with source code to prove it...but I don't hold grudges :)
Here is my solution, tested and working, so you can accept this as the answer if it pleases you.
Below, I have extended ActiveRecord::Relation, overriding the find_in_batches method to accept one additional option, :relation. When set to true, it will return the activerecord relation to your block, so you can then use your desired method 'pluck' to get only the ids of the target query.
#put this file in your lib directory:
#active_record_extension.rb
module ARAExtension
extend ActiveSupport::Concern
def find_in_batches(options = {})
options.assert_valid_keys(:start, :batch_size, :relation)
relation = self
start = options[:start]
batch_size = options[:batch_size] || 1000
unless block_given?
return to_enum(:find_in_batches, options) do
total = start ? where(table[primary_key].gteq(start)).size : size
(total - 1).div(batch_size) + 1
end
end
if logger && (arel.orders.present? || arel.taken.present?)
logger.warn("Scoped order and limit are ignored, it's forced to be batch order and batch size")
end
relation = relation.reorder(batch_order).limit(batch_size)
records = start ? relation.where(table[primary_key].gteq(start)) : relation
records = records.to_a unless options[:relation]
while records.any?
records_size = records.size
primary_key_offset = records.last.id
raise "Primary key not included in the custom select clause" unless primary_key_offset
yield records
break if records_size < batch_size
records = relation.where(table[primary_key].gt(primary_key_offset))
records = records.to_a unless options[:relation]
end
end
end
ActiveRecord::Relation.send(:include, ARAExtension)
here is the initializer
#put this file in config/initializers directory:
#extensions.rb
require "active_record_extension"
Originally, this method forced a conversion of the relation to an array of activrecord objects and returned it to you. Now, I optionally allow you to return the query before the conversion to the array happens. Here is an example of how to use it:
#user.things.find_in_batches(:batch_size=>10, :relation=>true).each do |batch_query|
# do any kind of further querying/filtering/mapping that you want
# show that this is actually an activerecord relation, not an array of AR objects
puts batch_query.to_sql
# add more conditions to this query, this is just an example
batch_query = batch_query.where(:color=>"blue")
# pluck just the ids
puts batch_query.pluck(:id)
end
Ultimately, if you don't like any of the answers given on an SO post, you can roll-your-own solution. Consider only downvoting when an answer is either way off topic or not helpful in any way. We are all just trying to help. Downvoting an answer that has source code to prove it will only deter others from trying to help you.
Previous EDIT
In response to your comment (because my comment would not fit):
calling
thing_ids
internally uses
pluck
pluck internally uses
select_all
...which instantiates an activerecord Result
Previous 2nd EDIT:
This line of code within pluck returns an activerecord Result:
....
result = klass.connection.select_all(relation.arel, nil, bound_attributes)
...
I just stepped through the source code for you. Using select_all will save you some memory, but in the end, an activerecord Result was still created and mapped over even when you are using the pluck method.
I would use something like this:
User.things.find_each(batch_size: 1000).map(&:id)
This will give you an array of the ids.
named_scope :with_country, lambad { |country_id| ...}
named_scope :with_language, lambad { |language_id| ...}
named_scope :with_gender, lambad { |gender_id| ...}
if params[:country_id]
Event.with_country(params[:country_id])
elsif params[:langauge_id]
Event.with_state(params[:language_id])
else
......
#so many combinations
end
If I get both country and language then I need to apply both of them. In my real application I have 8 different named_scopes that could be applied depending on the case. How to apply named_scopes incrementally or hold on to named_scopes somewhere and then later apply in one shot.
I tried holding on to values like this
tmp = Event.with_country(1)
but that fires the sql instantly.
I guess I can write something like
if !params[:country_id].blank? && !params[:language_id].blank? && !params[:gender_id].blank?
Event.with_country(params[:country_id]).with_language(..).with_gender
elsif country && language
elsif country && gender
elsif country && gender
.. you see the problem
Actually, the SQL does not fire instantly. Though I haven't bothered to look up how Rails pulls off this magic (though now I'm curious), the query isn't fired until you actually inspect the result set's contents.
So if you run the following in the console:
wc = Event.with_country(Country.first.id);nil # line returns nil, so wc remains uninspected
wc.with_state(State.first.id)
you'll note that no Event query is fired for the first line, whereas one large Event query is fired for the second. As such, you can safely store Event.with_country(params[:country_id]) as a variable and add more scopes to it later, since the query will only be fired at the end.
To confirm that this is true, try the approach I'm describing, and check your server logs to confirm that only one query is being fired on the page itself for events.
Check Anonymous Scopes.
I had to do something similar, having many filters applied in a view. What I did was create named_scopes with conditions:
named_scope :with_filter, lambda{|filter| { :conditions => {:field => filter}} unless filter.blank?}
In the same class there is a method which receives the params from the action and returns the filtered records:
def self.filter(params)
ClassObject
.with_filter(params[:filter1])
.with_filter2(params[:filter2])
end
Like that you can add all the filters using named_scopes and they are used depending on the params that are sent.
I took the idea from here: http://www.idolhands.com/ruby-on-rails/guides-tips-and-tutorials/add-filters-to-views-using-named-scopes-in-rails
Event.with_country(params[:country_id]).with_state(params[:language_id])
will work and won't fire the SQL until the end (if you try it in the console, it'll happen right away because the console will call to_s on the results. IRL the SQL won't fire until the end).
I suspect you also need to be sure each named_scope tests the existence of what is passed in:
named_scope :with_country, lambda { |country_id| country_id.nil? ? {} : {:conditions=>...} }
This will be easy with Rails 3:
products = Product.where("price = 100").limit(5) # No query executed yet
products = products.order("created_at DESC") # Adding to the query, still no execution
products.each { |product| puts product.price } # That's when the SQL query is actually fired
class Product < ActiveRecord::Base
named_scope :pricey, where("price > 100")
named_scope :latest, order("created_at DESC").limit(10)
end
The short answer is to simply shift the scope as required, narrowing it down depending on what parameters are present:
scope = Example
# Only apply to parameters that are present and not empty
if (!params[:foo].blank?)
scope = scope.with_foo(params[:foo])
end
if (!params[:bar].blank?)
scope = scope.with_bar(params[:bar])
end
results = scope.all
A better approach would be to use something like Searchlogic (http://github.com/binarylogic/searchlogic) which encapsulates all of this for you.
I'm doing this:
#snippets = Snippet.find :all, :conditions => { :user_id => session[:user_id] }
#snippets.each do |snippet|
snippet.tags.each do |tag|
#tags.push tag
end
end
But if a snippets has the same tag two time, it'll push the object twice.
I want to do something like if #tags.in_object(tag)[...]
Would it be possible? Thanks!
I think there are 2 ways to go about it to get a faster result.
1) Add a condition to your find statement ( in MySQL DISTINCT ). This will return only unique result. DBs in general do much better jobs than regular code at getting results.
2) Instead if testing each time with include, why don't you do uniq after you populate your array.
here is example code
ar = []
data = []
#get some radom sample data
100.times do
data << ((rand*10).to_i)
end
# populate your result array
# 3 ways to do it.
# 1) you can modify your original array with
data.uniq!
# 2) you can populate another array with your unique data
# this doesn't modify your original array
ar.flatten << data.uniq
# 3) you can run a loop if you want to do some sort of additional processing
data.each do |i|
i = i.to_s + "some text" # do whatever you need here
ar << i
end
Depending on the situation you may use either.
But running include on each item in the loop is not the fastest thing IMHO
Good luck
Another way would be to simply concat the #tags and snippet.tags arrays and then strip it of duplicates.
#snippets.each do |snippet|
#tags.concat(snippet.tags)
end
#tags.uniq!
I'm assuming #tags is an Array instance.
Array#include? tests if an object is already included in an array. This uses the == operator, which in ActiveRecord tests for the same instance or another instance of the same type having the same id.
Alternatively, you may be able to use a Set instead of an Array. This will guarantee that no duplicates get added, but is unordered.
You can probably add a group to the query:
Snippet.find :all, :conditions => { :user_id => session[:user_id] }, :group => "tag.name"
Group will depend on how your tag data works, of course.
Or use uniq:
#tags << snippet.tags.uniq
A common idiom that my camp uses in rails is as follows:
def right_things(all_things, value)
things = []
for thing in all_things
things << thing if thing.attribute == value
end
return things
end
how can I make this better/faster/stronger?
thx
-C
def right_things(all_things, value)
all_things.select{|x| x.attribute == value}
end
If your things are ActiveRecord models and you only need the items selected for your current purpose, you may, if you're using Rails 2.0 (? definitely 2.1) or above, find named_scopes useful.
class Thing
named_scope :rightness, lambda { |value| :conditions => ['attribute = ?', value] }
end
So you can say
Thing.rightness(123)
, which is (in this case) similar to
Thing.find_by_attribute(123)
in that it boils down to a SQL query, but it's more easily chainable to modify the SQL. If that's useful to you, which it may not be, of course...