Confusing difference between `count` and `size` - ruby-on-rails

I have a has_many :through relationship between Product and Order:
So I create a new #order and assign it a product like so:
#order = Order.new(products: [my_product])
This manifest in the console like so:
>> #order.products
=> #<ActiveRecord::Associations::CollectionProxy [#<Product id: 145, title: "Some Product" ...>]>
No for some reason I don't understand I get the following results:
>> #order.products.count
=> 0
>> #order.products.to_a.count
=> 1
>> #order.products.size
=> 1
>> #order.products.count
=> 0
I am going to use the size method now, since I want to know how many product I have. But I would have expected that size and count would return the same result for any type of collection.

Check this documentation out on size for Rails: Rails ActiveRecord Size Documentation
Also the documentation for count is here as well: Rails ActiveRecord Count Documentation
There are Ruby AND ActiveRecord methods length, size, and count which are completely different from each other.
Your first example of:>> #order.products.count is attempting to call the Rails ActiveRecord count method (counting records in the DB) while your other example of >> #order.products.to_a.count is attempting to call the Ruby count method (counting items in the container within memory with no connection to the DB).
So to answer your question when using the >> #order = Order.new(products: [my_product]) you are only creating the object in memory and not within the DB. You can read the documentation on size I posted a link to above to tell you why it is able to tell you either the length of the collection or the count of records in the DB depending on the context of its use.
Hope this helps!

size is the in-memory size of the products collection. You'll see when you run that method there is no sql query in the logs. However, if you run count, you'll see it actually produces a sql query (try this in rails console) and since this order is not persisted, you're getting back 0.
Which one should you use? Whichever you consider the source of truth based on where the lifecycle of the object is.

Related

Rails and Arel's where function: Can I call where on objects instead of making a call to the database?

Consider the following:
budget has many objects in its another_collection.
obj has many objects of the same type as object in its another_collection.
budget and some_collection are already declared before the following loop
they've been previous saved in the database and have primary keys set.
some_collection is a collections of objs.
-
some_collection.each do |obj|
another_obj = obj.another_collection.build
budget.another_collection << another_obj
end
budget.another_collection.collect {|another_obj| another_obj.another_obj_id}
=> [1, 2, 3, 4]
some_obj_with_pk_1 = some_collection.where(obj_id: obj.id)
some_obj_with_pl_1.id
=> 1
budget.another_collection.where(another_obj_id: some_obj_with_pk_1.id)
=> []
This shouldn't happen. What is happening is that rails queries the database for any items in another_collection with another_obj_id = 1. Since this collection hasn't been saved to the database yet, none of these items are showing up in the results.
Is there a function or something I can pass to Arel's where method that says to use local results only? I know I can just iterate over the items and find it but it would be great if I didn't have to do that and just use a method that does this already.
You could always use Enumeable#select which takes a block and returns only the elements that the block returns true for. You'd want to make sure that you had ActiveRecord retrieve the result set first (by calling to_a on your query).
records = Model.where(some_attribute: value).to_a
filtered_records = records.select { |record| record.something? }
Depending on your result set and your needs, it is possible that a second database query would be faster, as your SQL store is better suited to do these comparisons than Ruby. But if your records have yet to be saved, you would need to do something like the above, since the records aren't persisted yet

Rails Mongoid model query result returns wrong size/length/count info even when using limit

When querying on a certain model in my rails application, it returns the correct results, excerpt the size, length or count information, even using the limit criteria.
recipes = Recipe
.where(:bitly_url => /some.url/)
.order_by(:date => :asc)
.skip(10)
.limit(100)
recipes.size # => 57179
recipes.count # => 57179
recipes.length # => 57179
I can't understand why this is happening, it keeps showing the total count of the recipes collection, and the correct value should be 100 since I used limit.
count = 0
recipes.each do |recipe|
count += 1
end
# WAT
count # => 100
Can somebody help me?
Thanks!
--
Rails version: 3.2.3
Mongoid version: 2.4.10
MongoDB version: 1.8.4
From the fine manual:
- (Integer) length
Also known as: size
Get's the number of documents matching the query selector.
But .limit doesn't really alter the query selector as it doesn't change what the query matches, .offset and .limit alter what segment of the matches are returned. This doesn't match the behavior of ActiveRecord and the documentation isn't exactly explicit about this subtle point. However, Mongoid's behaviour does match what the MongoDB shell does:
> db.things.find().limit(2).count()
23
My things collection contains 23 documents and you can see that the count ignores the limit.
If you want to know how many results are returned then you could to_a it first:
recipes.to_a.length
As mentioned in one of the comments, in newer Mongoid versions (not sure which ones), you can simply use recipes.count(true) and this will include the limit, without needing to query the result set, as per the API here.
In the current version of mongoid (5.x), count(true) no longer works. Instead, count now accepts an options hash. Among them there's :limit option
criteria.count(limit: 10)
Or, to reuse whatever limit is already set on the criteria
criteria.count(criteria.options.slice(:limit))

How do i use .sort() to create a relation?

I am using the kaminari gem for pagination. I have a resources controller which paginates perfectly (due to the simple nature of the ordering). That can be seen here:
#resources = Resource.order("created_at desc").page(params[:page]).per(25)
That just sorts them by latest first. when i do .class it appears thats an activerecord::relation
On my tags though, I want to sort them by a relationship (the number of resources assigned to that tag)
#tags = Tag.all.sort{|a, b| b.number_of_resources <=> a.number_of_resources}.page(params[:page]).per(50)
It gives me the error however undefined methodpage' for #`
Tag.all returns an Array, hence your #page call failing, as it expects an ARel relation.
If #number_of_resources maps to a DB column, then all you need to do is:
Tag.order('number_of_resources').page(params[:page]).per(50)
If it's not, you either need to add it to the Tag database table, or just do your sort/paginate in Ruby rather than using kaminari. This will be feasible if the number of tags is under ~1000 or so.
If you do add the info to the db, check out this post: Counter Cache for a column with conditions?
you should do something like: 1) joins the two tables, 2) group rows by tag, 3) count how many rows belongs to each group, 4) order using that new column with the count
you should make a good sql statement and then you can call pagination

Rails 2.3.8: Fetching objects via ActiveRecord without building all the objects

I'm wondering if there's a way to fetch objects from the DB via ActiveRecord, without having Rails build the whole objects (just a few fields).
For example,
I sometimes need to check whether a certain object contains a certain field.
Let's say I have a Student object referencing a Bag object (each student has one bag).
I need to check if a female student exists that her bag has more than 4 pencils.
In ActiveRecord, I would have to do something like this:
exists = Student.female.find(:all, conditions => 'bags.pencil_count > 4', :include => :bag).size > 0
The problem is that if there are a 1000 students complying with this condition,
a 1000 objects would be built by AR including their 1000 bags.
This reduces me to using plain SQL for this query (for performance reasons), which breaks the AR.
I won't be using the named scopes, and I would have to remember to update them all around the code,
if something in the named scope changes.
This is an example, but there are many more cases that for performance reasons,
I would have to use SQL instead of letting AR build many objects,
and this breaks encapsulation.
Is there any way to tell AR not to build the objects, or just build a certain field (also in associations)?
If you're only testing for the existence of a matching record, just use Model.count from ActiveRecord::Calculations, e.g.:
exists = Student.female.count( :conditions => 'bags.pencil_count > 4',
:joins => :bag
) > 0
count simply (as the name of the class implies), does the calculation and doesn't build any objects.
(And for future reference it's good to know the difference between :include and :joins. The former eager-loads the associated model, whereas the latter does not, but still lets you use those fields in your :conditions.)
Jordan gave the best answer here - especially re: using joins instead of include (because join won't actually create the bag objects)
I'll just add to it by saying that if you do actually still need the "Student" objects (just with the small amount of info on it) you can also use the :select keyword - which works just like in mysql and means the db I/O will be reduced to just the info you put in the select - and you can also add derived fields form the other tables eg:
students = Student.female.all(
:select => 'students.id, students.name, bags.pencil_count AS pencil_count',
:conditions => 'students.gender = 'F' AND bags.pencil_count > 4',
:joins => :bag
)
students.each do |student|
p "#{student.name} has #{student.pencil_count} pencils in her bag"
end
would give eg:
Jenny has 5 pencils in her bag
Samantha has 14 pencils in her bag
Jill has 8 pencils in her bag
(though note that a derived field (eg pencil_count) will be a string - you may need to cast eg with student.pencil_count.to_i )

Subquery in Rails report generation

I'm building a report in a Ruby on Rails application and I'm struggling to understand how to use a subquery.
Each 'Survey' has_many 'SurveyResponses' and it is simple enough to retrieve these however I need to group them according to one of the fields, 'jobcode', as I only want to report the information relating to a single jobcode in one line in the report.
However I also need to know the constituent data that makes up the totals for that jobcode. The reason for this is that I need to calculate data such as medians and standard deviations and so need to know the values that make the total.
My thinking is that I retrieve the distinct jobcodes that were reported on for the survey and then as I loop through these I retrieve the individual responses for each jobcode.
Is this the correct way to do this or should I follow a different method?
You could use a named scope to simplify getting the groups of responses:
named_scope :job_group, lambda{|job_code| {:conditions => ["job_code = ?", job_code]}}
Put that in your response model, aand use it like this:
job.responses.job_group('some job code')
and you'll get an array of responses. If you're looking to get the mean of the values of one of the attributes on the responses, you can use map:
r = job.responses.job_group('some job code')
r.map(&:total)
=> [1, 5, 3, 8]
Alternatively, you might find it quicker to write custom SQL in order to get the mean / average / sum of groups of attributes. Going through rails for this sort of work may cause significant lag.
ActiveRecord::Base.connection.execute("Custom SQL here")
You can also use Model.find_by_sql()
For example:
class User < Activerecord::Base
# Your usual AR model
end
...
def index
#users = User.find_by_sql "select * from users"
# etc
end

Resources