Sum returns value when should return 0 - ruby-on-rails

I'm having a really confusing problem. Here it is:
[12] pry(EstimatedTime)> EstimatedTime.where(user_id: User.current.id, plan_on: date).pluck(:hours) => []
[13] pry(EstimatedTime)> EstimatedTime.where(user_id: User.current.id, plan_on: date).sum(:hours) => 3.0
What kind of magic is this?
This statement resides in model method, that is being called from view. Before that, in controller action, i'm calling another method of the same model, that is performing bulk create of records within transaction.
def self.save_all(records)
transaction do
records.each do |record|
record.save!
end
end
rescue ActiveRecord::RecordInvalid
return false
end
The exception is being thrown, method returns false, view is rendered and this happens.
UPD
I found a workaround, replacing .sum with .pluck(:hours).sum, but I still have no idea why my first way of doing this fails.

As David Aldrige pointed out in comments to the question, the problem was that .sum(:hours) was using cached data, while .pluck(:hours) was actually looking into database.
Why database and query cache contained different data? Well, seems like in Rails 3 (I should probably have specified my rails version when asking question) failed transaction would rollback the database, but leave query cache intact. This led to temporary data inconsistency between database and cache. This was corrected in Rails 4 as pointed out in this issue.
Solution 1 aka the one I chose for myself
Clear the query cache after transaction fails. So, the method performing the mass insert within transaction now looks like this:
def self.save_all(records)
transaction do
records.each do |record|
record.save!
end
end
rescue ActiveRecord::RecordInvalid
self.connection.clear_query_cache
return false
end
Solution 2
I personally find this much less elegant, though I cannot compare these solutions performance-wise, so I'll post them both. One can simply write the sum statement like this:
EstimatedTime.where(user_id: User.current.id, plan_on: date).pluck(:hours).sum
It will use the database data to calculate sum, bypassing the inconsistent cache problem.

Related

Ruby on Rails #find_each and #each. is this bad practice?

I know that find_each has been designed to consume smaller memory than each.
I found some code that other people wrote long ago. and I think that it's wrong.
Think about this codes.
users = User.where(:active => false) # What this line does actually? Nothing?
users.find_each do |user|
# update or do something..
user.update(:do_something => "yes")
end
in this case, It will store all user objects to the users variable. so we already populated the full amount of memory space. There is no point using find_each later on.
Am I correct?
so in other words, If you want to use find_each, you always need to use it with ActiveRecord::Relation object. Like this.
User.where(:active => false).find_each do |user|
# do something...
end
What do you think guys?
Update
in users = User.where(:active => false) line,
Some developer insists that rails never execute query unless we don't do anything with that variable.
What if we have a class with initialize method that has query?
class Test
def initialize
#users = User.where(:active => true)
end
def do_something
#user.find_each do |user|
# do something really..
end
end
end
If we call Test.new, what would happen? Nothing will happen?
users = User.where(:active => false) doesn't run a query against the database and it doesn't return an array with all inactive users. Instead, where returns an ActiveRecord::Relation. Such a relation basically describes a database query that hasn't run yet. The defined query is only run against the database when the actual records are needed. This happens for example when you run one of the following methods on that relation: find, to_a, count, each, and many others.
That means the change you did isn't a huge improvement, because it doesn't change went and how the database is queried.
But IMHO that your code is still slightly better because when you do not plan to reuse the relation then why assign it to a variable in the first place.
users = User.where(:active => false)
users.find_each do |user|
User.where(:active => false).find_each do |user|
Those do the same thing.
The only difference is the first one stores the ActiveRecord::Relation object in users before calling #find_each on it.
This isn't a Rails thing, it applies to all of Ruby. It's method chaining common to most object-oriented languages.
array = Call.some_method
array.each{ |item| do_something(item) }
Call.some_method.each{ |item| do_something(item) }
Again, same thing. The only difference is in the first the intermediate array will persist, whereas in the second the array will be built and then eventually deallocated.
If we call Test.new, what would happen? Nothing will happen?
Exactly. Rails will make an ActiveRecord::Relation and it will defer actually contacting the database until you actually do a query.
This lets you chain queries together.
#inactive_users = User.where(active: false).order(name: :asc)
Later you can to the query
# Inactive users whose favorite color is green ordered by name.
#inactive_users.where(favorite_color: :green).find_each do |user|
...
end
No query is made until find_each is called.
In general, pass around relations rather than arrays of records. Relations are more flexible and if it's never used there's no cost.
find_each is special in that it works in batches to avoid consuming too much memory on large tables.
A common mistake is to write this:
User.where(:active => false).each do |user|
Or worse:
User.all.each do |user|
Calling each on an ActiveRecord::Relation will pull all the results into memory before iterating. This is bad for large tables.
find_each will load the results in batches of 1000 to avoid using too much memory. It hides this batching from you.
There are other methods which work in batches, see ActiveRecord::Batches.
For more see the Rails Style Guide and use rubocop-rails to scan your code for issues and make suggestions and corrections.

How can I get active record to throw an exception if there are too many records

Following the principle of fail-fast:
When querying the database where there should only ever be one record, I want an exception if .first() (first) encounters more than one record.
I see that there is a first! method that throws if there's less records than expected but I don't see anything for if there's two or more.
How can I get active record to fail early if there are more records than expected?
Is there a reason that active record doesn't work this way?
I'm used to C#'s Single() that will throw if two records are found.
Why would you expect activerecord's first method to fails if there are more than 1 record? it makes no sense for it to work that way.
You can define your own class method the count the records before getting the first one. Something like
def self.first_and_only!
raise "more than 1" if size > 1
first!
end
That will raise an error if there are more than 1 and also if there's no record at all. If there's one and only one it will return it.
It seems ActiveRecord has no methods like that. One useful method I found is one?, you can call it on an ActiveRecord::Relation object. You could do
users = User.where(name: "foo")
raise StandardError unless users.one?
and maybe define your own custom exception
If you care enough about queries performance, you have to avoid ActiveRecord::Relation's count, one?, none?, many?, any? etc, which spawns SQL select count(*) ... query.
So, your could use SQL limit like:
def self.single!
# Only one fast DB query
result = limit(2).to_a
# Array#many? not ActiveRecord::Calculations one
raise TooManySomthError if result.many?
# Array#first not ActiveRecord::FinderMethods one
result.first
end
Also, when you expect to get only one record, you have to use Relation's take instead of first. The last one is for really first record, and can produce useless SQL ORDER BY.
find_sole_by (Rails 7.0+)
Starting from Rails 7.0, there is a find_sole_by method:
Finds the sole matching record. Raises ActiveRecord::RecordNotFound if no record is found. Raises ActiveRecord::SoleRecordExceeded if more than one record is found.
For example:
Product.find_sole_by(["price = %?", price])
Sources:
ActiveRecord::FinderMethods#find_sole_by.
Rails 7 adds ActiveRecord methods #sole and #find_sole_by.
Rails 7.0 adds ActiveRecord::FinderMethods 'sole' and 'find_sole_by'.

Rails: ActiveRecord:RecordNotUnique with first_or_create

I have this index in my Model's table:
UNIQUE KEY `index_panel_user_offer_visits_on_offer_id_and_panel_user_id` (`offer_id`,`panel_user_id`)
And this code:
def get_offer_visit(offer_id, panel_user_id)
PanelUserOfferVisit.where(:offer_id => offer_id, :panel_user_id => panel_user_id).first_or_create!
end
is randomly causing an ActiveRecord::RecordNotUnique exception in our application,
the issue is:
Duplicate entry for key
'index_panel_user_offer_visits_on_offer_id_and_panel_user_id'
I've read on the rails doc that first_or_create! is not an atomic method and can cause this kind of issues in a high concurrency environment and that a solution would be just to catch the exception and retry.
I tried different approaches, including Retriable gem to retry a certain number of times to repeat the operation but RecordNotUnique is still raised.
I even tried to change the logic:
def get_offer_visit(offer_id, panel_user_id)
begin
offer_visit = PanelUserOfferVisit.where(:offer_id => offer_id, :panel_user_id => panel_user_id).first_or_initialize
offer_visit.save!
offer_visit
rescue ActiveRecord::RecordNotUnique
offer_visit = PanelUserOfferVisit.where(:offer_id => offer_id, :panel_user_id => panel_user_id).first
raise Exception.new("offer_visit not found") if offer_visit.nil?
offer_visit
end
end
But the code still raise the custom exception I made and I really don't understand how it can fails to create the record on the first try because the record exists but then when it tries to find the record it doesn't find it again.
What am I missing?
You seem to be hitting rails query cache for PanelUserOfferVisit.where(...).first query: rails have just performed identical SQL query, it thinks that result will remain the same for the duration of the request.
ActiveRecord::Base.connection.clear_query_cache before the second call to db should help
Also you do not have to save! the record if it is not new_record?
If I understand properly your problem here is due to a race condition. I didn't found the exact piece of code from Github but lets say the method find_or_create is something like (edit: here is the code):
def first_or_create!(attributes = nil, &block)
first || create!(attributes, &block)
end
You should try to solve the race conditions using optimistic or pessimistic locking
Once locking implemented within your custom method, you should continue capturing possible NotUnique exceptions but try to clean cache or use uncached block before to perform the find again.
I experienced a similar issue but the root cause was not related to a race condition.
With first_or_create!, the INSERT remove surrounding spaces in the value but the SELECT don't.
So if you try this:
Users.where(email: 'user#mail.com ')
(mind the space at the end of the string), the resulting sql query will be:
SELECT "users".* FROM "users" WHERE "users"."email" = 'user#mail.com ' ORDER BY "users"."id";
which result in not finding the potentially existing record, then the insert query will be:
INSERT INTO "users" ("email") VALUES ('user#mail.com');
which can result in ActiveRecord::RecordNotUnique error if a record with 'user#mail.com' as an email already exists.
I do not have enough details about your problem so it might be unrelated but it might helps others.

Ruby on Rails - ActiveRecord::Relation count method is wrong?

I'm writing an application that allows users to send one another messages about an 'offer'.
I thought I'd save myself some work and use the Mailboxer gem.
I'm following a test driven development approach with RSpec. I'm writing a test that should ensure that only one Conversation is allowed per offer. An offer belongs_to two different users (the user that made the offer, and the user that received the offer).
Here is my failing test:
describe "after a message is sent to the same user twice" do
before do
2.times { sending_user.message_user_regarding_offer! offer, receiving_user, random_string }
end
specify { sending_user.mailbox.conversations.count.should == 1 }
end
So before the test runs a user sending_user sends a message to the receiving_user twice. The message_user_regarding_offer! looks like this:
def message_user_regarding_offer! offer, receiver, body
conversation = offer.conversation
if conversation.nil?
self.send_message(receiver, body, offer.conversation_subject)
else
self.reply_to_conversation(conversation, body)
# I put a binding.pry here to examine in console
end
offer.create_activity key: PublicActivityKeys.message_received, owner: self, recipient: receiver
end
On the first iteration in the test (when the first message is sent) the conversation variable is nil therefore a message is sent and a conversation is created between the two users.
On the second iteration the conversation created in the first iteration is returned and the user replies to that conversation, but a new conversation isn't created.
This all works, but the test fails and I cannot understand why!
When I place a pry binding in the code in the location specified above I can examine what is going on... now riddle me this:
self.mailbox.conversations[0] returns a Conversation instance
self.mailbox.conversations[1] returns nil
self.mailbox.conversations clearly shows a collection containing ONE object.
self.mailbox.conversations.count returns 2?!
What is going on there? the count method is incorrect and my test is failing...
What am I missing? Or is this a bug?!
EDIT
offer.conversation looks like this:
def conversation
Conversation.where({subject: conversation_subject}).last
end
and offer.conversation_subject:
def conversation_subject
"offer-#{self.id}"
end
EDIT 2 - Showing the first and second iteration in pry
Also...
Conversation.all.count returns 1!
and:
Conversation.all == self.mailbox.conversations returns true
and
Conversation.all.count == self.mailbox.conversations.count returns false
How can that be if the arrays are equal? I don't know what's going on here, blown hours on this now. Think it's a bug?!
EDIT 3
From the source of the Mailboxer gem...
def conversations(options = {})
conv = Conversation.participant(#messageable)
if options[:mailbox_type].present?
case options[:mailbox_type]
when 'inbox'
conv = Conversation.inbox(#messageable)
when 'sentbox'
conv = Conversation.sentbox(#messageable)
when 'trash'
conv = Conversation.trash(#messageable)
when 'not_trash'
conv = Conversation.not_trash(#messageable)
end
end
if (options.has_key?(:read) && options[:read]==false) || (options.has_key?(:unread) && options[:unread]==true)
conv = conv.unread(#messageable)
end
conv
end
The reply_to_convesation code is available here -> http://rubydoc.info/gems/mailboxer/frames.
Just can't see what I'm doing wrong! Might rework my tests to get around this. Or ditch the gem and write my own.
see this Rails 3: Difference between Relation.count and Relation.all.count
In short Rails ignores the select columns (if more than one) when you apply count to the query. This is because
SQL's COUNT allows only one or less columns as parameters.
From Mailbox code
scope :participant, lambda {|participant|
select('DISTINCT conversations.*').
where('notifications.type'=> Message.name).
order("conversations.updated_at DESC").
joins(:receipts).merge(Receipt.recipient(participant))
}
self.mailbox.conversations.count ignores the select('DISTINCT conversations.*') and counts the join table with receipts, essentially counting number of receipts with duplicate conversations in it.
On the other hand, self.mailbox.conversations.all.count first gets the records applying the select, which gets unique conversations and then counts it.
self.mailbox.conversations.all == self.mailbox.conversations since both of them query the db with the select.
To solve your problem you can use sending_user.mailbox.conversations.all.count or sending_user.mailbox.conversations.group('conversations.id').length
I have tended to use the size method in my code. As per the ActiveRecord code, size will use a cached count if available and also returns the correct number when models have been created through relations and have not yet been saved.
# File activerecord/lib/active_record/relation.rb, line 228
def size
loaded? ? #records.length : count
end
There is a blog on this here.
In Ruby, #length and #size are synonyms and both do the same thing: they tell you how many elements are in an array or hash. Technically #length is the method and #size is an alias to it.
In ActiveRecord, there are several ways to find out how many records are in an association, and there are some subtle differences in how they work.
post.comments.count - Determine the number of elements with an SQL COUNT query. You can also specify conditions to count only a subset of the associated elements (e.g. :conditions => {:author_name => "josh"}). If you set up a counter cache on the association, #count will return that cached value instead of executing a new query.
post.comments.length - This always loads the contents of the association into memory, then returns the number of elements loaded. Note that this won't force an update if the association had been previously loaded and then new comments were created through another way (e.g. Comment.create(...) instead of post.comments.create(...)).
post.comments.size - This works as a combination of the two previous options. If the collection has already been loaded, it will return its length just like calling #length. If it hasn't been loaded yet, it's like calling #count.
It is also worth mentioning to be careful if you are not creating models through associations, as the related model will not necessarily have those instances in its association proxy/collection.
# do this
mailbox.conversations.build(attrs)
# or this
mailbox.conversations << Conversation.new(attrs)
# or this
mailbox.conversations.create(attrs)
# or this
mailbox.conversations.create!(attrs)
# NOT this
Conversation.new(mailbox_id: some_id, ....)
I don't know if this explains what's going on, but the ActiveRecord count method queries the database for the number of records stored. The length of the Relation could be different, as discussed in http://archive.railsforum.com/viewtopic.php?id=6255, although in that example, the number of records in the database was less than the number of items in the Rails data structure.
Try
self.mailbox.conversations.reload; self.mailbox.conversations.count
or perhaps
self.mailbox.reload; self.mailbox.conversations.count
or, if neither of those work, just try reloading as many of the objects as possible to see if you can get it to work (self, mailbox, conversations, etc.).
My guess is that something is messed up between memory and the DB. This is definitely a really weird error though, might wanna put in an issue on Rails to see why this would be the case.
The result of mailbox.conversations is cached after the first call. To reload it write mailbox.conversations(true)

Empty Scope with Ruby on Rails

Following Problem:
I need something like an empty scope. Which means that this scope is emtpy, but responds to all methods a scope usually responds to.
I'm currently using a little dirty hack. I simply supply "1=0" as conditions. I find this realy ugly, since it hits the database. Simply returning an empty array won't work, since the result must respond to the scoped methods.
Is there a better existing solution for this or will I need to code this myself?
Maybe some example code could help explain what i need:
class User < ActiveRecord::Base
named_scope :admins, :conditions => {:admin => true }
named_scope :none_dirty, :conditions => "1=0" # this scope is always empty
def none_broken
[]
end
def self.sum_score # okay, a bit simple, but a method like this should work!
total = 0
self.all.each do |user|
total += user.score
end
return total
end
end
User.admin.sum_score # the score i want to know
User.none_drity.sum_score # works, but hits the db
User.none_broken.sum_score # ...error, since it doesn't respond to sum_score
Rails 4 introduces the none scope.
It is to be used in instances where you have a method which returns a relation, but there is a condition in which you do not want the database to be queried.
If you want a scope to return an unaltered scope use all:
No longer will a call to Model.all execute a query immediately and return an array of records. In Rails 4, calls to Model.all is equivalent to now deprecated Model.scoped. This means that more relations can be chained to Model.all and the result will be lazily evaluated.
User.where('false')
returns an ActiveRecord::Relation with zero elements, that is a chain-able scope that won't hit the database until you actually try to access one of its elements. This is similar to PhilT's solution with ('1=0') but a little more elegant.
Sorry User.scoped is not what you want. As commented this returns everything. Should have paid more attention to the question.
I've seen where('1 = 0') suggested before and Rails should probably cache it as well.
Also, where('1 = 0') won't hit the database until you do .all, .each, or one of the calculations methods.
I thing you need User.scoped({})
How about User.where(id: nil) ?
Or User.where(_id: nil) for mongoid.
The thing you are looking for does not exist. You could implement something like this by monky patching the find method. Yet, this would be an overkill, so I recomend keeping this unless it's performance critical.
Looking at your example code indicates you may not know about aggregated queries in SQL which are exposed as calculations methods in Rails:
User.sum(:score) will give you the sum of all users' scores
Take a look at Rails Guides for more info:
http://guides.rubyonrails.org/active_record_querying.html#sum

Resources