What is the difference between using .exists?, and .present? in Ruby? - ruby-on-rails

I want to make sure I'm using them for the correct occasion and want to know of any subtleties. They seem to function the same way, which is to check to see if a object field has been defined, when I use them via the console and there isn't a whole lot information online when I did a google search. Thanks!

To clarify: neither present? nor exists? are "pure" ruby—they're both from Rails-land.
present?
present? is an ActiveSupport extension to Object. It's usually used as a test for an object's general "falsiness". From the documentation:
An object is present if it’s not blank?. An object is blank if it’s false, empty, or a whitespace string.
So, for example:
[ "", " ", false, nil, [], {} ].any?(&:present?)
# => false
exists?
exists? is from ActiveResource. From its documentation:
Asserts the existence of a resource, returning true if the resource is found.
Note.create(:title => 'Hello, world.', :body => 'Nothing more for now...')
Note.exists?(1) # => true

The big difference between the two methods, is that when you call present? it initializes ActiveRecord for each record found(!), while exists? does not
to show this I added after_initialize on User. it prints: 'You have initialized an object!'
User.where(name: 'mike').present?
User Load (8.1ms) SELECT "users".* FROM "users" WHERE "users"."name" = $1 ORDER BY users.id ASC [["name", 'mike']]
You have initialized an object!
You have initialized an object!
User.exists?(name: 'mike')
User Exists (2.4ms) SELECT 1 AS one FROM "users" WHERE "users"."name" = $1 ORDER BY users.id ASC LIMIT 1 [["name", 'mike']]

There is a huge difference in performance, and .present? can be up to 10x slower then .exists? depending on the relation you are checking.
This article benchmarks .present? vs .any? vs .exists? and explains why they go from slower to faster, in this order.
In a nutshell, .present? (900ms in the example) will load all records returned, .any? (100ms in the example) will use a SQLCount to see if it's > 0 and .exists? (1ms in the example) is the smart kid that uses SQL LIMIT 1 to just check if there's at least one record, without loading them all neither counting them all.

SELECT COUNT(*) would scan the records to get a count.
SELECT 1 would stop after the first match, so their exec time would be very different.

The SQL generated by the two are also different.
present?:
Thing.where(name: "Bob").present?
# => SELECT COUNT(*) FROM things WHERE things.name = "Bob";
exists?:
Thing.exists?(name: "Bob")
# => SELECT 1 AS one from things WHERE name ="Bob" limit 1;
They both seem to run the same speed, but may vary given your situation.

You can avoid database query by using present?:
all_endorsements_11 = ArtworkEndorsement.where(user_id: 11)
ArtworkEndorsement Load (0.3ms) SELECT "artwork_endorsements".* FROM "artwork_endorsements" WHERE "artwork_endorsements"."user_id" = $1 [["user_id", 11]]
all_endorsements_11.present?
=> true
all_endorsements_11.exists?
ArtworkEndorsement Exists (0.4ms) SELECT 1 AS one FROM "artwork_endorsements" WHERE "artwork_endorsements"."user_id" = $1 LIMIT 1 [["user_id", 11]]
=> true

Related

Rails: Why does pluck method return uniq values?

I have an Account model with a column role.
I want to select distinct roles by created_at date (for example all distinct roles created on 01.01.2018 etc) and get only values of column role.
Selection of distinct roles works fine as a query, but when it comes to getting the values, I'm getting unexpected results.
If I'm just using a map function on all the query results, everything works good and SQL query looks fine.
Account.where(id: 1..10).select(:created_at, :role).distinct.map(&:role)
Account Load (1.0ms) SELECT DISTINCT "accounts"."created_at", "accounts"."role" FROM "accounts" WHERE ("accounts"."id" BETWEEN $1 AND $2) [["id", 1], ["id", 10]]
=> ["admin", "manager", "manager", "manager", "manager", "manager", "manager", "manager", "manager"]
But if I want to change .map(&:role) to .pluck(:role), that are the same by definition, pluck method removes first distinct condition and leaves only distinct by role as we can see in the beginning of the query.
Account.where(id: 1..10).select(:created_at, :role).distinct.pluck(:role)
(0.7ms) SELECT DISTINCT "accounts"."role" FROM "accounts" WHERE ("accounts"."id" BETWEEN $1 AND $2) [["id", 1], ["id", 10]]
=> ["admin", "manager"]
In pluck documentation (apidock) it's written, that pluck will use distinct only if the code looks like .pluck('distinct role')
Why does it work like this in my case? Is it some undocumented feature?
The short answer to your question "Why does it work like this in my case?" is because it is supposed to work like this. You stated:
In pluck documentation . . . it's written, that pluck will use distinct only if the code looks like .pluck('distinct role')
This is not accurate. The doc you referenced shows an example like this as a way to pluck with DISTINCT, but does not say that this is the only way to apply the DISTINCT SQL modifier. Since you have added .distinct onto your ActiveRecord relation, the resulting query will be SELECT DISTINCT. This prompts SQL to give you unique values, not the pluck method; pluck is only returning exactly what your DB gave to it.
For a way to achieve what you are after using pluck for distinct combinations of created_at and role, you can use a group instead:
Account.where(id: 1..10).group(:created_at, :role).pluck(:role)
# => SELECT "accounts"."role" FROM "accounts" WHERE ("accounts"."id" BETWEEN $1 AND $2) GROUP BY "accounts"."created_at", "accounts"."role" [["id", 1], ["id", 10]]
The .group(:created_at, :role) call (which adds a GROUP BY SQL clause) will give you unique combinations of rows based on created_at and role (the same role may appear multiple times if it is associated with multiple created_at values). Then .pluck(:role) will take only the values for role.

Rails how to get entries where we ignore deleted_at property along with other properties

I have the model Store. I would like to check an existence of an entry in the database by Store.where(:google_place_id => 'XXXX'). I just want to check if it exists in the database, regardless of whether it's (soft) deleted or not.
When I try that, rails runs this SQL:
SELECT "stores".* FROM "stores" WHERE "stores"."deleted_at" IS NULL AND "stores"."google_place_id" = $1 LIMIT $2 [["google_place_id", "XXX"], ["LIMIT", 1]]
After doing some research, I stumbled upon the unscoped property, which would remove this deleted_at clause from being included, but when that also gets rid of the entire WHERE clause. E.g if I tried this Store.where(:google_place_id => 'XXXX').unscoped it runs this SQL SELECT "stores".* FROM "stores" WHERE "stores"."deleted_at" IS NULL AND "stores"."google_place_id" = $1 LIMIT $2 [["google_place_id", "XXX"], ["LIMIT", 1]]
Can someone clarify what I am doing wrong?
From the paranoia gem README:
Store.with_deleted.where(google_place_id: 'xxx')
Use the unscoped scope to ignore not deleted scope
Store.unscoped.where(:google_place_id => 'XXXX')
First of all you can try to change it's ordering like:
Store.unscoped.where(:google_place_id => 'XXXX')

efficient rails query includes record (returns boolean)

My people have scores and I'd like an efficient way to query if the given user is in the top X users.
# person.rb
class Person
scope :top_score, -> {order('score DESC')}
scope :page_limit, -> { limit(10) }
def self.in_top_score(id)
top_score.page_limit.something_something_soemthign?
end
end
previously was doing:
user.id.in?(top_score.page_limit.pluck(:id))
but i'd prefer to move this check to the database to prevent the object serialization of hundreds/thousands of records.
Person.order('score DESC').select([:score, :id]).limit(1)
Person Load (0.5ms) SELECT score, id FROM `people` ORDER BY score DESC LIMIT 1
=> [#<Person id: "dxvrDy...", score: 35>]
now to check if another user exists in that list^^
Person.order('score DESC').select([:score, :id]).limit(1).exists?({id: "c_Tvr6..."})
Person Exists (0.3ms) SELECT 1 AS one FROM `people` WHERE `people`.`id` = 'c_Tvr6...' LIMIT 1
=> true
returns true but should return false
updated answer
Sorry, my original answer was incorrect. (The exists? query evidently uses LIMIT 1 and overwrites the LIMIT 10 from the page_limit scope, and evidently throws out the ORDER BY clause, too. Totally wrong! :-p)
What about this? It's a little bit less elegant, but I actually tested the answer this time :-p, and it seems to work as desired.
def self.in_top_score?(id)
where(id: id).where(id: Person.top_score.page_limit).exists?
end
Here's an example usage from my testing (using Rails 4.2.6) and the SQL it generates (which uses a subquery):
pry(main)> Person.in_top_score?(56)
Person Exists (0.4ms) SELECT 1 AS one FROM "people" WHERE "people"."id" = $1 AND "people"."id" IN (SELECT "people"."id" FROM "people" ORDER BY "people"."score" DESC LIMIT 10) LIMIT 1 [["id", 56]]
=> false
In my testing, this does indeed have at least a bit of a performance boost compared to your original version.
original answer
top_score.page_limit.exists?(user.id)
http://apidock.com/rails/ActiveRecord/FinderMethods/exists%3F

Rails `find_by` returning huge ID

Curious if anyone knows the intricacies of find_by since I've checked documentation and been unable to find info.
I know that find is used to find by primary keys like:
#user = User.find(params[:id]), returning the correct user.
Before I corrected my code it was #user = User.find_by(params[:id]) and returned a user with an ID way above the number of users in my DB.
Can anyone help me understand what is happening under the hood? What does find_by search by default when a parameter is omitted that is returning this strange user object?
find_by_field(value) is equivalent to where(field: value) but is not supposed to be used without appending a field name to the method like you mentioned. Moreover it returns only the first matching value. For example instead of doing User.where(name: 'John').limit(1) you can use: User.find_by_name 'John'.
On my side, using find_by with postgresql raises an error, when find_by_id does work:
User.find_by(1)
SELECT "users".* FROM "users" WHERE (1) ORDER BY "users"."email" ASC LIMIT 1
PG::DatatypeMismatch: ERROR: argument of WHERE must be type boolean, not type integer
User.find_by_id 1
SELECT "users".* FROM "users" WHERE "users"."id" = 1 ORDER BY "users"."email" ASC LIMIT 1
<User id: 1, ...
User.where(id: 1) # note that there is no LIMIT 1 in the generated SQL
SELECT "users".* FROM "users" WHERE "users"."id" = 1 ORDER BY "users"."email" ASC
You can use gem query_tracer to see the generated SQL or check this thread.
Please take a look here.
Here is an excerpt for find_by:
Finds the first record matching the specified conditions. There is no implied ordering so if order matters, you should specify it yourself.
If no record is found, returns nil.
Post.find_by name: 'Spartacus', rating: 4
Post.find_by "published_at < ?", 2.weeks.ago
Is that what you were looking for?
UPDATE
User.find_by(3) is equivalent to User.where(3).take
Here's the output from the console
pry(main)> User.where(3).take
#=> User Load (0.3ms) SELECT `users`.* FROM `users` WHERE (3) LIMIT 1
It's look like rails return to you an id of object in memory. It's like query User.find(params[:id]).object_id. But why it's happens? I try did same on my app with 4.2.4 version and all goes fine

How can write this code which execute only one query?

How to write this code which execute only one query instead of two and fire validations also ? update_all bypasses all validations defined in model.
model = ModelName.find(params[:id])
success = model.update_attribute(:column_name, nil)
You can not. Running the validations does include at least one step: Loading the database record into a ruby object (which takes one query). Updating the database of course takes another query. So in any case, you will have two queries for your task.
You can use the update method:
# Updates one record
User.update(1, :name => 'testtesttest')
http://apidock.com/rails/ActiveRecord/Relation/update
but it's still two queries as #mosch said.
User Load (0.0ms)[0m SELECT "users".* FROM "users" WHERE "users"."id" = 1 LIMIT 1
AREL (0.0ms)[0m [1mUPDATE "users" SET "name" = 'testtesttest', "updated_at" = '2011-05-03 11:41:23.000000' WHERE "users"."id" = 1

Resources