Conditionals in an Active Record query - ruby-on-rails

I have a large database of information. They are mostly patients. Some of them have email addresses and some of them don't. I can pull an additional email address by simply running
Patient.last.email
However I only want to see the patients who do not have email addresses. (I think its about half of the ones that I have)
I've tried both these below, but not having the best of luck.
Patient.includes(:email).where('email = ?', 'nil')
Patient.includes(:email).where('email = ?' 'nil').references(:email)
Would anyone know what I'm doing wrong?

Simple then.
Patient.where(email: nil)
Should get you your result.

Related

Multi-role check using rolify

Is it possible to do a multi-role check - user.has_role?(:admin, :moderator) ...which hopefully does one query to the DB vs doing user.has_role?(:admin) && user.has_role?(:moderator) which obviously involves going to DB twice.
https://github.com/EppO/rolify/issues/234
No, right now you can't pass such options.
user.has_role?(:admin, :moderator)
If you want to save yourself from performing multiple queries, you can do something like
(user.roles & [:admin, :moderator]).present?
rolify has has_any_role? and has_all_roles? methods that do this quite readably. has_any_role? only hits the DB once, but has_all_roles? looks like it iterates over the arguments calling has_role? so that wouldn't entirely address your concern about hitting the DB multiple times (it does stop when it hits the first negative result, which is better than nothing).
It looks to me like you could check if a user had both roles in one query with:
user.roles.where("name IN (?, ?)", "admin", "moderator").size > 1
That doesn't work if the user could have the same role on different resources, but you get the idea. The other problem is that it hard codes the number of roles you check. You could get around that by constructing the arguments list and using send but it would be kind of ugly. Alternatively, if you knew that the values you were checking for were safe, you could put them in an array and do:
roles_to_check = ["admin", "moderator"]
user.roles.where(name: roles_to_check).size >= roles_to_check.size

Search for user query string for address in database

I have a Rails app on a Postgres database and I need to have a search field for the user to enter a string and look up in the database for possible address matches (within a city). In the database I have a column with full addresses.
I cannot make assumptions on the input, so I am thinking that I should first try to directly look up the address on the database somehow (using a LIKE query maybe?), and if that fails, request to a Geocoding API (i.e. Google) to return a well formatted addresses list matching the query and search those in my database.
I would appreciate any guidance on how to do this.
I don't think FTS (full text search) is what you want. You'll have to use an address API that can match addresses.
I've successfully and easily used SmartyStreets for something like this. They have a free account you can use.
http://smartystreets.com
Also if you did want to try going down the FTS route here is a Gist that explains how to do it.
https://gist.github.com/4365593
You may know it already, but postresql has a fulltext search engine integrated so it's a great time to take advantage of it. I suggest watching thats excellent railscast.
Then once implemented :
class Place < AR
def search_db_or_geokit(query)
res = db_search()
if res.empty?
res = geokit_search(query)
else
res
end
end
def geokit_search(query)
# ...
end
def db_search(query)
# ...
end
end
For the geocoding google search api there's probably a good gem out there like geokit

Rails - given an array of Users - how to get a output of just emails?

I have the following:
#users = User.all
User has several fields including email.
What I would like to be able to do is get a list of all the #users emails.
I tried:
#users.email.all but that errors w undefined
Ideas? Thanks
(by popular demand, posting as a real answer)
What I don't like about fl00r's solution is that it instantiates a new User object per record in the DB; which just doesn't scale. It's great for a table with just 10 emails in it, but once you start getting into the thousands you're going to run into problems, mostly with the memory consumption of Ruby.
One can get around this little problem by using connection.select_values on a model, and a little bit of ARel goodness:
User.connection.select_values(User.select("email").to_sql)
This will give you the straight strings of the email addresses from the database. No faffing about with user objects and will scale better than a straight User.select("email") query, but I wouldn't say it's the "best scale". There's probably better ways to do this that I am not aware of yet.
The point is: a String object will use way less memory than a User object and so you can have more of them. It's also a quicker query and doesn't go the long way about it (running the query, then mapping the values). Oh, and map would also take longer too.
If you're using Rails 2.3...
Then you'll have to construct the SQL manually, I'm sorry to say.
User.connection.select_values("SELECT email FROM users")
Just provides another example of the helpers that Rails 3 provides.
I still find the connection.select_values to be a valid way to go about this, but I recently found a default AR method that's built into Rails that will do this for you: pluck.
In your example, all that you would need to do is run:
User.pluck(:email)
The select_values approach can be faster on extremely large datasets, but that's because it doesn't typecast the returned values. E.g., boolean values will be returned how they are stored in the database (as 1's and 0's) and not as true | false.
The pluck method works with ARel, so you can daisy chain things:
User.order('created_at desc').limit(5).pluck(:email)
User.select(:email).map(&:email)
Just use:
User.select("email")
While I visit SO frequently, I only registered today. Unfortunately that means that I don't have enough of a reputation to leave comments on other people's answers.
Piggybacking on Ryan's answer above, you can extend ActiveRecord::Base to create a method that will allow you to use this throughout your code in a cleaner way.
Create a file in config/initializers (e.g., config/initializers/active_record.rb):
class ActiveRecord::Base
def self.selected_to_array
connection.select_values(self.scoped)
end
end
You can then chain this method at the end of your ARel declarations:
User.select('email').selected_to_array
User.select('email').where('id > ?', 5).limit(4).selected_to_array
Use this to get an array of all the e-mails:
#users.collect { |user| user.email }
# => ["test#example.com", "test2#example.com", ...]
Or a shorthand version:
#users.collect(&:email)
You should avoid using User.all.map(&:email) as it will create a lot of ActiveRecord objects which consume large amounts of memory, a good chunk of which will not be collected by Ruby's garbage collector. It's also CPU intensive.
If you simply want to collect only a few attributes from your database without sacrificing performance, high memory usage and cpu cycles, consider using Valium.
https://github.com/ernie/valium
Here's an example for getting all the emails from all the users in your database.
User.all[:email]
Or only for users that subscribed or whatever.
User.where(:subscribed => true)[:email].each do |email|
puts "Do something with #{email}"
end
Using User.all.map(&:email) is considered bad practice for the reasons mentioned above.

validates_uniqueness_of failing on heroku?

In my User model, I have:
validates_uniqueness_of :fb_uid (I'm using facebook connect).
However, at times, I'm getting duplicate rows upon user sign up. This is Very Bad.
The creation time of the two records is within 100ms. I haven't been able to determine if it happens in two separate requests or not (heroku logging sucks and only goes back so far and it's only happened twice).
Two things:
Sometimes the request takes some time, because I query FB API for name info, friends, and picture.
I'm using bigint to store fb_uid (backend is postgres).
I haven't been able to replicate in dev.
Any ideas would be extremely appreciated.
The signin function
def self.create_from_cookie(fb_cookie, remote_ip = nil)
return nil unless fb_cookie
return nil unless fb_hash = authenticate_cookie(fb_cookie)
uid = fb_hash["uid"].join.to_i
#Make user and set data
fb_user = FacebookUser.new
fb_user.fb_uid = uid
fb_user.fb_authorized = true
fb_user.email_confirmed = true
fb_user.creation_ip = remote_ip
fb_name_data, fb_friends_data, fb_photo_data, fb_photo_ext = fb_user.query_data(fb_hash)
return nil unless fb_name_data
fb_user.set_name(fb_name_data)
fb_user.set_photo(fb_photo_data, fb_photo_ext)
#Save user and friends to the db
return nil unless fb_user.save
fb_user.set_friends(fb_friends_data)
return fb_user
end
I'm not terribly familiar with facebook connect, but is it possible to get two of the same uuid if two separate users from two separate accounts post a request in very quick succession before either request has completed? (Otherwise known as a race condition) validates_uniqueness_of can still suffer from this sort of race condition, details can be found here:
http://apidock.com/rails/ActiveModel/Validations/ClassMethods/validates_uniqueness_of
Because this check is performed
outside the database there is still a
chance that duplicate values will be
inserted in two parallel transactions.
To guarantee against this you should
create a unique index on the field.
See add_index for more information.
You can really make sure this will never happen by adding a database constraint. Add this to a database migration and then run it:
add_index :user, :fb_uid, :unique => true
Now a user would get an error instead of being able to complete the request, which is usually preferable to generating illegal data in your database which you have to debug and clean out manually.
From Ruby on Rails v3.0.5 Module ActiveRecord::Validations::ClassMethods
http://s831.us/dK6mFQ
Concurrency and integrity
Using this [validates_uniqueness_of]
validation method in conjunction with
ActiveRecord::Base#save does not
guarantee the absence of duplicate
record insertions, because uniqueness
checks on the application level are
inherently prone to race conditions.
For example, suppose that two users
try to post a Comment at the same
time, and a Comment’s title must be
unique. At the database-level, the
actions performed by these users could
be interleaved in the following
manner: ...
It seems like there is some sort of a race condition inside your code. To check this, i would first change the code so that facebook values are first extracted and only then i would create a new facebook object.
Then i would highly suggest that you write a test to check whether your function gets executed once. It seems that it's executed two times.
And upon this, there seems to be a race condition upon waiting to get the facebook results.

Removing duplicates from array before saving

I periodically fetch the latest tweets with a certain hashtag and save them locally. In order to prevent saving duplicates, I use the method below. Unfortunately, it does not seem to be working... so what's wrong with this code:
def remove_duplicates
before = #tweets.size
#tweets.delete_if {|tweet| !((Tweet.all :conditions => { :twitter_id => tweet.twitter_id}).empty?) }
duplicates = before - #tweets.size
puts "#{duplicates} duplicates found"
end
Where #tweets is an array of Tweet objects fetched from twitter. I'd appreciate any solution that works and especially one that might be more elegant...
you can validate_uniqueness_of :twitter_id in the Tweet model (where this code should be). This will cause duplicates to fail to save.
Since it sounds like you're using the Twitter search API, a better solution is to use the since_id parameter. Keep track of the last twitter status id you got from your previous query and use that as the since_id parameter on your next query.
More information is available at Twitter Search API Method: search
array.uniq!
Removes duplicate elements from self. Returns nil if no changes are made (that is, no duplicates are found).
Ok, turns out the problem was a bit of different nature: When looking closer into it, I found out that multipe Tweets were saved with the twitter_id 2147483647... This is the upper limit for integer fields :)
Changing the field to bigint solved the problem. It took me very long to figure out since MySQL did silently fail and just reverted to the maximum value as long as it could. (until I added the unique index). I quickly tried it out with postgres, which returned a nice "Integer out of range" error, which then pointed me to the real cause of the problem here.
Thanks Ben for the validation and indexing tips, as they lead to much cleaner code now!

Resources