Rails 5: can't order after group and count - ruby-on-rails

In my subscribers table I have duplicate emails, so I want to count these. I'm doing this in the Rails console:
Subscriber.select("email, count(*)").group(:email).order("count (*) desc")
This results in the following query to my PostgreSQL db:
SELECT email, count(*) FROM "subscribers" GROUP BY "subscribers"."email" ORDER BY count (*) desc
The strange thing is, although this query works fine when run directly from the db, it doesn't work in the Rails console, which returns:
#<ActiveRecord::Relation [#<Subscriber id: nil, email: "a#a.com">, #<Subscriber id: nil, email: "b#b.com">, #<Subscriber id: nil, email: "c#c.com">]>
Any ideas why?

It works but your subscriber objects don't have that aggregated field (and you are not extracting the others). Give it a name
subscribers = Subscriber.select("email, COUNT(*) AS counter").group(:email).order("COUNT (*) desc")
Extract these fields
subscribers.each { |s| puts s.email, s.counter }

Related

Array as a condition for find_or_create_by in Rails

I have something like this:
types = ['landline', 'cell']
Phone.find_or_create_by(person_id: 1, type: types) do |record|
record.number = '0'
end
This doesn't work. New records don't get created. But when I rewrite it to look like this:
types = ['landline', 'cell']
types.each do |type|
Phone.find_or_create_by(person_id: 1, type: type) do |record|
record.number = '0'
end
end
it works.
Any ideas why find_or_create_by doesn't work with array as a condition?
Nice question, here is my explanation with an example.
find_or_create_by first runs a select query and then proceeds for the create method
There is a note in the api documentation of find_or_create_by,
Please note this method is not atomic, it runs first a SELECT, and if there are no results an INSERT is attempted. If there are other threads or processes there is a race condition between both calls and it could be the case that you end up with two similar records.
Here is the reference of find_or_create_by
so, when this command is ran, it will first runs a select query.
For example, let me take a user table and show you with that table from console result.
I have a user with email test222#example.com in the database but not, test333#example.com.
User.find_or_create_by(email: ['test222#example.com','test333#example.com'])
now when I run the find_or_create_by this is the query generated.
User Load (175.5ms) SELECTusers.* FROMusersWHEREusers.emailIN ('test222#example.com', 'test333#example.com') LIMIT 1
The response is,
=> #<User id: 82, provider: "email", uid: "test222#example.com", name: nil, nickname: nil, image: nil, email: "test222#example.com", created_at: "2016-09-05 12:35:01", updated_at: "2016-09-05 12:35:01">
So it returned the found user and didn't run the create method, ignoring the not found(second email)
Now, if I run it in a loop,
emails = ['test222#example.com','test333#example.com']
emails.each do |email|
User.find_or_create_by(email: email)
end
The INSERT query will be ran for the second email,
**INSERT INTO `users` (`email`,`created_at`, `updated_at`) VALUES ('test#ead.com', '2016-10-07 13:16:25', '2016-10-07 13:16:25')**
This is the same case in your's too.

Active Record find_by_sql and rspec expect block. New ids are created for no reason

I am working on a complex SQL query with rails 4.2.1
def self.proofreader_for_job(job)
User.find_by_sql(
"SELECT * FROM users
INNER JOIN timers
ON users.id = timers.proofreader_id
INNER JOIN tasks
ON tasks.id = timers.task_id
WHERE tasks.job_id = #{job.id}")
end
My schema is (jobs has_many tasks, tasks has_many timers, and a timer belongs_to a user(role: proofreader) through the foriegn key proofreader_id)
The issue is that when I call the method it is returning what is the correct user's email and attributes but the id doesn't match.
For exeample User.proofreader_for_job(job) returns
[#<User id: 178, email: "testemail#gmail.com">]
testemail#gmail.com is the correct email, but I don't have a user in my db with an id of 178.
User.all just returns
[#<User id: 12, email: "fakeemail#gmail.com">,
#<User id: 11, email: "testemail#gmail.com">]
I noticed the issue in my rspec tests, but it happens on both development and test environments.
Does anyone have any idea why my methods is returning a user with such a high id. Is this done by design, if so why?
Thank you.
Since you're doing 'Select *', your statement will return all columns for each of the tables in the JOIN statement. So when you're casting the output from the SQL statement to a User type, I think the wrong 'id' column is being grabbed for the User id (likely the timers or tasks table).
Try explicitly specifying the columns to return like the below statement:
User.find_by_sql(
"SELECT users.id, users.email FROM users
INNER JOIN timers
ON users.id = timers.proofreader_id
INNER JOIN tasks
ON tasks.id = timers.task_id
WHERE tasks.job_id = #{job.id}")
end

Ruby on Rails Select As doesn't work

I am trying to get a "Select As" query statement to work, but keep getting an error and am not sure why it is not working. Per the API docs, the format is correct.
User.select("firstname as fname")
Results in:
User Load (1.7ms) SELECT firstname as fname FROM "users"
=> #<ActiveRecord::Relation [#<User id: nil>, #<User id: nil>]
However if i use:
User.select(:firstname)
I get:
User Load (2.8ms) SELECT "users"."firstname" FROM "users"
=> #<ActiveRecord::Relation [#<User id: nil, firstname: "John">, #<User id: nil, firstname: "Brian">,
So I can see from the query why it's not returning the results, but I don't understand why its creating the incorrect query. (The actual query I need to use the select as on is more complicated then this query, but I was trying to use the simpler query to try to figure out why it wasn't working properly.
The reason I need to use a select as query is because I have two separate objects from two very different tables that i need to join together and change one of the column names so I can sort by that column. I'm not sure if there is an easier way to change the name prior to combining the objects.
Thanks!
You can use alias_attribute :firstname, :fname in the model and then use User.select(:fname) in the controller as well.

Not able to select distinct row from each group using ActiveRecord with PostgreSQL in Rails

I have a small problem with my ActiveRecord query.
I have a Product model, which looks like:
#<Product id: nil, name: nil, price: nil, order_id: nil, created_at: nil, updated_at: nil, restaurant_id: nil>
Now I want to select DISTINCT on all names, and get all Product's attributes back.
I tried:
#products = Product.where(restaurant_id: !nil).group("products.name").order("name")
but I got this error:
PG::GroupingError: ERROR: column "products.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT "products".* FROM "products" WHERE "products"."rest...
^
: SELECT "products".* FROM "products" WHERE "products"."restaurant_id" = 1 GROUP BY products.name
Ok, then I added product_id to my query:
#products = Product.where(restaurant_id: !nil).group("products.name, products.id").order("name")
but this query returns Products with duplicated names.
So I tried this too:
#products = Product.where(restaurant_id: !nil).select("DISTINCT(NAME)").order("name")
but in return I got only Product record with id and name only (it's obvious), so If this query returned correct set, I added attributes which I need later:
#products = Product.where(restaurant_id: !nil).select("DISTINCT(NAME), restaurant_id, price").order("name")
And it returns also duplicated names.
Do you have any solution or idea, how to fix this query for PostgreSQL?
I'm used to writting query like this (on MySQL) and it's correct:
#products = Product.where(restaurant_id: !nil).select("DISTINCT(NAME), restaurant_id, price").order("name")
Why does PostreSQL not accept that query?
You should write the query as :-
Product.where
.not(restaurant_id: nil)
.select("DISTINCT ON(name) name, restaurant_id, price, updated_at")
.order("updated_at, name")
As per the official documentation SELECT DISTINCT ON ( expression [, ...] )
keeps only the first row of each set of rows where the given expressions evaluate to equal. The DISTINCT ON expressions are interpreted using the same rules as for ORDER BY (see above). Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first. For example:
SELECT DISTINCT ON (location) location, time, report
FROM weather_reports
ORDER BY location, time DESC;
retrieves the most recent weather report for each location. But if we had not used ORDER BY to force descending order of time values for each location, we'd have gotten a report from an unpredictable time for each location.

Finding all the users that have duplicate names

I have users which has first_name and last_name fields and i need to do a ruby find all the users that have duplicate accounts based on first and last names. For example i want to have a find that will search through all the other users and find if any have the same name and email. I was thinking a nested loop like this
User.all.each do |user|
//maybe another loop to search through all the users and maybe if a match occurs put that user in an array
end
Is there a better way
You could go a long way toward narrowing down your search by finding out what the duplicated data is in the first place. For example, say you want to find each combination of first name and email that is used more than once.
User.find(:all, :group => [:first, :email], :having => "count(*) > 1" )
That will return an array containing one of each of the duplicated records. From that, say one of the returned users had "Fred" and "fred#example.com" then you could search for only Users having those values to find all of the affected users.
The return from that find will be something like the following. Note that the array only contains a single record from each set of duplicated users.
[#<User id: 3, first: "foo", last: "barney", email: "foo#example.com", created_at: "2010-12-30 17:14:43", updated_at: "2010-12-30 17:14:43">,
#<User id: 5, first: "foo1", last: "baasdasdr", email: "abc#example.com", created_at: "2010-12-30 17:20:49", updated_at: "2010-12-30 17:20:49">]
For example, the first element in that array shows one user with "foo" and "foo#example.com". The rest of them can be pulled out of the database as needed with a find.
> User.find(:all, :conditions => {:email => "foo#example.com", :first => "foo"})
=> [#<User id: 1, first: "foo", last: "bar", email: "foo#example.com", created_at: "2010-12-30 17:14:28", updated_at: "2010-12-30 17:14:28">,
#<User id: 3, first: "foo", last: "barney", email: "foo#example.com", created_at: "2010-12-30 17:14:43", updated_at: "2010-12-30 17:14:43">]
And it also seems like you'll want to add some better validation to your code to prevent duplicates in the future.
Edit:
If you need to use the big hammer of find_by_sql, because Rails 2.2 and earlier didn't support :having with find, the following should work and give you the same array that I described above.
User.find_by_sql("select * from users group by first,email having count(*) > 1")
After some googling, I ended up with this:
ActiveRecord::Base.connection.execute(<<-SQL).to_a
SELECT
variants.id, variants.variant_no, variants.state
FROM variants INNER JOIN (
SELECT
variant_no, state, COUNT(1) AS count
FROM variants
GROUP BY
variant_no, state HAVING COUNT(1) > 1
) tt ON
variants.variant_no = tt.variant_no
AND variants.state IS NOT DISTINCT FROM tt.state;
SQL
Note that part that says IS NOT DISTINCT FROM, this is to help deal with NULL values, which can't be compared with equals sign in postgres.
If you are going the route of #hakunin and creating a query manually, you may wish to use the following:
ActiveRecord::Base.connection.exec_quey(<<-SQL).to_a
SELECT
variants.id, variants.variant_no, variants.state
FROM variants INNER JOIN (
SELECT
variant_no, state, COUNT(1) AS count
FROM variants
GROUP BY
variant_no, state HAVING COUNT(1) > 1
) tt ON
variants.variant_no = tt.variant_no
AND variants.state IS NOT DISTINCT FROM tt.state;
SQL
The change is replacing connection.execute(<<-SQL)
with connection.exec_query(<<-SQL)
There can be a problem with memory leakage using execute
Plead read Clarify DataBaseStatements#execute to get an in depth understanding of the problem.

Resources