Tag.joins(:quote_tags).group('quote_tags.tag_id').order('count desc').select('count(tags.id) AS count, tags.id, tags.name')
Build query:
SELECT count(tags.id) AS count, tags.id, tags.name FROM `tags` INNER JOIN `quote_tags` ON `quote_tags`.`tag_id` = `tags`.`id` GROUP BY quote_tags.tag_id ORDER BY count desc
Result:
[#<Tag id: 401, name: "different">, ... , #<Tag id: 4, name: "family">]
It not return count column for me. How can I get it?
Have you tried calling the count method on one of the returned Tag objects? Just because inspect doesn't mention the count doesn't mean that it isn't there. The inspect output:
[#<Tag id: 401, name: "different">, ... , #<Tag id: 4, name: "family">]
will only include things that the Tag class knows about and Tag will only know about the columns in the tags table: you only have id and name in the table so that's all you see.
If you do this:
tags = Tag.joins(:quote_tags).group('quote_tags.tag_id').order('count desc').select('count(tags.id) AS count, tags.id, tags.name')
and then look at the counts:
tags.map(&:count)
You'll see the array of counts that you're expecting.
Update: The original version of this answer mistakenly characterized select and subsequent versions ended up effectively repeating the current version of the other answer from #muistooshort. I'm leaving it in it's current state because it has the information about using raw sql. Thanks to #muistooshort for pointing out my error.
Although your query is in fact working as explained by the other answer, you can always execute raw SQL as an alternative.
There are a variety of select_... methods you can choose from, but I would think you'd want to use select_all. Assuming the build query that you implicitly generated was correct, you can just use that, as in:
ActiveRecord::Base.connection.select_all('
SELECT count(tags.id) AS count, tags.id, tags.name FROM `tags`
INNER JOIN `quote_tags` ON `quote_tags`.`tag_id` = `tags`.`id`
GROUP BY quote_tags.tag_id
ORDER BY count desc')
See http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/DatabaseStatements.html for information on the various methods you can choose from.
Related
Tag.joins(:quote_tags).group('quote_tags.tag_id').order('count desc').select('count(tags.id) AS count, tags.id, tags.name')
Build query:
SELECT count(tags.id) AS count, tags.id, tags.name FROM `tags` INNER JOIN `quote_tags` ON `quote_tags`.`tag_id` = `tags`.`id` GROUP BY quote_tags.tag_id ORDER BY count desc
Result:
[#<Tag id: 401, name: "different">, ... , #<Tag id: 4, name: "family">]
It not return count column for me. How can I get it?
Have you tried calling the count method on one of the returned Tag objects? Just because inspect doesn't mention the count doesn't mean that it isn't there. The inspect output:
[#<Tag id: 401, name: "different">, ... , #<Tag id: 4, name: "family">]
will only include things that the Tag class knows about and Tag will only know about the columns in the tags table: you only have id and name in the table so that's all you see.
If you do this:
tags = Tag.joins(:quote_tags).group('quote_tags.tag_id').order('count desc').select('count(tags.id) AS count, tags.id, tags.name')
and then look at the counts:
tags.map(&:count)
You'll see the array of counts that you're expecting.
Update: The original version of this answer mistakenly characterized select and subsequent versions ended up effectively repeating the current version of the other answer from #muistooshort. I'm leaving it in it's current state because it has the information about using raw sql. Thanks to #muistooshort for pointing out my error.
Although your query is in fact working as explained by the other answer, you can always execute raw SQL as an alternative.
There are a variety of select_... methods you can choose from, but I would think you'd want to use select_all. Assuming the build query that you implicitly generated was correct, you can just use that, as in:
ActiveRecord::Base.connection.select_all('
SELECT count(tags.id) AS count, tags.id, tags.name FROM `tags`
INNER JOIN `quote_tags` ON `quote_tags`.`tag_id` = `tags`.`id`
GROUP BY quote_tags.tag_id
ORDER BY count desc')
See http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/DatabaseStatements.html for information on the various methods you can choose from.
I have the following query returning duplicate titles, but :id is nil:
Movie.select(:title).group(:title).having("count(*) > 1")
[#<Movie:0x007f81f7111c20 id: nil, title: "Fargo">,
#<Movie:0x007f81f7111ab8 id: nil, title: "Children of Men">,
#<Movie:0x007f81f7111950 id: nil, title: "The Martian">,
#<Movie:0x007f81f71117e8 id: nil, title: "Gravity">]
I tried adding :id to the select and group but it returns an empty array. How can I return the whole movie record, not just the titles?
A SQL-y Way
First, let's just solve the problem in SQL, so that the Rails-specific syntax doesn't trick us.
This SO question is a pretty clear parallel: Finding duplicate values in a SQL Table
The answer from KM (second from the top, non-checkmarked, at the moment) meets your criteria of returning all duplicated records along with their IDs. I've modified KM's SQL to match your table...
SELECT
m.id, m.title
FROM
movies m
INNER JOIN (
SELECT
title, COUNT(*) AS CountOf
FROM
movies
GROUP BY
title
HAVING COUNT(*)>1
) dupes
ON
m.title=dupes.title
The portion inside the INNER JOIN ( ) is essentially what you've generated already. A grouped table of duplicated titles and counts. The trick is JOINing it to the unmodified movies table, which will exclude any movies that don't have matches in the query of dupes.
Why is this so hard to generate in Rails? The trickiest part is that, because we're JOINing movies to movies, we have to create table aliases (m and dupes in my query above).
Sadly, it Rails doesn't provide any clean ways of declaring these aliases. Some references:
Rails GitHub issues mentioning "join" and "alias". Misery.
SO Question: ActiveRecord query with alias'd table names
Fortunately, since we've got the SQL in-hand, we can use the .find_by_sql method...
Movie.find_by_sql("SELECT m.id, m.title FROM movies m INNER JOIN (SELECT title, COUNT(*) FROM movies GROUP BY title HAVING COUNT(*)>1) dupes ON m.first=.first")
Because we're calling Movie.find_by_sql, ActiveRecord assumes our hand-written SQL can be bundled into Movie objects. It doesn't massage or generate anything, which lets us do our aliases.
This approach has its shortcomings. It returns an array and not an ActiveRecord Relation, which means it can't be chained with other scopes. And, in the documentation for the find_by_sql method, we get extra discouragement...
This should be a last resort because using, for example, MySQL specific terms will lock you to using that particular database engine or require you to change your call if you switch engines.
A Rails-y Way
Really, what is the SQL doing above? It's getting a list of names that appear more than once. Then, it's matching that list against the original table. So, let's just do that using Rails.
titles_with_multiple = Movie.group(:title).having("count(title) > 1").count.keys
Movie.where(title: titles_with_multiple)
We call .keys because the first query returns an hash. The keys are our titles. The where() method can take an array, and we've handed it an array of titles. Winner.
You could argue one line of Ruby is more elegant than two. And if that one line of Ruby has an ungodly string of SQL embedded within it, how elegant is it really?
Hope this helps!
You can try to add id in your select:
Movie.select([:id, :title]).group(:title).having("count(title) > 1")
What I want to do is to join table and sum 3 columns.
self.document_products.joins("JOIN products ON products.id = document_products.product_id").group("products.tax_id").select("sum(a), sum(b), sum(c)")
Gives me
#<ActiveRecord::Relation [#<DocumentProduct id: nil>]>
Something like that works:
self.document_products.joins("JOIN products ON products.id = document_products.product_id").group("products.tax_id").sum("a")
But I want to have 3 sums. I can`t do sum("a, b, c"). Where is the problem?
So, the code is building a SQL query using the ActiveRecord chained method syntax. It's possible to use .to_sql as the final part of most such chains (basically, as long as it's still an ActiveRecord object, rather than having been converted to an Array, for example) to see the SQL generated, or indeed inspecting the log, if it's on. Considering the common part of the chain:
self.document_products.joins("JOIN products ON products.id = document_products.product_id").group("products.tax_id")
This generates something like (might not be exact, because I'm guessing a little about your application):
SELECT "document_products".* FROM "document_products" JOIN products ON products.id = document_products.product_id WHERE "document_products"."document_id" = 1497 GROUP BY products.tax_id
The two final methods you list are very different; select selects which columns in the query to return, whereas sum is an aggregate function which expects a single value to be returned in each case. Considering the select, we get something like the following generated:
SELECT SUM(products.a), SUM(products.b), SUM(products.c) FROM "document_products" JOIN products ON products.id = document_products.product_id WHERE "document_products"."document_id" = 1497 GROUP BY products.tax_id
When this query is interpreted, the expected data cannot be found, leading to the problem described. Ensuring that the GROUP BY clause is included in the SELECT part, however, yields the necessary information. Try something like this:
self.document_products.joins("JOIN products ON products.id = document_products.product_id").group("products.tax_id").select("products.tax_id, sum(a), sum(b), sum(c)")
This generates SQL something like:
SELECT products.tax_id, SUM(products.a), SUM(products.b), SUM(products.c) FROM "document_products" JOIN products ON products.id = document_products.product_id WHERE "document_products"."document_id" = 1497 GROUP BY products.tax_id
This appears to return the necessary information, and is, I think, what you're looking for (or close to it).
I have the following example query:
source = "(SELECT DISTINCT source.* FROM (SELECT * FROM items) AS source) AS items"
items = Item.select("items.*").from(source).includes([:images])
p items # [#<Item id: 1>, #<Item id:2>]
However running:
p items.count
Results in NoMethodError: undefined methodmap' for Arel::Nodes::SqlLiteral`
I appreciate the query is silly, however the non-simplifieid query is a bit too complicated to copy and this was the smallest crashing version I could create. Any ideas?
Can you call all on that object to essentially cast it to an Array?
Item.select("items.*").from(source).includes([:images]).all.count
Or perhaps in that case, size would be more appropriate. In any case, this will execute the query and load all the objects into memory, which may not be desirable.
It looks like the problem is with your includes([:images]). On a similar application, I can execute this from the console:
> Category.select('categories.*').from('(SELECT DISTINCT source.* FROM (SELECT * FROM categories) AS source) AS categories').count
(0.5ms) SELECT COUNT(*) FROM (SELECT DISTINCT source.* FROM (SELECT * FROM categories) AS source) AS categories
(Notice that the count overrides the SELECT clause, even though I explicitly specified items.*. But they're still equivalent queries.)
As soon as I add an includes scope, it fails:
> Category.select('categories.*').from('(SELECT DISTINCT source.* FROM (SELECT * FROM categories) AS source) AS categories').includes(:projects).count
NoMethodError: undefined method `left' for #<Arel::Nodes::SqlLiteral:0x131d35248>
I tried a few different means of acquiring the count, like select('COUNT(categories.*)'), but they all failed in various ways. ActiveRecord seems to be falling back on a basic LEFT OUTER JOIN to perform the eager loading, possibly because it thinks you're using some kind of condition or external table to perform the join, and this seems to confuse its normal methods of performing the count. See the end of the section on Eager Loading in the ActiveRecord::Associations docs.
My Suggestion
If the join doesn't affect the number of rows returned in the outer query, I'd say your best bet is to execute one query to get the count and one query to get the actual results. We have to do something similar in our application for paging: one query returns the current page of results, and one returns the total number of records matching the filter criteria.
The issue is Rails #24193 https://github.com/rails/rails/issues/24193 and has to do with from combined with eager loading. The workaround is to use the form: Item.select("items.*").from([Arel.sql(source)]).includes([:images])
In rails 3.0.0, the following query works fine:
Author.where("name LIKE :input",{:input => "#{params[:q]}%"}).includes(:books).order('created_at')
However, when I input as search string (so containing a double colon followed by a dot):
aa:.bb
I get the following exception:
ActiveRecord::StatementInvalid: SQLite3::SQLException: ambiguous column name: created_at
In the logs the these are the sql queries:
with aa as input:
Author Load (0.4ms) SELECT "authors".* FROM "authors" WHERE (name LIKE 'aa%') ORDER BY created_at
Book Load (2.5ms) SELECT "books".* FROM "books" WHERE ("books".author_id IN (1,2,3)) ORDER BY id
with aa:.bb as input:
SELECT DISTINCT "authors".id FROM "authors" LEFT OUTER JOIN "books" ON "books"."author_id" = "authors"."id" WHERE (name LIKE 'aa:.bb%') ORDER BY created_at DESC LIMIT 12 OFFSET 0
SQLite3::SQLException: ambiguous column name: created_at
It seems that with the aa:.bb input, an extra query is made to fetch the distinct author id_s.
I thought Rails would escape all the characters. Is this expected behaviour or a bug?
Best Regards,
Pieter
The "ambiguous column" error usually happens when you use includes or joins and don't specify which table you're referring to:
"name LIKE :input"
Should be:
"authors.name LIKE :input"
Just "name" is ambiguous if your books table has a name column too.
Also: have a look at your development.log to see what the generated query looks like. This will show you if it's being escaped properly.
Replace
.includes(:books)
with
.preload(:books)
This should force activerecord to use 2 queries instead of the join.
Rails has 2 versions of includes: One which constructs a big query with joins (the 2nd of your 2 queries and thus more likely to result in ambiguous column references and one that avoids the joins in favour of a separate query per association.
Rails decides which strategy to used based on whether it thinks that your conditions, order etc refer to the included tables (since in that case the joins version is required). Where a condition is a string fragment that heuristic isn't very sophisticated - i seem to recall that it just scans the conditions for anything that might look like a column from another table (ie foo.bar) so having a literal of that form could fool it.
You can either qualify your column names so that it doesn't matter which includes strategy is used or you can use preload/eager_load instead of includes. These behave similarly to includes but force a specific include strategy rather than trying to guess which is most appropriate.