Rails 3: has_many :uniq issue - ruby-on-rails

Here is a sample model setup (barebone Rails 3.0.5):
class Post < ActiveRecord::Base
has_many :comments
end
class Comment < ActiveRecord::Base
belongs_to :post
belongs_to :user
end
class User < ActiveRecord::Base
has_many :comments
has_many :commented_posts, through: :comments, source: :post, uniq: true
end
Now the following works correctly:
ruby-1.9.2-p0 > user.commented_posts.count
SQL (0.2ms) SELECT COUNT(DISTINCT "posts".id) FROM "posts" INNER JOIN "comments" ON "posts".id = "comments".post_id WHERE (("comments".user_id = 1))
=> 1
But adding condition makes active record 'forget' about uniq: true bit:
ruby-1.9.2-p0 > user.commented_posts.where("posts.id != 42").count
SQL (0.2ms) SELECT COUNT(*) FROM "posts" INNER JOIN "comments" ON "posts".id = "comments".post_id WHERE (("comments".user_id = 1)) AND (posts.id != 42)
=> 2
Bug? Or what am I missing?
edit:
all works:
ruby-1.9.2-p0 > user.commented_posts.where("posts.id != 42").all
Post Load (0.3ms) SELECT DISTINCT "posts".* FROM "posts" INNER JOIN "comments" ON "posts".id = "comments".post_id WHERE (("comments".user_id = 1)) AND (posts.id != 42)
=> [#<Post id: 1, created_at: "2011-03-07 12:17:30", updated_at: "2011-03-07 12:17:30">]
explicit uniq too:
ruby-1.9.2-p0 > user.commented_posts.where("posts.id != 42").uniq.count
Post Load (0.2ms) SELECT DISTINCT "posts".* FROM "posts" INNER JOIN "comments" ON "posts".id = "comments".post_id WHERE (("comments".user_id = 1)) AND (posts.id != 42)
=> 1
edit 2
Indeed bug in Rails. I submitted a patch. Please upvote it so it gets through sooner. https://github.com/rails/rails/pull/2924#issuecomment-3317185

I've run into this as well on both 3.0.9 and 3.0.10. I consider it an arel bug, although there might be some reason as to why it behaves this way.
I tried overriding count on the join method...
has_many :commented_posts, through: :comments, source: :post, uniq: true do
def count(column_name = nil, opts = {})
super(column_name || 'users.id', opts.reverse_merge(distinct: true))
end
end
But arel ignores the count method if a condition is present. This is why I think it's a bug.
As a hacking solution, I'm using #comments.count('users.id', distinct: true) to force arel to behave in situations where #comments might have a condition attached in the controller.

I think that the explicit uniq is causing the count to occur from the array in memory rather than doing a count in the database. You can tell this is true from the SELECT DISTINCT rather than SELECT COUNT DISTINCT. This will pull all the entries from the DB into rails and then count them.
It is possible to get the DB to do this by passing a couple of parameters to count
collection.count(:id, :distinct => true) will cause the correct SQL code to be generated.

Related

How to search group_by with association in Rails

I don't know how I describe this question, at first I want show my model which maintaining a relation like below
category.rb
class Category < ApplicationRecord
has_many :job_categories, dependent: :destroy
has_many :jobs, through: :job_categories
end
job.rb
class Job < ApplicationRecord
has_many :job_categories, dependent: :destroy
has_many :categories, through: :job_categories
end
job_category.rb
class JobCategory < ApplicationRecord
belongs_to :category, counter_cache: :jobs_count
belongs_to :job
end
schema.rb
create_table "categories", force: :cascade do |t|
t.string "name"
t.string "parent"
end
the parent is a column which maintain the group like Technology and under this ruby,rails,programming etc which is Technology related.
Below is my query for showing group by category
Category.select(:id, :name, :parent).group_by{|p| p.parent}
and it's showing like this
Technology
ruby
rails
etc
Now I want to show all jobs in group by Technology, I have a query for this like
Job.joins(:categories).where('lower(categories.parent) LIKE lower(?)', "%#{params[:parent]}%")
and it's showing wrong output like if I have only one job which categories is ruby,rails then this one job is showing two times, one for ruby and one for rails.
Thanks
Your associations are correct, you can retrieve all unique jobs for some categories by following:
Job.joins(:job_categories).joins(:categories).where('lower(categories.parent) LIKE lower(?)', "%#{params[:parent]}%").distinct
This will join the jobs with the intermediate table job_categories and jobs on relevant keys and where clause will then allow you to be selective on what you want to retrieve.
SELECT DISTINCT "jobs" .*
FROM "jobs" INNER
JOIN "job_categories" ON "job_categories" ."job_id" = "jobs" ."id" INNER
JOIN "job_categories" "job_categories_jobs_join" ON "job_categories_jobs_join" ."job_id" = "jobs" ."id" INNER
JOIN "categories" ON "categories" ."id" = "job_categories_jobs_join" ."category_id"
WHERE
(
lower ( categories.parent ) LIKE lower ( "Technology" ) )
Update:
Actually, we don't need to have explicit join to job_categories either, the following should suffice:
Job.joins(:categories).where('lower(categories.parent) LIKE lower(?)', "%#{params[:parent]}%").distinct
SELECT DISTINCT "jobs".* FROM "jobs" INNER JOIN "job_categories" ON "job_categories"."job_id" = "jobs"."id" INNER JOIN "categories" ON "categories"."id" = "job_categories"."category_id" WHERE (lower ( categories.parent ) LIKE lower ( "Technology" ))
Just few other options to fetch and group records with association has_many_through:
# Filtering by query
Job.joins(:categories).select('jobs.id, jobs.name, categories.parent').where('lower(categories.parent) LIKE lower(?)', "Technology").distinct.inspect
# => #<ActiveRecord::Relation [#<Job id: 1, name: "Developer">, #<Job id: 2, name: "Debugger">]>
# Grouping by categories.parent, return a hash
Job.joins(:categories).select('jobs.id, jobs.name, categories.parent').all.distinct.group_by(&:parent)
# => {"Technology"=>[#<Job id: 1, name: "Developer">, #<Job id: 2, name: "Debugger">], "Mechanics"=>[#<Job id: 3, name: "Technic">]}
# Accessing the hash by key
Job.joins(:categories).select('jobs.id, jobs.name, categories.parent').all.distinct.group_by(&:parent)["Technology"]
#=> [#<Job id: 1, name: "Developer">, #<Job id: 2, name: "Debugger">]

Two belong_to referring the same table + eager loading

First of all, based on this (Rails association with multiple foreign keys) I figured out how to make two belong_to pointing to the same table.
I have something like that
class Book < ApplicationRecord
belongs_to :author, inverse_of: :books
belongs_to :co_author, inverse_of: :books, class_name: "Author"
end
class Author < ApplicationRecord
has_many :books, ->(author) {
unscope(:where).
where("books.author_id = :author_id OR books.co_author_id = :author_id", author_id: author.id)
}
end
It's all good. I can do either
book.author
book.co_author
author.books
However, sometimes I need to eager load books for multiple authors (to avoid N queries).
I am trying to do something like:
Author.includes(books: :title).where(name: ["Lewis Carroll", "George Orwell"])
Rails 5 throws at me: "ArgumentError: The association scope 'books' is instance dependent (the scope block takes an argument). Preloading instance dependent scopes is not supported."
I am trying to figure out what I should do?
Should I go with many-to-many association? It sounds like a solution. However, it looks like it will introduce it's own problems (I need "ordering", meaning that I need explicitly differentiate between main author and co-author).
Just trying to figure out whether I am missing some simpler solution...
Why do you not use HABTM relation? For example:
# Author model
class Author < ApplicationRecord
has_and_belongs_to_many :books, join_table: :books_authors
end
# Book model
class Book < ApplicationRecord
has_and_belongs_to_many :authors, join_table: :books_authors
end
# Create books_authors table
class CreateBooksAuthorsTable < ActiveRecord::Migration
def change
create_table :books_authors do |t|
t.references :book, index: true, foreign_key: true
t.references :author, index: true, foreign_key: true
end
end
end
You can use eagerload like as following:
irb(main):007:0> Author.includes(:books).where(name: ["Lewis Carroll", "George Orwell"])
Author Load (0.1ms) SELECT "authors".* FROM "authors" WHERE "authors"."name" IN (?, ?) LIMIT ? [["name", "Lewis Correll"], ["name", "George Orwell"], ["LIMIT", 11]]
HABTM_Books Load (0.1ms) SELECT "books_authors".* FROM "books_authors" WHERE "books_authors"."author_id" IN (?, ?) [["author_id", 1], ["author_id", 2]]
Book Load (0.1ms) SELECT "books".* FROM "books" WHERE "books"."id" IN (?, ?) [["id", 1], ["id", 2]]
Try this:
Author.where(name: ["Lewis Carroll", "George Orwell"]).include(:books).select(:title)

Has Many 'finder_sql' Replacement in Rails 4.2

I've got an association that needs a few joins / custom queries. When trying to figure out how to implement this the repeated response is finder_sql. However in Rails 4.2 (and above):
ArgumentError: Unknown key: :finder_sql
My query to do the join looks like this:
'SELECT DISTINCT "tags".*' \
' FROM "tags"' \
' JOIN "articles_tags" ON "articles_tags"."tag_id" = "tags"."id"' \
' JOIN "articles" ON "article_tags"."article_id" = "articles"."id"' \
' WHERE articles"."user_id" = #{id}'
I understand that this can be achieved via:
has_many :tags, through: :articles
However if the cardinality of the join is large (i.e. a user has thousands of articles - but the system only has a few tags) it requires loading all the articles / tags:
SELECT * FROM articles WHERE user_id IN (1,2,...)
SELECT * FROM article_tags WHERE article_id IN (1,2,3...) -- a lot
SELECT * FROM tags WHERE id IN (1,2,3) -- a few
And of course also curious about the general case.
Note: also tried using the proc syntax but can't seem to figure that out:
has_many :tags, -> (user) {
select('DISTINCT "tags".*')
.joins('JOIN "articles_tags" ON "articles_tags"."tag_id" = "tags"."id"')
.joins('JOIN "articles" ON "article_tags"."article_id" = "articles"."id"')
.where('"articles"."user_id" = ?', user.id)
}, class_name: "Tag"
ActiveRecord::StatementInvalid: PG::UndefinedColumn: ERROR: column tags.user_id does not exist
SELECT DISTINCT "tags".* FROM "tags" JOIN "articles_tags" ON "articles_tags"."tag_id" = "tags"."id" JOIN "articles" ON "article_tags"."article_id" = "articles"."id" WHERE "tags"."user_id" = $1 AND ("articles"."user_id" = 1)
That is it looks like it is trying to inject the user_id onto tags automatically (and that column only exists on articles). Note: I'm preloading for multiple users so can't use user.tags without other fixes (the SQL pasted is what I'm seeing using exactly that!). Thoughts?
While this doesn't fix your problem directly - if you only need a subset of your data you can potentially preload it via a subselect:
users = User.select('"users".*"').select('COALESCE((SELECT ARRAY_AGG(DISTINCT "tags"."name") ... WHERE "articles"."user_id" = "users"."id"), '{}') AS tag_names')
users.each do |user|
puts user[:tag_names].join(' ')
end
The above is DB specific for Postgres (due to ARRAY_AGG) but an equivalent solution probably exists for other databases.
An alternative option might be to setup a view as a fake join table (again requires database support):
CREATE OR REPLACE VIEW tags_users AS (
SELECT
"users"."id" AS "user_id",
"tags"."id" AS "tag_id"
FROM "users"
JOIN "articles" ON "users"."id" = "articles"."user_id"
JOIN "articles_tags" ON "articles"."id" = "articles_tags"."article_id"
JOIN "tags" ON "articles_tags"."tag_id" = "tags"."id"
GROUP BY "user_id", "tag_id"
)
Then you can use has_and_belongs_to_many :tags (haven't tested - may want to set to readonly and can remove some of the joins and use if you have proper foreign key constraints setup).
So my guess is you are getting the error when you try to access #user.tags since you have that association inside the user.rb.
So I think what happens is when we try to access the #user.tags, we are trying to fetch the tags of the user and to that rails will search Tags whose user_id matches with currently supplied user's id. Since rails takes association name as modelname_id format by default, even if you don't have user_id it will try to search in that column and it will search (or add WHERE "tags"."user_id") no matter you want it to or not since ultimate goal is to find tags that are belongs to current user.
Of course my answer may not explain it 100%. Feel free to comment your thought or If you find anything wrong, let me know.
Short Answer
Ok, if I understand this correctly I think I have the solution, that just uses the core ActiveRecord utilities and does not use finder_sql.
Could potentially use:
user.tags.all.distinct
Or alternatively, in the user model change the has_many tags to
has_many :tags, -> {distinct}, through: :articles
You could create a helper method in user to retrieve this:
def distinct_tags
self.tags.all.distinct
end
The Proof
From your question I believe you have the following scenario:
A user can have many articles.
An article belongs to a single user.
Tags can belong to many articles.
Articles can have many tags.
You want to retrieve all the distinct tags a user has associated with the articles they have created.
With that in mind I created the following migrations:
class CreateUsers < ActiveRecord::Migration
def change
create_table :users do |t|
t.string :name, limit: 255
t.timestamps null: false
end
end
end
class CreateArticles < ActiveRecord::Migration
def change
create_table :articles do |t|
t.string :name, limit: 255
t.references :user, index: true, null: false
t.timestamps null: false
end
add_foreign_key :articles, :users
end
end
class CreateTags < ActiveRecord::Migration
def change
create_table :tags do |t|
t.string :name, limit: 255
t.timestamps null: false
end
end
end
class CreateArticlesTagsJoinTable < ActiveRecord::Migration
def change
create_table :articles_tags do |t|
t.references :article, index: true, null:false
t.references :tag, index: true, null: false
end
add_index :articles_tags, [:tag_id, :article_id], unique: true
add_foreign_key :articles_tags, :articles
add_foreign_key :articles_tags, :tags
end
end
And the models:
class User < ActiveRecord::Base
has_many :articles
has_many :tags, through: :articles
def distinct_tags
self.tags.all.distinct
end
end
class Article < ActiveRecord::Base
belongs_to :user
has_and_belongs_to_many :tags
end
class Tag < ActiveRecord::Base
has_and_belongs_to_many :articles
end
Next seed the database with a lot of data:
10.times do |tagcount|
Tag.create(name: "tag #{tagcount+1}")
end
5.times do |usercount|
user = User.create(name: "user #{usercount+1}")
1000.times do |articlecount|
article = Article.new(user: user)
5.times do |tagcount|
article.tags << Tag.find(tagcount+usercount+1)
end
article.save
end
end
Finally in rails console:
user = User.find(3)
user.distinct_tags
results in following output:
Tag Load (0.4ms) SELECT DISTINCT `tags`.* FROM `tags` INNER JOIN `articles_tags` ON `tags`.`id` = `articles_tags`.`tag_id` INNER JOIN `articles` ON `articles_tags`.`article_id` = `articles`.`id` WHERE `articles`.`user_id` = 3
=> #<ActiveRecord::AssociationRelation [#<Tag id: 3, name: "tag 3", created_at: "2016-10-18 22:00:52", updated_at: "2016-10-18 22:00:52">, #<Tag id: 4, name: "tag 4", created_at: "2016-10-18 22:00:52", updated_at: "2016-10-18 22:00:52">, #<Tag id: 5, name: "tag 5", created_at: "2016-10-18 22:00:52", updated_at: "2016-10-18 22:00:52">, #<Tag id: 6, name: "tag 6", created_at: "2016-10-18 22:00:52", updated_at: "2016-10-18 22:00:52">, #<Tag id: 7, name: "tag 7", created_at: "2016-10-18 22:00:52", updated_at: "2016-10-18 22:00:52">]>
May be it is helpful to use eager_load to force ActiveRecord execute joins. It works as includes(:tags).references(:tags)
Here is a code snippet:
users.eager_load(:tags).map { |user| user.tag.inspect }
# equal to
users.includes(:tags).references(:tags).map { |user| user.tag.inspect }
Where users - is an ActiveRecord relation.
This code will hit a database at least twice:
Select only users ids (hopefully, not too many)
Select users with joins tags through article_tags avoiding
SELECT * FROM article_tags WHERE article_id IN (1,2,3...) -- a lot
You are on the right path with has_many :tags, through: :articles (or even better has_many :tags, -> {distinct}, through: :articles as Kevin suggests). But you should read a bit about includes vs preload vs eager_load. You are doing this:
User.preload(:tags).each {|u| ... }
But you should do this:
User.eager_load(:tags).each {|u| ... }
or this:
User.includes(:tags).references(:tags).each {|u| ... }
When I do that I get this query:
SELECT "users"."id" AS t0_r0,
"tags"."id" AS t1_r0,
"tags"."name" AS t1_r1
FROM "users"
LEFT OUTER JOIN "articles"
ON "articles"."user_id" = "users"."id"
LEFT OUTER JOIN "articles_tags"
ON "articles_tags"."article_id" = "articles"."id"
LEFT OUTER JOIN "tags"
ON "tags"."id" = "articles_tags"."tag_id"
But that is still going to send a lot of redundant stuff from the database to your app. This will be faster:
User.eager_load(:tags).distinct.each {|u| ... }
Giving:
SELECT DISTINCT "users"."id" AS t0_r0,
"tags"."id" AS t1_r0,
"tags"."name" AS t1_r1
FROM "users"
LEFT OUTER JOIN "articles"
ON "articles"."user_id" = "users"."id"
LEFT OUTER JOIN "articles_tags"
ON "articles_tags"."article_id" = "articles"."id"
LEFT OUTER JOIN "tags"
ON "tags"."id" = "articles_tags"."tag_id"
Doing just User.first.tags.map &:name gets me joins too:
SELECT DISTINCT "tags".*
FROM "tags"
INNER JOIN "articles_tags"
ON "tags"."id" = "articles_tags"."tag_id"
INNER JOIN "articles"
ON "articles_tags"."article_id" = "articles"."id"
WHERE "articles"."user_id" = ?
For more details, please see this github repo with an rspec test to see what SQL Rails is using.
There are three possible solutions:
1) Continue to use has_many associations
Fake user_id column by adding it to the selected columns.
class User < ActiveRecord::Base
has_many :tags, -> (user) {
select(%Q{DISTINCT "tags".*, #{user_id} AS user_id })
.joins('JOIN "articles_tags" ON "articles_tags"."tag_id" = "tags"."id"')
.joins('JOIN "articles" ON "article_tags"."article_id" = "articles"."id"')
.where('"articles"."user_id" = ?', user.id)
}, class_name: "Tag"
end
2) Add an instance method on the User class
If you are using tags for queries only and you haven't used it in joins you can use this approach:
class User
def tags
select(%Q{DISTINCT "tags".*})
.joins('JOIN "articles_tags" ON "articles_tags"."tag_id" = "tags"."id"')
.joins('JOIN "articles" ON "article_tags"."article_id" = "articles"."id"')
.where('"articles"."user_id" = ?', id)
end
end
Now user.tags behaves like an association for all practical purposes.
3) OTOH, using EXISTS might be performant than using distinct
class User < ActiveRecord::Base
def tags
exists_sql = %Q{
SELECT 1
FROM articles,
articles_tags
WHERE "articles"."user_id" = #{id} AND
"articles_tags"."article_id" = "article"."id" AND
"articles_tags"."tag_id" = "tags.id"
}
Tag.where(%Q{ EXISTS ( #{exists_sql} ) })
end
end

Limit maximum associations in Rails

I have two Models Team and Match and a TeamMatch association.
class Match < ActiveRecord::Base
has_many :teams, :through => :team_matches, :source => :team
has_many :team_matches
def attend(team)
self.team_matches.create!(:team => team)
rescue ActiveRecord::RecordInvalid
nil
end
end
class Team < ActiveRecord::Base
has_many :matches, :through => :team_matches, :source => :match
has_many :team_matches
end
class TeamMatch < ActiveRecord::Base
belongs_to :match
belongs_to :team
end
How do I restrict how many Teams can be assigned to a Match?
EDIT:
Update according to suggestions. m = FactoryGirl.create(:team), t..2 = FactoryGirl.create(:team)
1.9.3p194 :005 > m.attend(t)
(0.1ms) BEGIN
TeamMatch Exists (0.3ms) SELECT 1 AS one FROM `team_matches` WHERE (`team_matches`.`team_id` = BINARY 1 AND `team_matches`.`match_id` = 1) LIMIT 1
SQL (0.2ms) INSERT INTO `team_matches` (`match_id`, `team_id`) VALUES (1, 1)
(0.4ms) COMMIT
=> #<TeamMatch id: 1, match_id: 1, team_id: 1>
1.9.3p194 :006 > m.attend(t1)
(0.1ms) BEGIN
TeamMatch Exists (0.3ms) SELECT 1 AS one FROM `team_matches` WHERE (`team_matches`.`team_id` = BINARY 2 AND `team_matches`.`match_id` = 1) LIMIT 1
SQL (0.1ms) INSERT INTO `team_matches` (`match_id`, `team_id`) VALUES (1, 2)
(0.4ms) COMMIT
=> #<TeamMatch id: 2, match_id: 1, team_id: 2>
1.9.3p194 :007 > m.attend(t2)
(0.1ms) BEGIN
TeamMatch Exists (0.3ms) SELECT 1 AS one FROM `team_matches` WHERE (`team_matches`.`team_id` = BINARY 3 AND `team_matches`.`match_id` = 1) LIMIT 1
SQL (0.2ms) INSERT INTO `team_matches` (`match_id`, `team_id`) VALUES (1, 3)
(0.4ms) COMMIT
I just realize there is much neater solution to that, which is possible as it is has_many through association:
class TeamMatch < ActiveRecord::Base
belongs_to :match
belongs_to :team
validate :teams_per_match_limit
def teams_per_match_limit
errors.add(:base, 'blah') if par.children.size > 1
end
end
You can use association callback.
has_many :teams, :through => :team_matches, :source => :team, :before_add => :limit_number_of_teams
def limit_number_of_teams(added_team)
raise Exception.new('Team limit for the match reached') if teams.size >= 2
end
You should add some kind of custom validation like this:
class Match < ActiveRecord::Base
has_many :teams, :through => :team_matches, :source => :team
has_many :team_matches
validate :max_associate
def max_associate
errors.add(:teams, "can not be added more than 5") if teams.length > 5
end
end

Condition on join tables - Rails

My models:
class Review < ActiveRecord::Base
belongs_to :business
class Business < ActiveRecord::Base
has_many :reviews
has_and_belongs_to_many :categories
I want to get the Reviews for businesses under a certain category:
Review.joins(:business => :categories).where(:business => {:categories => [1,2,3,4]})
The resulting query:
SELECT "reviews".* FROM "reviews" INNER JOIN
"businesses" ON "businesses"."id" = "reviews"."business_id" INNER JOIN
"businesses_categories" ON "businesses_categories"."business_id" = "businesses"."id"
INNER JOIN "categories" ON "categories"."id" = "businesses_categories"."category_id"
WHERE "business"."categories" IN (1, 2, 3, 4)
However, I am getting the following error:
ActiveRecord::StatementInvalid: PG::Error: ERROR: missing FROM-clause entry
for table "business"
LINE 1: ...id" = "businesses_categories"."category_id" WHERE "business"...
Use this:
Review.joins(:business => :categories).where( :categories => { :id => [1,2,3,4] } )

Resources