cannot chain joins using string to left_joins - ruby-on-rails

Summary:
I have a many to many relationship between attachments and rules, through alerts.
I have a given rule and a given selection of attachments (those with a given bug_id).
I need to go through all the selected attachments an indicate whether there is an alert for the rule or not, with a different CSS background-color.
Outer Join
I get the correct results with the following query:
SELECT attachments.*, alerts.rule_id
FROM attachments
LEFT OUTER JOIN alerts ON alerts.attachment_id = attachments.id
and alerts.rule_id = 9
WHERE attachments.bug_id;
I'm looking for something like:
bug.attachments
.left_joins(alerts: {'rules.id' => 9})
.select('attachments.*, alerts.rule_id')
Database
class Alert < ApplicationRecord
belongs_to :attachment
class Attachment < ApplicationRecord
has_many :alerts
attachments
| id | bug_id |
| 14612 | 38871 |
| 14613 | 38871 |
| 14614 | 38871 |
alerts
| attachment_id | rule_id |
| 14612 | 9 |
| 14614 | 8 |
Condition in the From Clause
Without the alerts.rule_id = 9 condition in the FROM clause, we get the following result:
| id | rule_id |
| 14612 | 9 |
| 14614 | 8 |
| 14613 | NULL |
So having a WHERE clause WHERE alerts.rule_id = 9 or alerts.rule_id is NULL would lose the result for 14612
So the following won't work:
bug.attachments
.joins(:alerts)
.select('attachments.*, alerts.rule_id')
.where( ??? )
Edit
The above is a simplified and corrected version of my original question.
The original question is below:
alerts belongs to rules and attachments, and attachments belong to bugs.
class Alert < ApplicationRecord
belongs_to :attachment
belongs_to :rule
class Attachment < ApplicationRecord
belongs_to :bug
has_many :alerts
class Bug < ApplicationRecord
has_many :attachments
For a given rule, I need to show all the attachments for a given bug, and whether there is an alert or not. I want the following SQL:
SELECT attachments.*, alerts.id as alert_id
FROM `attachments`
LEFT OUTER JOIN `alerts` ON `alerts`.`attachment_id` = `attachments`.`id`
LEFT OUTER JOIN `rules` ON `rules`.`id` = `alerts`.`rule_id` AND rules.id = 9
WHERE `attachments`.`bug_id` = 38871
I can get this from:
bug.attachments
.joins("LEFT OUTER JOIN `alerts` ON `alerts`.`attachment_id` = `attachments`.`id`")
.joins("LEFT OUTER JOIN `rules` ON `rules`.`id` = `alerts`.`rule_id` AND rules.id = 9")
.select('attachments.*, alerts.id as alert_id')
.map{|attach| [attach.file_name, attach.alert_id]}
What I want to know is how to avoid calling joins with a string SQL fragment.
I'm looking for something like:
bug.attachments
.left_joins(alerts: {rule: {'rules.id' => 9}})
.select('attachments.*, alerts.id as alert_id')
.map{|attach| [attach.file_name, attach.alert_id]}
Is there anyway to avoid passing an SQL string?

Actually I think you will able to get the right results by putting rules.id = 9 in where clause.
SELECT attachments.*, alerts.id as alert_id
FROM `attachments`
LEFT OUTER JOIN `alerts` ON `alerts`.`attachment_id` = `attachments`.`id`
LEFT OUTER JOIN `rules` ON `rules`.`id` = `alerts`.`rule_id`
WHERE `attachments`.`bug_id` = 38871 AND (rules.id = 9 OR rules.id IS NULL)

Related

Ruby On Rails + PostgreSQL: If not matching data found, return the row as nil

I have these two models:
class ModelA < ApplicationRecord
has_one :model_b
has_one :model_b
end
class ModelB < ApplicationRecord
belongs_to :model_a
end
Data in DB tables:
model_a
id | ...
1 | ...
2 | ...
3 | ...
model_b
id | model_a_id | value_a | value_b
1 | 1 | abc | def
2 | 2 | ghi | jkl
For every record in the the model_a, I want to get a record from table model_b - I can get it like this.
ModelA.joins('LEFT JOIN model_b ON model_b.model_a_id = model_a.id')
This query would return me the rows with ID 1 and 2 from the table model_a. However, I would like to get returned also the row with ID 3 from the table model_a and for this row, I would want to get returned the associated (in this case, non-existing) row from model_b with these values:
value_a: NULL
value_b: NULL
How do I do that? I tried to play with different JOINS, with CASE IF/ELSE/END, but I happened to not find the right combination.
As I need to be able to filter/query these data, I believe it would be probably better to solve this on the PSQL level, rather than on Rails.
EDIT: RIGHT JOIN returns me only the first 2 rows form model_a.
EDIT2: This is the desired output:
modal_a.id | modal_b.value_a | modal_b.value_b
1 | abc | def
2 | ghi | jkl
3 | null | null
Thank you advance.
That's called a left outer join
ModelA.joins('LEFT OUTER JOIN model_b ON model_b.model_a_id = model_a.id')
It will return all ModelA records even if no modelB record is present.
In pure rails...
ModelA.includes(:model_b)
To explicitly include the columns that may have nil...
records = ModelA.includes(:model_b).select('*, model_b.value_a as model_b_value_a, model_b.value_b as model_b_value_b')
This lets you do records.first.id to see the model_a id, and records.first.model_b_value_a etc to see the value from model_b
For records without an associated model_b record, records.first.model_b_value_a will return nil

Rails query, based on a scope from an unrelated model

I want to find all of a user's convos where there is not a connect
I have a convos table, with a sender_id and recipient_id which are both references to a user id
# app/models/user.rb
has_many :convos, ->(user) {
unscope(:where).where("sender_id = :id OR recipient_id = :id", id: user.id)
}
Note the convo can belong to a user that is either sender_id OR recipient_id.
# app/models/convo.rb
class Convo < ApplicationRecord
belongs_to :sender, :foreign_key => :sender_id, class_name: 'User'
belongs_to :recipient, :foreign_key => :recipient_id, class_name: 'User'
has_many :msgs, dependent: :destroy
validates_uniqueness_of :sender_id, :scope => :recipient_id
scope :involving, -> (user) do
where("convos.sender_id =? OR convos.recipient_id =?",user.id,user.id)
end
scope :between, -> (sender_id,recipient_id) do
where("(convos.sender_id = ? AND convos.recipient_id =?) OR (convos.sender_id = ? AND convos.recipient_id =?)", sender_id,recipient_id, recipient_id, sender_id)
end
end
Connect table has a requestor_id and requestee_id which are both references to a user id.
Connect model
class Connect < ApplicationRecord
belongs_to :requestor, :foreign_key => :requestor_id, class_name: 'User'
belongs_to :requestee, :foreign_key => :requestee_id, class_name: 'User'
scope :between, -> (requestor_id,requestee_id) do
where("(connects.requestor_id = ? AND connects.requestee_id =?) OR (connects.requestor_id = ? AND connects.requestee_id =?)", requestor_id,requestee_id, requestee_id, requestor_id)
end
end
I want to find all of a user's convos where there is not a connect
I've tried something like:
user = User.first
user.convos.where.not(Connect.between(self.requestor_id, self.requestee_id).length > 0 )
# NoMethodError (undefined method `requestor_id' for main:Object)
user.convos.where.not(Connect.between(convo.requestor_id, convo.requestee_id).length > 0 )
# undefined local variable or method `convo' for main:Object
Then I tried without referencing a user at all, and just tried to get all convos without a connect.
Convo.where("Connect.between(? ,?) < ?)", :sender_id, :recipient_id, 1)
# ActiveRecord::StatementInvalid (SQLite3::SQLException: near "between": syntax error: SELECT "convos".* FROM "convos" WHERE (Connect.between('sender_id' ,'recipient_id') < 1)))
Convo.where("Connect.between(? ,?) < ?)", self.sender_id, self.recipient_id, 1)
# NoMethodError (undefined method `sender_id' for main:Object)
What is the best way to get all the user's convos where a connect doesn't exist?
UPDATE
This works, and is what I'm looking for, but obviously this is trashy, and I'd like to understand how get this in 1 call.
#og_connections = []
current_user.convos.each do |convo|
if Connect.between(convo.sender_id, convo.recipient_id).length === 0
#og_connections.push(current_user.id === convo.sender_id ? convo.recipient_id : convo.sender_id)
end
end
#connections = User.select(:id, :first_name, :slug).where(id: #og_connections, status: 'Active')
You can use LEFT JOIN to get the users rows where their match between id and convos.sender_id and convos.recipient_id is not NULL, but their match between connections.requester_id and connections.requestee_id is NULL:
SELECT *
FROM users
LEFT JOIN connects
ON users.id IN (connects.requester_id, connects.requestee_id)
LEFT JOIN convos
ON users.id IN (convos.sender_id, convos.recipient_id)
WHERE connects.requester_id IS NULL AND
connects.requestee_id IS NULL AND
convos.sender_id IS NOT NULL AND
convos.recipient_id IS NOT NULL
AR implementation:
User.joins('LEFT JOIN connects ON users.id IN (connects.requester_id, connects.requestee_id)
LEFT JOIN convos ON users.id IN (convos.sender_id, convos.recipient_id)')
.where(connects: { requester_id: nil, requestee_id: nil })
.where.not(convos: { sender_id: nil, recipient_id: nil })
Considering a DB structure like this:
db=# \d+ users
Table "public.users"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
------------+--------------------------------+-----------+----------+-----------------------------------+----------+--------------+-------------
id | bigint | | not null | nextval('users_id_seq'::regclass) | plain | |
name | character varying | | | | extended | |
created_at | timestamp(6) without time zone | | not null | | plain | |
updated_at | timestamp(6) without time zone | | not null | | plain | |
Indexes:
"users_pkey" PRIMARY KEY, btree (id)
db=# \d+ convos
Table "public.convos"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------------+--------------------------------+-----------+----------+------------------------------------+---------+--------------+-------------
id | bigint | | not null | nextval('convos_id_seq'::regclass) | plain | |
sender_id | integer | | | | plain | |
recipient_id | integer | | | | plain | |
created_at | timestamp(6) without time zone | | not null | | plain | |
updated_at | timestamp(6) without time zone | | not null | | plain | |
Indexes:
"convos_pkey" PRIMARY KEY, btree (id)
db=# \d+ connects
Table "public.connects"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------------+--------------------------------+-----------+----------+--------------------------------------+---------+--------------+-------------
id | bigint | | not null | nextval('connects_id_seq'::regclass) | plain | |
requestor_id | integer | | | | plain | |
requestee_id | integer | | | | plain | |
created_at | timestamp(6) without time zone | | not null | | plain | |
updated_at | timestamp(6) without time zone | | not null | | plain | |
Indexes:
"connects_pkey" PRIMARY KEY, btree (id)
With the following records:
db=# select * from users;
id | name | created_at | updated_at
----+------+----------------------------+----------------------------
1 | seb | 2019-11-27 09:59:53.762911 | 2019-11-27 09:59:53.762911
2 | sab | 2019-11-27 09:59:55.455096 | 2019-11-27 09:59:55.455096
3 | foo | 2019-11-27 10:07:19.760675 | 2019-11-27 10:07:19.760675
4 | bar | 2019-11-27 10:07:36.18696 | 2019-11-27 10:07:36.18696
5 | meh | 2019-11-27 10:07:38.465841 | 2019-11-27 10:07:38.465841
(5 rows)
db=# select * from convos;
id | sender_id | recipient_id | created_at | updated_at
----+-----------+--------------+----------------------------+----------------------------
1 | 1 | 2 | 2019-11-27 10:09:36.742426 | 2019-11-27 10:09:36.742426
2 | 1 | 3 | 2019-11-27 10:09:40.555118 | 2019-11-27 10:09:40.555118
(2 rows)
db=# select * from connects;
id | requestor_id | requestee_id | created_at | updated_at
----+--------------+--------------+----------------------------+----------------------------
1 | 1 | 2 | 2019-11-27 10:07:07.76146 | 2019-11-27 10:07:07.76146
2 | 2 | 1 | 2019-11-27 10:07:11.380084 | 2019-11-27 10:07:11.380084
3 | 1 | 4 | 2019-11-27 10:07:47.892944 | 2019-11-27 10:07:47.892944
4 | 5 | 1 | 2019-11-27 10:07:51.406224 | 2019-11-27 10:07:51.406224
(4 rows)
The following query will return only the second convo, because user with id 3 doesn't have any connect.
SELECT convos.*
FROM convos
LEFT JOIN users
ON users.id IN (convos.sender_id, convos.recipient_id)
LEFT JOIN connects
ON users.id IN (connects.requestor_id, connects.requestee_id)
WHERE connects.requestor_id IS NULL AND connects.requestee_id IS NULL
id | sender_id | recipient_id | created_at | updated_at | id | name | created_at | updated_at | id | requestor_id | requestee_id | created_at | updated_at
----+-----------+--------------+----------------------------+----------------------------+----+------+----------------------------+----------------------------+----+--------------+--------------+------------+------------
2 | 1 | 3 | 2019-11-27 10:09:40.555118 | 2019-11-27 10:09:40.555118 | 3 | foo | 2019-11-27 10:07:19.760675 | 2019-11-27 10:07:19.760675 | | | | |
(1 row)
The Rails query for that can be this:
Convo
.joins('LEFT JOIN users ON users.id IN (convos.sender_id, convos.recipient_id)
LEFT JOIN connects ON users.id IN (connects.requestor_id, connects.requestee_id)')
.where(connects: { requestor_id: nil, requestee_id: nil })
Answer with current setup
If you're looking for just current_user, you'll want to start with their convos, do a left join to connects, and select the rows where connects is NULL. With your table setup, we'll have to do this joins manually on the possible user_id combinations:
current_user.convos.joins("
LEFT JOIN connects ON
(connects.requestor_id = convos.sender_id AND connects.requestee_id = convos.recipient_id)
OR
(connects.requestor_id = convos.recipient_id AND connects.requestee_id = convos.sender_id)
").where(connects: {id: nil})
The left joins gives you any connects that are between the same two users as the convo, which is necessarily involving current_user since we started with current_user.convos. From there we filter down to only rows where the connects fields are NULL, getting us rows with a convo that does not have a matching connect.
Suggestion
That much raw SQL is a bit of code smell in a Rails app, and it's because of what we're trying to do here with the models set up as they are. I'd suggest refactoring the data models to make the queries easier. Two ideas come to mind:
Always create a symmetrical record for a connect and a convo, so you can look up by a single column instead of using all the ORs. That is, whenever you create a connect between user 1 and user 2, also create one between user 2 and user 1. More bookkeeping, since you'd have to destroy and edit them together as well. But it lets you . define simple associations without all the hoops.
Use a separate table to refer to unique user-pairs (order doesn't matter). To do this, create a UserPair model with user_1_id and user_2_id, where user_1_id is always set to the lower of the two user ids. That way, a convo can be more easily identified by a user_pair_id, a UserPair can has_many :convos and has_many: connects, and you can to a straight rails join between convos -> user_pairs -> connects.
The models in 2 would look something like
class UserPair < ApplicationRecord
belongs_to :user_1, class_name: "User"
belongs_to :user_2, class_name: "User"
before_save :sort_users
scope :between, -> (user_1_id,user_2_id) do
# records are always saved in sorted id order, so sort before querying
user_1_id, user_2_id = [user_1_id, user_2_id].sort
where(user_1_id: user_1_id, user_2_id: user_2_id)
end
# always put lowest id first for easy lookup
def sort_users
if user_1.present? && user_2.present? && user_1.id > user_2.id
self.user_1, self.user_2 = user_2, user_1
end
end
end
class Convo < ApplicationRecord
belongs_to :sender, :foreign_key => :sender_id, class_name: 'User'
belongs_to :recipient, :foreign_key => :recipient_id, class_name: 'User'
belongs_to :user_pair
before_validation :set_user_pair
scope :involving, -> (user) do
where("convos.sender_id =? OR convos.recipient_id =?",user.id,user.id)
end
# since UserPair records are always user_id sorted, we can just use
# that model's scope here without need to repeat it, using `merge`
scope :between, -> (sender_id,recipient_id) do
joins(:user_pair).merge(UserPair.between(sender_id, recipient_id))
end
def set_user_pair
self.user_pair = UserPair.find_or_initialize_by(user_1: sender, user_2: recipient)
end
end
So if I understand correctly, from the list of users a user has a conversation with, you want the list of users that they do not have a connection with.
In a simple way this could be something like:
users_conversed_with = user.convos.map{|c| [c.sender_id, c.recipient_id]}.flatten.uniq
users_connected_with = user.connections.map{|c| c.requestor_id, c.requestee_id}.flatten.uniq
Both sets also contain the user.id, but we can ignore that, because we are interested in the difference: that would be the set of people we conversed with, without connection (and because user.id will be in both, unless one of them is empty, we do not have to separately remove user.id from those sets).
users_not_connected_with = users_conversed_with - users_connected_with
This is not an optimal approach, because we do two queries, retrieve all the user-ids from the database, to then discard probably most of the retrieved data. We could improve this by creating a custom query, and let the database do the work for us, like so
sql = <<-SQL
(select distinct user_id from
(select sender_id as user_id from convos where sender_id=#{user.id} or recipient_id=#{user.id}
union
select recipient_id as user_id from convos where sender_id=#{user.id} or recipient_id=#{user.id}
)
)
except
(
(select distinct user_id from
(select requestor_id as user_id from connections where requestor_id=#{user.id} or requestee_id=#{user.id}
union
select requestee_id as user_id from convos where requestor_id=#{user.id} or requestee_id=#{user.id}
)
)
SQL
result = Convo.connection.execute(sql)
users_ids_in_convo_without_connection = result.to_a.map(&:values).flatten
But if performance is not an issue, your code has the advantage of being very readable and clearer in it's intention.
I'll first write the SQL query to do so. In your case you perhaps want
SELECT convos.*
FROM convos
WHERE (sender_id = :user_id
AND NOT EXISTS (
SELECT 1
FROM connects
WHERE (requestor_id = sender_id AND requestee_id = recipient_id) OR (requestor_id = recipient_id AND requestee_id = sender_id)
))
OR
(recipient_id = :user_id
AND NOT EXISTS (
SELECT 1
FROM connects
WHERE (requestor_id = recipient_id AND requestee_id = sender_id) OR (requestor_id = sender_id AND requestee_id = recipient_id)
))
This can be then converted into AR query.
class Convo < ApplicationRecord
def self.no_connects(user_id = nil)
q = joins('
LEFT JOIN connects ON
sender_id IN (connects.requestor_id, connects.requestee_id)
OR
recipient_id IN (connects.requestor_id, connects.requestee_id)
')
q = q.where('connects.requestor_id IS NULL AND connects.requestee_id IS NULL')
q = q.where("convos.sender_id = :user_id OR convos.recipient_id = :user_id", user_id: user_id) if user_id
q
end
end
To get all the convos without connects
Convo.no_connects
For single user
Convo.no_connects(current_user.id)

Finding objects that are not associated in has_many_through

I have simple classes like these:
class Book
has_many :book_categorizations
has_many :categories, through: :book_categorizations, source: :book_category
end
class BookCategorizations
belongs_to :book
belongs_to :book_category
end
class BookCategory
has_many :book_categorizations
has_many :books, through: :book_categorizations
end
I would like to find Books that have no category. How can I query that using where?
You could add scope with an LEFT JOIN to your model:
# in book.rb
scope :without_categories, lambda {
joins('LEFT JOIN book_categorizations ON books.id = book_categorizations.book_id').
where(book_categorizations: { book_category_id: nil })
}
Which could be used like:
Book.without_categories
#=> returns books without a category
How it works:
Imaging you have a fruits and a colors table:
fruits
id | name
1 | Apple
2 | Orange
3 | Banana
colors
id | name
1 | black
2 | red
3 | yellow
And a colors_fruits join table:
colors_fruits
color_id | fruit_id
2 | 1 # red Apple
3 | 3 # yellow Banana
Since Rails' joins method generates INNER JOIN, all joins would only return fruits that have at least one color. The orange wouldn't be in the list, because it does not have a color (therefore no join is possible):
Fruit.joins(:colors)
#=> red Apple, yellow Banana (simplified)
But when we are interested into fruits that do not have an color, then we need an LEFT JOIN. A LEFT JOIN includes all elements from the left table - even if there is not matching on the right table (unfortunately there is no Rails helper for this kind of joins):
Fruits.joins('LEFT JOIN colors_fruits ON colors_fruits.fruits_id = fruits.id')
This generates a result like:
id | color | fruit_id | color_id
1 | Apple | NULL | NULL
2 | Orange | 2 | 1
3 | Banana | 3 | 3
Now we just need to exclude the ones that do not have a color_id
Fruits.joins('LEFT JOIN colors_fruits ON colors_fruits.fruits_id = fruits.id').
where(colors_fruits: { color_id: nil })
You might want to read about the different types of SQL JOINS. And there is this well known diagram about joins.

Rails query through 2 different associated models

I'm having a little trouble trying to get a query to work the way I want it, I'm not getting all the results I'm hoping for.
I have 3 models Post, Comment and Tag. Both the posts and the comments can contain tags, and both have a has_and_belongs_to_many relationship with tags. I want to be able to get all the posts that either have a specified tag or have comments with that tag, I've been doing it in the following scope on posts like so:
scope :tag, -> (tag_id) { joins(:tags, :comment_tags).where("tags_posts.tag_id = :tag_id OR comments_tags.tag_id = :tag_id", tag_id: tag_id) }
But that doesn't return all the posts, just a subset of them, seems like its only the ones regarding the comments, this is the query it generates:
SELECT COUNT(*) FROM "posts"
INNER JOIN "tags_posts" ON "tags_posts"."post_id" = "posts"."id"
INNER JOIN "tags" ON "tags"."id" = "tags_posts"."tag_id"
INNER JOIN "comments" ON "comments"."post_id" = "posts"."id"
INNER JOIN "comments_tags" ON "comments_tags"."comment_id" = "comments"."id"
INNER JOIN "tags" "comment_tags_posts" ON "comment_tags_posts"."id" = "comments_tags"."tag_id"
WHERE (tags_posts.tag_id = 1 OR comments_tags.tag_id = 1)
These are the models:
class Post < ActiveRecord::Base
has_and_belongs_to_many :tags
has_many :comment_tags, through: :comments, source: :tags
end
class Tag < ActiveRecord::Base
has_and_belongs_to_many :posts
has_and_belongs_to_many :comments
end
class Comment < ActiveRecord::Base
belongs_to :post
has_and_belongs_to_many :tags
end
I'm not certain whether you've already figured this out, but in case you haven't, here is a possible solution:
In plain SQL, mainly for illustration purposes:
SELECT
DISTINCT posts.*
FROM
posts
INNER JOIN
tags_posts ON tags_posts.post_id = posts.id
LEFT JOIN
comments ON comments.post_id = posts.id
LEFT JOIN
comments_tags ON comments_tags.comment_id = comments.id
INNER JOIN
tags ON (tags.id = tags_posts.tag_id OR tags.id = comments_tags.tag_id)
WHERE tags.id = 1
The primary issue in your original version was that you were making an INNER JOIN with comments and comments_tags. As a result you were probably cutting out every Post which did not have any comments. So the solution is to LEFT JOIN everything related to the comments. And then, because we are left joining, we can INNER JOIN tags on either the tag posts or comment posts.
Converting to Active Record is not very pretty, but necessary:
Post.joins("INNER JOIN posts_tags ON posts_tags.post_id = posts.id")
.joins("LEFT JOIN comments ON comments.post_id = posts.id")
.joins("LEFT JOIN comments_tags ON comments_tags.comment_id = comments.id")
.joins("INNER JOIN tags ON (posts_tags.tag_id = tags.id OR comments_tags.tag_id = tags.id)")
.where(tags: {id: 1})
.uniq
Note the necessity of DISTINCT and uniq, as you will get duplicates because of the LEFT JOIN.
Edit
In case there's some misunderstanding of the dataset or structure, this is an example of the data I used in my test to create the above query.
posts
+----+--------------------------+
| id | text |
+----+--------------------------+
| 1 | Post about programming 1 |
| 2 | Post about programming 2 |
| 3 | Post about programming 3 |
| 4 | Post about cooking 1 |
| 5 | Post about cooking 2 |
+----+--------------------------+
tags
+----+-------------+
| id | name |
+----+-------------+
| 1 | programming |
| 2 | cooking |
| 3 | woodworking |
+----+-------------+
tags_posts
+--------+---------+
| tag_id | post_id |
+--------+---------+
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 4 |
| 2 | 5 |
+--------+---------+
comments
+----+----------------------------------------------+---------+
| id | comment_text | post_id |
+----+----------------------------------------------+---------+
| 1 | comment - programming on programming post 1a | 1 |
| 2 | comment - programming on programming post 1b | 1 |
| 3 | comment - programming on programming post 2a | 2 |
| 4 | comment - cooking on programming post 3a | 3 |
| 5 | comment - programming on cooking post 4a | 4 |
| 6 | comment - cooking on cooking post 4b | 4 |
| 7 | comment - cooking on cooking post 5a | 5 |
+----+----------------------------------------------+---------+
comments_tags
+--------+------------+
| tag_id | comment_id |
+--------+------------+
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 1 | 5 |
| 2 | 4 |
| 2 | 6 |
| 2 | 7 |
+--------+------------+
If I want to search for "programming", the above query will yield this result set:
+----+--------------------------+
| id | text |
+----+--------------------------+
| 1 | Post about programming 1 |
| 2 | Post about programming 2 |
| 4 | Post about cooking 1 |
| 3 | Post about programming 3 |
+----+--------------------------+
since we have 3 posts specifically tagged with "programming", and one comment tagged as "programming" on a differently tagged post.
I am not sure to understand what's a yum, is it a post ?
From your SQL query it seems it will count only the yum that have both a specific tag AND comment with this specific tag. What you want is to count yum that have a specific tag OR comments with this specific tag.
I would do either 2 queries one to count the yum with specific tag + one to count the yum with specific commented tags and add them both to get the total or make one query with an UNION condition.
scope :yums_tagged, -> (tag_id) { joins(:tags).where("tags_yums.tag_id = :tag_id", tag_id: tag_id) }
scope :comments_taged, -> (tag_id) { joins(:comment_tags).where("comments_tags.tag_id = :tag_id", tag_id: tag_id) }

How to find posts tagged with more than one tag in Rails and Postgresql

I have the models Post, Tag, and PostTag. A post has many tags through post tags. I want to find posts that are exclusively tagged with more than one tag.
has_many :post_tags
has_many :tags, through: :post_tags
For example, given this data set:
posts table
--------------------
id | title |
--------------------
1 | Carb overload |
2 | Heart burn |
3 | Nice n Light |
tags table
-------------
id | name |
-------------
1 | tomato |
2 | potato |
3 | basil |
4 | rice |
post_tags table
-----------------------
id | post_id | tag_id |
-----------------------
1 | 1 | 1 |
2 | 1 | 2 |
3 | 2 | 1 |
4 | 2 | 3 |
5 | 3 | 1 |
I want to find posts tagged with tomato AND basil. This should return only the "Heart burn" post (id 2). Likewise, if I query for posts tagged with tomato AND potato, it should return the "Carb overload" post (id 1).
I tried the following:
Post.joins(:tags).where(tags: { name: ['basil', 'tomato'] })
SQL
SELECT "posts".* FROM "posts"
INNER JOIN "post_tags" ON "post_tags"."post_id" = "posts"."id"
INNER JOIN "tags" ON "tags"."id" = "post_tags"."tag_id"
WHERE "tags"."name" IN ('basil', 'tomato')
This returns all three posts because all share the tag tomato. I also tried this:
Post.joins(:tags).where(tags: { name 'basil' }).where(tags: { name 'tomato' })
SQL
SELECT "posts".* FROM "posts"
INNER JOIN "post_tags" ON "post_tags"."post_id" = "posts"."id"
INNER JOIN "tags" ON "tags"."id" = "post_tags"."tag_id"
WHERE "tags"."name" = 'basil' AND "tags"."name" = 'tomato'
This returns no records.
How can I query for posts tagged with multiple tags?
You may want to review the possible ways to write this kind of query in this answer for applying conditions to multiple rows in a join. Here is one possible option for implementing your query in Rails using 1B, the sub-query approach...
Define a query in the PostTag model that will grab up the Post ID values for a given Tag name:
# PostTag.rb
def self.post_ids_for_tag(tag_name)
joins(:tag).where(tags: { name: tag_name }).select(:post_id)
end
Define a query in the Post model that will grab up the Post records for a given Tag name, using a sub-query structure:
# Post.rb
def self.for_tag(tag_name)
where("id IN (#{PostTag.post_ids_for_tag(tag_name).to_sql})")
end
Then you can use a query like this:
Post.for_tag("basil").for_tag("tomato")
Use method .includes, like this:
Item.where(xpto: "test")
.includes({:orders =>[:suppliers, :agents]}, :manufacturers)
Documentation to .includes here.

Resources