Group models with exact same has_many through relationships - ruby-on-rails

Forgive me if this has been asked before, I had a hard time thinking of good search queries.
Lets say I have 2 models, Posts and Tags. Posts have many tags through a pivot model, PostTags.
What I'd like to do is group posts that have the exact same combination of tags. I know how to group posts that have any of the same tags, but I've been having a harder time with this.
For example, if I have a post with and ID of 1, and the post has two Tags- one with an ID of 5, another with an ID of 7. I would have 2 PostTags, one with a post_id of 1, and a tag_id of 5, and then another with a post_id of 1 and a tag_id of 7. I have another Post with an id of 3, and it also has 2 PostTags - one with a post_id of 3 and a tag_id of 5, and another with a post_id of 3 and a tag_id of 7. I'd like to group these together so that I can get a count of how many posts have both of these tags, and no others.
Thanks, and I hope I was able to explain this properly.

I think you could probably do something like this in a nested query:
SELECT tag_ids,
string_agg(post_id, ',')
FROM (
SELECT post_id,
string_agg(tag_id, ',') as tag_ids
FROM post_tags
GROUP BY post_id)
GROUP BY tag_ids;
Explanation:
First in the inside query, you concatenate tag_ids grouped by post_id, so you can get the combination of tags for each post.
Then in the outside query, you concatenate post_ids by the combination of tag_ids, so you get all the post_ids for each tag combination.
This might not be the end yet, you could further process the post ids, or modify the query to fetch whatever data you need.
Hope this help!

Hope the Model relationships are sets properly.
# Post Model
class Post
has_many :post_tags
has_many :tags, through: :post_tags
end
# Fetch Tags to match with posts collection
tag_ids = []
# Query to fetch posts
Post.joins(:tags).where(tags: { id: tag_ids }).
group("posts.id").having("count(posts.id) >= ?", tag_ids.size)
First line ensure that only posts having tags included in the specified
tag list are fetched
Second line ensure that posts are having both of the tags.
If you want to match for exact tags (no other tags should present), then
change the condition to = instead of >=
Happy Hacking!

Related

Find records with at least one association but exclude records where any associations match condition

In the following setup a customer has many tags through taggings.
class Customer
has_many :taggings
has_many :tags, through: :taggings
end
class Tagging
belongs_to :tag
belongs_to :customer
end
The query I'm trying to perform in Rails with postgres is to Find all customers that have at least one tag but don't have either of the tags A or B.
Performance would need to be taken into consideration as there are tens of thousands of customers.
Please try the following query.
Customer.distinct.joins(:taggings).where.not(id: Customer.joins(:taggings).where(taggings: {tag_id: [tag_id_a,tag_id_b]}).distinct )
Explanation.
Joins will fire inner join query and will make sure you get only those customers which have at least one tag associated with them.
where.not will take care of your additional condition.
Hope this helps.
Let tag_ids is array of A and B ids:
tag_ids = [a.id, b.id]
Then you need to find the Customers, which have either A or B tag:
except_relation = Customer.
joins(:tags).
where(tags: { id: tag_ids }).
distinct
And exclude them from the ones, which have at least one tag:
Customer.
joins(:tags).
where.not(id: except_relation).
distinct
INNER JOIN, produced by .joins, removes Customer without Tag and is a source of dups, so distinct is needed.
UPD: When you need performance, you probably have to change your DB schema to avoid extra joins and indexes.
You can search examples of jsonb tags implementation.
Get ids of tag A and B
ids_of_tag_a_and_b = [Tag.find_by_title('A').id, Tag.find_by_title('B').id]
Find all customers that have at least one tag but don't have either of the tags A or B.
#Customer.joins(:tags).where.not("tags.id in (?)", ids_of_tag_a_and_b)
Customer.joins(:tags).where.not("tags.id = ? OR tags.id = ?", tag_id_1, tag_id_2)

Rails ActiveRecord : search in multiple values with multiple values

So let's say I have Post, Category and Categorizations models.
A post can have many categories through categorizations.
Now, how can I pull out all the posts that match at least one item of an array of categories?
Example:
Post 1 has categories 2,5,6
Post 2 has categories 1,5,9
Post 3 has categories 2,4,8
Find posts that match 3,5
I want the posts 1 and 2 to be returned.
Thanks!
Assuming that Categorization is a join model for Post and Category:
Post.joins(:categorizations).where(:categorizations => {:category_id => [3, 5]})
If it's not, and Categorization actually has_many :categories then:
Post.joins(:categories).where(:categories=> {:id => [3, 5]})
Note that the second method will work in the first case as well, however it will require 2 SQL joins and thus may not perform as well.

Selecting posts with multiple tags

I'm implementing a tagging system on a blog app. This app has posts, posts have many tags through taggings. More or less like RailCasts #382 http://railscasts.com/episodes/382-tagging
I will use checkboxes to select posts with multiple tags like this:
Post.joins(:tags).where(:tags => { :id => [tag_ids] } )
But what if I want to join posts that have all the required tags instead of posts that meet only one requirements?
For exapmle:
Post1 has tags "foo, bar, baz"
Post2 has tags "bar, baz"
Post3 has tags "bar"
If I search for ["bar", "baz"] my method returns posts 1, 2 and 3. What If I want to return only post 1 and 2?
In SQL, you would do something like
GROUP BY posts.id
HAVING count(tags.name) = 2
In rails it translates to
Post.joins(:tags).select('posts.id').where(:tags => { :id => [tag_ids] }
).having("count(tags.name) = ?", tag_ids.count).group('posts.id')
The above code assumes that tag_ids is an array of ids. Also, it only loads the ids for the returned set of ActiveRecord Post objects.
If you want to load more fields, add them in the select call, or otherwise remove the select call to load all Post fields. Just remember that for every column/field from Post that is retrieved in the resulting SQL query, you will need to add that field to the group call.

How can I add a virtual "count" column to a select query in rails?

I have a post model, and post has_many :comments, :as => :commentable (polymorphic). I am looking for a way that I can fetch all posts, and have a virtual attribute on the record which will display how many comments belong to that post.
I was thinking that I could just do:
Post.select("posts.*, count(comments.id) as post_comments").joins(:comments)
However, that returns only one record, with post_comments set to ALL comments in the entire database, not just those belonging to the record...
Actually, what you are missing is a group clause. You need to group by site, otherwise the count() aggregation collapses everything to one record as you see.
Try this:
Post.select("posts.*, count(comments.id) as post_comments")
.joins(:comments)
.group('posts.id')
I think the problem is that your count(comments.id) just does one count for the entire joined table. You can get around this with a nested query:
Post.select("posts.*, (SELECT count(comments.id) FROM comments WHERE comments.post_id=posts.id) AS post_comments")
You don't need the join in this case, since the comments table is not used in the outer query.
I would do that with variables in Post model. 1st I would try to find the post I am looking for somehow (you can find it by wichever parameter you want,below I show the example with searching the id param).
#post = Post.find(params[:id])
When you find the post you were looking for, finding out comment number is preety easy, try something along the lines of...
#comments = #post.comments.size
...wich would return an integer.

Rails has_many association count child rows

What is the "rails way" to efficiently grab all rows of a parent table along with a count of the number of children each row has?
I don't want to use counter_cache as I want to run these counts based on some time conditions.
The cliche blog example:
Table of articles. Each article has 0 or more comments.
I want to be able to pull how many comments each article has in the past hour, day, week.
However, ideally I don't want to iterate over the list and make separate sql calls for each article nor do I want to use :include to prefetch all of the data and process it on the app server.
I want to run one SQL statement and get one result set with all the info.
I know I can hard code out the full SQL, and maybe could use a .find and just set the :joins, :group, and :conditions parameters... BUT I am wondering if there is a "better" way... aka "The Rails Way"
This activerecord call should do what you want:
Article.find(:all, :select => 'articles.*, count(posts.id) as post_count',
:joins => 'left outer join posts on posts.article_id = articles.id',
:group => 'articles.id'
)
This will return a list of article objects, each of which has the method post_count on it that contains the number of posts on the article as a string.
The method executes sql similar to the following:
SELECT articles.*, count(posts.id) AS post_count
FROM `articles`
LEFT OUTER JOIN posts ON posts.article_id = articles.id
GROUP BY articles.id
If you're curious, this is a sample of the MySQL results you might see from running such a query:
+----+----------------+------------+
| id | text | post_count |
+----+----------------+------------+
| 1 | TEXT TEXT TEXT | 1 |
| 2 | TEXT TEXT TEXT | 3 |
| 3 | TEXT TEXT TEXT | 0 |
+----+----------------+------------+
Rails 3 Version
For Rails 3, you'd be looking at something like this:
Article.select("articles.*, count(comments.id) AS comments_count")
.joins("LEFT OUTER JOIN comments ON comments.article_id = articles.id")
.group("articles.id")
Thanks to Gdeglin for the Rails 2 version.
Rails 5 Version
Since Rails 5 there is left_outer_joins so you can simplify to:
Article.select("articles.*, count(comments.id) AS comments_count")
.left_outer_joins(:comments)
.group("articles.id")
And because you were asking about the Rails Way: There isn't a way to simplify/railsify this more with ActiveRecord.
From a SQL perspective, this looks trivial - Just write up a new query.
From a Rails perspective, The values you mention are computed values. So if you use find_by_sql, the Model class would not know about the computed fields and hence would return the computed values as strings even if you manage to translate the query into Rails speak. See linked question below.
The general drift (from the responses I got to that question) was to have a separate class be responsible for the rollup / computing the desired values.
How to get rails to return SUM(columnName) attributes with right datatype instead of a string?
A simple way that I used to solve this problem was
In my model I did:
class Article < ActiveRecord::Base
has_many :posts
def count_posts
Post.where(:article_id => self.id).count
end
end
Now, you can use for example:
Articles.first.count_posts
Im not sure if it can be more efficient way, But its a solution and in my opinion more elegant than the others.
I made this work this way:
def show
section = Section.find(params[:id])
students = Student.where( :section_id => section.id ).count
render json: {status: 'SUCCESS', section: students},status: :ok
end
In this I had 2 models Section and Student. So I have to count the number of students who matches a particular id of section.

Resources