I can't find any good articles about how to query array columns in Rails. I came across the need to query an Array column in Rails.
I found from an article teaching how to do basic query here.
Let's follow the example in the article where Book covers many subjects and subjects is stored as an array column:
add_column :books, :subjects, :text, array: true, default: []
Query books that contains a certain subject - e.g. History
Book.where("'history' = ANY (subjects)")
Query books that contains all listed subjects - e.g. Finance AND Business AND Accounting
Book.where("subjects #> ?", "{Finance,Business,Accounting}")
I wonder how I can do the following?
Query books that contains any of the listed subjects - e.g. Fiction OR Biography
Query books that doesn't contain a certain subject - e.g. NOT Physics
Query books that doesn't contain ANY of the subjects - e.g. NOT (Physics OR Chemistry OR Biology)
And is there any Rails way of doing the above queries?
For,
Query books that contains any of the listed subjects - e.g. Fiction OR Biography
Book.where("subjects && ?", "{Fiction,Biography}")
Query books that doesn't contain a certain subject - e.g. NOT Physics
Book.where("subjects <> ?", "{Physics}")
Query books that don't contain ANY of the subjects - e.g. NOT (Physics OR Chemistry OR Biology)
Book.where.not("subjects && ?", "{Physics,Chemistry,Biology}")
You can see the array functions of Postgres for reference.
https://www.postgresql.org/docs/8.2/functions-array.html
Usually, associations are a preferable way of approaching the problem:
Book has_many :subjects # or has_one/has_and_belongs_to_many
Subject belongs_to :book # or has_and_belongs_to_many
And then just create a table subjects, save all your subjects there and you're set up.
Your queries:
Query books that contains any of the listed subjects - e.g. Fiction OR
Biography
Book.find_by_sql "SELECT * FROM books WHERE 'Fiction' = ANY (subjects) OR 'Biography' = ANY (subjects)"
Query books that doesn't contain a certain subject - e.g. NOT Physics
Book.where.not("subjects #> ?", "{Physics}")
Query books that doesn't contain ANY of the subjects - e.g. NOT
(Physics OR Chemistry OR Biology)
Book.find_by_sql "SELECT * FROM books WHERE books NOT IN (SELECT * FROM books WHERE 'Physics' = ANY (subjects) OR 'Chemistry' = ANY (subjects) OR 'Biology' = ANY (subjects)"
Related
In the following book club example with associations:
class User
has_and_belongs_to_many :clubs
has_and_belongs_to_many :books
end
class Club
has_and_belongs_to_many :users
has_and_belongs_to_many :books
end
class Book
has_and_belongs_to_many :users
has_and_belongs_to_many :clubs
end
given a specific club record:
club = Club.find(params[:id])
how can I find all the users in the club who have all books in array of books?
club.users.where_has_all_books(books)
In PostgreSQL it can be done with a single query. (Maybe in MySQL too, I'm just not sure.)
So, some basic assumptions first. 3 tables: clubs, users and books, every table has id as a primary key. 3 join tables, books_clubs, books_users, clubs_users, each table contains pairs of ids (for books_clubs it will be [book_id, club_id]), and those pairs are unique within that table. Quite reasonable conditions IMO.
Building a query:
First, let's get ids of books from given club:
SELECT book_id
FROM books_clubs
WHERE club_id = 1
ORDER BY book_id
Then get users from given club, and group them by user.id:
SELECT CU.user_id
FROM clubs_users CU
JOIN users U ON U.id = CU.user_id
JOIN books_users BU ON BU.user_id = CU.user_id
WHERE CU.club_id = 1
GROUP BY CU.user_id
Join these two queries by adding having to 2nd query:
HAVING array_agg(BU.book_id ORDER BY BU.book_id) #> ARRAY(##1##)
where ##1## is the 1st query.
What's going on here: Function array_agg from the left part creates a sorted list (of array type) of book_ids. These are books of user. ARRAY(##1##) from the right part returns the sorted list of books of the club. And operator #> checks if 1st array contains all elements of the 2nd (ie if user has all books of the club).
Since 1st query needs to be performed only once, it can be moved to WITH clause.
Your complete query:
WITH club_book_ids AS (
SELECT book_id
FROM books_clubs
WHERE club_id = :club_id
ORDER BY book_id
)
SELECT CU.user_id
FROM clubs_users CU
JOIN users U ON U.id = CU.user_id
JOIN books_users BU ON BU.user_id = CU.user_id
WHERE CU.club_id = :club_id
GROUP BY CU.user_id
HAVING array_agg(BU.book_id ORDER BY BU.book_id) #> ARRAY(SELECT * FROM club_book_ids);
It can be verified in this sandbox: https://www.db-fiddle.com/f/cdPtRfT2uSGp4DSDywST92/5
Wrap it to find_by_sql and that's it.
Some notes:
ordering by book_id is not necessary; #> operator works with unordered arrays too. I just have a suspicion that comparison of ordered array is faster.
JOIN users U ON U.id = CU.user_id in 2nd query is only necessary for fetching user properties; in case of fetching user ids only it can be removed
It appears to work by grouping and counting.
club.users.joins(:books).where(books: { id: club.books.pluck(:id) }).group('users.id').having('count(*) = ?', club.books.count)
If anyone knows how to run the query without intermediate queries that would be great and I will accept the answer.
This looks like a situation where you'd make two queries, one to get all the ids you need, the other select perform a WHERE IN.
I have a schema where product has_many articles
I am retrieving a mongoid criteria based on scopes I created on the article model :
criteria = Article.published.with_image
From this criteria, I would like now to find all articles for which their products have a certain subject_id (or a subset of subject_ids).
I tried to writte :
criteria = criteria.in('product.subject_ids': data[:subjects])
where data[:subjects] is an array of subject_ids but this doesn't work.
Is there a clean way to do this with mongoid without having to loop over all articles from the first criteria or pluck all product_ids from the first criteria ?
How about any of these?
Project.where(:subject_id.in => data[:subject_id], :article_ids.in => criteria.pluck(:id))
criteria = Article.eagerload(:products).published.with_image
criterial.map {|art| return art.product if data[:subjects].any? {|subjects| art.product.subject_ids.include?(id) }
Everywhere I looked there were 2 examples
either
Book.where("'history' = ANY (subjects)")
to query a book with a specific subject in the subject array
or
Book.where("subjects #> ?", '{'history', 'drama'}')
to query books that the subjects array has both history and drama
How do I query for books that has either history or drama or both?
At the moment I am solving this using
query = subject_list.map do |subject|
"'#{subject}' = ANY(subjects)"
end.join(" OR ")
Book.where(query)
I have two models as follows:
class Bookshelf < ActiveRecord::Base
has_many :books
scope :in_stock, -> { where(in_stock: true) }
end
class Book < ActiveRecord::Base
belongs_to :bookshelf
end
I would like to find all the books in a collection of bookshelves based on a column in the bookshelf table efficiently.
At the moment I have to loop through each member as follows:
available_bookshelves = Bookshelf.in_stock
This returns an activerecord relation
To retrieve all the books in the relation, i am looping through the relation as follows:
available_bookshelves.each do |this_bookshelf|
this_bookshelf.books.each do |this_book|
process_isbn this_book
end
end
I would like all the books from the query so that I don't have to loop through each "bookshelf" from the collection returned individually. This works but feels verbose. I have other parts of the app where similar queries-loops are being performed.
EDIT:
Some clarification: Is there a way to get all books in all bookshelves that fit a certain criteria?
For example, if there are 5 brown bookshelves, can we retrieve all the books in those bookshelves?
something like (this is not valid code)
brown_books = books where bookshelf is brown
You can use the following query to get the books in the in stock book shelves
available_books = Book.where(bookshelf_id: Bookshelf.in_stock.select(:id))
That will run a single query which will look like:
SELECT books.*
FROM books
WHERE books.bookshelf_id IN (SELECT id FROM bookshelves WHERE in_stock = true)
I have two models:
Questions:
has_and_belongs_to_many :topics
Topics:
has_and_belongs_to_many :questions
And the associated joining table: questions_topics
How can I write a query to get all topics and their occurrences in the joining table, and sorted by the count (showing by order of the most active topics first)?
So basically I want to be able to do this in a single query:
list = Topics.all
Order list by list.questions.count
Update:
Is there a better rails way to write the following query? (which does appear to give the required result:
Topic.includes(:questions).group('questions_topics.topic_id').references(:questions).order("count(questions_topics.topic_id) DESC")