How to order by optional association with ActiveRecord - ruby-on-rails

Scenario:
Pupils have many grade progressions.
A grade progression belongs to a polymorphic progressable (either a subject or topic), and a score (decimal).
Pupils may not yet have a grade progression recorded for
all progressables.
I need:
A list of Pupils ordered by grade score (highest first) for any given progressable.
Pupils without grades for the progressable should appear last.
If a Pupil has multiple grades for the progressable then the most recent should be used.
I've tried
Pupil.left_joins(:grade_progressions).where(grade_progressions: { progressable: progressable })
.order('grade_progressions.score DESC')
however, this does not work because it excludes any pupils without a grade for the progressable.
With raw SQL this should be possible by specifying the condition on the LEFT JOIN however I can't find a way to do this with ActiveRecord object syntax. I'd like to avoid specifying the join as a string if possible.
How can I author an ActiveRecord query to order by an optional polymorphic association in this way?

Rails 5 has a native .left_outer_joins method. Previously you needed to specify the join as a string or use .includes which often has unintended results.
An OUTER JOIN unlike the SQL default INNER JOIN includes rows that have no matches in the joined table.

Related

Rails (Activerecord) - Can't make query with joins and global sum without duplicates

I'm using a query with multiple user-set filters in order to show a list of invoices in a Rails app. One of the filters adds a where condition on a column of a separate table, which needs a double join in order to be accessible (estimates -through projects-).
scope :by_seller, lambda {|user_id|
joins(project: :estimates)
.where(estimates: {:user_id => user_id}) unless user_id.blank?
}
Additionally, I use Rails' aggregate method "sum" in order to find out the total amount of the invoices, #invoices.sum(:total_cache), where total_cache is a cached column in the database specifically designed to perform this kind of sum in a performant way.
#invoices.sum(:total_cache)
My problem is, given the fact that I need a double join in order to access Estimates through Projects, and that each Invoice belongs to a Project, BUT a project can have many Estimates, the join operation results in duplicate records, so my Invoices table shows some of the invoices many times (as many as the number of estimates its project has). This results in an invoices table with duplicate records, and in an incorrect sum value, as it sums some of the invoice totals N times.
The filtering behaviour is just fine, as my intention is to filter by the user who made ANY of the estimates in the invoice project. However, the issue is that when I try to avoid the duplicates by adding a group('invoices.id') -the way I always solved such situations-, the final sum operation won't return the total sum of the invoices' total, but a grouped sum of each one of them (totally useless).
The only workaround I've found is to include the group clause and perform the sum in pure ruby code, treating the collection as an array, which IMHO is terribly inefficient, as there are tons of invoices:
#invoices.map(&:total_cache).inject(0, &:+)
Is there a way I can obtain a unique ActiveRecord collection of Invoices without duplicates in a way I can then call the aggregate sum method and obtain a total calculated by Postgres?
Of course, if there is something wrong in my base idea I'm completely open to hearing it! It's quite a complex query (I simplified it for the sake of the question here) and there can be many approaches I'm sure!
Thank you everyone!
I'm not sure how much "slower" or "faster" this is than doing the sum in ruby code. But if you want to still retain an ActiveRecord::Relation object, then you can do something like below. I reproduced your setup environment in a local Rails project.
user = User.first
Invoice.where(
id: Invoice.by_seller(user.id).select(:id)
).sum(:total_cache)
# (1.2 ms) SELECT SUM("invoices"."total_cache") FROM "invoices" WHERE "invoices"."id" IN (SELECT "invoices"."id" FROM "invoices" INNER JOIN "projects" ON "projects"."id" = "invoices"."project_id" INNER JOIN "estimates" ON "estimates"."project_id" = "projects"."id" WHERE "estimates"."user_id" = $1) [["user_id", 1]]
# => 5

Order by count of a model's association with a particular attribute

Say I have the two models Users and Appointments where a user has_many appointments.
An appointment can be of two different types: typeA and typeB.
How can I write a query to order the users by the amount of typeB appointments they have?
I've looked into counter_cache but it seems to just count the number of the association (so in this case the number of appointments a user would have) and does not allow for the counting of a particular type of appointment.
With joins (INNER JOIN) you'll get only those users, who have at least one appointment associated:
User.joins(:appointments)
.where(appointments: { type: 'typeB' })
.group('users.id')
.order('count(appointments.id) DESC')
If you use includes (LEFT OUTER JOIN) instead, you'll get a list of all users having those without appointments of 'typeB' at the end of the list.
Depending on the size and complexity of the database, it may be sometimes better to do two queries instead of joins on tables. In case you want to skip joins one way could be to get the order of Ids from one select query, and then retrieving the records from the second query.
ordered_user_ids = Appointments.select('user_id, count(1) AS N')
.where(:type => 'typeB').group(:user_id).order('N desc').collect(&:user_id)
# find users keeping the order of the users intact
User.find(ordered_user_ids, :order => "field(id, #{ordered_user_ids.join(',')})")

Combining joins with other joins or regular queries

I'd like to combine Solr join queries with regular queries. As an example, suppose I want to find all stores in Jyväskylä (Finland) selling guide books. If documents for stores have the fields city and productIds and documents for products have the fields productType and productId in my index, I'd expect something like this to work:
{!join from=productIds to=productId}productType:"guide book" city:Jyväskylä
However, join queries are a particular kind of LocalParams and those are effective for the entire query. Therefore, this query would select documents that have productType=guide book and city=Jyväskylä, which doesn't make any sense.
Worse, suppose I want to look for stores that carry guide books and are located in cities with a population over 1000 people. I'd need two joins for that (to select products and cities).
Of course, I can split this into a query (q) and a filter query (fq), but that limits me to two kinds of queries (so, either one regular query and one join query or two joins), and more importantly abuses the concept of queries and filter queries.
My question therefore: how can I combine regular and join queries and how can I have multiple join queries?
I think I've got it: there can be more than one filter query and each can have its own join clause. Solr's admin interface won't let you enter more than one query in the fq field, but it's possible through the "Raw Query Parameters" field.
I'm not sure how efficient this is, though.

Rails : get fields from join tables in one single query

Starting with rails, i want to create a request with dynamic selection and dynamic sorting, like following examples (in SQL):
select * from books join authors on author_id = books.id
where books.title like '%something%'
order by author.name, books.title
or
select * from books join authors on author_id = books.id
where books.title like '%something%'
order by books.title, author.name
Author has_many books, book belongs to author.
I code this with two nested loops. In the first case, Author (sorted by name) is read first then Book (sorted by title), in the second case, Book first then author.
I can then print together books fields and authors fields.
The loops must reflect the hierarchy of sort fields.
But many other fields exist, and dynamic selection/ordering may be any field(s).
Is there a way to write a single 'each' loop, where books fields and authors fields would be available together, like with above sql examples.
My problem is to get fields from several tables on one single line.
What would the 'find' request be?
Thanks for your help.
Your basic query would be something like:
#books = Book.where("title LIKE ?", "%{something}%").joins(:author).order("author.name ASC, books.title ASC")
As for controlling the sorting, you can break that into scopes that get conditionally added depending on your params.

Rails - Only pull in some HABTM associations on a case-by-case basis to avoid unnecessary joins

In Rails 4, I have a project in which I've set up three models with the following many-to-many relationships:
An Item
has_and_belongs_to_many categories
has_and_belongs_to_many tags
A Category
has_and_belongs_to_many items
A Tag
has_and_belongs_to_many items
And while it's easy to select an Item and automatically get all associated categories and tags, there are some situations in which I'd want to select items AND their associated categories, but NOT their tags. In these cases, I'd like to avoid doing extra database joins against the Tags table and ItemsTags join table. Can anyone help me with the correct find syntax to only join Items to categories? (Side note: I'm also planning on adding 10 additional many-to-many relationships between items and other models, but I'm just simplifying the scenario for this question. In the end, I'm trying to avoid doing a join with an excessive number of tables whenever I can.)
Thanks!
Rails will by default not load associated records unless you request it
Item.all will only fetch record from 'items' table
Then later in your code if you call item.categories that's the point when a query is performed to fetch all categories of this particular item. If you never call item.tags then the query to 'tags' table is never executed and the records are not fetch. Bottom line is: you can have as many associations as needed, as long as you don't explicitly call them they won't be loaded.
Side note about performance, rails offer several ways to join and include associated tables:
Item.include(:category).all Will trigger only 2 queries to fetch all items, and all associated categories.
Item.include(:category).joins(:category).all -> will trigger only 1 query joining the items and categories tables (but it may be slower than 2 requests)
So you have all control over what's loaded from the database. Those can apply for scope as well.

Resources