I have the following relationship in Rails: Campaign -has-many- Promise(s)
And I have the following Ruby code to return verbose list of campaigns with promises count:
def campaigns
results = Campaign
.where(user_id: current_user.id)
.left_outer_joins(:promises)
.select('campaigns.*', 'COUNT(DISTINCT promises.id) AS promises_count')
.group('campaigns.id')
.ransack(params[:q])
.result(distinct: true)
render json: {
results: results.page(params[:page]).per(params[:per_page]),
total_results: results.count(:id)
}
end
Everything works fine, unless I try to sort by promises_count. Ransack (or something else?) generates the following SQL for Postgres:
SELECT DISTINCT
campaigns.*,
COUNT(DISTINCT promises.id) AS promises_count
FROM "campaigns"
LEFT OUTER JOIN "promises"
ON "promises"."campaign_id" = "campaigns"."id"
LEFT OUTER JOIN "promises" "promises_campaigns"
ON "promises_campaigns"."campaign_id" = "campaigns"."id"
WHERE "campaigns"."user_id" = 1 GROUP BY campaigns.id;
It works, but there is no ORDER BY for some reason. When I sort by other properties, it works fine. I think Ransack is missing something, and handles promises_count different way because it's generated property and not real one.
It's possible to sort in Postgres, for example, manual query with added ORDER BY works:
SELECT DISTINCT
campaigns.*,
COUNT(DISTINCT promises.id) AS promises_count
FROM "campaigns"
LEFT OUTER JOIN "promises"
ON "promises"."campaign_id" = "campaigns"."id"
LEFT OUTER JOIN "promises" "promises_campaigns"
ON "promises_campaigns"."campaign_id" = "campaigns"."id"
WHERE "campaigns"."user_id" = 1
GROUP BY campaigns.id
ORDER BY promises_count desc;
How do I make Ransack work? I tried different combinations of queries without too much luck.
I have a working SQL query for Postgres v10.
SELECT *
FROM
(
SELECT DISTINCT ON (title) products.title, products.*
FROM "products"
) subquery
WHERE subquery.active = TRUE AND subquery.product_type_id = 1
ORDER BY created_at DESC
With the goal of the query to do a distinct based on the title column, then filter and order them. (I used the subquery in the first place, as it seemed there was no way to combine DISTINCT ON with ORDER BY without a subquery.
I am trying to express said query in ActiveRecord.
I have been doing
Product.select("*")
.from(Product.select("DISTINCT ON (product.title) product.title, meals.*"))
.where("subquery.active IS true")
.where("subquery.meal_type_id = ?", 1)
.order("created_at DESC")
and, that works! But, it's fairly messy with the string where clauses in there. Is there a better way to express this query with ActiveRecord/Arel, or am I just running into the limits of what ActiveRecord can express?
I think the resulting ActiveRecord call can be improved.
But I would start improving with original SQL query first.
Subquery
SELECT DISTINCT ON (title) products.title, products.* FROM products
(I think that instead of meals there should be products?) has duplicate products.title, which is not necessary there. Worse, it misses ORDER BY clause. As PostgreSQL documentation says:
Note that the “first row” of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first
I would rewrite sub-query as:
SELECT DISTINCT ON (title) * FROM products ORDER BY title ASC
which gives us a call:
Product.select('DISTINCT ON (title) *').order(title: :asc)
In main query where calls use Rails-generated alias for the subquery. I would not rely on Rails internal convention on aliasing subqueries, as it may change anytime. If you do not take this into account you could merge these conditions in one where call with hash-style argument syntax.
The final result:
Product.select('*')
.from(Product.select('DISTINCT ON (title) *').order(title: :asc))
.where(subquery: { active: true, meal_type_id: 1 })
.order('created_at DESC')
I have this query that uses the DBContext entities I created.
var referral = entities.StudentReferrals.Where(x => x.ReferralID == p && x.SchoolYear == year).FirstOrDefault();
When I remove x.SchoolYear == year the query works fine, but with it my query times out. The opposite of what I would expect to happen, I would expect the more you narrow a query down via Where clause constraints the less likely it would time out.
SchoolYear is a field in the query and the query itself is valid, when I perform the query within SQL Studio Manager it returns results in less than a second.
My confusion is, why would adding a constraint to the Where clause cause a query to time out??
x.SchoolYear and year are both strings.
The full query is...
SELECT [Extent1].[BirthDate] AS [BirthDate],
[Extent1].[LegalFirstName] AS [LegalFirstName],
[Extent1].[LegalLastName] AS [LegalLastName],
[Extent1].[PreferredFirstName] AS [PreferredFirstName],
[Extent1].[PreferredLastName] AS [PreferredLastName],
[Extent1].[StudentNumber] AS [StudentNumber],
[Extent1].[LegacyStudentNumber] AS [LegacyStudentNumber],
[Extent1].[TranscriptSchoolCode] AS [TranscriptSchoolCode],
[Extent1].[OEN] AS [OEN],
[Extent1].[StatusIndicator] AS [StatusIndicator],
[Extent1].[SchoolYear] AS [SchoolYear],
[Extent1].[ReferralID] AS [ReferralID],
[Extent1].[PersonID] AS [PersonID],
[Extent1].[Active] AS [Active],
[Extent1].[ServiceTypeID] AS [ServiceTypeID],
[Extent1].[IsSchoolActive] AS [IsSchoolActive],
[Extent1].[Principal] AS [Principal],
[Extent1].[SchoolName] AS [SchoolName],
[Extent1].[SchoolCode] AS [SchoolCode],
[Extent1].[NearNorthSchoolCode] AS [NearNorthSchoolCode],
[Extent1].[TranscriptSchoolPrincipal] AS [TranscriptSchoolPrincipal],
[Extent1].[TranscriptSchoolName] AS [TranscriptSchoolName],
[Extent1].[TranscriptNearNorthSchoolCode] AS [TranscriptNearNorthSchoolCode],
[Extent1].[GuardianFirstName] AS [GuardianFirstName],
[Extent1].[GuardianLastName] AS [GuardianLastName],
[Extent1].[AreaCode] AS [AreaCode],
[Extent1].[ContactNo] AS [ContactNo],
[Extent1].[ReferredByFirstName] AS [ReferredByFirstName],
[Extent1].[ReferredByLastName] AS [ReferredByLastName],
[Extent1].[ReferredDate] AS [ReferredDate],
[Extent1].[Reason] AS [Reason],
[Extent1].[gender] AS [gender],
[Extent1].[grade] AS [grade],
[Extent1].[HomeroomTeacher] AS [HomeroomTeacher],
[Extent1].[IntakeTeamMember] AS [IntakeTeamMember],
[Extent1].[IntakeMemberID] AS [IntakeMemberID]
FROM (SELECT [StudentReferrals].[BirthDate] AS [BirthDate],
[StudentReferrals].[LegalFirstName] AS [LegalFirstName],
[StudentReferrals].[LegalLastName] AS [LegalLastName],
[StudentReferrals].[PreferredFirstName] AS [PreferredFirstName],
[StudentReferrals].[PreferredLastName] AS [PreferredLastName],
[StudentReferrals].[gender] AS [gender],
[StudentReferrals].[StudentNumber] AS [StudentNumber],
[StudentReferrals].[LegacyStudentNumber] AS [LegacyStudentNumber],
[StudentReferrals].[TranscriptSchoolCode] AS [TranscriptSchoolCode],
[StudentReferrals].[OEN] AS [OEN],
[StudentReferrals].[StatusIndicator] AS [StatusIndicator],
[StudentReferrals].[SchoolYear] AS [SchoolYear],
[StudentReferrals].[grade] AS [grade],
[StudentReferrals].[ReferralID] AS [ReferralID],
[StudentReferrals].[PersonID] AS [PersonID],
[StudentReferrals].[Active] AS [Active],
[StudentReferrals].[ServiceTypeID] AS [ServiceTypeID],
[StudentReferrals].[IsSchoolActive] AS [IsSchoolActive],
[StudentReferrals].[Principal] AS [Principal],
[StudentReferrals].[SchoolName] AS [SchoolName],
[StudentReferrals].[SchoolCode] AS [SchoolCode],
[StudentReferrals].[NearNorthSchoolCode] AS [NearNorthSchoolCode],
[StudentReferrals].[TranscriptSchoolPrincipal] AS [TranscriptSchoolPrincipal],
[StudentReferrals].[TranscriptSchoolName] AS [TranscriptSchoolName],
[StudentReferrals].[TranscriptNearNorthSchoolCode] AS [TranscriptNearNorthSchoolCode],
[StudentReferrals].[GuardianFirstName] AS [GuardianFirstName],
[StudentReferrals].[GuardianLastName] AS [GuardianLastName],
[StudentReferrals].[AreaCode] AS [AreaCode],
[StudentReferrals].[ContactNo] AS [ContactNo],
[StudentReferrals].[ReferredByFirstName] AS [ReferredByFirstName],
[StudentReferrals].[ReferredByLastName] AS [ReferredByLastName],
[StudentReferrals].[ReferredDate] AS [ReferredDate],
[StudentReferrals].[IntakeTeamMember] AS [IntakeTeamMember],
[StudentReferrals].[IntakeMemberID] AS [IntakeMemberID],
[StudentReferrals].[Reason] AS [Reason],
[StudentReferrals].[HomeroomTeacher] AS [HomeroomTeacher]
FROM [dbo].[StudentReferrals] AS [StudentReferrals]) AS [Extent1]
WHERE ([Extent1].[ReferralID] = #p__linq__0) AND ([Extent1].[SchoolYear] = #p__linq__1)
Here is the StudentReferral definition...
SELECT TOP (100) PERCENT p.person_id AS PersonID, p.birth_date AS BirthDate, p.legal_first_name AS LegalFirstName, p.legal_surname AS LegalLastName, p.preferred_first_name AS PreferredFirstName,
p.preferred_surname AS PreferredLastName, p.gender, p.student_no AS StudentNumber, p.legacy_student_number AS LegacyStudentNumber, p.transcript_school_code AS TranscriptSchoolCode,
p.oen_number AS OEN, s.status_indicator_code AS StatusIndicator, s.school_year AS SchoolYear, s.grade, CAST(CASE WHEN PATINDEX('%[^A-Za-z]%', s.Grade) = 0 THEN 1 ELSE CASE WHEN CAST(s.Grade AS int)
< 9 THEN 1 ELSE 0 END END AS bit) AS IsElementary, t.SchoolName, t.SchoolCode, t.NearNorthSchoolCode, pg.person_id AS GuardianID, pg.legal_first_name AS GuardianFirstName,
pg.legal_surname AS GuardianLastName, pt.area_code AS AreaCode, pt.phone_no AS ContactNo, pt.email_account AS Email
FROM Trillium.dbo.persons AS p INNER JOIN
Trillium.dbo.student_registrations AS s ON s.person_id = p.person_id INNER JOIN
dbo.Schools AS t ON t.SchoolCode = s.school_code INNER JOIN
NNDSB_AD_Routines.dbo.Students_Trillium_Guardians AS g ON s.person_id = g.student_person_id INNER JOIN
Trillium.dbo.persons AS pg ON g.contact_person_id = pg.person_id INNER JOIN
Trillium.dbo.person_telecom AS pt ON pg.person_id = pt.person_id
WHERE (s.status_indicator_code IN ('Active', 'PreReg')) AND (pt.telecom_type_name = 'home')
GROUP BY p.person_id, p.birth_date, p.legal_first_name, p.legal_surname, p.preferred_first_name, p.preferred_surname, p.gender, p.student_no, p.legacy_student_number, p.transcript_school_code, p.oen_number,
s.status_indicator_code, s.school_year, s.grade, CAST(CASE WHEN PATINDEX('%[^A-Za-z]%', s.Grade) = 0 THEN 1 ELSE CASE WHEN CAST(s.Grade AS int) < 9 THEN 1 ELSE 0 END END AS bit), t.SchoolName,
t.SchoolCode, t.NearNorthSchoolCode, pg.person_id, pg.legal_first_name, pg.legal_surname, pt.area_code, pt.phone_no, pt.email_account, g.primary_contact_priority
ORDER BY g.primary_contact_priority
I can almost guarantee that the query that EF produces and the query you're executing in SSMS are not the exact same SELECT statement. You probably wrote something like what Stephen Byrne has in his answer, i.e.
SELECT * from StudentReferrals WHERE ReferallID=1 AND SchoolYear='2015'
Right off the bat this query doesn't have a TOP qualifier on it which your EF query probably will due to the presence of the FirstOrDefault call.
Your first step should be to use something like SQL Profiler and grab the actual query that EF is generating. It's possible that with that query the optimizer is choosing to do a table scan because of the type of query that is being generated.
This likely won't make any difference, but you could also try rewriting your query as:
var referral = entities.StudentReferrals.FirstOrDefault(x => x.ReferralID == p && x.SchoolYear == year);
As an example, when I write the following query against my database:
OrganizationalNodes.FirstOrDefault(on => on.Name == "Justice League")
EF generates the following SQL:
SELECT
[Limit1].[C1] AS [C1],
[Limit1].[Id] AS [Id],
-- columns omitted for brevity
FROM ( SELECT TOP (1)
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
-- columns omitted for brevity
'0X0X' AS [C1]
FROM [dbo].[OrganizationalItems] AS [Extent1]
INNER JOIN [dbo].[OrganizationalNodes] AS [Extent2] ON [Extent1].[Id] = [Extent2].[Id]
WHERE N'Justice League' = [Extent1].[Name]
) AS [Limit1]
Well, to answer the question
why would adding a constraint to the Where clause cause a query to time out
The most likely cause is that you have a lot of data in the table, but no index covers the SchoolYear column. Therefore when you include in in a WHERE clause, this causes a Table Scan (because every row has to be checked to see if it should be included or not in the result set)
If you use SQL Server Management Studio and write the query manually for e.g
SELECT * from StudentReferrals WHERE ReferallID=1 AND SchoolYear='2015'
And then include the actual Execution Plan (Query->Include Actual Estimation Plan) then you will get the execution breakdown which will show you clearly if there is a Table Scan involved. If there is, create an index to "cover" the columns involved and it should fix your issue.
Update
Another possible solution could be to run DBCC FREEPROCCACHE to clear out any cached execution plans just in case for some reason SQL Server has picked something insane for whatever query is generated by Entity Framework.
I faced today a problem that leads me in a gotcha of ActiveRecord use.
ActiveRecord returns for a specific query (with includes) certain amount of objects in an ActiveRelation object.
If you chain on the same ActiveRecord query sum(:attribute), it includes more objects in the calculated result. To describe what I mean here my example:
Environment:
ActiveRecord 4.2.3
Postgres 9.3.5
DB-structure:
Order has_many items
My query:
#orders = Order.includes(:items).where('orders.created_at >= ? AND orders.created_at <= ?', date_from, date_to)
The produced SQL-Query:
SELECT orders.* FROM order_containers WHERE orders.created_at >= '2015-08-11' AND orders.created_at <= '2015-08-17 23:59:59.999999';
The mentioned query returns e.g. 20 orders. As you can see, the includes doesn't play any rule in the query. And if I sum the price for the result, in ruby:
#orders.to_a.sum(&:price)
it returns 20.00
The same ActiveRecord query with SUM:
Order.includes(:items).where('orders.created_at >= ? AND orders.created_at <= ?', date_from, date_to).sum(:price)
it returns 45.00
It produces a different SQL statment:
SELECT SUM(orders.price_eur) FROM orders LEFT OUTER JOIN line_items ON items.order_container_id = orders.id WHERE orders.created_at >= '2015-08-11' AND orders.created_at <= '2015-08-17 23:59:59.999999'
The summed orders in this case are much more because the produced SQL-query includes the same order more than one time (because of Join). Every order has one or more items what leads to much more orders (duplicates) than the query without the Left Outer Join.
I hope this can help you avoid this gotcha.
nabinabou
includes is generally used for eager loading. Why don't you replace it with joins?
Running the following query on a has_many association. Recommendations has_many Approvals.
I am running, rails 3 and PostgreSQL:
Recommendation.joins(:approvals).where('approvals.count = ?
AND recommendations.user_id = ?', 1, current_user.id)
This is returning the following error: https://gist.github.com/1541569
The error message tells you:
aggregates not allowed in WHERE clause
count() is an aggregate function. Use the HAVING clause for that.
The query could look like this:
SELECT r.*
FROM recommendations r
JOIN approvals a ON a.recommendation_id = r.id
WHERE r.user_id = $current_user_id
GROUP BY r.id
HAVING count(a.recommendation_id) = 1
With PostgreSQL 9.1 or later it is enough to GROUP BY the primary key of a table (presuming recommendations.id is the PK). In Postgres versions before 9.1 you had to include all columns of the SELECT list that are not aggregated in the GROUP BY list. With recommendations.* in the SELECT list, that would be every single column of the table.
I quote the release notes of PostgreSQL 9.1:
Allow non-GROUP BY columns in the query target list when the primary
key is specified in the GROUP BY clause (Peter Eisentraut)
Simpler with a sub-select
Either way, this is simpler and faster, doing the same:
SELECT *
FROM recommendations r
WHERE user_id = $current_user_id
AND (SELECT count(*)
FROM approvals
WHERE recommendation_id = r.id) = 1;
Avoid multiplying rows with a JOIN a priori, then you don't have to aggregate them back.
Looks like you have a column named count and PostgreSQL is interpreting that column name as the count aggregate function. Your SQL ends up like this:
SELECT "recommendations".*
FROM "recommendations"
INNER JOIN "approvals" ON "approvals"."recommendation_id" = "recommendations"."id"
WHERE (approvals.count = 1 AND recommendations.user_id = 1)
The error message specifically points at the approvals.count:
LINE 1: ...ecommendation_id" = "recommendations"."id" WHERE (approvals....
^
I can't reproduce that error in my PostgreSQL (9.0) but maybe you're using a different version. Try double quoting that column name in your where:
Recommendation.joins(:approvals).where('approvals."count" = ? AND recommendations.user_id = ?', 1, current_user.id)
If that sorts things out then I'd recommend renaming your approvals.count column to something else so that you don't have to worry about it anymore.