Can't paste long cypher text on Neo4j Browser - neo4j

I use Neo4j the below.
Neo4j Browser version: 4.0.8
Neo4j Server version: 3.5.18 (community)
Since about half a year ago, I can't paste long cypher text on Neo4j Browser.
I can paste it per 10 lines in several batches. But I'm going crazy.
I was able to paste long cypher text about half a year ago.
I'm at a loss for finding a solution.
A sample of long cypher text is the following(syntactically correct). 
MATCH(a0:C_Patent) WHERE a0._SID IN ['the_id']
CALL apoc.cypher.run('WITH {a0} AS a0 OPTIONAL MATCH(b2:C_Country) WHERE a0.Country = b2.Name
OPTIONAL MATCH(b2:C_Country) RETURN b2._SID AS _SID, LABELS(b2)[0] AS module, b2.Name AS Name, b2.CountryName AS CountryName', {a0:a0}) YIELD value AS b2
WITH DISTINCT a0, {_SID:b2._SID, module:b2.module, Name:b2.Name, CountryName:b2.CountryName} AS Country
CALL apoc.cypher.run('WITH {a0} AS a0 OPTIONAL MATCH(c2:C_Employee) WHERE a0.LastModifiedUser = c2.Name
OPTIONAL MATCH(c2:C_Employee) RETURN c2._SID AS _SID, LABELS(c2)[0] AS module, c2.Name AS Name, c2.Fullname AS Fullname', {a0:a0}) YIELD value AS c2
WITH DISTINCT a0, Country, {_SID:c2._SID, module:c2.module, Name:c2.Name, Fullname:c2.Fullname} AS LastModifiedUser
OPTIONAL MATCH(a0:C_Patent)-[d0:RDAVAILABLE]->(e0:C_RDDivision)
WITH DISTINCT a0, Country, LastModifiedUser, {_SID:e0._SID, module:LABELS(e0)[0], _RID:d0._RID, Name:e0.Name, Fullname:e0.Fullname, Name:e0.Name} AS RDDivision ORDER BY RDDivision.Name
OPTIONAL MATCH(a0:C_Patent)-[f0:ATTACHMENT]->(g0:C_Document) WHERE g0.Type = '1'
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, {_SID:g0._SID, module:LABELS(g0)[0], _RID:f0._RID, Name:g0.Name, Date:g0.Date, Time:g0.Time} AS PrincipalFigure ORDER BY PrincipalFigure.Date ASC, PrincipalFigure.Time ASC
OPTIONAL MATCH(a0:C_Patent)-[h0:APPLICANT]->(i0) WHERE (i0:C_Company OR i0:C_Party OR i0:C_Person OR i0:C_Practitioner)
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, {_SID:i0._SID, module:LABELS(i0)[0], _RID:h0._RID, RightShare:h0.RightShare, CostShare:h0.CostShare, ApplicantReference:h0.ApplicantReference, Type:h0.Type, ApplicantMemo:h0.ApplicantMemo, Order:h0.Order, Name:i0.Name, Fullname:i0.Fullname, Order:h0.Order} AS Applicants ORDER BY Applicants.Order ASC
OPTIONAL MATCH(a0:C_Patent)-[j0:REPRESENTOR]->(k0) WHERE (k0:C_Company OR k0:C_Party)
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, {_SID:k0._SID, module:LABELS(k0)[0], _RID:j0._RID, ApplicantReference:j0.ApplicantReference, ApplicantMemo:j0.ApplicantMemo, Order:j0.Order, Name:k0.Name, Fullname:k0.Fullname, Order:j0.Order} AS Representor ORDER BY Representor.Order ASC
CALL apoc.path.spanningTree(a0, {relationshipFilter: 'ORIGINAL|PRIORITY|REGIONAL', labelFilter: '+C_Design|C_Gazette|C_Patent', minLevel: 0, maxLevel: 999}) YIELD path
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, Representor, Inventors, NODES(path) AS _nodes UNWIND _nodes AS _node
OPTIONAL MATCH(_node)-[r:ORIGINAL|PRIORITY|REGIONAL]-(dst)
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, Representor, Inventors, _node, r, dst, _node=STARTNODE(r) AS outgoing ORDER BY _node.Name
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, Representor, Inventors, _node, CASE WHEN outgoing THEN { type:TYPE(r), _SID:dst._SID } END AS parents
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, Representor, Inventors, _node {._SID, `#parents`:COLLECT(parents), .Name, ._SID, .Status, .Country, .Law, .AppType, .AppRoute, .AppNumber, .AppDate, .PubNumber, .RegNumber, .RegDate, module:LABELS(_node)[0]} AS LegalFamily
OPTIONAL MATCH(a0:C_Patent)-[t0:AGENT]->(u0) WHERE (u0:C_Employee OR u0:C_Party OR u0:C_Person OR u0:C_Practitioner)
OPTIONAL MATCH(u0)-[v0:COMPANY]->(w0) WHERE (w0:C_Company OR w0:C_Party OR w0:C_Practitioner)
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, Representor, Inventors, LegalFamily, {_SID:u0._SID, module:LABELS(u0)[0], _RID:t0._RID, Type:t0.Type, AgentMemo:t0.AgentMemo, Order:t0.Order, Name:u0.Name, Fullname:u0.Fullname, ComName:w0.Fullname, Order:t0.Order} AS Practitioners ORDER BY Practitioners.Order ASC
OPTIONAL MATCH(a0:C_Patent)-[x0:IPREP]->(y0:C_Employee)
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, Representor, Inventors, LegalFamily, Practitioners, {`#IPRepFullname`:y0.Fullname, Name:y0.Name} AS y0_Pack ORDER BY y0_Pack.Name
OPTIONAL MATCH(a0:C_Patent)-[z0:REPCONTACT]->(a1:C_Employee)
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, Representor, Inventors, LegalFamily, Practitioners, y0_Pack, {`#ContactFullname`:a1.Fullname, Name:a1.Name} AS a1_Pack ORDER BY a1_Pack.Name
OPTIONAL MATCH(a0:C_Patent)-[b1:INVENTOR]->(c1) WHERE (c1:C_Employee OR c1:C_Person) AND b1.Order = 1
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, Representor, Inventors, LegalFamily, Practitioners, y0_Pack, a1_Pack, {`#RepInventorFullname`:c1.Fullname, Name:c1.Name} AS c1_Pack ORDER BY c1_Pack.Name
OPTIONAL MATCH(a0:C_Patent)-[d1:RIGHTSHARE]->(e1:C_Office)
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, Representor, Inventors, LegalFamily, Practitioners, y0_Pack, a1_Pack, c1_Pack, {`#RightDivisionFullnames`:e1.Fullname, _SID:e1._SID, Order:d1.Order} AS e1_Pack ORDER BY e1_Pack.Order ASC
OPTIONAL MATCH(a0:C_Patent)-[f1:COSTSHARE]->(g1:C_CostShare)
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, Representor, Inventors, LegalFamily, Practitioners, y0_Pack, a1_Pack, c1_Pack, e1_Pack, {`#CostDivisionFullnames`:g1.Fullname, _SID:g1._SID, Order:f1.Order} AS g1_Pack ORDER BY g1_Pack.Order ASC
OPTIONAL MATCH(a0:C_Patent)<-[h1:APPLICATION]-(i1:C_PatFamily)
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, Representor, Inventors, LegalFamily, Practitioners, y0_Pack, a1_Pack, c1_Pack, e1_Pack, g1_Pack, {`#FamilyNo`:i1.Name, Name:i1.Name} AS i1_Pack ORDER BY i1_Pack.Name
OPTIONAL MATCH(a0:C_Patent)-[j1:ORIGINAL|PRIORITY|REGIONAL*0..]->(k1)-[l1:PRIORITY]->(m1) WHERE (k1:C_Gazette OR k1:C_Patent) AND (m1:C_Gazette OR m1:C_Patent) AND NOT (m1)-[:PRIORITY]->()
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, Representor, Inventors, LegalFamily, Practitioners, y0_Pack, a1_Pack, c1_Pack, e1_Pack, g1_Pack, i1_Pack
, MIN({AppDate:m1.AppDate, Name:m1.Name}) AS _min
OPTIONAL MATCH(a0:C_Patent)-[j1:ORIGINAL|PRIORITY|REGIONAL*0..]->(k1)-[l1:PRIORITY]->(m1)
WHERE NOT (m1)-[:PRIORITY]->() AND _min.AppDate = m1.AppDate AND _min.Name = m1.Name
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, Representor, Inventors, LegalFamily, Practitioners, y0_Pack, a1_Pack, c1_Pack, e1_Pack, g1_Pack, i1_Pack, {`#EarliestPriorityClaimAppDate`:m1.AppDate, Name:k1.Name} AS m1_Pack ORDER BY m1_Pack.Name
OPTIONAL MATCH(a0:C_Patent)-[n1:ORIGINAL*0..]->(o1)-[p1:ORIGINAL]->(q1) WHERE (o1:C_Design OR o1:C_Gazette OR o1:C_Patent) AND (q1:C_Design OR q1:C_Gazette OR q1:C_Patent) AND NOT (q1)-[:ORIGINAL]->()
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, Representor, Inventors, LegalFamily, Practitioners, y0_Pack, a1_Pack, c1_Pack, e1_Pack, g1_Pack, i1_Pack, m1_Pack
, MIN({AppDate:q1.AppDate, Name:q1.Name}) AS _min
OPTIONAL MATCH(a0:C_Patent)-[n1:ORIGINAL*0..]->(o1)-[p1:ORIGINAL]->(q1)
WHERE NOT (q1)-[:ORIGINAL]->() AND _min.AppDate = q1.AppDate AND _min.Name = q1.Name
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, Representor, Inventors, LegalFamily, Practitioners, y0_Pack, a1_Pack, c1_Pack, e1_Pack, g1_Pack, i1_Pack, m1_Pack, {`#EarliestParentAppDate`:q1.AppDate, Name:o1.Name} AS q1_Pack ORDER BY q1_Pack.Name
OPTIONAL MATCH(a0:C_Patent)-[r1:APPOFFICE]->(s1) WHERE (s1:C_Company OR s1:C_Party OR s1:C_Practitioner)
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, Representor, Inventors, LegalFamily, Practitioners, y0_Pack, a1_Pack, c1_Pack, e1_Pack, g1_Pack, i1_Pack, m1_Pack, q1_Pack, {`#SupplierFullname`:s1.Fullname, Name:s1.Name} AS s1_Pack ORDER BY s1_Pack.Name
OPTIONAL MATCH(a0:C_Patent)-[t1:ASSOCIATOR]->(u1:C_Practitioner)
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, Representor, Inventors, LegalFamily, Practitioners, y0_Pack, a1_Pack, c1_Pack, e1_Pack, g1_Pack, i1_Pack, m1_Pack, q1_Pack, s1_Pack, {`#AssociatorFullname`:u1.Fullname, Name:u1.Name} AS u1_Pack ORDER BY u1_Pack.Name
OPTIONAL MATCH(a0:C_Patent)-[v1:AVAILABLE]->(w1:C_Product)
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, Representor, Inventors, LegalFamily, Practitioners, y0_Pack, a1_Pack, c1_Pack, e1_Pack, g1_Pack, i1_Pack, m1_Pack, q1_Pack, s1_Pack, u1_Pack, {`#ProductFullnames`:w1.Fullname, _SID:w1._SID, Order:v1.Order} AS w1_Pack ORDER BY w1_Pack.Order ASC
OPTIONAL MATCH(a0:C_Patent)-[x1:PRIORITY|REGIONAL*0..]->(y1)-[z1:REGIONAL]->(a2) WHERE (y1:C_Gazette OR y1:C_Patent) AND (a2:C_Gazette OR a2:C_Patent) AND (a0.Country = 'WO' OR a2.Country = 'WO')
WITH DISTINCT a0, Country, LastModifiedUser, RDDivision, PrincipalFigure, Applicants, Representor, Inventors, LegalFamily, Practitioners, y0_Pack, a1_Pack, c1_Pack, e1_Pack, g1_Pack, i1_Pack, m1_Pack, q1_Pack, s1_Pack, u1_Pack, w1_Pack, {`#IntlAppDate`:CASE WHEN a0.Country = 'WO' THEN a0.AppDate ELSE a2.AppDate END, `#IntlPubDate`:CASE WHEN a0.Country = 'WO' THEN a0.PubDate ELSE a2.PubDate END, Name:y1.Name} AS a2_Pack ORDER BY a2_Pack.Name
RETURN labels(a0)[0] AS label, a0{_SID:a0._SID, module:labels(a0)[0], public:1
, _SID:a0._SID
, Name:a0.Name
, Status:a0.Status
, AppNumber:a0.AppNumber
, InventionTitle:a0.InventionTitle
, Contracted:a0.Contracted
, NEDOContracted:a0.NEDOContracted
, Finally:a0.Finally
, Nickname:a0.Nickname
, AppDate:a0.AppDate
, PubNumber:a0.PubNumber
, PubDate:a0.PubDate
, ExamRequestDeadline:a0.ExamRequestDeadline
, NextAnnuityDueDate:a0.NextAnnuityDueDate
, RegNumber:a0.RegNumber
, RegDate:a0.RegDate
, FinallyDate:a0.FinallyDate
, TermRemainingDays:duration.inDays(date(), date(toString(a0.PatentTermLapseDate))).days
, SupplierAssignedID:a0.SupplierAssignedID
, Evaluation:a0.Evaluation
, Rank:a0.Rank
, IPContactClass:a0.IPContactClass
, IssueMemo:a0.IssueMemo
, ExpenseMemo:a0.ExpenseMemo
, ApplicationTitle:a0.ApplicationTitle
, ClaimCount:a0.ClaimCount
, ClaimCountInApp:a0.ClaimCountInApp
, AppReference:a0.AppReference
, Abstract:a0.Abstract
, Claims:a0.Claims
, IPC:a0.IPC
, ApplicationNote:a0.ApplicationNote
, AppKeyword:a0.AppKeyword
, FreeKeyward:a0.FreeKeyward
, ApplicationMemo:a0.ApplicationMemo
, ResearchDevelopmentDivision:a0.ResearchDevelopmentDivision
, ResearchDevelopmentDivisionText:a0.ResearchDevelopmentDivisionText
, ResearchDevelopmentDivision_SDK:a0.ResearchDevelopmentDivision_SDK
, ResearchDevelopmentDivisionText_SDK:a0.ResearchDevelopmentDivisionText_SDK
, m_FamilyNo:a0.m_FamilyNo
, m_OrgName:a0.m_OrgName
, OrgCompany:a0.OrgCompany
, m_Knowhow:a0.m_Knowhow
, LastModifiedTime:a0.LastModifiedTime
, CreatedTime:a0.CreatedTime
, Law:a0.Law
, AppRoute:a0.AppRoute
, Country:COLLECT(Country)[0]
, LastModifiedUser:COLLECT(LastModifiedUser)[0]
, RDDivision:CASE WHEN RDDivision._SID IS NULL THEN NULL ELSE COLLECT(DISTINCT RDDivision)[0] END
, PrincipalFigure:CASE WHEN PrincipalFigure._SID IS NULL THEN NULL ELSE COLLECT(DISTINCT PrincipalFigure)[0] END
, Applicants:CASE WHEN Applicants._SID IS NULL THEN [] ELSE COLLECT(DISTINCT Applicants) END
, Representor:CASE WHEN Representor._SID IS NULL THEN [] ELSE COLLECT(DISTINCT Representor) END
, Inventors:CASE WHEN Inventors._SID IS NULL THEN [] ELSE COLLECT(DISTINCT Inventors) END
, LegalFamily:CASE WHEN LegalFamily._SID IS NULL THEN [] ELSE COLLECT(DISTINCT LegalFamily) END
, Practitioners:CASE WHEN Practitioners._SID IS NULL THEN [] ELSE COLLECT(DISTINCT Practitioners) END
, IPRepFullname:COLLECT(DISTINCT y0_Pack.`#IPRepFullname`)
, ContactFullname:COLLECT(DISTINCT a1_Pack.`#ContactFullname`)
, RepInventorFullname:COLLECT(DISTINCT c1_Pack.`#RepInventorFullname`)
, RightDivisionFullnames:REDUCE(a=[], x IN COLLECT(DISTINCT {_SID:e1_Pack._SID, `#RightDivisionFullnames`:e1_Pack.`#RightDivisionFullnames`}) | a + COALESCE(x.`#RightDivisionFullnames`, 'null'))
, CostDivisionFullnames:REDUCE(a=[], x IN COLLECT(DISTINCT {_SID:g1_Pack._SID, `#CostDivisionFullnames`:g1_Pack.`#CostDivisionFullnames`}) | a + COALESCE(x.`#CostDivisionFullnames`, 'null'))
, FamilyNo:COLLECT(DISTINCT i1_Pack.`#FamilyNo`)
, EarliestPriorityClaimAppDate:COLLECT(DISTINCT m1_Pack.`#EarliestPriorityClaimAppDate`)
, EarliestParentAppDate:COLLECT(DISTINCT q1_Pack.`#EarliestParentAppDate`)
, SupplierFullname:COLLECT(DISTINCT s1_Pack.`#SupplierFullname`)
, AssociatorFullname:COLLECT(DISTINCT u1_Pack.`#AssociatorFullname`)
, ProductFullnames:REDUCE(a=[], x IN COLLECT(DISTINCT {_SID:w1_Pack._SID, `#ProductFullnames`:w1_Pack.`#ProductFullnames`}) | a + COALESCE(x.`#ProductFullnames`, 'null'))
, IntlAppDate:COLLECT(DISTINCT a2_Pack.`#IntlAppDate`)
, IntlPubDate:COLLECT(DISTINCT a2_Pack.`#IntlPubDate`)
} AS nodes ORDER BY nodes.Name DESC;

Maybe try out the centrally hosted browser and see if the latest versions sorts things out for you. FWIW, I couldn't replicate the issue on the latest version of Browser (5.3.0)
http://browser.graphapp.io

Related

Cypher multiple OPTIONAL MATCH - Pattern Comprehension - COUNT DISTINCT

I have read a lot of comments about OPTIONAL MATCH and Pattern Comprehesion, but I can't find a solution for my case.
I have a node (Account) in my Neo4j Database and I'd like to count the nodes which belongs to each account.
The following code works with one or two optional matches, but the many optional matches produce a cross product and a timeout.
// Account
MATCH (a:Account{billingCountry: "DE", isDeleted: false})
WHERE a.id IS NOT NULL
// User
MATCH (a)<-[:CREATED]-(u:User)
// Contact
OPTIONAL MATCH (a) <-[:CONTACT_OF]- (c:Contact{isDeleted: false})
// Opportunity
OPTIONAL MATCH (a) <-[:OPPORTUNITY_OF]- (o:Opportunity{isDeleted: false, s4sMarked_For_Deletion__C: false})
// Open Opportunity
OPTIONAL MATCH (a)<-[:OPPORTUNITY_OF]-(open:Opportunity{isClosed: false, isDeleted: false})
// Attribute
OPTIONAL MATCH (a) <-[:ATTRIBUTE_OF]- (aa:Attribute_Assignment{isDeleted: false})
// Sales Planning
OPTIONAL MATCH (a) <-[:SALESPLAN_OF]- (s:Sales_Planning)
// Task
OPTIONAL MATCH (a) <-[:TASK_OF]- (t:Task{isDeleted: false})
// Event
OPTIONAL MATCH (a) <-[:EVENT_OF]- (e:Event{isDeleted: false})
// Contract
OPTIONAL MATCH (a) <-[:CONTRACT_OF]- (ct:Contract{isDeleted: false})
RETURN
a.id
u.name AS User_Name,
u.department AS User_Department,
COUNT(DISTINCT c.id) AS Contact_Count,
COUNT(DISTINCT o.id) AS Opportunity_Count,
COUNT(DISTINCT open.id) AS OpenOpp_Count,
COUNT(DISTINCT aa.id) AS Attribute_Count,
COUNT(DISTINCT s.timeYear) AS Sales_Plan_Count,
COUNT(DISTINCT t.id) AS Task_Count,
COUNT(DISTINCT e.id) AS Event_Count,
COUNT(DISTINCT ct.id) AS Contract_Count
I can rewrite the query with a Pattern Compression, but then I just get back the non distinct ids in arrays.
Is there a way to count the distinct values inside the arrays or another way how to count the values in pattern compression?
MATCH (a:Account{billingCountry: "DE", isDeleted: false})
WHERE a.id IS NOT NULL
RETURN a.id,
[
[(a)<-[:CONTACT_OF]- (c:Contact{isDeleted: false}) | c.id],
[(a)<-[:OPPORTUNITY_OF]- (o:Opportunity{isDeleted: false, s4sMarked_For_Deletion__C: false}) | o.id],
[(a)<-[:OPPORTUNITY_OF]-(open:Opportunity{isClosed: false, isDeleted: false}) | open.id],
[(a) <-[:ATTRIBUTE_OF]- (aa:Attribute_Assignment{isDeleted: false}) | aa.id],
[(a) <-[:SALESPLAN_OF]- (s:Sales_Planning) | s.timeYear],
[(a) <-[:TASK_OF]- (t:Task{isDeleted: false}) | t.id],
[(a) <-[:EVENT_OF]- (e:Event{isDeleted: false}) | e.id],
[(a) <-[:CONTRACT_OF]- (ct:Contract{isDeleted: false}) | ct.id]
]
If I made a formal mistake in my first stockoverflow post, I would appreciate feedback :)
The problem lies, in the RETURN statement, because you are calculating all the counts at the last, neo4j has to calculate the cartesian products. If you calculate each node count at each step, it will be much more optimal. Like this:
MATCH (a:Account{billingCountry: "DE", isDeleted: false})
WHERE a.id IS NOT NULL
MATCH (a)<-[:CREATED]-(u:User)
OPTIONAL MATCH (a) <-[:CONTACT_OF]- (c:Contact{isDeleted: false})
WITH a, u, COUNT(DISTINCT c.id) AS Contact_Count,
OPTIONAL MATCH (a) <-[:OPPORTUNITY_OF]- (o:Opportunity{isDeleted: false, s4sMarked_For_Deletion__C: false})
WITH a, u, Contact_Count, COUNT(DISTINCT o.id) AS Opportunity_Count
OPTIONAL MATCH (a)<-[:OPPORTUNITY_OF]-(open:Opportunity{isClosed: false, isDeleted: false})
WITH a, u, Contact_Count, Opportunity_Count, COUNT(DISTINCT open.id) AS OpenOpp_Count
OPTIONAL MATCH (a) <-[:ATTRIBUTE_OF]- (aa:Attribute_Assignment{isDeleted: false})
WITH a, u, Contact_Count, Opportunity_Count, OpenOpp_Count, COUNT(DISTINCT aa.id) AS Attribute_Count
OPTIONAL MATCH (a) <-[:SALESPLAN_OF]- (s:Sales_Planning)
WITH a, u, Contact_Count, Opportunity_Count, OpenOpp_Count, Attribute_Count,COUNT(DISTINCT s.timeYear) AS Sales_Plan_Count
OPTIONAL MATCH (a) <-[:TASK_OF]- (t:Task{isDeleted: false})
WITH a, u, Contact_Count, Opportunity_Count, OpenOpp_Count, Attribute_Count, Sales_Plan_Count, COUNT(DISTINCT t.id) AS Task_Count
OPTIONAL MATCH (a) <-[:EVENT_OF]- (e:Event{isDeleted: false})
WITH a, u, Contact_Count, Opportunity_Count, OpenOpp_Count, Attribute_Count, Sales_Plan_Count, Task_Count, COUNT(DISTINCT e.id) AS Event_Count
OPTIONAL MATCH (a) <-[:CONTRACT_OF]- (ct:Contract{isDeleted: false})
RETURN
a.id, u.name AS User_Name, u.department AS User_Department, Contact_Count,
Opportunity_Count, OpenOpp_Count, Attribute_Count, Sales_Plan_Count,
Task_Count, Event_Count, COUNT(DISTINCT ct.id) AS Contract_Count

Neo4j Cypher query and index of element in the collection

I'm trying to find index number of Decision by {decisionGroupId}, {decisionId} and {criteriaIds}
This is my current Cypher query:
MATCH (dg:DecisionGroup)-[:CONTAINS]->(childD:Decision)
WHERE dg.id = {decisionGroupId}
OPTIONAL MATCH (childD)-[vg:HAS_VOTE_ON]->(c:Criterion)
WHERE c.id IN {criteriaIds}
WITH childD, vg.avgVotesWeight as weight, vg.totalVotes as totalVotes
ORDER BY weight DESC, totalVotes DESC
WITH COLLECT(childD) AS ps
RETURN REDUCE(ix = -1, i IN RANGE(0, SIZE(ps)-1)
| CASE ps[i].id WHEN {decisionId} THEN i ELSE ix END) AS ix
I have only 3 Decision in the database but this query returns the following indices:
2
3
4
while I expecting something like(starting from 0 and -1 if not found)
0
1
2
What is wrong with my query and how to fix it?
UPDATED
This query is working fine with COLLECT(DISTINCT childD) AS ps:
MATCH (dg:DecisionGroup)-[:CONTAINS]->(childD:Decision)
WHERE dg.id = {decisionGroupId}
OPTIONAL MATCH (childD)-[vg:HAS_VOTE_ON]->(c:Criterion)
WHERE c.id IN {criteriaIds}
WITH childD, vg.avgVotesWeight as weight, vg.totalVotes as totalVotes
ORDER BY weight DESC, totalVotes DESC
WITH COLLECT(DISTINCT childD) AS ps
RETURN REDUCE(ix = -1, i IN RANGE(0, SIZE(ps)-1)
| CASE ps[i].id WHEN {decisionId} THEN i ELSE ix END) AS ix
Please help me to refactor this query and get rid of heavy REDUCE.
Let's try to get the reduce part right with a simpler query:
WITH ['a', 'b', 'c'] AS ps
RETURN
reduce(ix = -1, i IN RANGE(0, SIZE(ps)-1) |
CASE ps[i] WHEN 'b' THEN i ELSE ix END) AS ix
)
As I stated in the comments, it is usually better to avoid reduce if possible. So, to express the same using a list comprehension, use WHERE for filtering.
WITH ['a', 'b', 'c'] AS ps
RETURN [i IN RANGE(0, SIZE(ps)-1) WHERE ps[i] = 'b'][0]
The list comprehension results in a list with a single element, and we will use the [0] indexer to select that element.
After adapting this to your query, we'll get something like this:
MATCH (dg:DecisionGroup)-[:CONTAINS]->(childD:Decision)
WHERE dg.id = {decisionGroupId}
OPTIONAL MATCH (childD)-[vg:HAS_VOTE_ON]->(c:Criterion)
WHERE c.id IN {criteriaIds}
WITH childD, vg.avgVotesWeight as weight, vg.totalVotes as totalVotes
ORDER BY weight DESC, totalVotes DESC
WITH COLLECT(DISTINCT childD) AS ps
RETURN [i IN RANGE(0, SIZE(ps)-1) WHERE ps[i].id = {decisionId}][0]
If you have APOC installed, you can also use the function:
return apoc.coll.indexOf([1,2,3],2)

Neo4j Cypher count query performance optimizaztion

I have the following Neo4j Cypher count() query:
MATCH (dg:DecisionGroup)-[:CONTAINS]->(childD:Decision)
WHERE dg.id = 1
MATCH (childD)-[relationshipValueRel4:HAS_VALUE_ON]-(filterCharacteristic4:Characteristic)
WHERE filterCharacteristic4.id = 4
WITH relationshipValueRel4, childD, dg
WHERE (ANY (id IN [5, 25, 106] WHERE id IN relationshipValueRel4.optionIds ))
WITH childD, dg
RETURN count(childD) as total
Right now this query works pretty slow:
Cypher version: CYPHER 3.3, planner: COST, runtime: INTERPRETED. 3380782 total db hits in 2991 ms.
This is PROFILE output:
How to optimize this query performance ?
P.S
The corresponding main query works pretty fast:
PROFILE MATCH (dg:DecisionGroup)-[:CONTAINS]->(childD:Decision)
WHERE dg.id = 1
MATCH (childD)-[relationshipValueRel4:HAS_VALUE_ON]-(filterCharacteristic4:Characteristic)
WHERE filterCharacteristic4.id = 4
WITH relationshipValueRel4, childD, dg
WHERE (ANY (id IN [5, 25, 106]
WHERE id IN relationshipValueRel4.optionIds ))
WITH childD, dg WITH childD , dg
SKIP 0 LIMIT 10
WITH *
MATCH (childD)-[ru:CREATED_BY]->(u:User)
OPTIONAL MATCH (childD)-[rup:UPDATED_BY]->(up:User)
RETURN ru, u, rup, up, childD AS decision,
[ (dg)<-[:DEFINED_BY]-(entity)<-[:COMMENTED_ON]-(comg:CommentGroup)-[:COMMENTED_FOR]->(childD) | {entityId: toInt(entity.id), types: labels(entity), totalComments: toInt(comg.totalComments)} ] AS commentGroups,
[ (dg)<-[:DEFINED_BY]-(c1)<-[vg1:HAS_VOTE_ON]-(childD) | {criterionId: toInt(c1.id), weight: vg1.avgVotesWeight, totalVotes: toInt(vg1.totalVotes)} ] AS weightedCriteria, [ (dg)<-[:DEFINED_BY]-(ch1:Characteristic)<-[v1:HAS_VALUE_ON]-(childD) WHERE NOT ((ch1)<-[:DEPENDS_ON]-()) | {characteristicId: toInt(ch1.id), optionIds: v1.optionIds, valueIds: v1.valueIds, value: v1.value, available: v1.available, totalHistoryValues: v1.totalHistoryValues, totalFlags: v1.totalFlags, description: v1.description, valueType: ch1.valueType, visualMode: ch1.visualMode} ] AS valuedCharacteristics
Cypher version: CYPHER 3.3, planner: COST, runtime: INTERPRETED. 11725 total db hits in 8 ms
Please help to optimize the count() query performance also.

Neo4j Cypher query structure and performance optimization

I have created a Cypher query dynamic builder. For a complex cases this builder produces a quite big queries, for example:
MATCH (parentD)-[:CONTAINS]->(childD:Decision)-[ru:CREATED_BY]->(u:User)
WHERE id(parentD) = {decisionId}
MATCH (childD)<-[:SET_FOR]-(filterValue415431:Value)-[:SET_ON]->(filterCharacteristic415431:Characteristic)
WHERE id(filterCharacteristic415431) = 415431
WITH filterValue415431, childD, ru, u
WHERE ({filterValue4154311} IN filterValue415431.value )
OR ({filterValue4154312} IN filterValue415431.value )
OR ({filterValue4154313} IN filterValue415431.value )
OR ({filterValue4154314} IN filterValue415431.value )
OR ({filterValue4154315} IN filterValue415431.value )
MATCH (childD)<-[:SET_FOR]-(filterValue415441:Value)-[:SET_ON]->(filterCharacteristic415441:Characteristic)
WHERE id(filterCharacteristic415441) = 415441
WITH filterValue415441, childD, ru, u
WHERE ({filterValue4154416} IN filterValue415441.value )
OR ({filterValue4154417} IN filterValue415441.value )
OR ({filterValue4154418} IN filterValue415441.value )
OR ({filterValue4154419} IN filterValue415441.value )
OR ({filterValue41544110} IN filterValue415441.value )
OR ({filterValue41544111} IN filterValue415441.value )
OR ({filterValue41544112} IN filterValue415441.value )
OR ({filterValue41544113} IN filterValue415441.value )
OR ({filterValue41544114} IN filterValue415441.value )
OR ({filterValue41544115} IN filterValue415441.value )
OR ({filterValue41544116} IN filterValue415441.value )
OR ({filterValue41544117} IN filterValue415441.value )
MATCH (childD)<-[:SET_FOR]-(filterValue416273:Value)-[:SET_ON]->(filterCharacteristic416273:Characteristic)
WHERE id(filterCharacteristic416273) = 416273
WITH filterValue416273, childD, ru, u
WHERE (filterValue416273.value >= {filterValue41627318})
AND (filterValue416273.value <= {filterValue41627319})
MATCH (childD)<-[:SET_FOR]-(filterValue417410:Value)-[:SET_ON]->(filterCharacteristic417410:Characteristic)
WHERE id(filterCharacteristic417410) = 417410
WITH filterValue417410, childD, ru, u
MATCH (childD)<-[:SET_FOR]-(filterValue416423:Value)-[:SET_ON]->(filterCharacteristic416423:Characteristic)
WHERE id(filterCharacteristic416423) = 416423
WITH filterValue416423, childD, ru, u
WHERE ({filterValue41642320} IN filterValue416423.value )
OR ({filterValue41642321} IN filterValue416423.value )
OR ({filterValue41642322} IN filterValue416423.value )
OR ({filterValue41642323} IN filterValue416423.value )
MATCH (childD)<-[:SET_FOR]-(filterValue415673:Value)-[:SET_ON]->(filterCharacteristic415673:Characteristic)
WHERE id(filterCharacteristic415673) = 415673
WITH filterValue415673, childD, ru, u
WHERE ({filterValue41567324} IN filterValue415673.value )
OR ({filterValue41567325} IN filterValue415673.value )
OR ({filterValue41567326} IN filterValue415673.value )
OR ({filterValue41567327} IN filterValue415673.value )
OR ({filterValue41567328} IN filterValue415673.value )
OR ({filterValue41567329} IN filterValue415673.value )
OR ({filterValue41567330} IN filterValue415673.value )
OR ({filterValue41567331} IN filterValue415673.value )
OR ({filterValue41567332} IN filterValue415673.value )
OR ({filterValue41567333} IN filterValue415673.value )
OR ({filterValue41567334} IN filterValue415673.value )
OR ({filterValue41567335} IN filterValue415673.value )
OR ({filterValue41567336} IN filterValue415673.value )
OR ({filterValue41567337} IN filterValue415673.value )
OR ({filterValue41567338} IN filterValue415673.value )
OR ({filterValue41567339} IN filterValue415673.value )
OPTIONAL MATCH (childD)<-[:VOTED_FOR]-(vg:VoteGroup)-[:VOTED_ON]->(c:Criterion)
WHERE id(c) IN {criteriaIds}
WITH childD, ru, u, vg.avgVotesWeight as weight, vg.totalVotes as totalVotes
WITH ru, u, childD , toFloat(sum(weight)) as weight, toInt(sum(totalVotes)) as totalVotes
ORDER BY weight DESC
SKIP 0 LIMIT 10
RETURN ru, u, childD AS decision, weight, totalVotes,
[ (parentD)<-[:DEFINED_BY]-(entity)<-[:COMMENTED_ON]-(comg:CommentGroup)-[:COMMENTED_FOR]->(childD) |
{entityId: id(entity), types: labels(entity), totalComments: toInt(comg.totalComments)} ] AS commentGroups,
[ (parentD)<-[:DEFINED_BY]-(c1:Criterion)<-[:VOTED_ON]-(vg1:VoteGroup)-[:VOTED_FOR]->(childD) |
{criterionId: id(c1), weight: vg1.avgVotesWeight, totalVotes: toInt(vg1.totalVotes)} ] AS weightedCriteria,
[ (parentD)<-[:DEFINED_BY]-(ch1:Characteristic)<-[:SET_ON]-(v1:Value)-[:SET_FOR]->(childD) |
{characteristicId: id(ch1), value: v1.value, valueType: ch1.valueType, visualMode: ch1.visualMode} ] AS valuedCharacteristics
Right now I'm not very happy with a performance. For example call on this query takes ~500ms
Could you please take a look and tell if there is a chance to improve this query ?
UPDATED
This is a pretty much the same query but with a different parameters:
MATCH (parentD)-[:CONTAINS]->(childD:Decision)-[ru:CREATED_BY]->(u:User)
WHERE id(parentD) = 415406
MATCH (childD)<-[:SET_FOR]-(filterValue416423:Value)-[:SET_ON]->(filterCharacteristic416423:Characteristic)
WHERE id(filterCharacteristic416423) = 416423
WITH filterValue416423, childD, ru, u
WHERE ('Adobe RGB' IN filterValue416423.value ) OR ('ECI RGB' IN filterValue416423.value )
MATCH (childD)<-[:SET_FOR]-(filterValue416273:Value)-[:SET_ON]->(filterCharacteristic416273:Characteristic)
WHERE id(filterCharacteristic416273) = 416273 WITH filterValue416273, childD, ru, u
WHERE (filterValue416273.value >= 4) AND (filterValue416273.value <= 53)
MATCH (childD)<-[:SET_FOR]-(filterValue415431:Value)-[:SET_ON]->(filterCharacteristic415431:Characteristic)
WHERE id(filterCharacteristic415431) = 415431 WITH filterValue415431, childD, ru, u
WHERE ('Compact' IN filterValue415431.value )
OR ('Compact SLR' IN filterValue415431.value )
OR ('Large SLR' IN filterValue415431.value )
OR ('Rangefinder-style mirrorless' IN filterValue415431.value )
OR ('SLR-like (bridge)' IN filterValue415431.value )
MATCH (childD)<-[:SET_FOR]-(filterValue415441:Value)-[:SET_ON]->(filterCharacteristic415441:Characteristic)
WHERE id(filterCharacteristic415441) = 415441 WITH filterValue415441, childD, ru, u
WHERE ('Brass' IN filterValue415441.value )
OR ('Carbon fiber' IN filterValue415441.value )
OPTIONAL MATCH (childD)<-[:VOTED_FOR]-(vg:VoteGroup)-[:VOTED_ON]->(c:Criterion)
WHERE id(c) IN [415414, 415415, 415412, 415426, 415411]
WITH childD, ru, u, vg.avgVotesWeight as weight, vg.totalVotes as totalVotes
WITH ru, u, childD , toFloat(sum(weight)) as weight, toInt(sum(totalVotes)) as totalVotes
ORDER BY weight DESC
SKIP 0 LIMIT 10
RETURN ru, u, childD AS decision, weight, totalVotes,
[ (parentD)<-[:DEFINED_BY]-(entity)<-[:COMMENTED_ON]-(comg:CommentGroup)-[:COMMENTED_FOR]->(childD) |
{entityId: id(entity), types: labels(entity), totalComments: toInt(comg.totalComments)} ] AS commentGroups,
[ (parentD)<-[:DEFINED_BY]-(c1:Criterion)<-[:VOTED_ON]-(vg1:VoteGroup)-[:VOTED_FOR]->(childD) |
{criterionId: id(c1), weight: vg1.avgVotesWeight, totalVotes: toInt(vg1.totalVotes)} ] AS weightedCriteria,
[ (parentD)<-[:DEFINED_BY]-(ch1:Characteristic)<-[:SET_ON]-(v1:Value)-[:SET_FOR]->(childD) |
{characteristicId: id(ch1), value: v1.value, valueType: ch1.valueType, visualMode: ch1.visualMode} ] AS valuedCharacteristics
Cypher version: CYPHER 3.1, planner: COST, runtime: INTERPRETED. 646192 total db hits in 390 ms.
UPDATED
This is the output of :schema
Indexes
ON :Characteristic(lowerName) ONLINE
ON :CharacteristicGroup(lowerName) ONLINE
ON :Criterion(lowerName) ONLINE
ON :CriterionGroup(lowerName) ONLINE
ON :Decision(lowerName) ONLINE
ON :FlagType(name) ONLINE (for uniqueness constraint)
ON :HistoryValue(originalValue) ONLINE
ON :Permission(code) ONLINE (for uniqueness constraint)
ON :Role(name) ONLINE (for uniqueness constraint)
ON :User(email) ONLINE (for uniqueness constraint)
ON :User(username) ONLINE (for uniqueness constraint)
ON :Value(value) ONLINE
Constraints
ON ( flagtype:FlagType ) ASSERT flagtype.name IS UNIQUE
ON ( permission:Permission ) ASSERT permission.code IS UNIQUE
ON ( role:Role ) ASSERT role.name IS UNIQUE
ON ( user:User ) ASSERT user.email IS UNIQUE
ON ( user:User ) ASSERT user.username IS UNIQUE
UPDATED
I have optimized the query as suggest at the answer below:
MATCH (parentD)-[:CONTAINS]->(childD:Decision)
WHERE id(parentD) = 415406
MATCH (childD)<-[:SET_FOR]-(filterValue416423)-[:SET_ON]->(filterCharacteristic416423)
WHERE id(filterCharacteristic416423) = 416423
WITH DISTINCT filterValue416423, childD
WHERE ('Adobe RGB' IN filterValue416423.value ) OR ('ECI RGB' IN filterValue416423.value )
MATCH (childD)<-[:SET_FOR]-(filterValue416273)-[:SET_ON]->(filterCharacteristic416273)
WHERE id(filterCharacteristic416273) = 416273
WITH DISTINCT childD, filterValue416273
WHERE (filterValue416273.value >= 4) AND (filterValue416273.value <= 53)
MATCH (childD)<-[:SET_FOR]-(filterValue415431)-[:SET_ON]->(filterCharacteristic415431)
WHERE id(filterCharacteristic415431) = 415431
WITH DISTINCT childD, filterValue415431
WHERE ('Compact' IN filterValue415431.value )
OR ('Compact SLR' IN filterValue415431.value )
OR ('Large SLR' IN filterValue415431.value )
OR ('Rangefinder-style mirrorless' IN filterValue415431.value )
OR ('SLR-like (bridge)' IN filterValue415431.value )
MATCH (childD)<-[:SET_FOR]-(filterValue415441)-[:SET_ON]->(filterCharacteristic415441)
WHERE id(filterCharacteristic415441) = 415441
WITH DISTINCT childD, filterValue415441
WHERE ('Brass' IN filterValue415441.value )
OR ('Carbon fiber' IN filterValue415441.value )
OPTIONAL MATCH (childD)<-[:VOTED_FOR]-(vg:VoteGroup)-[:VOTED_ON]->(c:Criterion)
WHERE id(c) IN [415414, 415415, 415412, 415426, 415411]
WITH DISTINCT * MATCH (childD)-[ru:CREATED_BY]->(u:User)
WITH DISTINCT childD, ru, u, vg.avgVotesWeight as weight, vg.totalVotes as totalVotes
WITH DISTINCT ru, u, childD , toFloat(sum(weight)) as weight, toInt(sum(totalVotes)) as totalVotes
ORDER BY weight DESC
SKIP 0 LIMIT 10
RETURN ru, u, childD AS decision, weight, totalVotes,
[ (parentD)<-[:DEFINED_BY]-(entity)<-[:COMMENTED_ON]-(comg:CommentGroup)-[:COMMENTED_FOR]->(childD) |
{entityId: id(entity), types: labels(entity), totalComments: toInt(comg.totalComments)} ] AS commentGroups,
[ (parentD)<-[:DEFINED_BY]-(c1:Criterion)<-[:VOTED_ON]-(vg1:VoteGroup)-[:VOTED_FOR]->(childD) |
{criterionId: id(c1), weight: vg1.avgVotesWeight, totalVotes: toInt(vg1.totalVotes)} ] AS weightedCriteria,
[ (parentD)<-[:DEFINED_BY]-(ch1:Characteristic)<-[:SET_ON]-(v1)-[:SET_FOR]->(childD) |
{characteristicId: id(ch1), value: v1.value, valueType: ch1.valueType, visualMode: ch1.visualMode} ] AS valuedCharacteristics
PROFILE output:
With DISTINCT childD the query works pretty slow, without much better but stil so far from perfect
One more try
PROFILE MATCH (parentD)-[:CONTAINS]->(childD:Decision)
WHERE id(parentD) = 415406
MATCH (childD)<-[:SET_FOR]-(filterValue416423)-[:SET_ON]->(filterCharacteristic416423)
USING JOIN ON childD
WHERE id(filterCharacteristic416423) = 416423
AND ('Adobe RGB' IN filterValue416423.value ) OR ('ECI RGB' IN filterValue416423.value )
WITH DISTINCT childD
MATCH (childD)<-[:SET_FOR]-(filterValue416273)-[:SET_ON]->(filterCharacteristic416273)
USING JOIN ON childD
WHERE id(filterCharacteristic416273) = 416273 AND (filterValue416273.value >= 4) AND (filterValue416273.value <= 53)
WITH DISTINCT childD
MATCH (childD)<-[:SET_FOR]-(filterValue415431)-[:SET_ON]->(filterCharacteristic415431)
USING JOIN ON childD
WHERE id(filterCharacteristic415431) = 415431
AND ('Compact' IN filterValue415431.value )
OR ('Compact SLR' IN filterValue415431.value )
OR ('Large SLR' IN filterValue415431.value )
OR ('Rangefinder-style mirrorless' IN filterValue415431.value )
OR ('SLR-like (bridge)' IN filterValue415431.value )
WITH DISTINCT childD
MATCH (childD)<-[:SET_FOR]-(filterValue415441)-[:SET_ON]->(filterCharacteristic415441)
USING JOIN ON childD
WHERE id(filterCharacteristic415441) = 415441
AND ('Brass' IN filterValue415441.value )
OR ('Carbon fiber' IN filterValue415441.value )
OPTIONAL MATCH (childD)<-[:VOTED_FOR]-(vg:VoteGroup)-[:VOTED_ON]->(c:Criterion)
WHERE id(c) IN [415414, 415415, 415412, 415426, 415411]
WITH DISTINCT * MATCH (childD)-[ru:CREATED_BY]->(u:User)
WITH DISTINCT childD, ru, u, vg.avgVotesWeight as weight, vg.totalVotes as totalVotes
WITH DISTINCT ru, u, childD , toFloat(sum(weight)) as weight, toInt(sum(totalVotes)) as totalVotes
ORDER BY weight DESC
SKIP 0 LIMIT 10
RETURN childD
The main problem with your query, is that you are basically doing a lot of checks, with rows running wild. So here are some tips to reduce how many rows you are generating at each MATCH.
1) Unless you NEED duplicates, use WITH DISTINCT instead of just WITH. WITH can create duplicate rows (because you only cut off a column), and every duplicate row you process is wasted time and extra DB hits. (Namely, every filter column you drop adds duplicate rows)
2) :Value.value is overloaded. It has no semantic meaning, and the value isn't even guaranteed to be any kind of type. That means every :Value check has to go out and touch a bunch of :Value nodes that have nothing to do with what your searching for. So as the number of attached :Value nodes increases, the more expensive it becomes to find the right one (This is less expensive if it could be indexed, so that it could just find the right :Value, and see what it is connected to. This doesn't help if you can't change the schema you're working with, and by schema, I mean how your data/relationships are setup).
3) Only check what you need to check. It might seem more efficient to say (a:A)-[:TO]->(b:B), but if all [:TO] are from :A to :B, Neo4j now has to verify that the first node is an :A and the second node is a :B. Cypher doesn't know what is implicitly true, so it has to do the check, but each of these redundant checks has to go out and hit the DB for every row. So it is better to say (a)-[:TO]->(b).
4) Limit variable scope. Here, you match -[ru:CREATED_BY]->(u:User) at the beginning but than don't use it til the end, with no filters. This multiplies how many rows you have by the number of -[ru:CREATED_BY]->(u:User) on each decision, that ALL have to be checked in the further matches. Unless -[ru:CREATED_BY]->(u:User) somehow greatly limits the matched decisions (or there can only be one per decision), match this support information at the end.
5) Order your filters from strongest to weakest (if you can). to cut as many rows as early as possible.
6) Tricks to minimize rows. Each row pulled up makes the following steps in the query have to work that much harder, so minimize rows in queries. If you are using OR to combine unrelated, but similar columns queries (like all orgs with conditions A or orgs with conditions B) and the work of the two queries just make things more expensive for the other half, it might be better to use UNION to combine the results of smaller, faster queries (and UNION can run in parallel up to the merge results). Note that simple queries like WHERE org.id in [1,2,3] are still faster than UNION, since the work can all be done in one lookup.
Aside from union, if you are collecting nodes that you don't filter on, you can use collect(column) to reduce 'duplicates' down to 1 row, and than UNWIND (column) as column at the end of the query to get your rows back! (column here referring to variable name)
7) Doing a lot of filters on 1 node? Cypher has USING hints for that! The hint USING JOIN ON column tells Cypher that it will probably be more efficient doing this match with more starting leafs and joining them. So using USING JOIN ON childD on each match will tell Cypher to do all the filters in parallel, and use the overlapping rows of all of them. Note that USINGs are just you telling Cypher "trust me, this should go faster if we try doing this" which can actually make the query worse if you are wrong. (USING JOIN should be useful though for making large queries more parallel though)
UPDATE:
First, a note on node.id = "constant" AND node.value = "constant" OR node.id = "constant2" AND node.value = "constant2" vs node.value = map[node.id]. The first query is able to do node filtering on node lookup, while the later has to filter through all of the nodes that where already looked up. Without previous filtering on that lookup, that means the map has to pull in all nodes. While the map offers some level of (arguable) simplicity/flexibility, it is one of the least efficient ways to filter nodes.
Second, The big problem with your query now, is the :Value is super overloaded, and you aren't finding it by ID. :Value should be a relationship, or have an indexed ID field so that you don't have to touch ALL <-[:SET_FOR]- and -[:SET_ON]->. Using the Join hint I think will at least make SET_FOR higher priority, which appears to be the more efficient of the two.
Here is my attempt to rewrite the PROFILE query more efficiently. (v1)
MATCH (parentD)-[:CONTAINS]->(childD:Decision)
WHERE id(parentD) = 415406
MATCH (childD)<-[:SET_FOR]-(filterValue416423)-[:SET_ON]->(filterCharacteristic416423)
USING JOIN ON childD
WHERE id(filterCharacteristic416423) = 416423
WHERE ('Adobe RGB' IN filterValue416423.value ) OR ('ECI RGB' IN filterValue416423.value )
WITH DISTINCT childD
MATCH (childD)<-[:SET_FOR]-(filterValue416273)-[:SET_ON]->(filterCharacteristic416273)
USING JOIN ON childD
WHERE id(filterCharacteristic416273) = 416273 AND (filterValue416273.value >= 4) AND (filterValue416273.value <= 53)
WITH DISTINCT childD
MATCH (childD)<-[:SET_FOR]-(filterValue415431)-[:SET_ON]->(filterCharacteristic415431)
USING JOIN ON childD
WHERE id(filterCharacteristic415431) = 415431
WHERE ('Compact' IN filterValue415431.value )
OR ('Compact SLR' IN filterValue415431.value )
OR ('Large SLR' IN filterValue415431.value )
OR ('Rangefinder-style mirrorless' IN filterValue415431.value )
OR ('SLR-like (bridge)' IN filterValue415431.value )
WITH DISTINCT childD
MATCH (childD)<-[:SET_FOR]-(filterValue415441)-[:SET_ON]->(filterCharacteristic415441)
USING JOIN ON childD
WHERE id(filterCharacteristic415441) = 415441
WHERE ('Brass' IN filterValue415441.value )
OR ('Carbon fiber' IN filterValue415441.value )
OPTIONAL MATCH (childD)<-[:VOTED_FOR]-(vg:VoteGroup)-[:VOTED_ON]->(c:Criterion)
WHERE id(c) IN [415414, 415415, 415412, 415426, 415411]
WITH DISTINCT * MATCH (childD)-[ru:CREATED_BY]->(u:User)
WITH DISTINCT childD, ru, u, vg.avgVotesWeight as weight, vg.totalVotes as totalVotes
WITH DISTINCT ru, u, childD , toFloat(sum(weight)) as weight, toInt(sum(totalVotes)) as totalVotes
ORDER BY weight DESC
SKIP 0 LIMIT 10
RETURN ru, u, childD AS decision, weight, totalVotes,
[ (parentD)<-[:DEFINED_BY]-(entity)<-[:COMMENTED_ON]-(comg:CommentGroup)-[:COMMENTED_FOR]->(childD) |
{entityId: id(entity), types: labels(entity), totalComments: toInt(comg.totalComments)} ] AS commentGroups,
[ (parentD)<-[:DEFINED_BY]-(c1:Criterion)<-[:VOTED_ON]-(vg1:VoteGroup)-[:VOTED_FOR]->(childD) |
{criterionId: id(c1), weight: vg1.avgVotesWeight, totalVotes: toInt(vg1.totalVotes)} ] AS weightedCriteria,
[ (parentD)<-[:DEFINED_BY]-(ch1:Characteristic)<-[:SET_ON]-(v1)-[:SET_FOR]->(childD) |
{characteristicId: id(ch1), value: v1.value, valueType: ch1.valueType, visualMode: ch1.visualMode} ] AS valuedCharacteristics

Neo4j Cypher query and complex sorting

I have a following Cypher query:
MATCH (t:Tenant) WHERE ID(t) in {tenantIds}
OR t.isPublic
WITH COLLECT(t) as tenants
MATCH (parentD)-[:CONTAINS]->(childD:Decision)-[ru:CREATED_BY]->(u:User)
WHERE id(parentD) = {decisionId}
AND (not (parentD)-[:BELONGS_TO]-(:Tenant)
OR any(t in tenants WHERE (parentD)-[:BELONGS_TO]-(t)))
AND (not (childD)-[:BELONGS_TO]-(:Tenant)
OR any(t in tenants WHERE (childD)-[:BELONGS_TO]-(t)))
MATCH (childD)<-[:SET_FOR]-(filterValue630:Value)-[:SET_ON]->(filterCharacteristic630:Characteristic)
WHERE id(filterCharacteristic630) = 630
WITH filterValue630, childD, ru, u
WHERE (filterValue630.value <= 799621200000)
OPTIONAL MATCH (childD)<-[:SET_FOR]->(sortValue631:Value)-[:SET_ON]->(sortCharacteristic631:Characteristic)
WHERE id(sortCharacteristic631) = 631
RETURN ru, u, childD AS decision,
[ (parentD)<-[:DEFINED_BY]-(entity)<-[:COMMENTED_ON]-(comg:CommentGroup)-[:COMMENTED_FOR]->(childD)
| {entityId: id(entity), types: labels(entity), totalComments: toInt(comg.totalComments)} ] AS commentGroups,
[ (parentD)<-[:DEFINED_BY]-(c1:Criterion)<-[:VOTED_ON]-(vg1:VoteGroup)-[:VOTED_FOR]->(childD)
| {criterionId: id(c1), weight: vg1.avgVotesWeight, totalVotes: toInt(vg1.totalVotes)} ] AS weightedCriteria,
[ (parentD)<-[:DEFINED_BY]-(ch1:Characteristic)<-[:SET_ON]-(v1:Value)-[:SET_FOR]->(childD)
| {characteristicId: id(ch1), value: v1.value, valueType: ch1.valueType, visualMode: ch1.visualMode} ] AS valuedCharacteristics
ORDER BY sortValue631.value ASC, childD.createDate DESC
SKIP 0 LIMIT 100
as a result of this query execution I receive 15 records where each of them correctly contains populated commentGroups, weightedCriteria and valuedCharacteristics collections.
But when I changing my query to the following one(I'm adding sort condition by criteria weight):
MATCH (t:Tenant)
WHERE ID(t) in {tenantIds}
OR t.isPublic
WITH COLLECT(t) as tenants
MATCH (parentD)-[:CONTAINS]->(childD:Decision)-[ru:CREATED_BY]->(u:User)
WHERE id(parentD) = {decisionId}
AND (not (parentD)-[:BELONGS_TO]-(:Tenant)
OR any(t in tenants WHERE (parentD)-[:BELONGS_TO]-(t)))
AND (not (childD)-[:BELONGS_TO]-(:Tenant)
OR any(t in tenants WHERE (childD)-[:BELONGS_TO]-(t)))
MATCH (childD)<-[:SET_FOR]-(filterValue630:Value)-[:SET_ON]->(filterCharacteristic630:Characteristic)
WHERE id(filterCharacteristic630) = 630
WITH filterValue630, childD, ru, u
WHERE (filterValue630.value <= 799621200000)
OPTIONAL MATCH (childD)<-[:VOTED_FOR]-(vg:VoteGroup)-[:VOTED_ON]->(c:Criterion)
WHERE id(c) IN {criteriaIds}
WITH c, childD, ru, u, (vg.avgVotesWeight * (CASE WHEN c IS NOT NULL THEN coalesce({criteriaCoefficients}[toString(id(c))], 1.0) ELSE 1.0 END)) as weight, vg.totalVotes as totalVotes
OPTIONAL MATCH (childD)<-[:SET_FOR]->(sortValue631:Value)-[:SET_ON]->(sortCharacteristic631:Characteristic)
WHERE id(sortCharacteristic631) = 631
RETURN ru, u, childD AS decision, toFloat(sum(weight)) as weight, toInt(sum(totalVotes)) as totalVotes, sortValue631,
[ (parentD)<-[:DEFINED_BY]-(entity)<-[:COMMENTED_ON]-(comg:CommentGroup)-[:COMMENTED_FOR]->(childD)
| {entityId: id(entity), types: labels(entity), totalComments: toInt(comg.totalComments)} ] AS commentGroups,
[ (parentD)<-[:DEFINED_BY]-(c1:Criterion)<-[:VOTED_ON]-(vg1:VoteGroup)-[:VOTED_FOR]->(childD)
| {criterionId: id(c1), weight: vg1.avgVotesWeight, totalVotes: toInt(vg1.totalVotes)} ] AS weightedCriteria,
[ (parentD)<-[:DEFINED_BY]-(ch1:Characteristic)<-[:SET_ON]-(v1:Value)-[:SET_FOR]->(childD)
| {characteristicId: id(ch1), value: v1.value, valueType: ch1.valueType, visualMode: ch1.visualMode} ] AS valuedCharacteristics
ORDER BY weight DESC, totalVotes ASC, sortValue631.value ASC, childD.createDate DESC
SKIP 0 LIMIT 100
the query works without errors and returns the same result set of 15 records but commentGroups, weightedCriteria and valuedCharacteristics collections are only populated where weight > 0 The rest of them are null
This is wrong and not as expected. The commentGroups, weightedCriteria and valuedCharacteristics collections should be populated for all records in my result set as it was after the first query execution.
Right now I don't understand why the following part of new Cypher query prevents correct population of the mentioned collections:
OPTIONAL MATCH (childD)<-[:VOTED_FOR]-(vg:VoteGroup)-[:VOTED_ON]->(c:Criterion)
WHERE id(c) IN {criteriaIds}
WITH c, childD, ru, u, (vg.avgVotesWeight * (CASE WHEN c IS NOT NULL THEN coalesce({criteriaCoefficients}[toString(id(c))], 1.0) ELSE 1.0 END)) as weight, vg.totalVotes as totalVotes
What am I doing wrong within a new query and how to fix it?
UPDATED
This is the query which produces the issue:
MATCH (t:Tenant) WHERE ID(t) in []
OR t.isPublic
WITH COLLECT(t) as tenants
MATCH (parentD)-[:CONTAINS]->(childD:Decision)-[ru:CREATED_BY]->(u:User)
WHERE id(parentD) = 60565
AND (not (parentD)-[:BELONGS_TO]-(:Tenant)
OR any(t in tenants WHERE (parentD)-[:BELONGS_TO]-(t)))
AND (not (childD)-[:BELONGS_TO]-(:Tenant)
OR any(t in tenants WHERE (childD)-[:BELONGS_TO]-(t)))
MATCH (childD)<-[:SET_FOR]-(filterValue60639:Value)-[:SET_ON]->(filterCharacteristic60639:Characteristic)
WHERE id(filterCharacteristic60639) = 60639
WITH filterValue60639, childD, ru, u
WHERE (filterValue60639.value <= 799621200000)
OPTIONAL MATCH (childD)<-[:VOTED_FOR]-(vg:VoteGroup)-[:VOTED_ON]->(c:Criterion)
WHERE id(c) IN [60581, 60575]
WITH childD, ru, u, vg.avgVotesWeight as weight, vg.totalVotes as totalVotes
OPTIONAL MATCH (childD)<-[:SET_FOR]->(sortValue60640:Value)-[:SET_ON]->(sortCharacteristic60640:Characteristic)
WHERE id(sortCharacteristic60640) = 60640
RETURN ru, u, childD AS decision, toFloat(sum(weight)) as weight, toInt(sum(totalVotes)) as totalVotes, sortValue60640,
[ (parentD)<-[:DEFINED_BY]-(entity)<-[:COMMENTED_ON]-(comg:CommentGroup)-[:COMMENTED_FOR]->(childD)
| {entityId: id(entity), types: labels(entity), totalComments: toInt(comg.totalComments)} ] AS commentGroups,
[ (parentD)<-[:DEFINED_BY]-(c1:Criterion)<-[:VOTED_ON]-(vg1:VoteGroup)-[:VOTED_FOR]->(childD)
| {criterionId: id(c1), weight: vg1.avgVotesWeight, totalVotes: toInt(vg1.totalVotes)} ] AS weightedCriteria,
[ (parentD)<-[:DEFINED_BY]-(ch1:Characteristic)<-[:SET_ON]-(v1:Value)-[:SET_FOR]->(childD)
| {characteristicId: id(ch1), value: v1.value, valueType: ch1.valueType, visualMode: ch1.visualMode} ] AS valuedCharacteristics
ORDER BY weight DESC, totalVotes ASC, sortValue60640.value ASC, childD.createDate DESC
SKIP 0 LIMIT 100
for a some reason
OPTIONAL MATCH (childD)<-[:VOTED_FOR]-(vg:VoteGroup)-[:VOTED_ON]->(c:Criterion)
WHERE id(c) IN [60581, 60575]
prevents commentGroups, weightedCriteria and valuedCharacteristics collection population for all childD that do not match this expression.. How to fix this ?
Okay, this is a rather odd thing. I found something that should work, though at the moment I can't tell why it's working, just that it involves calculating weight and totalVotes before your return.
Take the first line of your RETURN, and replace it with this, which includes a WITH clause first, which will calculate the weight and totalVotes, then perform the RETURN:
WITH ru, u, childD, toFloat(sum(weight)) as weight, toInt(sum(totalVotes)) as totalVotes, sortValue60640
RETURN ru, u, childD AS decision, weight, totalVotes, sortValue60640,
One other thing to note, you can save some unnecessary operations by performing your ORDER BY, SKIP, and LIMIT operations before you perform your pattern comprehensions:
WITH ru, u, childD, toFloat(sum(weight)) as weight, toInt(sum(totalVotes)) as totalVotes, sortValue60640
ORDER BY weight DESC, totalVotes ASC, sortValue60640.value ASC, childD.createDate DESC
SKIP 0 LIMIT 100
RETURN ru, u, childD AS decision, weight, totalVotes, sortValue60640,
[ (parentD)<-[:DEFINED_BY]-(entity)<-[:COMMENTED_ON]-(comg:CommentGroup)-[:COMMENTED_FOR]->(childD)
| {entityId: id(entity), types: labels(entity), totalComments: toInt(comg.totalComments)} ] AS commentGroups,
[ (parentD)<-[:DEFINED_BY]-(c1:Criterion)<-[:VOTED_ON]-(vg1:VoteGroup)-[:VOTED_FOR]->(childD)
| {criterionId: id(c1), weight: vg1.avgVotesWeight, totalVotes: toInt(vg1.totalVotes)} ] AS weightedCriteria,
[ (parentD)<-[:DEFINED_BY]-(ch1:Characteristic)<-[:SET_ON]-(v1:Value)-[:SET_FOR]->(childD)
| {characteristicId: id(ch1), value: v1.value, valueType: ch1.valueType, visualMode: ch1.visualMode} ] AS valuedCharacteristics

Resources