Neo4j Aggregate Multiple Lines into a Map - neo4j

I have the following Cypher script:
MATCH (sy:SchoolYear)<-[:TERM_OF*]-()<-[:DAY_OF]-(d:Day)
WHERE sy.year = 2015
OPTIONAL MATCH (d)<-[:START]-(e:Enrollment)-[:AT]->(s:School)
RETURN d.date, s.abbreviation, count(e)
ORDER BY d.date
This gives me all of the dates in the range that I want and returns number of students that have enrolled for each school for that date, or null. The only issue I have is that different schools are on different lines, causing a single date to have multiple lines. I would like to aggregate those into a single line per date.
I'm given:
1/1/2000, School 1, 5
1/1/2000, School 2, 10
1/2/2000, null, null
1/3/2000, School 1, 6
What I would like:
1/1/2000, {School 1 : 5, School 2: 10}
1/2/2000, null
1/3/2000, {School 1: 6}
I've tried:
MATCH (sy:SchoolYear)<-[TERM_OF*]-()<-[:DAY_OF]-(d:Day)
WHERE sy.year = 2015
OPTIONAL MATCH (d)<-[:START]-(e:Enrollment)-[:AT]->(s:School)
WITH d, s.abbreviation as abb, count(e) as enr
RETURN d.date, {abb:enr}
ORDER BY d.date
How should I go about this?

Here is how I would go with this aggregate each school into a map and the maps into a collection
MATCH (sy:SchoolYear)<-[TERM_OF*]-()<-[:DAY_OF]-(d:Day)
WHERE sy.year = 2015
OPTIONAL MATCH (d)<-[:START]-(e:Enrollment)-[:AT]->(s:School)
WITH d, s, count(e) as students
RETURN d.date, collect({name:s.abbreviation, students:students})
ORDER BY d.date

This is a bit ugly, but I think it returns what you are after. I tried using the school name as a key just like you did in your example and I could not get that to work either. In the end I resorted to this.
MATCH (sy:SchoolYear)<-[TERM_OF*]-()<-[:DAY_OF]-(d:Day)
WHERE sy.year = 2015
OPTIONAL MATCH (d)<-[:START]-(e:Enrolment)-[:AT]->(s:School)
// collect the schools and their counts together
with d, [s.abbreviation, count(e)] as school_count
// collect all of the school counts together by date
with d.date as date, collect(school_count) as school_counts
// format the school counts as a string with the schools
// as keys and the counts as values
with date, reduce( out = "", s in school_counts | out + s[0] + " : " + s[1] + ", " ) as school_count_str
return date, '{ ' + left(school_count_str, length(school_count_str)-2) + ' }' as school_counts
order by date

Related

Neo4j: Why difference in result?

I have the Cypher query:
match(p:Product {StyleNumber : "Z94882A", Color: "Black"})--(stock:Stock {Retailer: "11"})
with sum(stock.Stockcount) as onstock, p
optional match(p)-->(s:Sale {Retailer : "11"})
where s.Date = 20170801
return p.Color,p.Size, onstock as stock, sum(s.Quantity) as sold
This gives correctly:
Color,Size,Stock,Sold
Black,M,3,0
Black,S,3,1
Black,L,1,1
Black,XL,5,2
But if I leave out the Size property in the return statement,and just return:
return p.Color, onstock as stock, sum(s.Quantity) as sold
This only returns 3 rows (Size "M" is missing):
Black,3,1
Black,1,1
Black,5,2
I can't figure out why there is a difference in the result?
Because you are using the sum() aggregation function.
Cypher doesn't have a GROUP BY clause (like traditional SQL databases), but when you use an aggregation function all non-aggregated fields are implicitly used as grouping fields.
So when you remove p.Size from return the first row is grouped with the second row because all values implicitly grouped are equals (p.Color = 'Black' and onstock = 3). Also, the values of the Sold column are used in the sum() function (0 + 1 = 1), producing the row:
Black,3,1

Select 'x' returns "x"()

I want to return 3 columns, A(description), 'D'(hard coded value of D), Q(date)
=query('Detailed Plan'!$A$2:$Q, "select A,'D',Q where D = date)
It returns the following results. Rows 2 and greater are exactly what I want and would be perfect if I didn't get the first row. How do I get a hard coded value into a column without "D"() showing up in the first row?
blank, "D"(), blank
Description, D, date
Description, D, date
Description, D, date
Thanks so much for any help that is provided.
You can use 'label' in the query string.
=query('Detailed Plan'!$A$2:$Q, "select A, D, Q where D = date label A 'Description', D 'Some value', Q 'Date' ", 0)
EDIT: if you don't need headers at all, try
=query('Detailed Plan'!$A$2:$Q, "select A, D, Q where D = date ", 0)

Confused about using ORDER BY correctly

I am experimenting with dates in neo4j. Now I'd like to sort my results by the ISODateString. I created a cypher query like this:
MATCH(e:Expedition {id : "BJGYmzwZb"})-[pje]-(u:User)
WHERE (e)-[:POSSIBLY_JOINS_EXPEDITION]-(u) OR (e)-[:JOINS_EXPEDITION]-(u)
WITH e, u, apoc.date.parse(pje.createdAt, 's',"yyyy-MM-dd'T'HH:mm:ss.SSS'Z'") as date
ORDER by date
OPTIONAL MATCH(u)<-[invitee:POSSIBLY_JOINS_EXPEDITION]-(e)
OPTIONAL MATCH(u)-[attendee:JOINS_EXPEDITION]->(e)
OPTIONAL MATCH(u)-[applicant:POSSIBLY_JOINS_EXPEDITION]->(e)
RETURN date, {user: properties(u), isInvitee: COUNT(invitee) > 0, isApplicant: COUNT(applicant) > 0, isAttendee: COUNT(attendee) > 0} as u
The returned results are not sorted properly. Whereas the following query does return the results in the right order. I just removed the parts with OPTIONAL MATCH.
MATCH(e:Expedition {id : "BJGYmzwZb"})-[pje]-(u:User)
WHERE (e)-[:POSSIBLY_JOINS_EXPEDITION]-(u) OR (e)-[:JOINS_EXPEDITION]-(u)
WITH e, u, apoc.date.parse(pje.createdAt, 's',"yyyy-MM-dd'T'HH:mm:ss.SSS'Z'") as date
ORDER by date
RETURN date, {user: properties(u)} as u
Any suggestions what I am doing wrong? Do I need to deal differently with the OPTIONAL MATCH-additions?
Put ORDER by date after the RETURN statement, like this:
MATCH(e:Expedition {id : "BJGYmzwZb"})-[pje]-(u:User)
WHERE (e)-[:POSSIBLY_JOINS_EXPEDITION]-(u) OR (e)-[:JOINS_EXPEDITION]-(u)
WITH e, u, apoc.date.parse(pje.createdAt, 's',"yyyy-MM-dd'T'HH:mm:ss.SSS'Z'") as date
OPTIONAL MATCH(u)<-[invitee:POSSIBLY_JOINS_EXPEDITION]-(e)
OPTIONAL MATCH(u)-[attendee:JOINS_EXPEDITION]->(e)
OPTIONAL MATCH(u)-[applicant:POSSIBLY_JOINS_EXPEDITION]->(e)
RETURN date, {user: properties(u), isInvitee: COUNT(invitee) > 0, isApplicant: COUNT(applicant) > 0, isAttendee: COUNT(attendee) > 0} as u
ORDER by date

Neo4j: Conditional return/IF clause/String manipulation

This is in continuation of Neo4j: Listing node labels
I am constructing a dynamic MATCH statement to return the hierarchy structure & use the output as a Neo4j JDBC input to query the data from a java method:
MATCH p=(:Service)<-[*]-(:Anomaly)
WITH head(nodes(p)) AS Service, p, count(p) AS cnt
RETURN DISTINCT Service.company_id, Service.company_site_id,
"MATCH srvhier=(" +
reduce(labels = "", n IN nodes(p) | labels + labels(n)[0] +
"<-[:BELONGS_TO]-") + ") WHERE Service.company_id = {1} AND
Service.company_site_id = {2} AND Anomaly.name={3} RETURN " +
reduce(labels = "", n IN nodes(p) | labels + labels(n)[0] + ".name,");
The output is as follows:
MATCH srvhier=(Service<-[:BELONGS_TO]-Category<-[:BELONGS_TO]-SubService<-
[:BELONGS_TO]-Assets<-[:BELONGS_TO]-Anomaly<-[:BELONGS_TO]-) WHERE
Service.company_id = {1} and Service.company_site_id = {21} and
Anomaly.name={3} RETURN Service.name, Category.name, SubService.name,
Assets.name, Anomaly.name,
The problem I am seeing:
The "BELONGS_TO" gets appended to my last node
Line 2: Assets<-[:BELONGS_TO]-Anomaly**<-[:BELONGS_TO]-**
Are there string functions (I have looked at Substring..) that can be used to remove it? Or can I use a CASE statement with condition n=cnt to append "BELONGS_TO"?
The same problem persists with my last line:
Line 5: Assets.name,Anomaly.name**,** - the additional "," that I need to eliminate.
Thanks.
I think you need to introduce a case statement into the reduce clause something like this snippet below. If the node isn't the last element of the collection then append the "<-[:BELONGS_TO]-" relationship. If it is the last element then don't append it.
...
reduce(labels = "", n IN nodes(p) |
CASE
WHEN n <> nodes(p)[length(nodes(p))-1] THEN
labels + labels(n)[0] + "<-[:BELONGS_TO]-"
ELSE
labels + labels(n)[0]
END
...
Cypher has a substring function that works basically like you'd expect. An example: here's how you'd return everything but the last three characters of a string:
return substring("hello", 0, length("hello")-3);
(That returns "he")
So you could use substring to trim the last separator off of your query that you don't want.
But I don't understand why you're building your query in such a complex way; you're using cypher to write cypher (which is OK) but (and I don't understand your data model 100%) it seems to me like there's probably an easier way to write this query.

neo4j counting relations of multiple nodes

I work on neo4j graph , I wrote this query
match (rec:Recipe) , (rec1:Recipe) , (rec)-[r:ContainsIngredient]->() , (rec1)- [r1:ContainsIngredient]->()
where rec.name = "a" AND rec1.name = "b"
return count(r) , count(r1)
it returns the same value , although Recipe("a") have three relations and Recipe("b") have 5 relations .
note : I noticed that it always returns the bigger value .
You aren't grouping by the recipe name. Try this:
MATCH (rec:Recipe)
WHERE rec.name = "a" OR rec.name = "b"
MATCH (rec)-[:ContainsIngredient]->()
RETURN rec.name, COUNT(*)

Resources