I'm playing with cypher and I have some simple aggregation going on for me.
MATCH (p:Person)-[:HAS_CAR]->(n:Car)
RETURN n, count(p)
MATCH (p:Person)-[:HAS_APARTMENT]->(n:Apartment)
RETURN n, count(p)
MATCH (p:Person)-[:HAS_HOUSE]->(n:House)
RETURN n, count(p)
The problem is that I have to make 3 trips to the database to get all those results together. The problematic thing about that is that those queries are the last MATCH statement in a much bigger chain. Like this:
MATCH (:City { Id: 10})<-[:LIVES_IN]-(p:Person)
WITH p
MATCH ...
WITH p
MATCH ...
WITH p
MATCH ...
WITH p
MATCH ...
WITH p
MATCH p-[:HAS_CAR]->(n:Car)
RETURN n, count(p)
After all those MATCH ... WITH statements, only a few person nodes are matched so the last part of the query is very fast, but the initial part is not. I can't help but think that this could be improved because all three queries share a lot of statements.
I came up with this:
...
MATCH p-[:HAS_CAR|HAS_APARTMENT|HAS_HOUSE]->(n)
RETURN n, labels(n), count(p)
And I can work with that. But what if I wanted to mix in something like this:
MATCH p-[:KNOWS]->(:Person)-[:HAS_BIKE]->(n:Bike)
RETURN n, count(p)
Or even:
MATCH p-[:KNOWS]->(:Person)-[:HAS_BIKE|HAS_BOAT]->(n)
RETURN n, labels(n), count(p)
Can all of this be done in a single query and how?
Sometimes you need to use collections instead of rows to merge aggregation queries together and pass them along. This strategy might help... For example:
MATCH (p:Person)-[:HAS_CAR]->(car:Car)
WITH car, count(p) carCount
WITH collect({car:car, count:carCount}) as carCounts
MATCH (p:Person)-[:HAS_APARTMENT]->(n:Apartment)
WITH n, count(p) as apartmentCount, carCounts
RETURN collect({apartment:n, count:apartmentCount}) as apartmentCounts, carCounts
Update (see comments)--this lets you pass along the results of a filter and do a quick id lookup to find them again:
MATCH (p:Person)
WHERE p.name = "John" // or whatever else you need to filter on
WITH collect(id(p)) as pids
MATCH (p)-[:HAS_CAR]->(car:Car)
WHERE id(p) IN pids
WITH car, count(p) carCount, pids
WITH collect({car:car, count:carCount}) as carCounts, pids
MATCH (p)-[:HAS_APARTMENT]->(n:Apartment)
WHERE id(p) IN pids
WITH n, count(p) as apartmentCount, carCounts
RETURN collect({apartment:n, count:apartmentCount}) as apartmentCounts, carCounts
Related
How can I know how many nodes and edges are involved in a MATCH? Is there another way besides Explain / Profile Match?
If you mean how many nodes are matched in a path, such as a variable-length path, then you can assign a path variable for this:
MATCH p = (k:Person {name:'Keanu Reeves'})-[*..8]-(t:Person {name:'Tom Hanks'})
WITH p LIMIT 1
RETURN p, length(p) as pathLength, length(p) + 1 as numberOfNodesInPath
You can also use nodes(p) and relationships(p) to get the collection of nodes and relationships that make up the path, and you can use size() on those collections to get their size.
There exists the COUNT() function of Cypher that allows you to count the number of elements. As for example in this query:
MATCH (n)
RETURN COUNT(n);
This query will count all nodes in your database.
You can find more information in the cypher manual, under the aggregating functions. Check it out.
The following Cypher snippet should return the number of distinct nodes and relationships found by any given MATCH clause. Just replace <your code here> with your MATCH pattern.
MATCH <your code here>
WITH COLLECT(NODES(p)) AS ns, SUM(SIZE(RELATIONSHIPS(p))) AS relCount
UNWIND ns AS nodeList
UNWIND nodeList AS node
RETURN COUNT(DISTINCT node) AS nodeCount, relCount;
I am using neo4j to store data with nodes having 1 of 2 labels :Person and Organization. All nodes have a property :name
All the relationships are labeled LinkedTo and have a property :score. There might be multiple relations between one pair of Person and Organization nodes
I have used path queries to search paths between these nodes like:
MATCH (n:Person) WHERE n.name =~ "(?i)person1"
MATCH (m:Organization) WHERE m.name =~ "(?i)organization1"
WITH m,n
MATCH p = (m)-[*1..4]-(n)
RETURN p ORDER BY length(p) LIMIT 10
This returns all paths (upto 10)
Now I want to find specific paths, with all relations involved having a score=1. Not sure how to achieve this, I started with MATCH p = (m)-[f*1..4]-(n) but it got a deprecation warning. So after some googling and trials and errors, I came up with this:
MATCH (n:Person) WHERE n.name =~ "(?i)person1"
MATCH (m:Organization) WHERE m.name =~ "(?i)organization1"
WITH m,n
MATCH p = (m)-[*1..4]-(n)
WITH filter(x IN relationships(p) WHERE x.score=1) AS f
ORDER BY length(p)
UNWIND f AS ff
MATCH (a)-[ff]-(b)
RETURN a,b,ff LIMIT 10
But this is not correct and not clean and is giving me relationships and nodes that are not needed in the path.
This might be a basic cypher query but I am just a beginner and need help with this. :)
From what I have understand you are searching this query :
MATCH p = (m:Organization)-[rels*1..4]-(n:Person)
WHERE
n.name =~ "(?i)person1" AND
m.name =~ "(?i)organization1" AND
all(r IN rels WHERE r.score=1)
RETURN p
ORDER BY length(p)
Due to all(r IN rels WHERE r.score=1), Neo4j will just expand relationships in the path that have an attribute score set to 1.
Also you should note that you are using a regex condition for the node n & m, and this operator can't use indexes !
If your goal is to have a case-insensitive search, I advise you to create a sanitize field (ex _name) to store the name in lower-case, and to replace your regex with a CONTAINS or STARTS WITH
NOT RELEVANT - SKIP TO Important Edit.
I have the following query:
MATCH (n)
WHERE (n:person) AND n.id in ['af97ab48544b'] // id is our system identifier
OPTIONAL MATCH (n)-[r:friend|connected|owner]-(m)
WHERE (m:person OR m:dog OR m:cat)
RETURN n,r,m
This query returns all the persons, dogs and cats that have a relationship with a specific person. I would like to turn it over to receive all the nodes & relationships that NOT includes in this query results.
If it was SQL it would be
select * from graph where id NOT IN (my_query)
I think that the OPTIONAL MATCH is the problematic part. I How can I do it?
Any advice?
Thanks.
-- Important Edit --
Hey guys, sorry for changing my question but my requirements has been changed. I need to get the entire graph (all nodes and relationships) connected and disconnected except specific nodes by ids. The following query is working but only for single id, in case of more ids it isn't working.
MATCH (n) WHERE (n:person)
OPTIONAL MATCH (n)-[r:friend|connected|owner]-(m) WHERE (m:person OR m:dog OR m:cat)
WITH n,r,m
MATCH (excludeNode) WHERE excludeNode.id IN ['af97ab48544b']
WITH n,r,m,excludeNode WHERE NOT n.id = excludeNode.id AND (NOT m.id = excludeNode.id OR m is null)
RETURN n,m,r
Alternatively I tried simpler query:
MATCH (n) WHERE (n:person) AND NOT n.id IN ['af97ab48544b'] return n
But this one does not returns the relationships (remember I need disconnected nodes also).
How can I get the entire graph exclude specific nodes? That includes nodes and relationships, connected nodes and disconnected as well.
try this:
match (n) where not n.id = 'id to remove' optional match (n)-[r]-(m)
where not n.id in ['id to remove'] and not m.id in ['id to remove']
return n,r,m
You've gotta switch the 'perspective' of your query... start by looping over every node, then prune the ones that connect to your person.
MATCH (bad:person) WHERE bad.id IN ['af97ab48544b']
WITH COLLECT(bad) AS bads
MATCH path = (n:person) - [r:friend|:connected|:owner] -> (m)
WHERE n._id = '' AND (m:person OR m:cat OR m:dog) AND NOT ANY(bad IN bads WHERE bad IN NODES(path))
RETURN path
That said, this is a problem much more suited to SQL than to a graph. Any time you have to loop over every node with a label, you're in relational territory, the graph will be less efficient.
I wish to use the results of a UNION (n) as a filter for a subsequent match.
MATCH (n:Thing)-<<Insert valid match filters here>>
RETURN n
UNION
MATCH (n:Thing)-<<Insert a different set of match filters here>>
RETURN n;
n feeds into:
MATCH (n)-[:RELTYPE1]->(a:Artifact);
RETURN a;
I would expect to use a WITH statement, but I've struggled to figure out how structure the statement.
MATCH (n:Thing)-<<Insert valid match filters here>>
RETURN n
UNION
MATCH (n:Thing)-<<Insert a different set of match filters here>>
WITH n
MATCH (n)-[:RELTYPE1]->(a:Artifact);
RETURN a;
This was my original attempt, but the WITH is interpreted as the start of subquery of the UNION's second match (which makes sense).
I can see a few inelegant ways to make this work, but what is the proper approach?
I have been looking at your union example and it makes sense to me but I cannot see how I could make it work. But I am certainly not the guy with all of the answers. Is there a reason you couldn't do something like this though...
MATCH (n:Thing)
WHERE n.name = 'A'
WITH collect(n) as n1
MATCH (n:Thing)
WHERE n.name = 'B'
WITH n1 + collect(n) AS both
UNWIND both AS n
MATCH (n)-[:RELTYPE1]->(a:Artifact);
RETURN a;
I currently have this query:
START n=node(*)
MATCH (p:Person)-[:is_member]->(g:Group)
WHERE g.name ='FooManGroup'
RETURN p, count(p)
LIMIT 5
Say there are 42 people in FooManGroup, I want to return 5 of these people, with a count of 42.
Is this possible to do in one query?
Running this now returns 5 rows, which is fine, but a count of 104, which is the total number of nodes of any type in my DB.
Any suggestions?
You can use a WITH clause to do the counting of the persons, followed by an identical MATCH clause to do the matching of each person. Notice that you need to START on the p nodes and not just some n that will match any node in the graph:
MATCH (p:Person )-[:is_member]->(g:Group)
WHERE g.name ='FooManGroup'
WITH count(p) as personsInGroup
MATCH (p:Person)-[:is_member]->(g:Group)
WHERE g.name ='FooManGroup'
RETURN p, personsInGroup
LIMIT 5
It may not be the best or most elegant way to this, but it works. If you use cypher 2.0 it may be a bit more compact like this:
MATCH (p:Person)-[:is_member]->(g:Group {name: 'FooManGroup'})
WITH count(p) as personsInGroup
MATCH (p:Person)-[:is_member]->(g:Group {name: 'FooManGroup'})
RETURN p, personsInGroup
LIMIT 5
Relationship types are always uppercased in cypher, so :is_member should be :IS_MEMBER which I think is more readable:
MATCH (p:Person)-[:IS_MEMBER]->(g:Group {name: 'FooManGroup'})
WITH count(p) as personsInGroup
MATCH (p:Person)-[:IS_MEMBER]->(g:Group {name: 'FooManGroup'})
RETURN p, personsInGroup
LIMIT 5
Try this:
MATCH (p:Person)-[:is_member]->(g:Group)
WHERE g.name ='FooManGroup'
RETURN count(p), collect(p)[0..5]