How to get all nodes from "multi match" cypher - neo4j

I'm trying to make a cypher query to make nodes list which is using "multi match" as follows:
MATCH (N1:Node)-[r1:write]->(P1:Node),
(P1:Node)-[r2:access]->(P2:Node)<-[r3:create]-(P1)
WHERE r1.Time <r3.Time and r1.ID = r2.ID and r3.Time < r2.Time
return nodes(*)
I expect the output of the Cypher to be all nodes of the result, but Cypher doesn't support nodes(*).
I know that there is a way to resolve this like thisa;
MATCH G1=(N1:Node)-[r1:write]->(P1:Node),
G2=(P1:Node)-[r2:access]->(P2:Node)<-[r3:create]-(P1)
WHERE r1.Time <r3.Time and r1.ID = r2.ID and r3.Time < r2.Time
return nodes(G1), nodes(G2)
But the match part could be changed frequently so I want to know the way to get nodes from multi-match without handling variable for every match.
Is it possible?

Your specific example is easy to consolidate into a single path, as in:
MATCH p=(:Node)-[r1:write]->(p1:Node)-[r2:access]->(:Node)<-[r3:create]-(p1)
WHERE r1.Time < r3.Time < r2.Time AND r1.ID = r2.ID
RETURN NODES(p)
If you want the get distinct nodes, you can replace RETURN NODES(p) with UNWIND NODES(p) AS n RETURN DISTINCT n.
But, in general, you might be able to use the UNION clause to join the results of multiple disjoint statements, as in:
MATCH p=(:Node)-[r1:write]->(p1:Node)-[r2:access]->(:Node)<-[r3:create]-(p1)
WHERE r1.Time < r3.Time < r2.Time AND r1.ID = r2.ID
UNWIND NODES(p) AS n
RETURN n
UNION
MATCH p=(q0:Node)-[s1:create]->(q1:Node)-[s2:read]->(q0)
WHERE s2.Time > s1.Time AND s1.ID = s2.ID
UNWIND NODES(p) AS n
RETURN n
The UNION clause would combine the 2 results (after removing duplicates). UNION ALL can be used instead to keep all the duplicate results. The drawback of using UNION (ALL) is that variables cannot be shared between the sub-statements.

Related

How Many Nodes Are Involved in a Match

How can I know how many nodes and edges are involved in a MATCH? Is there another way besides Explain / Profile Match?
If you mean how many nodes are matched in a path, such as a variable-length path, then you can assign a path variable for this:
MATCH p = (k:Person {name:'Keanu Reeves'})-[*..8]-(t:Person {name:'Tom Hanks'})
WITH p LIMIT 1
RETURN p, length(p) as pathLength, length(p) + 1 as numberOfNodesInPath
You can also use nodes(p) and relationships(p) to get the collection of nodes and relationships that make up the path, and you can use size() on those collections to get their size.
There exists the COUNT() function of Cypher that allows you to count the number of elements. As for example in this query:
MATCH (n)
RETURN COUNT(n);
This query will count all nodes in your database.
You can find more information in the cypher manual, under the aggregating functions. Check it out.
The following Cypher snippet should return the number of distinct nodes and relationships found by any given MATCH clause. Just replace <your code here> with your MATCH pattern.
MATCH <your code here>
WITH COLLECT(NODES(p)) AS ns, SUM(SIZE(RELATIONSHIPS(p))) AS relCount
UNWIND ns AS nodeList
UNWIND nodeList AS node
RETURN COUNT(DISTINCT node) AS nodeCount, relCount;

neo4j cypher to filter multi paths based on two relationships

I have the following graph:
I need to get all the AD nodes which are related to a particular User node. If I search by a user B1, I should get all the AD nodes which are connected by HAS relation to B1 node as well as the AD nodes which are connected to its parent by HAS relation. But if any of these AD nodes are connected by an EXCLUDES relation, I should filter that one out.
For example, if I search by B1, I should get AD4,AD2
AD1 has EXCLUDES with D1 and AD3 has excludes with C1, hence filtered out.
I am using the following cypher
MATCH path=(p:AD)-[:HAS|EXCLUDES]-()<-[:CHILD_OF*]-(u:User) USING INDEX u:User(id) WHERE u.id = 'B1'
with p,
collect( filter( r in rels(path)
where type(r) = 'EXCLUDES'
)
) as test
where all( t in test where size(t) = 0 )
return p
The issue is when I search with C1, it return AD4,AD3,AD2. How can I eliminate AD3 from the result?
:CHILD_OF* doesn't include your starting node. To include that, set a lowerbound of 0:
[:CHILD_OF*0..]
That said, there are probably better ways to form your query. Try this, maybe:
MATCH (u:User)
WHERE u.id = 'B1'
WITH u, [(p:AD)-[:EXCLUDES]-()<-[:CHILD_OF*0..]-(u) | p] as excluded
MATCH (p:AD)-[:HAS]-()<-[:CHILD_OF*0..]-(u)
WHERE not p in excluded
RETURN p
EDIT
The pattern comprehension feature was released with Neo4j 3.1. You won't be able to use that in an older version. Try this instead:
MATCH (u:User)
WHERE u.id = 'B1'
OPTIONAL MATCH (p:AD)-[:EXCLUDES]-()<-[:CHILD_OF*0..]-(u)
WITH u, collect(p) as excluded
MATCH (p:AD)-[:HAS]-()<-[:CHILD_OF*0..]-(u)
WHERE not p in excluded
RETURN p

neo4j cypher query for filtering paths

I am using neo4j to store data with nodes having 1 of 2 labels :Person and Organization. All nodes have a property :name
All the relationships are labeled LinkedTo and have a property :score. There might be multiple relations between one pair of Person and Organization nodes
I have used path queries to search paths between these nodes like:
MATCH (n:Person) WHERE n.name =~ "(?i)person1"
MATCH (m:Organization) WHERE m.name =~ "(?i)organization1"
WITH m,n
MATCH p = (m)-[*1..4]-(n)
RETURN p ORDER BY length(p) LIMIT 10
This returns all paths (upto 10)
Now I want to find specific paths, with all relations involved having a score=1. Not sure how to achieve this, I started with MATCH p = (m)-[f*1..4]-(n) but it got a deprecation warning. So after some googling and trials and errors, I came up with this:
MATCH (n:Person) WHERE n.name =~ "(?i)person1"
MATCH (m:Organization) WHERE m.name =~ "(?i)organization1"
WITH m,n
MATCH p = (m)-[*1..4]-(n)
WITH filter(x IN relationships(p) WHERE x.score=1) AS f
ORDER BY length(p)
UNWIND f AS ff
MATCH (a)-[ff]-(b)
RETURN a,b,ff LIMIT 10
But this is not correct and not clean and is giving me relationships and nodes that are not needed in the path.
This might be a basic cypher query but I am just a beginner and need help with this. :)
From what I have understand you are searching this query :
MATCH p = (m:Organization)-[rels*1..4]-(n:Person)
WHERE
n.name =~ "(?i)person1" AND
m.name =~ "(?i)organization1" AND
all(r IN rels WHERE r.score=1)
RETURN p
ORDER BY length(p)
Due to all(r IN rels WHERE r.score=1), Neo4j will just expand relationships in the path that have an attribute score set to 1.
Also you should note that you are using a regex condition for the node n & m, and this operator can't use indexes !
If your goal is to have a case-insensitive search, I advise you to create a sanitize field (ex _name) to store the name in lower-case, and to replace your regex with a CONTAINS or STARTS WITH

neo4j how to use count(distinct()) over the nodes of path

I search the longest path of my graph and I want to count the number of distinct nodes of this longest path.
I want to use count(distinct())
I tried two queries.
First is
match p=(primero)-[:ResponseTo*]-(segundo)
with max(length(p)) as lengthPath
match p1=(primero)-[:ResponseTo*]-(segundo)
where length(p1) = lengthPath
return nodes(p1)
The query result is a graph with the path nodes.
But if I tried the query
match p=(primero)-[:ResponseTo*]-(segundo)
with max(length(p)) as lengthPath
match p1=(primero)-[:ResponseTo*]-(segundo)
where length(p1) = lengthPath
return count(distinct(primero))
The result is
count(distinct(primero))
2
How can I use count(distinct()) over the node primero.
Node Primero has a field called id.
You should bind at least one of those nodes, add a direction and also consider a path-limit otherwise this is an extremely expensive query.
match p=(primero)-[:ResponseTo*..30]-(segundo)
with p order by length(p) desc limit 1
unwind nodes(p) as n
return distinct n;

Find shortest path between nodes with additional filter

I'm trying to model flights between airports on certain dates. So far my test graph looks like this:
Finding shortest path between for example LTN and WAW is trivial with:
MATCH (f:Airport {code: "LTN"}), (t:Airport {code: "WAW"}),
p = shortestPath((f)-[]-(t)) RETURN p
Which gives me:
But I have no idea how to get only paths with Flights that have relation FLIES_ON with given Date.
Link to Neo4j console
Here's what I would do with your given model. The other commenters' queries don't seem right, as they use ANY() instead of ALL(). You specifically said you only want paths where all Flight nodes on the path are attached to a given Date node with a :FLIES_ON relationship:
MATCH (LTN:Airport {code:"LTN"}),
(WAW:Airport {code:"WAW"}),
p =(LTN)-[:ROUTE*]-(WAW)
WHERE ALL(x IN FILTER(x IN NODES(p) WHERE x:Flight)
WHERE (x)<-[:FLIES_ON]-(:Date {date:"130114"}))
WITH p ORDER BY LENGTH(p) LIMIT 1
RETURN p
http://console.neo4j.org/r/xgz84y
though this would not be my preferred structure for this kind of data; in answering your question i might go this way instead. get the paths, filter the path and get the first one ordered by length.
in the console tests is runs faster than the one suggested above as the query plan is simpler.
Anyhoo i hope this at least points you in a good direction :)
MATCH (f:Airport { cd: "ltn" }),(t:Airport { cd: "waw" }), p =((f)-[r*]-(t))
WHERE ANY (x IN relationships(p)
WHERE type(x)='FLIES_ON') AND ANY (x IN nodes(p)
WHERE x.cd='130114')
RETURN p
ORDER BY length(p)
LIMIT 1
The problem is that using shortestPath or allShortestPaths will never include the Date nodes.
What you need to do is to filter the pattern with the date node (I don't know however how you store the date, so I'll take Ymd format:
MATCH (f:Airport {code: "LTN"}), (t:Airport {code: "WAW"})
MATCH p=(f)-[*]-(t)
WHERE ANY (r in rels(p) WHERE type(r) = 'FLIES_ON')
AND ANY (n in nodes(p) WHERE 'Date' IN labels(n) AND n.date = 20150120)
RETURN p
ORDER BY length(p)
LIMIT 1
Another solution and less costly, is to include the date in your match and building yourself the path with it :
MATCH (n:Date {date:20150120})
MATCH (f:Airport {code:"LTN"}), (t:Airport {code:"WAW"})
MATCH p=(f)<-[*]-(n)-[*]->(t)
RETURN distinct(p)
ORDER BY length(p)

Resources