Neo4j Version: 3.0.4
Objective of the below query is to eliminate duplicate bus service and bustop in a path, it work fine if i didn't provide the relationship count -[r:CONNECTSWITH]-> but if the relationship count defined -[r:CONNECTSWITH*..3]-> ,then its throwing
Key not found: r
Working:
OPTIONAL MATCH p=(o:PORT{name:"busstop1"})-[r:CONNECTSWITH]->(d:PORT{name:"busstop2"})
WHERE ALL(r1 IN rels(p)
WHERE 1 = size(filter(r2 IN rels(p) WHERE (r1.service = r2.service))))
AND ALL(n IN nodes(p) WHERE 1 = size(filter(m IN nodes(p) WHERE id(m) = id(n))))
RETURN p
LIMIT 10
Not Working:
OPTIONAL MATCH p=(o:PORT{name:"busstop1"})-[r:CONNECTSWITH*..3]->(d:PORT{name:"busstop2"})
WHERE ALL(r1 IN rels(p)
WHERE 1 = size(filter(r2 IN rels(p) WHERE (r1.service = r2.service))))
AND ALL(n IN nodes(p) WHERE 1 = size(filter(m IN nodes(p) WHERE id(m) = id(n))))
RETURN p
LIMIT 10
Work around Solution:
OPTIONAL MATCH p=(o:PORT{name:"busstop1"})-[r:CONNECTSWITH*..3]->(d:PORT{name:"busstop2"})
WHERE ALL(r1 in rels(p)
WHERE 1 = size(filter(r2 IN rels(p) WHERE (r1.service = r2.service)))) =
ALL(n IN nodes(p) WHERE 1 = size(filter(m IN nodes(p) WHERE id(m) = id(n))))
AND ALL(r1 in rels(p)
WHERE 1 = size(filter(r2 IN rels(p) WHERE (r1.service = r2.service))))
RETURN p
LIMIT 10
Aside: This feels like a neo4j bug. If you are encountering this with the latest neo4j version, you may want to submit a neo4j issue.
As a possible workaround, since the query does not actually use the r identifier, try removing it from the query.
Related
I am working on a query on the movie database to check the shortest paths between n nodes. In this simplified example, we want all the shortest paths between 2 movies:
match p=allShortestPaths((n)-[*]-(m)) where id(n) = 87 and id(m) = 121
return p;
Now I want to have all the shortest paths that don't include Keanu Reeves in it.
I tried this:
match p=allShortestPaths((n)-[*]-(m)) where id(n) = 87 and id(m) = 121 and NONE(n in nodes(p) where n.name = "Keanu Reeves")
return p;
This however takes an eternity to load, even after I have indexed the name field of Person...
Then In tried the following:
match p=allShortestPaths((n)-[*]-(m)) where id(n) = 87 and id(m) = 121
with p WHERE NONE(n in nodes(p) where n.name = "Keanu Reeves")
return p;
This however gives me no results. I misinterpreted this by thinking it would just return those paths which don't have Keanu Reeves in between:
(no changes, no records)
I tested if I could get only those with Keanu Reeves in between with the any() function. This works perfectly:
match p=allShortestPaths((n)-[*]-(m)) where id(n) = 87 and id(m) = 121
with p WHERE ANY(n in nodes(p) where n.name = "Keanu Reeves")
return p;
What is the best approach to tackle this? I must say that my production query is way more complex than this, but it all boils down to this problem. It has to be a performant solution.
The problem is that if one of the nodes from the path does not have the property name, then the entire check will not be passed.
So or do we check for the existence of a property:
match p=allShortestPaths((n)-[*]-(m))
where
n.title = 'The Replacements' and
m.title = 'Speed Racer' and
NONE(n in nodes(p) where EXISTS(n.name) and n.name = "Keanu Reeves")
return p
Or use the COALESCE function:
match p=allShortestPaths((n)-[*]-(m))
where
n.title = 'The Replacements' and
m.title = 'Speed Racer' and
NONE(n in nodes(p) where COALESCE(n.name, '') = "Keanu Reeves")
return p
Let's say, I have a path A->B->C->D and the relationships have a property val.
Now, I have to pick any two nodes from the path and if the rel.val>0.8
and if it is true for all the pair of nodes, then return the path
Ex:
P = A-->B-->C-->D
All nodes = [A,B,C,D]
return p if{
rel.val of (A,B) >0.8
rel.val of (A,C) >0.8
rel.val of (A,D) >0.8
rel.val of (B,C) >0.8
rel.val of (B,D) >0.8
rel.val of (C,D) >0.8
}
Here is my query, (of course the query is wrong):
MATCH p=(a{word:"quality"})-[r*1..2]->(b)
WHERE NONE (n IN nodes(p) WHERE size(filter(x IN nodes(p) WHERE n = x))> 1)
MATCH q = (a)-[r:coocr]->(b) where a in nodes(p) AND b in nodes(p) AND NOT b = a AND None(rel IN rels(q) WHERE rel.val < 0.8 )
RETURN p
In summary, you want to MATCH a path and then make sure that all pairs of nodes in your path are connected by a relationship which fullfills a certain criterion (rel.val > 0.8).
Interesting question, I think this is not really straightforward. Maybe I am overlooking something obvious?
Here is an idea how to approach the problem. You first MATCH your path, then MATCH between all nodes in the path and count the number of relationships with rel.val > 0.8. This number has to be the size of the factorial of the number of nodes (num relationships == (num nodes)!, number of possible combinations of 2).
The following query returns the number of relationships, but I don't know how to compare this to the factorial of the number of nodes:
// match your path like before
MATCH p=(a:Uselabel {word:"quality"})-[r:USETYPE*1..2]->(b)
// use unwind to get the nodes from the path
UNWIND nodes(path) AS x
// do this twice to match the nodes onto themselves
UNWIND nodes(path) AS y
// match your relationship
MATCH (x)-[rel:USETYPE]-(y)
// criterion for your relationship
WHERE rel.val > 0.8
// only if two different nodes
WHERE x <> y
// get the count of pairs
WITH p, count(DISTINCT rel) AS num_pairs
// now I don't know how to get/compare the factorial of the number of nodes :)
RETURN num_pairs
I didn't find a built-in function for the factorial, so you have to look into this.
I am running the following query that is meant to compare two collections nodes set1 and set2. All nodes in set2 are in set1, and I would like to identify all the nodes in set1 that are NOT in set2. However, the query returns a set of nodes that includes some of the nodes in set1. I am running this query on v2.1.7. Suggestions?
Query:
MATCH p=(a:ObjectConcept{sctid:233604007})<-[:ISA*]-(b:ObjectConcept)
with nodes(p) as set1, p
MATCH q=(a:ObjectConcept{sctid:34020007})<-[:ISA*]-(b:ObjectConcept)
with nodes(q) as set2,set1, p
WHERE ALL(x in set2 WHERE NOT x in set1)
with nodes(p) as pneumo
UNWIND pneumo AS pneumolist
RETURN distinct pneumolist.FSN,pneumolist.sctid
Alternative query, same result:
Query:
MATCH p=(a:ObjectConcept{sctid:233604007})<-[:ISA*]-(b:ObjectConcept)
with nodes(p) as set1, p
MATCH q=(a:ObjectConcept{sctid:34020007})<-[:ISA*]-(b:ObjectConcept)
with nodes(q) as set2,set1, p
WHERE NONE(x in set2 WHERE x in set1)
with nodes(p) as pneumo
UNWIND pneumo AS pneumolist
RETURN distinct pneumolist.FSN,pneumolist.sctid
Your matches don't return just one row as you might expect but many rows,
and your comparison is done between the cross product of those many row combinations. You probably want to create a set for each of your two subtrees first with a combination of unwind + collect(distinct)
The code below will not be as fast, as cypher internally doesn't have a Set concept yet.
try this
MATCH p=(a:ObjectConcept{sctid:233604007})<-[:ISA*]-(b:ObjectConcept)
unwind nodes(p) as n
with collect(distinct n) as set1
MATCH q=(a:ObjectConcept{sctid:34020007})<-[:ISA*]-(b:ObjectConcept)
unwind nodes(q) as m
with collect(distinct m) as set2
WHERE NONE(x in set2 WHERE x in set1)
UNWIND set1 AS pneumolist
RETURN distinct pneumolist.FSN,pneumolist.sctid
The following query was successful, and addresses Michael's discussion regarding cross products (above).
MATCH p=(a:ObjectConcept{sctid:233604007})<-[:ISA*]-(b:ObjectConcept)
with distinct nodes(p) as set1
UNWIND set1 as x1
with collect(DISTINCT x1) as set11
MATCH q=(a:ObjectConcept{sctid:34020007})<-[:ISA*]-(b:ObjectConcept)
with distinct nodes(q) as set2,set11
UNWIND set2 as x2
with collect(distinct x2) as set22,set11
with REDUCE(pneumo=[],x in set11|case when x in set22 then pneumo else pneumo
+ [x] END) AS pneumo
return pneumo
Given the following query
match (a)-->(b)-->(c)
where id(a) = 0
return b.name, collect(c) as cs
How it's possible to return a collection composed of only a couple of fields, instead of the whole nodes?
A single field is easy:
match (a)-->(b)-->(c)
where id(a) = 0
return b.name, collect(c.fieldName) as cs
For multiple field names, maybe concatenate them?
match (a)-->(b)-->(c)
where id(a) = 0
return b.name, collect(c.fieldName1 + delimiter + c.fieldName2) as cs
match (a)-->(b)-->(c)
where id(a) = 0
return b.name, collect( { field1: c.field1, field2: c.field2 } ) as cs
I want to find all paths given a start node
MATCH path=(n)-[rels*1..10]-(m)
with the following 2 conditions on path inlcusion:
true if relationship between subsequent nodes in path has property PROP='true'
if type(relationship)=SENDS then true if direction of the relationship is outgoing (from one path node to the next node in the path)
Another way of phrasing this is that direction doesn't matter unless the relationship name is SENDS
I can do condition 1 with WHERE ALL (r IN rels WHERE r.PROP='true') however ive no idea how to do condition 2.
The only way I can think of to filter on relationship direction without declaring direction in the match pattern is by comparing the start node of each relationship in the path with the node at the corresponding index of the nodes() collection from the path. For this you need the relationship and node collections from the path, an index counter and some boolean evaluation equivalent to ALL(). One way to do it is to use REDUCE with a collection for the accumulator, so you can accumulate index and maintain a true/false value for the path at the same time. Here's an example, the accumulator starts at [0,1] where the 0 is the index for testing that startNode(r) equals the node at the corresponding index in the node collection (i.e. it's an outgoing relationship) and the 1 represents true, which signifies that the path has not yet failed your conditions. For each relationship the index value is incremented and the CASE/WHEN clause multiplies the 'boolean' with 1 if your conditions are satisfied, with 0 if not. The evaluation of the path is then the evaluation of the second value in the collection returned by REDUCE -- if 1 then yay, if 0 then boo.
MATCH path = (n)-[*1..10]-(m)
WITH path, nodes(path) as ns, relationships(path) as rs
WHERE REDUCE(acc = [0,1], r IN rs |
[acc[0]+1, CASE WHEN
r.PROP='true' AND
(type(r) <> "SENDS" OR startNode(r) = ns[acc[0]]) THEN acc[1]*1 ELSE acc[1]*0 END]
)[1] = 1
RETURN path
or maybe this is more readable
WHERE REDUCE(acc = [0,1], r IN rs |
CASE WHEN
r.PROP=true AND
(type(r) <> "SENDS" OR startNode(r) = ns[acc[0]])
THEN [acc[0]+1, acc[1]*1]
ELSE [acc[0]+1, acc[1]*0]
END
)[1] = 1
Here's a console: http://console.neo4j.org/?id=v3kgz9
For completeness I've answered the question using jjaderberg correct solution plus a condition to fix the start node and ensure that no zero length paths are included
MATCH p = (n)-[*1..10]-(m)
WHERE ALL(n in nodes(p) WHERE 1=length(filter(m in nodes(p) WHERE m=n)))
AND (id(n)=1)
WITH p, nodes(p) as ns, relationships(p) as rs
WHERE REDUCE(acc = [0,1], r IN rs | [acc[0]+1,
CASE WHEN r.PROP='true' AND (type(r) <> "SEND" OR startNode(r) = ns[acc[0]])
THEN acc[1]*1
ELSE acc[1]*0
END])[1] = 1
RETURN nodes(p);
Or my alternative answer based on jjaderbags answer but does not use accumulator but which is slightly slower
MATCH p=(n)-[rels*1..10]-(m)
WHERE ALL(n in nodes(p) WHERE 1=length(filter(m in nodes(p) WHERE m=n)))
AND( ALL (r IN rels WHERE r.PROP='true')
AND id(n)=1)
WITH p, range(0,length(p)-1) AS idx, nodes(p) as ns, relationships(p) as rs
WHERE ALL (i in idx WHERE
CASE type(rs[i])='SEND'
WHEN TRUE THEN startnode(rs[i])=ns[i]
ELSE TRUE
END)
RETURN nodes(p);