I am searching a collection for small subgraphs using cypher. I'm looking for the pattern A->B->C but I want to also see instances where A, B, or C are isolated and subsets of the pattern such as instances where the pattern is incomplete eg. A->B or B->C
I know I can get isolated nodes using the following for each node type
MATCH (a:A)
WHERE NOT EXISTS ((a)--())
RETURN a
and I can match my pattern with
MATCH (a:A)-[r1]->(b:B)-[r2]->(c:C)
RETURN a,b,c,r1,r2
How can I search for parts of the pattern where there is say, A->B but no follow on relation for B? Finding incomplete patterns is required.
Note: the database is in Neo4j and it is not a fully connected graph but rather a set of many independent small graphs, hence why I can find 'part' of a pattern.
If there are only three nodes, then below will give you all subsets of the pattern (A)-(B)-(C). That is, isolated node A, B, C, connected A-B, B-C and A-C.
MATCH (p) WHERE NOT EXISTS ((p)--()) and labels(p) in [['A'], ['B'], ['C']]
RETURN p
UNION ALL
MATCH p=(:A)--(b:B) WHERE NOT EXISTS ((b)--(:C))
RETURN p
UNION ALL
MATCH p=(b:B)--(:C) WHERE NOT EXISTS ((:A)--(b))
RETURN p
UNION ALL
MATCH p=(:A)--(:C)
RETURN p
Sample Data:
CREATE (a1:A {title: "a1"})
CREATE (a2:A {title: "a2"})
CREATE (a3:A {title: "a3"})
CREATE (b1:B {title: "b1"})
CREATE (b2:B {title: "b2"})
CREATE (b3:B {title: "b3"})
CREATE (c1:C {title: "c1"})
CREATE (c2:C {title: "c2"})
CREATE (c3:C {title: "c3"})
CREATE (a1)-[:LINKS]->(b1)
CREATE (b2)-[:LINKS]->(c2)
CREATE (a3)-[:LINKS]->(b3)-[:LINKS]->(c3);
Result:
The below query should work, for the the above schema, you described, you can modify it according to your needs:
MATCH p = (x)-->() WHERE (NOT (x)-->()-->() AND NOT ()-->(x)-->()) AND labels(x) IN [['A'], ['B']]
RETURN p
UNION ALL
MATCH (p) WHERE NOT (p)--()
RETURN p
You might consider using replace. I encountered a simpler problem in genealogy: x-linked inheritance. It is characterized by lack of father to son inheritance. You can find paths by concatenating the sex and then filtering out 'MM', as follows:
match p=(n:Person{RN:1})-[:father|mother*0..99]->(x)
with n,x,reduce(s='', g in nodes(p) |s + g.sex) as cs
with n,x, cs where cs=replace(cs,'MM','')
return x.RN, cs
One example of the concatenated output:
MFFFFMFF
Perhaps you can use this to accept or exclude patterns of A, B, and C?
Related
I want to Traverse a PATH in neo4j (preferably using Cypher, but I can write neo4j managed extensions).
Problem -
For any starting node (:Person) I want to traverse hierarchy like
(me:Person)-[:FRIEND|:KNOWS*]->(newPerson:Person)
if the :FRIEND outgoing relationship is present then the path should traverse that, and ignore any :KNOWS outgoing relationships, if :FRIEND relationship does not exist but :KNOWS relationship is present then the PATH should traverse that node.
Right now the problem with above syntax is that it returns both the paths with :FRIEND and :KNOWS - I am not able to filter out a specific direction based on above requirement.
1. Example data set
For the ease of possible further answers and solutions I note my graph creating statement:
CREATE
(personA:Person {name:'Person A'})-[:FRIEND]->(personB:Person {name: 'Person B'}),
(personB)-[:FRIEND]->(personC:Person {name: 'Person C'}),
(personC)-[:FRIEND]->(personD:Person {name: 'Person D'}),
(personC)-[:FRIEND]->(personE:Person {name: 'Person E'}),
(personE)-[:FRIEND]->(personF:Person {name: 'Person F'}),
(personA)-[:KNOWS]->(personG:Person {name: 'Person G'}),
(personA)-[:KNOWS]->(personH:Person {name: 'Person H'}),
(personH)-[:KNOWS]->(personI:Person {name: 'Person I'}),
(personI)-[:FRIEND]->(personJ:Person {name: 'Person J'});
2. Scenario "Optional Match"
2.1 Solution
MATCH (startNode:Person {name:'Person A'})
OPTIONAL MATCH friendPath = (startNode)-[:FRIEND*]->(:Person)
OPTIONAL MATCH knowsPath = (startNode)-[:KNOWS*]->(:Person)
RETURN friendPath, knowsPath;
If you do not need every path to all nodes of the entire path, but only the whole, I recommend using shortestPath() for performance reasons.
2.1 Result
Note the missing node 'Person J', because it owns a FRIENDS relationship to node 'Person I'.
3. Scenario "Expand paths"
3.1 Solution
Alternatively you could use the Expand paths functions of the APOC user library. Depending on the next steps of your process you can choose between the identification of nodes, relationships or both.
MATCH (startNode:Person {name:'Person A'})
CALL apoc.path.subgraphNodes(startNode,
{maxLevel: -1, relationshipFilter: 'FRIEND>', labelFilter: '+Person'}) YIELD node AS friendNodes
CALL apoc.path.subgraphNodes(startNode,
{maxLevel: -1, relationshipFilter: 'KNOWS>', labelFilter: '+Person'}) YIELD node AS knowsNodes
WITH
collect(DISTINCT friendNodes.name) AS friendNodes,
collect(DISTINCT knowsNodes.name) AS knowsNodes
RETURN friendNodes, knowsNodes;
3.2 Explanation
line 1: defining your start node based on the name
line 2-3: Expand from the given startNode following the given relationships (relationshipFilter: 'FRIEND>') adhering to the label filter (labelFilter: '+Person').
line 4-5: Expand from the given startNode following the given relationships (relationshipFilter: 'KNOWS>') adhering to the label filter (labelFilter: '+Person').
line 7: aggregates all nodes by following the FRIEND relationship type (omit the .name part if you need the complete node)
line 8: aggregates all nodes by following the KNOWS relationship type (omit the .name part if you need the complete node)
line 9: render the resulting groups of nodes
3.3 Result
╒═════════════════════════════════════════════╤═════════════════════════════════════════════╕
│"friendNodes" │"knowsNodes" │
╞═════════════════════════════════════════════╪═════════════════════════════════════════════╡
│["Person A","Person B","Person C","Person E",│["Person A","Person H","Person G","Person I"]│
│"Person D","Person F"] │ │
└─────────────────────────────────────────────┴─────────────────────────────────────────────┘
MATCH p = (me:Person)-[:FRIEND|:KNOWS*]->(newPerson:Person)
WITH p, extract(r in relationships(p) | type(r)) AS types
RETURN p ORDER BY types asc LIMIT 1
This is a matter of interrogating the types of outgoing relationships for each node and then making a prioritized decision on which relationships to retain leveraging some nested case logic.
Using the small graph above
MATCH path = (a)-[r:KNOWS|FRIEND]->(b)
WITH a, COLLECT([type(r),a,r,b]) AS rels
WITH a,
rels,
CASE WHEN filter(el in rels WHERE el[0] = "FRIEND") THEN filter(el in rels WHERE el[0] = "FRIEND")
ELSE CASE WHEN filter(el in rels WHERE el[0] = "KNOWS") THEN filter(el in rels WHERE el[0] = "KNOWS") ELSE [''] END END AS search
UNWIND search AS s
RETURN s[1] AS a, s[2] AS r, s[3] AS b
I believe this returns your expected result:
Based on your logic, there should be no traversal to Person G or Person H from Person A, as there is a FRIEND relationship from Person A to Person B that takes precedence.
However there is a traversal from Person H to Person I because of the existence of the singular KNOWS relationship, and then a subsequent traversal from Person I to Person J.
I have a large graph where some of the relationships have properties that I want to use to effectively prune the graph as I create a subgraph. For example, if I have a property called 'relevance score' and I want to start at one node and sprawl out, collecting all nodes and relationships but pruning wherever a relationship has the above property.
My attempt to do so netted this query:
start n=node(15) match (n)-[r*]->(x) WHERE NOT HAS(r.relevance_score) return x, r
My attempt has two issues I cannot resolve:
1) Reflecting I believe this will not result in a pruned graph but rather a collection of disjoint graphs. Additionally:
2) I am getting the following error from what looks to be a correctly formed cypher query:
Type mismatch: expected Any, Map, Node or Relationship but was Collection<Relationship> (line 1, column 52 (offset: 51))
"start n=node(15) match (n)-[r*]->(x) WHERE NOT HAS(r.relevance_score) return x, r"
You should be able to use the ALL() function on the collection of relationships to enforce that for all relationships in the path, the property in question is null.
Using Gabor's sample graph, this query should work.
MATCH p = (n {name: 'n1'})-[rs1*]->()
WHERE ALL(rel in rs1 WHERE rel.relevance_score is null)
RETURN p
One solution that I can think of is to go through all relationships (with rs*), filter the the ones without the relevance_score property and see if the rs "path" is still the same. (I quoted "path" as technically it is not a Neo4j path).
I created a small example graph:
CREATE
(n1:Node {name: 'n1'}),
(n2:Node {name: 'n2'}),
(n3:Node {name: 'n3'}),
(n4:Node {name: 'n4'}),
(n5:Node {name: 'n5'}),
(n1)-[:REL {relevance_score: 0.5}]->(n2)-[:REL]->(n3),
(n1)-[:REL]->(n4)-[:REL]->(n5)
The graph contains a single relevant edge, between nodes n1 and n2.
The query (note that I used {name: 'n1'} to get the start node, you might use START node=...):
MATCH (n {name: 'n1'})-[rs1*]->(x)
UNWIND rs1 AS r
WITH n, rs1, x, r
WHERE NOT exists(r.relevance_score)
WITH n, rs1, x, collect(r) AS rs2
WHERE rs1 = rs2
RETURN n, x
The results:
╒══════════╤══════════╕
│n │x │
╞══════════╪══════════╡
│{name: n1}│{name: n4}│
├──────────┼──────────┤
│{name: n1}│{name: n5}│
└──────────┴──────────┘
Update: see InverseFalcon's answer for a simpler solution.
Using Neo4J and Cypher:
Given the diagram below, I want to be able to start at node 'A' and get all the children that have a 'ChildOf' relationship with 'A', but not an 'InactiveChildOf' relationship. So, in this example, I would get back A, C and G. Also, a node can get a new parent ('H' in the diagram) and if I ask for the children of 'H', I should get B, D and E.
I have tried
match (p:Item{name:'A'}) -[:ChildOf*]-(c:Item) where NOT (p)-[:InactiveChildOf]-(c) return p,c
however, that also returns D and E.
Also tried:
match (p:Item{name:'A'}) -[rels*]-(c:Item) where None (r in rels where type(r) = 'InactiveChildOf') return p,c
But that returns all.
Hopefully, this is easy for Neo4J and I am just missing something obvious. Appreciate the help!
Example data: MERGE (a:Item {name:'A'}) MERGE (b:Item {name:'B'}) MERGE (c:Item {name:'C'}) MERGE (d:Item {name:'D'}) MERGE (e:Item {name:'E'}) MERGE (f:Item {name:'F'}) MERGE (g:Item {name:'G'}) MERGE (h:Item {name:'H'}) MERGE (b)-[:ChildOf]->(a) MERGE (b)- [:InactiveChildOf] ->(a) MERGE (c)-[:ChildOf]->(a) MERGE (d)-[:ChildOf]->(b) MERGE (e)-[:ChildOf]->(b) MERGE (f)-[:ChildOf]->(c) MERGE (f)- [:InactiveChildOf] ->(c) MERGE (g)-[:ChildOf]->(c) MERGE (b)-[:ChildOf]->(h)
Note, I understand that I could simply put an "isActive" property on the ChildOf relationship or remove the relationship, but I am exploring options and trying to understand if this concept would work.
If a query interpreted as: find all the nodes, the path to which passes through the nodes unrelated by InactiveChildOf to the previous node, the request might be something like this:
match path = (p:Item{name:'A'})<-[:ChildOf*]-(c:Item)
with nodes(path) as nds
unwind range(0,size(nds)-2) as i
with nds,
nds[i] as i1,
nds[i+1] as i2
where not (i1)-[:InactiveChildOf]-(i2)
with nds,
count(i1) as test
where test = size(nds)-1
return head(nds),
last(nds)
Update: I think that this version is better (check that between two nodes there is no path that will contain at least one non-active type of relationship):
match path = (p:Item {name:'A'})<-[:ChildOf|InactiveChildOf*]-(c)
with p, c,
collect( filter( r in rels(path)
where type(r) = 'InactiveChildOf'
)
) as test
where all( t in test where size(t) = 0 )
return p, c
By reading and examining the graph, correct me if I'm wrong but the actual text representation of the cypher query should be
Find me nodes in a path to A, all nodes in that path cannot have an outgoing
InactiveChildOf relationship.
So, in Cypher it would be :
MATCH p=(i:Item {name:"A"})<-[:ChildOf*]-(x)
WHERE NONE( x IN nodes(p) WHERE (x)-[:InactiveChildOf]->() )
UNWIND nodes(p) AS n
RETURN distinct n
Which returns
Consider the following DB structure:
For your convenience, you can create it using:
create (p1:Person {name: "p1"}),(p2:Person {name: "p2"}),(p3:Person {name: "p3"}),(e1:Expertise {title: "Exp1"}),(e2:Expertise {title: "Exp2"}),(e3:Expertise {title: "Exp3"}),(p1)-[r1:Expert]->(e1),(p1)-[r2:Expert]->(e2),(p2)-[r3:Expert]->(e2),(p3)-[r4:Expert]->(e3),(p2)-[r5:Expert]->(e3)
I want to be able to find all Person nodes that are not related to a specific Expertise node, e.g. "Exp2"
I tried
MATCH (p:Person)--(e:Expertise)
WHERE NOT (e.title = "Exp2")
RETURN p
But it returns all the Person nodes (while I expected it to return only p3).
Logically, this result makes sense because each of these nodes is related to at least one Expertise that is not Exp2.
But what I want is to find all the Person nodes that are not related to Exp2, even if they are related to other nodes as well.
How can this be done?
Edit
It appears that I wasn't clear on the requirements. This is a (very) simplified way of presenting my problem with a much more complicated DB.
Consider the possibility that Expertise has more properties which I would like to use in the same query (not necessarily with negation). For example:
MATCH (p)--(e)
WHERE e.someProp > 5 AND e.anotherProp = "cookie" AND NOT e.title = "Exp2"
UPDATE
You need to restrict it a bit more, meaning to only the person
MATCH (p:Person), (e:Expertise {title="Exp2"})
WHERE NOT (p)-[]->(e)
RETURN p
I think you will be just fine with the <> operator :
MATCH (p:Person)--(e:Expertise)
WHERE e.title <> "Exp2"
RETURN p
Or you can express it in a pattern :
MATCH (p:Person)
WHERE NOT EXISTS((p)--(e:Expertise {title:"Exp2"}))
RETURN p
Little change query from #ChristopheWillemsen:
MATCH (e:Expertise) WHERE e.someProperty > 5 AND NOT e.title = someValue
WITH collect(e) as es
MATCH (p:Person) WHERE all(e in es WHERE NOT Exists( (p)--(e) ) )
RETURN p
UPDATE:
// Collect the `Expertise` for which the following conditions:
MATCH (e:Expertise) WHERE e.num > 3 AND e.title = 'Exp2'
WITH collect(e) as es
// Select the users who do not connect with any of of expertise from `es` set:
OPTIONAL MATCH (p:Person) WHERE all(e in es WHERE NOT Exists( (p)--(e) ) )
RETURN es, collect(p)
Another query with some optimization:
// Get the set of `Expertise-node` for which the following conditions:
MATCH (e:Expertise) WHERE e.num > 3 AND e.title = 'Exp2'
// Collect all `Person-node` connected to node from the `Expertise-node` set:
OPTIONAL MATCH (e)--(p:Person)
WITH collect(e) as es, collect(distinct id(p)) as eps
//Get all `Person-node` not in `eps` set:
OPTIONAL MATCH (p:Person) WHERE NOT id(p) IN eps
RETURN es, collect(p)
suppose i have following relationships stored in neo4j.
A->B,A->D,C->B,C->E
Here A, C are of same label nodes and B, E also are of same label nodes.
What is the cypher query to count how many nodes A and C have in common?
Based on that I want to make to a relationship between A and C. I would like to add a relationship rank between them and give it some value say 0.5 because 1 node common. What would that query look like?
To return the number of common nodes between A and C match a pattern that has A at one and C a the other with an intermediary node. Then count the occurrences of the intermediary node.
match (:TypeOne {name: 'A'})--(common)--(:TypeOne {name: 'C'})
return count(common)
If you want to create a relationship directly between A and C as a result of the match then use merge or create with the A and C nodes. And use set to add a value to the newly created relationship.
Something like this should satisfy your requirements.
match (a:TypeOne {name: 'A'})--(common)--(c:TypeOne {name: 'C'})
with a, c, count(common) as in_common
merge (a)-[rel:COMMON_WITH]->(c)
set rel.value = in_common * 0.5
return *