I have this sample data
With the sample query
CREATE (a1:A {title: "a1"})
CREATE (a2:A {title: "a2"})
CREATE (a3:A {title: "a3"})
CREATE (b1:B {title: "b1"})
CREATE (b2:B {title: "b2"})
MATCH (a:A {title: "a1"}), (b:B {title: "b1"})
CREATE (a)-[r:LINKS]->(b)
MATCH (a:A), (b:B) return a,b
What I am trying to achieve:
Find all the node type A that are not connected to node type B (ans: a2, a3)
Find all the node type B that are not connected to node type A (ans: b2)
Both of this requirements are expected to be bi-directional, and have the same query template.
Where I have reached
Get all A not connected to B: gets me a2 and a3 as expected
MATCH path=(a:A)-[r]-(b:B)
WHERE (a)-[r]-(b)
WITH collect(a) as al
MATCH (c:A)
WHERE not c IN al
RETURN c
Get all disconnected B, I get both b1 and b2 which is incorrect, and printing "al" revealed that the list is empty
MATCH path=(b:B)-[r]-(a:A)
WHERE (b)-[r]-(a)
WITH collect(b) as al
MATCH (c:B)
WHERE not c IN al
RETURN c
some how
WHERE (b)-[r]-(a) **!=** WHERE (a)-[r]-(b)
even if I have the the direction as bi-directional (not mentioned)
If I change it to WHERE (a)-[r]-(b) in the second query then it works, but I want a generic bi-directional query.
Use the path pattern in where:
MATCH (a:A) WHERE NOT (a)-[:LINKS]-(:B)
RETURN a;
MATCH (b:B) WHERE NOT (b)-[:LINKS]-(:A)
RETURN b;
Or combine into one query:
OPTIONAL MATCH (a:A) WHERE NOT (a)-[:LINKS]-(:B)
WITH collect(a) AS aNodes
OPTIONAL MATCH (b:B) WHERE NOT (b)-[:LINKS]-(:A)
WITH aNodes,
collect(b) AS bNodes
RETURN aNodes, bNodes
Update: why the original query produces an incorrect result?
I think this is a bug. The problem is that when you use a variable for a relationship in where, the pattern implicitly uses the direction from left to right, even if it is not specified:
// Will return 0, but for test data should return 1
MATCH (b:B)-[r]-(a:A) WHERE (b)-[r]-(a)
RETURN COUNT(*);
// Will return 1
MATCH (b:B)-[r]-(a:A) WHERE (b)<-[r]-(a)
RETURN COUNT(*);
// Will return 1
MATCH (b:B)-[r]-(a:A) WHERE (b)--(a)
RETURN COUNT(*);
// Will return 1
MATCH (b:B)-[r]-(a:A) WHERE (a)-[r]-(b)
RETURN COUNT(*);
Related
I am searching a collection for small subgraphs using cypher. I'm looking for the pattern A->B->C but I want to also see instances where A, B, or C are isolated and subsets of the pattern such as instances where the pattern is incomplete eg. A->B or B->C
I know I can get isolated nodes using the following for each node type
MATCH (a:A)
WHERE NOT EXISTS ((a)--())
RETURN a
and I can match my pattern with
MATCH (a:A)-[r1]->(b:B)-[r2]->(c:C)
RETURN a,b,c,r1,r2
How can I search for parts of the pattern where there is say, A->B but no follow on relation for B? Finding incomplete patterns is required.
Note: the database is in Neo4j and it is not a fully connected graph but rather a set of many independent small graphs, hence why I can find 'part' of a pattern.
If there are only three nodes, then below will give you all subsets of the pattern (A)-(B)-(C). That is, isolated node A, B, C, connected A-B, B-C and A-C.
MATCH (p) WHERE NOT EXISTS ((p)--()) and labels(p) in [['A'], ['B'], ['C']]
RETURN p
UNION ALL
MATCH p=(:A)--(b:B) WHERE NOT EXISTS ((b)--(:C))
RETURN p
UNION ALL
MATCH p=(b:B)--(:C) WHERE NOT EXISTS ((:A)--(b))
RETURN p
UNION ALL
MATCH p=(:A)--(:C)
RETURN p
Sample Data:
CREATE (a1:A {title: "a1"})
CREATE (a2:A {title: "a2"})
CREATE (a3:A {title: "a3"})
CREATE (b1:B {title: "b1"})
CREATE (b2:B {title: "b2"})
CREATE (b3:B {title: "b3"})
CREATE (c1:C {title: "c1"})
CREATE (c2:C {title: "c2"})
CREATE (c3:C {title: "c3"})
CREATE (a1)-[:LINKS]->(b1)
CREATE (b2)-[:LINKS]->(c2)
CREATE (a3)-[:LINKS]->(b3)-[:LINKS]->(c3);
Result:
The below query should work, for the the above schema, you described, you can modify it according to your needs:
MATCH p = (x)-->() WHERE (NOT (x)-->()-->() AND NOT ()-->(x)-->()) AND labels(x) IN [['A'], ['B']]
RETURN p
UNION ALL
MATCH (p) WHERE NOT (p)--()
RETURN p
You might consider using replace. I encountered a simpler problem in genealogy: x-linked inheritance. It is characterized by lack of father to son inheritance. You can find paths by concatenating the sex and then filtering out 'MM', as follows:
match p=(n:Person{RN:1})-[:father|mother*0..99]->(x)
with n,x,reduce(s='', g in nodes(p) |s + g.sex) as cs
with n,x, cs where cs=replace(cs,'MM','')
return x.RN, cs
One example of the concatenated output:
MFFFFMFF
Perhaps you can use this to accept or exclude patterns of A, B, and C?
So I have a query along the lines of MATCH (g:Gene)-[r]-() RETURN DISTINCT type(r), count(r) which returns a breakdown table of the number of in/out-going edges from a gene node, per relationship type.
I want to do this on a number of nodes and instead of doing it one table at a time, it would be awesome to just return a table with relationship types in one column and counts per gene on the subsequent ones.
MATCH (g:Gene {name: "G1"})-[r]-(n)
RETURN DISTINCT type(r), count(r) as g1
UNION ALL MATCH (g:Gene {name: "G2"})-[r]-(n)
RETURN DISTINCT type(r), count(r) as g2
Doesn't work due to syntax error: All sub queries in an UNION must have the same column names (line 3, column 1 (offset: 108)). This is likely due to the fact that some genes don't have all the relationship types that others have.
If I do the following:
MATCH (g:Gene {name: "G1"})-[r]-(n)
RETURN DISTINCT type(r), null as g2, count(r) as g1
UNION ALL MATCH (g:Gene {name: "G2"})-[r]-(n)
RETURN DISTINCT type(r), null as g1, count(r) as g2
then I get duplicate rows for relationship types, where it's null for g1 in one and for g2 in the other.
What am I misunderstanding here?
The UNION error is because the columns do not match- changing it to
MATCH (g:Gene {name: "G1"})-[r]-(n)
RETURN DISTINCT type(r) as type, count(r) as count
UNION ALL
MATCH (g:Gene {name: "G2"})-[r]-(n)
RETURN DISTINCT type(r) as type , count(r) as count
will work.
You can use UNWIND. and you need to return the node (g in here).
UNWIND ["G1", "G2"] AS name
MATCH (g:Gene {name: name})-[r]-(n)
RETURN g , DISTINCT type(r) as type, count(r) as count
With Labels A, B, and Z, A and B have their own relationships to Z. With the query
MATCH (a:A)
MATCH (b:B { uuid: {id} })
MATCH (a)-[:rel1]->(z:Z)<-[:rel2]-(b)
WITH a, COLLECT(z) AS matched_z
RETURN DISTINCT a, matched_z
Which returns the nodes of A and all the Nodes Z that have a relationship to A and B
I'm stuck on trying to ALSO return a separate array of the Z Nodes that B has with Z but not with A (i.e. missing_z). I am attempting to do an initial query to return all the relationships between B & Z
results = MATCH (b:B { uuid: {id} })
MATCH (b)-[:rel2]->(z:Z)
RETURN DISTINCT COLLECT(z.uuid) AS z
MATCH (a:A)
MATCH (b:B { uuid: {id} })
MATCH (a)-[:rel1]->(z:Z)<-[:rel2]-(b)
WITH a, COLLECT(z) AS matched_z, z
RETURN DISTINCT a, matched_z, filter(skill IN z.array WHERE NOT z.uuid IN {results}) AS missing_z
The results seem to have nil for missing_z where one would assume it should be populated. Not sure if filter is the correct way to go with a WHERE NOT / IN scenario. Can the above 2 queries be combined into 1?
The hard part here, in my opinion, is that any failed matches will drop everything you have matched so far. But your starting point seems to be "All Z related by B.uuid", So start by collecting that and filtering/copying from there.
Use WITH + aggregation functions to copy+filter columns
Use OPTIONAL MATCH if a failure to match shouldn't drop already collected rows.
If I understand what you are trying to do well enough, This cypher should do the job, and just adjust it as needed (let me know if you need help understanding any part of it/adapting it)
// Match base set
MATCH (z:Z)<-[:rel2]-(b:B { uuid: {id} })
// Collect into single list
WITH COLLECT(z) as zs
// Match all A (ignore relation to Zs)
MATCH (a:A)
// For each a, return a, the sub-list of Zs related to a, and the sub-list of Zs not related to a
RETURN a as a, FILTER(n in zs WHERE (a)-[:rel1]->(n)) as matched, FILTER(n in zs WHERE NOT (a)-[:rel1]->(n)) as unmatched
This query might do what you want:
MATCH (z:Z)<-[:rel2]-(b:B { uuid: {id} })
WITH COLLECT(z) as all_zs
UNWIND all_zs AS z
MATCH (a)-[:rel1]->(z)
WITH all_zs, COLLECT(DISTINCT z) AS matched_zs
RETURN matched_zs, apoc.coll.subtract(all_zs, matched_zs) AS missing_zs;
It first stores in the all_zs variable all the Z nodes that have a rel2 relationship from b. This collection's contents remain unaffected even if the second MATCH clause matches a subset of those Z nodes.
It then stores in matched_zs the distinct all_zs nodes that have a rel1 relationship from any A node.
Finally, it returns:
the matched_zs collection, and
the unique nodes from all_zs that are not also in matched_zs, as missing_zs.
The query uses the convenient APOC function apoc.coll.subtract to generate the latter return value.
I opened this question here: How to find specific subgraph in Neo4j using where clause to find a path of a certain criteria. Yet when I try to do things like get the relationship type I cannot.
For example I tried MATCH p = (n:Root)-[rs1*]->()
WHERE ALL(rel in rs1 WHERE rel.relevance is null)
RETURN nodes(p), TYPE(relationships(p))
But I get the error:
Type mismatch: expected Relationship but was Collection<Relationship>
I think I need to use a WITH clause but not sure.
Similarly I wanted the ID of a node but that also failed.
The problem is that relationships returns a collection and the type function only works on a single relationship. There are two main approaches to solve this.
Use UNWIND to get a separate row for each relationship:
MATCH p = (n:Root)-[rs1*]->()
WHERE ALL(rel in rs1 WHERE rel.relevance is null)
WITH relationships(p) AS rs
UNWIND n, rs AS r
RETURN n, type(r)
Use extract to get the results in a list (in a single row per root node):
MATCH p = (n:Root)-[rs1*]->()
WHERE ALL(rel in rs1 WHERE rel.relevance is null)
WITH n, relationships(p) AS rs
RETURN n, extract(r IN rs | type(r))
Or even shorter:
MATCH p = (n:Root)-[rs1*]->()
WHERE ALL(rel in rs1 WHERE rel.relevance is null)
RETURN n, extract(r IN relationships(p) | type(r))
UPDATE: I've changed the graphic and example queries to make the request more clear. The basic idea is the same, but now I'm showing that there really are more than just two relationships. The idea is I want TWO of them to match, not necessarily ALL of them.
Given the following Neo4j graph:
Is it possible to specify a relationship in a query that requires that TWO specific relationships be there for a match, but not necessarily all, without simply stating each full matching path separately? I want a logical AND on the relationship types, just like we have a logical OR using the | character.
This is how you would use a logical OR with the | character:
// OR on MEMBER_OF and GRANT_GROUP_COMP
MATCH (p:Person {name:'John'})-[r:MEMBER_OF|GRANT_GROUP_COMP]->(t:Team {name:'Team 1'})
RETURN p,r,t
What I'm looking for is something like this, an AND with a & or simlar that REQUIRES that both relationships be present:
// AND type functionality in the relationship I'd like
MATCH (p:Person {name:'John'})-[r:MEMBER_OF&GRANT_GROUP_COMP]->(t:Team {name:'Team 1'})
RETURN p,r,t
Without having to resort to this - which works for me just fine:
// I'd like to avoid this
MATCH (p:Person {name:'John'})-[r:MEMBER_OF]->(t:Team {name:'Team 1'}),
(p)-[r2:GRANT_GROUP_COMP]->(t)
RETURN p,r,r2,t
Any insight would be appreciated, but based on responses so far, it simply doesn't exist.
What about this?
MATCH (D:Person {name:'Donald'})-[r1:WORKS_AT]->
(o:Office {code:'279'})<-[r2:SUPPORTS]-(D)
RETURN *
Inspired version of Dave
MATCH (D:Person {name:'Donald'})-[r:WORKS_AT|SUPPORTS]->(o:Office {code:'279'})
WITH D, o, collect(r) as rels,
collect(distinct type(r)) as tmp WHERE size(tmp) >= 2
return D, o, rels
Update:
MATCH (D:Person {name:'Donald'})
- [r: MEMBER_OF
| GRANT_INDIRECT_ALERTS
| GRANT_INDIRECT_COMP
| GRANT_GROUP_ALERTS
| GRANT_GROUP_COMP
] ->
(o:Office {code:'279'})
WITH D, o, collect(r) as rels,
collect(distinct type(r)) as tmp WHERE size(tmp) >= 2 AND size(tmp) <= 5
return D, o, rels
This query will return a result if John and Team 1 have MEMBER_OF AND GRANT_GROUP_COMP relationships between them.
(This is very similar to the second answer of #stdob--, but requires the size of types to be exactly 2.)
MATCH (p:Person {name: 'John'})-[r:MEMBER_OF|GRANT_GROUP_COMP]->(t:Team {name: 'Team 1'})
WITH p, t, COLLECT(r) AS rels, COLLECT(DISTINCT type(r)) AS types
WHERE SIZE(types) = 2
RETURN p, t, rels;
You could add the second relationship type in a WHERE clause. Something like this...
MATCH (p:Person {name:'John'})-[r:GRANT_GROUP_COMP]->(t:Team {name:'Team 1'})
WHERE (p)-[:MEMBER_OF]->(t)
RETURN *
Or you could make sure that the complete set is in the collection of relationship types. Something like this...
MATCH (p:Person {name:'John'})-[r]->(t:Team {name:'Team 1'})
with p,t,collect(type(r)) as r_types
where all(r in ['MEMBER_OF','GRANT_GROUP_COMP'] where r in r_types)
RETURN p, t, r_types