Selecting one of two nodes in Cypher query - neo4j

all Cypher masters!
I can't figure out how to query all B nodes while choosing either B1 or B2 and B4 or B5. There is no constraint on which of them, only that one is chosen. As in the image, there's a relation (B1,B2) and (B4,B5).
In other words - I want to MATCH all nodes of type B connected to some node with type A, but excluding either B1 or B2 and B4 or B5 (using the relation between them) in the result. The nodes of type B can only be pairwise connected - that is, no (B1,B2), (B2,B3) will exists simultaneously. Although, there can be more than one pair as the image shows.
Any ideas are more than welcome!

I think this is a simple additional condition:
MATCH (A:A)--(B:B)
OPTIONAL MATCH (B)--(BT:B)--(A)
WITH B WHERE BT IS NULL OR id(B) > id(BT)
RETURN B

So for this, it will be faster to use APOC Procedures, as there are some helpful collection functions, and a procedure we'll want to easily get the relationships that exist between a group of nodes.
The idea here is that we'll match to the connected :B nodes, use the cover() procedure to get all relationships among these nodes, collect those relationships and from those take one of the nodes for those relationships (we'll use the startnode here), and then we'll just subtract those chosen nodes from our list leaving us with the :B nodes we want:
MATCH (a:A)--(b:B)
WITH collect(b) as bNodes
CALL apoc.algo.cover(bNodes) YIELD rel
WITH bNodes, [r in collect(rel) | startNode(r)] as toRemove
RETURN apoc.coll.subtract(bNodes, toRemove) as nodes
If you don't have (or don't want to use) APOC, here's a Cypher-only version:
MATCH (a:A)--(b:B)
WITH collect(b) as bNodes
UNWIND bNodes as b
OPTIONAL MATCH (b)-[r]-(other)
WHERE other IN bNodes
WITH bNodes, collect(DISTINCT startNode(r)) as toRemove
RETURN [b in bNodes WHERE NOT b in toRemove] as nodes

Related

Query to write hops and return all the properties from the middle nodes or a better way to do it and skip hops?

Node A is connected to Node E through different nodes B (B can be repeating), C and D etc as given below.
(A)--(C)--(D)--(E)
(A)--(B)--(C)--(D)--(E)
(A)--(B)--(B)--(C)--(D)--(E)
(A)--(B)--(B)--(B)--(C)--(D)--(E)
There could be up to 7 B nodes between A and C or no B node at all (like the first case above).
Question: How to get all the E1, E2, E3, E4 connected to A1 with a single query and return properties from all A, B, C, D and E nodes? I could not return the properties using the hops.
MATCH (A {Id:30})-[*1..6]-(E) RETURN DISTINCT A.Name, E.Name;
But we want to return B.Name (If there are multiple B nodes in the middle their names too), C.Name and D.Name too. Happy to skip hopping completely if required. Help please? Thanks in advance.
Try this
// in case there is always A,C and E, you can look for
// paths with length 3 to 6
MATCH path=(A)-[*3..6]-(E)
// return the name of each node in the same order
RETURN [n IN nodes(path) | n.name] AS nodeNames
Assuming A through E are node labels, this query should get all paths that match your pattern (with 0 to 7 B nodes between the A and C nodes), and return distinct lists of node Name values:
MATCH p=(:A)-[*..8]-(:C)--(:D)--(:E)
WHERE ALL(n IN NODES(p)[1..-3] WHERE 'B' IN LABELS(n))
RETURN DISTINCT [m IN NODES(p) | m.Name] AS names
In general, the query would be more efficient if you could also specify the relationship types and their directionality.

cypher how get relation between every two node and the distance from start node?

I have some nodes and relation like A -> B ->C -> D; andB->DSo the B C D is a loop, now I want get all relations and each relation distance from node A;
I expect result like:
{startNode: A, endNode: B, rel:FRIEND, distanceFromAtoEndnode: 1},
{startNode: B, endNode: C, rel:FRIEND, distanceFromAtoEndnode: 2},
{startNode: C, endNode: D, rel:FRIEND, distanceFromAtoEndnode: 3},
{startNode: B, endNode: D, rel:FRIEND, distanceFromAtoEndnode: 2}
and my cypher:
match p=(n:Person {name:"A"})-[r*1..9]-(m:Person) return last(r) as rel,length(p) as distanceFromAtoEndnode
but this always get one more item I donot need:
{startNode: D, endNode: C, rel:FRIEND, distanceFromAtoEndnode: 3},
and if there are double loop like "8" then the result is worse
how can I write the cypher?
While it's easy enough to get distance to end nodes, your requirement to get the last relationship, with the results you want, will be tough (maybe impossible?) to get with just Cypher, since since relationships may be traversed multiple times depending on the connectedness of the graph, and since the same relationships will occur as the last relationship of paths of differing lengths.
If you absolutely need this, then you can use APOC path expander procedures to ensure that each relationship is only ever traversed once, so you won't get the same relationship as a result but for a different path.
An example of usage that should give you the results you want:
MATCH (n:Person {name:"A"})
CALL apoc.path.expandConfig(n, {uniqueness:'RELATIONSHIP_GLOBAL', minLevel:1, maxLevel:9, labelFilter:'>Person'}) YIELD path
WITH last(relationships(path)) as rel, length(path) as distanceFromAtoEndnode
RETURN rel {startNode:startNode(rel).name, endNode:endNode(rel).name, rel:type(rel)} as rel, distanceFromAtoEndnode
As for pure Cypher solutions, you may get hangs on highly connected graphs, since Cypher's expansion tries to find all possible paths. But here you go.
To find shortest distance for all connected nodes:
MATCH path=(:Person {name:"A"})-[*1..9]-(other:Person)
RETURN other, min(length(path)) as shortestDistance
And to find all relationships connecting all connected nodes:
MATCH path=(:Person {name:"A"})-[*1..9]-(:Person)
WITH distinct last(relationships(path)) as rel
RETURN rel {startNode:startNode(rel).name, endNode:endNode(rel).name, rel:type(rel)} as rel

Return multiple sums of relationship weights using cypher

I have a graph with one node type 'nodeName' and one relationship type 'relName'. Each node pair has 0-1 'relName' relationships with each other but each node can be connected to many nodes.
Given an initial list of nodes (I'll refer to this list as the query subset) I want to:
Find all the nodes that connect to the query subset
I'm currently doing this (which may be overly convoluted):
MATCH (a: nodeName)-[r:relName]-()
WHERE (a.name IN ['query list'])
WITH a
MATCH (b: nodeName)-[r2:relName]-()
WHERE NOT (b.name IN ['query list'])
WITH a, b
MATCH (a)--(b)
RETURN DISTINCT b
Then for each connected node (b) I want to return the SUM of the weights that connect to the query subset
For example. If node b1 has 4 edges that connect to nodes in the query subset I would like to RETURN SUM(r2.weight) AS totalWeight for b2. I actually need a list of all the b nodes ordered by totalWeight.
No. 2 is where I'm stuck. I've been reading the docs about FOREACH and reduce() but I'm not sure how to apply them here.
Speed is important as I have 30,000 nodes and 1.5M edges if you have any suggestions regarding this please throw them into the mix.
Many thanks
Matt
Why do you need so many Match statements? You can specify a nodes and b nodes in single Match statement and select only those who have a relationship between them.
After that just return b nodes and sum of the weights. b nodes will automatically be acting as a group by if it is returned along with aggregation function such as sum.
MATCH (a:nodeName)-[r:relName]-(b:nodeName)
WHERE (a.name IN ['query list']) AND NOT((b.name IN ['query list']))
RETURN b.name, sum(r.weight) as weightSum order by weightSum
I think we can simplify that query a bit.
MATCH (a: nodeName)
WHERE (a.name IN ['query list'])
WITH collect(a) as subset
UNWIND subset as a
MATCH (a)-[r:relName]-(b)
WHERE NOT b in subset
RETURN b, sum(r.weight) as totalWeight
ORDER BY totalWeight ASC
Since sum() is an aggregating function, it will make the non-aggregation variables the grouping key (in this case b), so the sum is per b node, then we order them (switch to DESC if needed).

Find connected groups over levels

A node A has 3 connected Nodes B1, B2, B3. Those Bx Nodes have again connected Nodes C1,C2,C3 and C4. Also Node A have 2 connected nodes C5 and C6.
Starting with node A I want to collect all C-nodes. I did a query for the A node, collect the two C-Nodes, then a query for the B-nodes, collect again all C-nodes and merge both arrays. Work but is not very clever.
I tried (Pseudocode)
MATCH (g)<-[:IS_SUBGROUP_OF*1]-(i)-[:HAS_C_NODES]->(c) WHERE g = A.uuid RETURN C_NODES
But I get either all c-nodes for A or for the B-nodes
How would I do a query that collects all C-Nodes starting with Node A?
* edited *
Here is some example data:
CREATE (a:A), (b1:B1), (b2:B2), (b3:B3), (c1:C1), (c2:C2), (c3:C3), (c4:C4), (a)-[r:HAS]->(c4), (a)-[r1:HAS]->(b1), (a)-[r2:HAS]->(b2), (a)-[r3:HAS]->(b3), (b1)-[r4:HAS]->(c1), (b1)-[r5:HAS]->(c2), (b2)-[r6:HAS]->(c3)
A query should return all nodes starting with C, no matter to which node they are connected (A or B).
You can add multiple labels for each node. You should use this to your advantage and segregate all the B and C nodes into a second label.
Eg:
CREATE (a:A), (b1:B1:BType), (b2:B2:BType), (b3:B3:BType), (c1:C1:CType), (c2:C2:CType), (c3:C3:CType), (c4:C4:CType), (a)-[r:HAS]->(c4), (a)-[r1:HAS]->(b1), (a)-[r2:HAS]->(b2), (a)-[r3:HAS]->(b3), (b1)-[r4:HAS]->(c1), (b1)-[r5:HAS]->(c2), (b2)-[r6:HAS]->(c3)
I have modified your create statement to group all the B nodes as :BType label and all the C nodes as :CType label.
You can simply use the optional match keyword to selectively traverse through the relationships if they exist and obtain the results you want.
match (a:A)-[:HAS]->(b:BType)-[:HAS]->(c:CType) optional match (a:A)-[:HAS]->(xc:CType) return c,xc
If you would like both sets of nodes to be grouped together you could try this statement instead which uses collect().
match (a:A)-[:HAS]->(b:BType)-[:HAS]->(c:CType) with a,collect (distinct c) as set1 optional match (a:A)-[:HAS]->(xc:CType) return set1 + collect (distinct xc) as output

neo4j collecting nodes and relations type b-->a<--c,a<--d

I am extending maxdemarzi's excellent graph visualisation example (http://maxdemarzi.com/2013/07/03/the-last-mile/) using VivaGraph backed by neo4j.
I want to display relationships of the type
a-->b<--c,b<--d
I tried the query
MATCH p = (a)--(b:X)--(c),(b:X)--(d)
RETURN EXTRACT(n in nodes(p) | {id:ID(n), name:COALESCE(n.name, n.title, ID(n)), type:LABELS(n)}) AS nodes,
EXTRACT(r in relationships(p)| {source:ID(startNode(r)) , target:ID(endNode(r))}) AS rels
It looks like the named query picks up only a-->b<--c pattern and omits the b<--d patterns.
Am i missing something... can i not add multiple patterns in a named query?
The most immediate problem is that the comma in the MATCH clause separates the first pattern from the second. The variable 'p' only stores the first pattern. This is why you aren't getting the results you desire. Independent of that, you are at risk of having a 'loose binding' by putting a label on both of your nodes named 'b' in the two patterns. The second 'b' node should not have a label.
So here is a version of your query that should work.
MATCH p1=(a)-->(b:X)<--(c), p2=(b)<--(d)
WITH nodes(p1) + d AS ns, relationships(p1) + relationships(p2) AS rs
RETURN EXTRACT(n IN ns | {id:ID(n), name:COALESCE(n.name, n.title, ID(n)), type:LABELS(n)}) AS nodes,
EXTRACT(r in rs| {source:ID(startNode(r)) , target:ID(endNode(r))}) AS rels
Capture both paths, then build collections from the nodes and relationships of both paths. The collection of nodes actually only extracts the nodes from p1 and adds the 'd' node. You could write that part as
nodes(p1) + nodes(p2) as ns
but then the 'b' node will appear in the list twice.

Resources