Retrieve relationships between child nodes using neo4j cypher - neo4j

I have a following network result when I run this query in neo4j browser:
MATCH (n1:Item {name: 'A'})-[r]-(n2:Item) Return n1,r,n2
At the bottom of the graph, it says: Displaying 6 nodes, 7 relationships.
But when I look on the table in the neo4j browser, I only have 5 records
n1,r,n2
A,A->B,B
A,A->C,C
A,A->D,D
A,A->E,E
A,A->F,F
So in the java code, when I get the list of records using the code below:
List<Record> records = session.run(query).list();
I only get 5 records, so I only get the 5 relationships.
But I want to get all 7 relationships including the 2 below:
B->C
C->F
How can i achieve that using the cypher query?

This should work:
MATCH (n:Item {name: 'A'})-[r1]-(n2:Item)
WITH n, COLLECT(r1) AS rs, COLLECT(n2) as others
UNWIND others AS n2
OPTIONAL MATCH (n2)-[r2]-(x)
WHERE x IN others
RETURN n, others, rs + COLLECT(r2) AS rs
Unlike #FrantišekHartman's first approach, this query uses UNWIND to bind n2 (which is not specified in the WITH clause and therefore becomes unbound) to the same n2 nodes found in the MATCH clause. This query also combines all the relationships into a single rs list.

There are many ways to achieve this. One ways is to travers to 2nd level and check that the 2nd level node is in the first level as well
MATCH (n1:Item {name: 'A'})-[r]-(n2:Item)
WITH n1,collect(r) AS firstRels,collect(n2) AS firstNodes
OPTIONAL MATCH (n2)-[r2]-(n3:Item)
WHERE n3 IN firstNodes
RETURN n1,firstRels,firstNodes,collect(r2) as secondRels
Or you could do a Cartesian product between the first level nodes and match:
MATCH (n1:Item {name: 'A'})-[r]-(n2:Item)
WITH n1,collect(r) AS firstRels,collect(n2) as firstNodes
UNWIND firstNodes AS x
UNWIND firstNodes AS y
OPTIONAL MATCH (x)-[r2]-(y)
RETURN n1,firstRels,firstNodes,collect(r2) as secondRels
Depending on on cardinality of firstNodes and secondRels and other existing relationships one might be faster than the other.

Related

Cypher - given relationship, get the nodes

If I do the query
MATCH (:Label1 {prop1: "start node"}) -[relationships*1..10]-> ()
UNWIND relationships as relationship
RETURN DISTINCT relationship
How do I get nodes for each of acquired relationship to get result in format:
╒════════╤════════╤═══════╕
│"from" │"type" │"to" │
╞════════╪════════╪═══════╡
├────────┼────────┼───────┤
└────────┴────────┴───────┘
Is there a function such as type(r) but for getting nodes from relationship?
RomanMitasov and ray have working answers above.
I don't think they quite get at what you want to do though, because you're basically returning every relationship in the graph in a sort of inefficient way. I say that because without a start or end position, specifying a path length of 1-10 doesn't do anything.
For example:
CREATE (r1:Temp)-[:TEMP_REL]->(r2:Temp)-[:TEMP_REL]->(r3:Temp)
Now we have a graph with 3 Temp nodes with 2 relationships: from r1 to r2, from r2 to r3.
Run your query on these nodes:
MATCH (:Temp)-[rels*1..10]->(:Temp)
UNWIND rels as rel
RETURN startNode(rel), type(rel), endNode(rel)
And you'll see you get four rows. Which is not what you want because there are only two distinct relationships.
You could modify that to return only distinct values, but you're still over-searching the graph.
To get an idea of what relationships are in the graph and what they connect, I use a query like:
MMATCH (n)-[r]->(m)
RETURN labels(n), type(r), labels(m), count(r)
The downside of that, of course, is that it can take a while to run if you have a very large graph.
If you just want to see the structure of your graph:
CALL db.schema.visualization()
Best wishes and happy graphing! :)
Yes, such functions do exist!
startNode(r) to get the start node from relationship r
endNode(r) to get the end node
Here's the final query:
MATCH () -[relationships*1..10]-> ()
UNWIND relationships as r
RETURN startNode(r) as from, type(r) as type, endNode(r) as to

Optimize Cypher query on multiple unique labels

In Neo4j, is it faster to run a query against all nodes (AllNodesScan) and then filter on their labels with a WHERE clause, or to run multiple queries with a NodeByLabelScan?
To illustrate, I want all nodes that are labeled with one of the labels in label_list:
label_list = ['label_1', 'label_2', ...]
Which would be faster in an application (this is pseudo-code):
for label in label_list:
run.query("MATCH (n:{label}) return n")
or
run.query("MATCH (n) WHERE (n:label_1 or n:label_2 or ...)")
EDIT:
Actually, I just realized that the best option might be to run multiple NodeByLabelScan in a single query, with something looking like this:
MATCH (a:label_1)
MATCH (b:label_2)
...
UNWIND [a, b ..] as foo
RETURN foo
Could someone speak to it?
Yes, it would be better to run multiple NodeByLabelScans in a single query.
For example:
OPTIONAL MATCH (a:label_1)
WITH COLLECT(a) AS list
OPTIONAL MATCH (b:label_2)
WITH list + COLLECT(b) AS list
OPTIONAL MATCH (c:label_3)
WITH list + COLLECT(c) AS list
UNWIND list AS n
RETURN DISTINCT n
Notes on the query:
It uses OPTIONAL MATCH so that the query can proceed even if a wanted label is not found in the DB.
It uses multiple aggregation steps to avoid cartesian products (also see this).
And it uses UNWIND so that it can useDISTINCT to return distinct nodes (since a node can have multiple labels).

How to write cypher statement to combine nodes when an OPTIONAL MATCH is null?

Background
Hi all, I am currently trying to write a cypher statement that allows me to find a set of paths on a map from a starting point. I want my search result to always return connecting streets within 5 nodes. Optionally, if there's a nearby hospital, I would like my search pattern to also indicate nearby hospitals.
Main Problem
Because there isn't always a nearby hospital to the current street, sometimes my optional match search pattern comes back as null. Here's the current cypher statement I'm using:
MATCH path=(a:Street {id: 123})-[:CONNECTED_TO*..5]-(b:Street)
OPTIONAL MATCH optionalPath=(b)-[:CONNECTED_TO]->(hospital:Hospital)
WHERE ALL (x IN nodes(path) WHERE (x:Street))
WITH DISTINCT nodes(path) + nodes(optionalPath) as n
UNWIND n as nodes
RETURN DISTINCT nodes;
However, this syntax only works if optionalPath contains nodes. If it doesn't, the statement nodes(path) + nodes(optionalPath) is an operation adding null and I get no records. This is true even the nodes(path) term does contain nodes.
What's the best way to get around this problem?
You can use COALESCE to replace a NULL with some other value. For example:
MATCH path=(:Street {id: 123})-[:CONNECTED_TO*..5]-(b:Street)
WHERE ALL (x IN nodes(path) WHERE x:Street)
OPTIONAL MATCH optionalPath=(b)-[:CONNECTED_TO]->(hospital:Hospital)
WITH nodes(path) + COALESCE(nodes(optionalPath), []) as n
UNWIND n as nodes
RETURN DISTINCT nodes;
I have also made a few other improvements:
The WHERE clause was moved up right after the first MATCH. This eliminates the unwanted path values immediately. Your original query would get all path values (even unwanted ones) and always the perform the second MATCH query, and only eliminate unwanted paths afterwards. (But, it is actually not clear if you even need the WHERE clause at all; for example, if the CONNECTED_TO relationship is only used between Street nodes.)
The DISTINCT in your WITH clause would have prevented duplicate n collections, but the collections internally could have had duplicate paths. This was probably not what you wanted.
It seems you don't really want the path, just all the street nodes within 5 steps, plus any connected hospitals. So I would simplify your query to just that, and then condense the 3 columns down to 1.
MATCH (a:Street {id: 123})-[:CONNECTED_TO*..5]-(b:Street)
OPTIONAL MATCH (b)-[:CONNECTED_TO]->(hospital:Hospital)
WITH collect(a) + collect(b) + collect(hospital) as n
UNWIND n as nodez
RETURN DISTINCT nodez;
If Streets can be indirectly connected (hospital in between), Than I'd adjust like this
MATCH (a:Street {id: 123})-[:CONNECTED_TO]-(b:Street)
WITH a as nodez, b as a
MATCH (a)-[:CONNECTED_TO]-(b:Street)
WITH nodez+collect(b) as nodez, b as a
MATCH (a)-[:CONNECTED_TO]-(b:Street)
WITH nodez+collect(b) as nodez, b as a
MATCH (a)-[:CONNECTED_TO]-(b:Street)
WITH nodez+collect(b) as nodez, b as a
MATCH (a)-[:CONNECTED_TO]-(b:Street)
WITH nodez+collect(b) as nodez, b as a
OPTIONAL MATCH (b)-[:CONNECTED_TO]->(hospital:Hospital)
WITH nodez + collect(hospital) as n
UNWIND n as nodez
RETURN DISTINCT nodez;
It's a bit more verbose, but just says exactly what you want (and also adds the start node to the hospital check list)

cypher to combine nodes and relationships into a single column

So as a complication to this question, I basically want to do
MATCH (n:TEST) OPTIONAL MATCH (n)-[r]->() RETURN DISTINCT n, r
And I want to return n and r as one column with no repeat values. However, running
MATCH (n:TEST) OPTIONAL MATCH (n)-[r]->() UNWIND n+r AS x RETURN DISTINCT x
gives a "Type mismatch: expected List but was Relationship (line 1, column 47)" error. And this query
MATCH (n:TEST) RETURN DISTINCT n UNION MATCH ()-[n]->() RETURN DISTINCT n
Puts nodes and relationships in the same column, but the context from the first match is lost in the second half.
So how can I return all matched nodes and relationships as one minimal list?
UPDATE:
This is the final modified version of the answer query I am using
MATCH (n:TEST)
OPTIONAL MATCH (n)-[r]->()
RETURN n {.*, rels:collect(r {properties:properties(r), id:id(r), type:type(r), startNode:id(startNode(r)), endNode:id(endNode(r))})} as n
There are a couple ways to handle this, depending on if you want to hold these within lists, or within maps, or if you want a map projection of a node to include its relationships.
If you're using Neo4j 3.1 or newer, then map projection is probably the easiest approach. Using this, we can output the properties of a node and include its relationships as a collected property:
MATCH (n:TEST)
OPTIONAL MATCH (n)-[r]->()
RETURN n {.*, rels:collect(r)} as n
Here's what you might do if you wanted each row to be its own pairing of a node and a single one of its relationships as a list:
...
RETURN [n, r] as pair
And as a map:
...
RETURN {node:n, rel:r} as pair
EDIT
As far as returning more data from each relationship, if you check the Code results tab, you'll see that the id, relationship type, and start and end node ids are included, and accessible from your back-end code.
However, if you want to explicitly return this data, then we just need to include it in the query, using another map projection for each relationship:
MATCH (n:TEST)
OPTIONAL MATCH (n)-[r]->()
RETURN n {.*, rels:collect(r {.*, id:id(r), type:type(r), startNode:startNode(r), endNode:endNode(r)})} as n

How to find specific subgraph in Neo4j using where clause

I have a large graph where some of the relationships have properties that I want to use to effectively prune the graph as I create a subgraph. For example, if I have a property called 'relevance score' and I want to start at one node and sprawl out, collecting all nodes and relationships but pruning wherever a relationship has the above property.
My attempt to do so netted this query:
start n=node(15) match (n)-[r*]->(x) WHERE NOT HAS(r.relevance_score) return x, r
My attempt has two issues I cannot resolve:
1) Reflecting I believe this will not result in a pruned graph but rather a collection of disjoint graphs. Additionally:
2) I am getting the following error from what looks to be a correctly formed cypher query:
Type mismatch: expected Any, Map, Node or Relationship but was Collection<Relationship> (line 1, column 52 (offset: 51))
"start n=node(15) match (n)-[r*]->(x) WHERE NOT HAS(r.relevance_score) return x, r"
You should be able to use the ALL() function on the collection of relationships to enforce that for all relationships in the path, the property in question is null.
Using Gabor's sample graph, this query should work.
MATCH p = (n {name: 'n1'})-[rs1*]->()
WHERE ALL(rel in rs1 WHERE rel.relevance_score is null)
RETURN p
One solution that I can think of is to go through all relationships (with rs*), filter the the ones without the relevance_score property and see if the rs "path" is still the same. (I quoted "path" as technically it is not a Neo4j path).
I created a small example graph:
CREATE
(n1:Node {name: 'n1'}),
(n2:Node {name: 'n2'}),
(n3:Node {name: 'n3'}),
(n4:Node {name: 'n4'}),
(n5:Node {name: 'n5'}),
(n1)-[:REL {relevance_score: 0.5}]->(n2)-[:REL]->(n3),
(n1)-[:REL]->(n4)-[:REL]->(n5)
The graph contains a single relevant edge, between nodes n1 and n2.
The query (note that I used {name: 'n1'} to get the start node, you might use START node=...):
MATCH (n {name: 'n1'})-[rs1*]->(x)
UNWIND rs1 AS r
WITH n, rs1, x, r
WHERE NOT exists(r.relevance_score)
WITH n, rs1, x, collect(r) AS rs2
WHERE rs1 = rs2
RETURN n, x
The results:
╒══════════╤══════════╕
│n │x │
╞══════════╪══════════╡
│{name: n1}│{name: n4}│
├──────────┼──────────┤
│{name: n1}│{name: n5}│
└──────────┴──────────┘
Update: see InverseFalcon's answer for a simpler solution.

Resources