Get all paths neo4J Cypher - neo4j

I'm using neo4j to develop a proof of concept and I want to get all Nodes ID for all paths from my root node to leafs, example with ids :
ROOT1-->N1--->SN2--->L1
ROOT1-->N2--->SN3--->L3
What I want to get in my result query is : ROO1,N1,SN2 and ROOT1,N2,SN3
Im new to cypher and I struggle to get this result, any help would be usefull .

I assume that the ID that you mention is an id property.
To get a collection of the node ids in each full path (except for the leaf node):
MATCH p=(root {id: 'ROOT1'})-[*]->(leaf)
WHERE NOT (leaf)-->()
RETURN EXTRACT(x IN NODES(p)[..-1] | x.id) AS result;
Here is a sample result:
+----------------------+
| result |
+----------------------+
| ["ROOT1","N1","SN2"] |
| ["ROOT1","N2","SN3"] |
+----------------------+

Related

Cypher - loading results while optionally adding count of relationships

I'm using a graph database to establish a relationship between folders, their children and users (be it owners or sharers of the folder).
Here is an example of my structure. Where orange are folders and blue are users. -
What I want my query to achieve: It should return direct children of the folder under query, and while doing so determine if the child folder being returned is being shared.
My query
MATCH (:Folder { name: 'Nick Hamill' })-[:CHILD]->(children:Folder)
WITH children
OPTIONAL MATCH path = (children)<-[*]-(:User)
UNWIND RELATIONSHIPS(path) AS r WITH children, r
WHERE TYPE(r) = 'SHARES'
RETURN children AS model, COUNT(r) > 0 AS shared
So the query works brilliantly (perhaps a little optimisation needed?) when there is a related user (see below), however, the query fails to return any result if there is no user relationship. I personally can't see why this is because it's an optional match, and surely the count could just return empty?
╒══════════════════════════════════════════════════════════════════════╤════════╕
│"model" │"shared"│
╞══════════════════════════════════════════════════════════════════════╪════════╡
│{"name":"Dr. Denis Abshire","created_at":"2019-10-11 13:54:58","id":"c│true │
│f5e084f-d963-35d3-9c6f-fe29b86f6d43","updated_at":"2019-10-11 13:54:58│ │
│"} │ │
└──────────────────────────────────────────────────────────────────────┴────────┘
The query should be relatively self-explanatory but for the sake of clarity here's some expected outputs -
| Query Folder | Returned Folder | Shared? |
|----------------------|------------------|---------|
| Miss Dessie Oritz II | Nick Hamill | TRUE |
| Nick Hamill | Dr Denis Abshire | TRUE |
| Samara Russell | Shemar Huels PhD | FALSE |
| Shemar Huels PhD | Hazle Ward | FALSE |
I'm running neo4j 3.5.11 community edition. I feel like this should be a fairly easy solution, I'm just meeting the limits of my extremely limited cypher knowledge.
Appreciate any help!
I don't undertsand why you are using this in your query :
OPTIONAL MATCH path = (children)<-[*]-(:User)
UNWIND RELATIONSHIPS(path) AS r WITH children, r
WHERE TYPE(r) = 'SHARES'
With (children)<-[*]-(:User) you are searching all the path (without restriction on its size) between the children & User nodes.
And with the WHERE TYPE(r) = 'SHARES' you only want the SHARES relationship ...
So your query will work on this kind of pattern : (children)<-[:CHILD]-(:Folder)<-[:CHILD]-(:Folder)<-[:SHARES]-(:User)
Is it what you want ?
If so, can you try this query :
MATCH (:Folder { name: 'Nick Hamill' })-[:CHILD]->(children:Folder)
RETURN children AS model, size((children)<-[:CHILD*0..]-(:Folder)<-[:SHARES]-(:User)) > 0 AS shared

In neo4j, a query to count the number of distinct structures

In neo4j my database consists of chains of nodes. For each distinct stucture/layout (does graph theory has a better word?), I want to count the number of chains. For example, the database consists of 9 nodes and 5 relationships as this:
(:a)->(:b)
(:b)->(:a)
(:a)->(:b)
(:a)->(:b)->(:b)
where (:a) is a node with label a. Properties on nodes and relationships are irrelevant.
The result of the counting should be:
------------------------
| Structure | n |
------------------------
| (:a)->(:b) | 2 |
| (:b)->(:a) | 1 |
| (:a)->(:b)->(:b) | 1 |
------------------------
Is there a query that can achieve this?
Appendix
Query to create test data:
create (:a)-[:r]->(:b), (:b)-[:r]->(:a), (:a)-[:r]->(:b), (:a)-[:r]->(:b)-[:r]->(:b)
EDIT:
Thanks for the clarification.
We can get the equivalent of what you want, a capture of the path pattern using the labels present:
MATCH path = (start)-[*]->(end)
WHERE NOT ()-->(start) and NOT (end)-->()
RETURN [node in nodes(path) | labels(node)[0]] as structure, count(path) as n
This will give you a list of the labels of the nodes (the first label present for each...remember that nodes can be multi-labeled, which may throw off your results).
As for getting it into that exact format in your example, that's a different thing. We could do this with some text functions in APOC Procedures, specifically apoc.text.join().
We would need to first add formatting around the extraction of the first label to add the prefixed : as well as the parenthesis. Then we could use apoc.text.join() to get a string where the nodes are joined by your desired '->' symbol:
MATCH path = (start)-[*]->(end)
WHERE NOT ()-->(start) and NOT (end)-->()
WITH [node in nodes(path) | labels(node)[0]] as structure, count(path) as n
RETURN apoc.text.join([label in structure | '(:' + label + ')'], '->') as structure, n

Is it possible to match node by id, that lies in the properties of other node or relationship?

I've got a problem with simple matching.
For example,
I have some node
start startNode = node(0)
It has a relationship with another one. One of the relationship's properties is idOfThirdNode with id(thirdNode).
I found out that start point = node( ) get only digits as arguments and any toInt(rel.idOfThirdNode) is not available at all, as other match(point:_Node) where id(point) = rel.idOfThirdNode
Find node by property is not a problem. But it isn't possible to set new duplicate id-property.
Have this problem any decision or only saving this property in model and begining of new matching with this property like id?
Edit:
Earlier I have had in result of such action:
start startNode = node({0})
optional match startNode-[r:REL]-(relNode: _Node)
return distinct startNode, id(r) as linkId, id(relNode) as nodeId,
r.idOfthirdNode as point
beautiful table with nulls in some fields
______________________________________
| StartNode| linkId | nodeId | point |
--------------------------------------
| startNode| 1 | 2 | null |
| info | | | |
-------------------------------------
| startNode| 3 | 4 | 5 |
| info | | | |
But now this "where" make disabled all null matching
start startNode = node({0})
optional match startNode-[r:REL]-(relNode: _Node), (pointNode:_Node)
where id(pointNode) = r.idOfthirdNode
return distinct startNode, id(r) as linkId, id(relNode) as nodeId,
collect({pointNode.name, id:id(pointNode)}) as point
and I get only second line.
You should be able to do something like this:
MATCH (point:_Node), (node:Label)
WHERE ID(point) = node.idOfThirdNode
RETURN *
But I've never actually seen that done because relationships are so much better than foreign keys
This should work for you:
START startNode = node(0)
MATCH (startNode)-[rel]->(secondNode), (thirdNode:_Node)
WHERE ID(thirdNode) = rel.idOfThirdNode
RETURN startNode, secondNode, thirdNode

Nodes with same relation to a third node in a graph database

I was following the Neo4J online tutorial and I came to a question while trying this query with the query tool:
match (a:Person)-[:ACTED_IN|:DIRECTED]->()<-[:ACTED_IN|:DIRECTED]-(b:Person)
return a,b;
I was expecting one of the pairs returned to have the same Person in both identifiers but that didn't happen. Can somebody explain me why? Does a match clause exclude repeated elements in the different identifiers used?
UPDATE:
This question came to me in "Lession 3 - Adding Relationships with Cypher, more" from Neo4J online tutorial, where the query I mentioned above is presented.
I refined the query to the following one, in order to focus more directly my question:
MATCH (a:Person {name:"Keanu Reeves"})-[:ACTED_IN]->()<-[:ACTED_IN]-(b)
RETURN a,b;
The results:
|---------------|--------------------|
| a | b |
|---------------|--------------------|
| Keanu Reeves | Carrie-Anne Moss |
| Keanu Reeves | Laurence Fishburne |
| Keanu Reeves | Hugo Weaving |
| Keanu Reeves | Brooke Langton |
| Keanu Reeves | Gene Hackman |
| Keanu Reeves | Orlando Jones |
|------------------------------------|
So, why there is no row with Keanu Reeves in a and b? Doesn't he should match with both both relations :ACTED_IN?
The behavior you observed is by design.
To quote the manual:
While pattern matching, Cypher makes sure to not include matches where
the same graph relationship is found multiple times in a single
pattern. In most use cases, this is a sensible thing to do.
I would check your data sample. Your query looks like it works just fine for me. I replicated with a simple data set, and here's verification that it does produce pairs like what you're looking for.
Joe acted in "Some Flick"
neo4j-sh (?)$ create (p:Person {name:"Joe"})-[:ACTED_IN]->(m:Movie {name:"Some Flick"});
+-------------------+
| No data returned. |
+-------------------+
Nodes created: 2
Relationships created: 1
Properties set: 2
Labels added: 2
14 ms
But Joe is so multi-talented, he also directed "Some Flick".
neo4j-sh (?)$ match (p:Person {name: "Joe"}), (m:Movie {name: "Some Flick"}) create p-[:DIRECTED]->m;
+-------------------+
| No data returned. |
+-------------------+
Relationships created: 2
23 ms
So who are the actor/director pairs that we know of?
neo4j-sh (?)$ match (a:Person)-[:ACTED_IN|:DIRECTED]->()<-[:ACTED_IN|:DIRECTED]-(b:Person)
> return a,b;
+-----------------------------------------------------+
| a | b |
+-----------------------------------------------------+
| Node[222128]{name:"Joe"} | Node[222128]{name:"Joe"} |
| Node[222128]{name:"Joe"} | Node[222128]{name:"Joe"} |
+-----------------------------------------------------+
2 rows
50 ms
Of course it's Joe.

Get Node ID's in Neo4j using Python

I have recently begun using Neo4j and am struggling to understand how things work. I am trying to create relationships between nodes that I created earlier in my script. The cypher query that I found looks like it should work, but I don't know how to get the id's to replace the #'s
START a= node(#), b= node(#)
CREATE UNIQUE a-[r:POSTED]->b
RETURN r
If you want to use plain cypher, the documentation has a lot of usage examples.
When you create nodes you can return them (or just their ids by returning id(a)), like this:
CREATE (a {name:'john doe'}) RETURN a
This way you can keep the id around to add relationships.
If you want to attach relationships later, you should not use the internal id of the nodes to reference them from external system. They can for example be re-used if you delete and create nodes.
You can either search for a node by scanning over all and filtering using WHERE or add an index to your database, e.g. if you add an auto_index on name:
START n = node:node_auto_index(name='john doe')
and continue from there. Neo4j 2.0 will support index lookup transparently so that MATCH and WHERE should be as efficient.
If you are using python, you can also take a look at py2neo which provides you with a more pythonic interface while using cypher and the REST interface to communicate with the server.
This could be what you are looking for:
START n = node(*) , x = node(*)
Where x<>n
CREATE UNIQUE n-[r:POSTED]->x
RETURN r
It will create POSTED relationship between all the nodes like this
+-----------------------+
| r |
+-----------------------+
| (0)-[10:POSTED]->(1) |
| (0)-[10:POSTED]->(2) |
| (0)-[10:POSTED]->(3) |
| (1)-[10:POSTED]->(0) |
| (1)-[10:POSTED]->(2) |
| (1)-[10:POSTED]->(3) |
| (2)-[10:POSTED]->(0) |
| (2)-[10:POSTED]->(1) |
| (2)-[10:POSTED]->(3) |
| (3)-[10:POSTED]->(0) |
| (3)-[10:POSTED]->(1) |
| (3)-[10:POSTED]->(2) |
And if you don't want a relation between the reference node(0) and the other nodes, you can make the query like this
START n = node(*), x = node(*)
WHERE x<>n AND id(n)<>0 AND id(x)<>0
CREATE UNIQUE n-[r:POSTED]->x
RETURN r
and the result will be like that:
+-----------------------+
| r |
+-----------------------+
| (1)-[10:POSTED]->(2) |
| (1)-[10:POSTED]->(3) |
| (2)-[10:POSTED]->(1) |
| (2)-[10:POSTED]->(3) |
| (3)-[10:POSTED]->(1) |
| (3)-[10:POSTED]->(2) |
On the client side using Javascript I post the cypher query:
start n = node(*) WHERE n.name = '" + a.name + "' return n
and then parse the id number from response "self" in the form of:
server_url:7474/db/data/node/node_id
After hours of trying to figure this out, I finally found what I was looking for. I was struggling with how nodes were getting returned and found that
userId=person[0][0][0].id
would return what I wanted. Thanks for all your help though!
Using py2neo, the way I've found that is really useful is to use the remote module.
from py2neo import Graph, remote
graph = Graph()
graph.run('CREATE (a)-[r:POSTED]-(b)')
a = graph.run('MATCH (a)-[r]-(b) RETURN a').evaluate()
a_id = remote(a)._id
b = graph.run('MATCH (a)-[r]-(b) WHERE ID(a) = {num} RETURN b', num=a_id).evaluate()
b_id = remote(b)._id
graph.run('MATCH (a)-[r]-(b) WHERE ID(a)={num1} AND ID(b)={num2} CREATE (a)-[x:UPDATED]-(b)', num1=a_id, num2=b_id)
The remote function takes in a py2neo Node object and has an _id attribute that you can use to return the current ID number from the graph database.

Resources