I am trying to obtain the node and edge ids for the shortest path between two nodes in my neo4j graph database.
If I do not specify which nodes I want, the code runs somehow and returns a path:
import py2neo
graph.run("MATCH (start:Point)-[:SOURCE_POINT]->(r:Road)-[:TARGET_POINT]->(end:Point) \
CALL apoc.algo.dijkstraWithDefaultWeight(start, end, 'Road', 'length', 10.0) \
YIELD path as path, weight as weight \
UNWIND nodes(path) as n \
RETURN DISTINCT { id : id(n), labels : labels(n), data: n} as node").to_table()
But when I run the same code and specify which nodes I want, it returns empty:
graph.run("MATCH (start:Point {id: '4984061949'})-[:SOURCE_POINT]->(r:Road)-[:TARGET_POINT]->(end:Point {id: '4984061963'}) \
...
If I simply try to match those node id's, it returns them ok - so I know they are in the db.
I'm thinking it could be because my 'cost' is a string. But I'm not sure how to cast it to float before it goes through the dijkstraWithDefaultWeight function.
You seem to have a couple of issues.
1. MATCH clause is too restrictive
The following MATCH clause would only succeed if there was a path between the specified start and end nodes consisting of just one Road:
MATCH (start:Point {id: '4984061949'})-[:SOURCE_POINT]->(r:Road)-[:TARGET_POINT]->(end:Point {id: '4984061963'})
If that MATCH clause fails, then your query would return nothing.
The following MATCH clause would succeed if there was a path between any pair of Point nodes consisting of one Road:
MATCH (start:Point)-[:SOURCE_POINT]->(r:Road)-[:TARGET_POINT]->(end:Point)
If that MATCH clause succeeds, then, of course, the Dijkstra procedure will also succeed.
Instead of either of the above, you should probably just use MATCH to get the two endpoints and let the Dijkstra algorithm do the job of finding the path:
MATCH (start:Point {id: '4984061949'}), (end:Point {id: '4984061963'})
2. Wrong procedure argument(s)
The third argument passed to apoc.algo.dijkstraWithDefaultWeight is supposed to specify relationship types and directions, not node labels. Also, the last 2 arguments are supposed to be a relationship property and default relationship property value, respectively.
Related
I wrote a script to to batch create a bunch of relationship in neo4j. Here is the cypher:
:param batch => [{startId: 'abc123', endId: 'abc321'}, {startId: 'abc456', endId: 'abc654']
UNWIND $batch as row
MATCH (from {id: row.startId}
MATCH (to {id: row.endId}
CREATE (from)-[rel:HAS]->(to)
RETURN rel
The problem that there might be some startId/endId entries that don't match any nodes and are silently ignore. Is there a way to return the list of rows that don't match any nodes and create the relationship for the nodes that do match?
I tried OPTIONAL MATCH to fail-fast as soon an id doesn't find a startId/endId however, the query execution was really slow.
First of all, you should always try to specify a label for the node that is used to kick off a MATCH (unless the MATCH pattern uses any already-bound nodes). Otherwise, every single node in the DB must be scanned. In addition, you should consider using indexes to speed up your MATCHs (but, again, you'd need to specify the labels).
Here is a query that uses the APOC procedure apoc.do.when to create a new relationship when appropriate. It returns each row and the corresponding new relationship (or NULL if either node is not found):
UNWIND $batch as row
OPTIONAL MATCH (from:Foo {id: row.startId})
OPTIONAL MATCH (to:Foo {id: row.endId})
CALL apoc.do.when(
from IS NOT NULL AND to IS NOT NULL,
'CREATE (from)-[rel:HAS]->(to) RETURN rel',
'RETURN NULL AS rel',
{from: from, to: to}) YIELD value
RETURN row, value.rel AS rel
I have a graph of nodes with a relationship NEXT with 2 properties sequence (s) and position (p). For example:
N1-[NEXT{s:1, p:2}]-> N2-[NEXT{s:1, p:3}]-> N3-[NEXT{s:1, p:4}]-> N4
A node N might have multiple outgoing Next relationships with different property values.
Given a list of node names, e.g. [N2,N3,N4] representing a sequential path, I want to check if the graph contains the nodes and that the nodes are connected with relationship Next in order.
For example, if the list contains [N2,N3,N4], then check if there is a relationship Next between nodes N2,N3 and between N3,N4.
In addition, I want to make sure that the nodes are part of the same sequence, thus the property s is the same for each relationship Next. To ensure that the order maintained, I need to verify if the property p is incremental. Meaning, the value of p in the relationship between N2 -> N3 is 3 and the value p between N3->N4 is (3+1) = 4 and so on.
I tried using APOC to retrieve the possible paths from an initial node N using python (library: neo4jrestclient) and then process the paths manually to check if a sequence exists using the following query:
q = "MATCH (n:Node) WHERE n.name = 'N' CALL apoc.path.expandConfig(n {relationshipFilter:'NEXT>', maxLevel:4}) YIELD path RETURN path"
results = db.query(q,data_contents=True)
However, running the query took some time that I eventually stopped the query. Any ideas?
This one is a bit tough.
First, pre-match to the nodes in the path. We can use the collected nodes here to be a whitelist for nodes in the path
Assuming the start node is included in the list, a query might go like:
UNWIND $names as name
MATCH (n:Node {name:name})
WITH collect(n) as nodes
WITH nodes, nodes[0] as start, tail(nodes) as tail, size(nodes)-1 as depth
CALL apoc.path.expandConfig(start, {whitelistNodes:nodes, minLevel:depth, maxLevel:depth, relationshipFilter:'NEXT>'}) YIELD path
WHERE all(index in range(0, size(nodes)-1) WHERE nodes[index] = nodes(path)[index])
// we now have only paths with the given nodes in order
WITH path, relationships(path)[0].s as sequence
WHERE all(rel in tail(relationships(path)) WHERE rel.s = sequence)
// now each path only has relationships of common sequence
WITH path, apoc.coll.pairsMin([rel in relationships(path) | rel.p]) as pairs
WHERE all(pair in pairs WHERE pair[0] + 1 = pair[1])
RETURN path
short version: I need to get a path that can contain different relationships in different directions. However, I have a constraint on my path where if it contains successive relationships of a particular type, then both relationship must be in the same direction.
long version:
I am using the query below to get a path between two nodes:
MATCH p=shortestPath((n:Class { code: '1' })-[r*]-(m:Class { code: '4'})) WHERE NONE(x IN NODES(p) WHERE 'Ontology' in labels(x)) return p
The query correctly returns me a shortest path between the two nodes. However I need to further constraint this query so that it returns only path where successive relationship of a particular type are in the same direction.
For example, suppose the relationship -a-> need to be in the same direction, it should not return (1)-a->(2)<-a-(3)-b->(4) but can return (1)-a->(6)-a->(3)-b->(7)<-c-(5)<-d-(6)-e->(4) or (3)-b->(7)<-c-(4) .
The above examples were just a simplification of my real data. In my real use case, I need to find a shortest path between a node with IRI
http://elite.polito.it/ontologies/dogont.owl#Actuator and another node with IRI http://elite.polito.it/ontologies/dogont.owl#StateValue. The query below is a specific query that encodes the path I need and it returns a path, that is the path exist. I need to make it more generic using shortestpath.
MATCH p=(n:Class {iri: 'http://elite.polito.it/ontologies/dogont.owl#Actuator'})-->(a:Class)<--(b:ObjectProperty{iri:'http://elite.polito.it/ontologies/dogont.owl#hasState'})-->(c:Class{iri:'http://elite.polito.it/ontologies/dogont.owl#State'})<--(d:Class{iri:'http://elite.polito.it/ontologies/dogont.owl#hasStateValue'})-->(e:Class{iri:'http://elite.polito.it/ontologies/dogont.owl#StateValue'}) return p
Is this possible with cypher ?
This query should work if you want to capture paths that are consistent in either direction (but it has to invoke shortestPath() twice):
MATCH (n:Class {code: '1'}), (m:Class {iri: '4'})
OPTIONAL MATCH p1=shortestPath((n)-[*]->(m))
WHERE NONE(x IN NODES(p1) WHERE 'Ontology' in labels(x))
OPTIONAL MATCH p2=shortestPath((n)<-[*]-(m))
WHERE NONE(y IN NODES(p2) WHERE 'Ontology' in labels(y))
RETURN p1, p2
p1 and/or p2 will be null if there is no consistently rightward or leftward path, respectively.
However, if you know that you want a specific direction (say, rightward), then this should work:
MATCH p=shortestPath((:Class {code: '1'})-[*]->(:Class {iri: '4'}))
WHERE NONE(x IN NODES(p) WHERE 'Ontology' in labels(x))
RETURN p
I have the following graph stored in csv format:
graphUnioned.csv:
a b
b c
The above graph denotes path from Node:a to Node:b. Note that the first column in the file denotes source and the second column denotes destination. With this logic the second path in the graph is from Node:b to Node:c. And the longest path in the graph is: Node:a to Node:b to Node:c.
I loaded the above csv in Neo4j desktop using the following command:
LOAD CSV WITH HEADERS FROM "file:\\graphUnioned.csv" AS csvLine
MERGE (s:s {s:csvLine.s})
MERGE (o:o {o:csvLine.o})
MERGE (s)-[]->(o)
RETURN *;
And then for finding longest path I run the following command:
match (n:s)
where (n:s)-[]->()
match p = (n:s)-[*1..]->(m:o)
return p, length(p) as L
order by L desc
limit 1;
However unfortunately this command only gives me path from Node: a to Node:b and does not return the longest path. Can someone please help me understand as to where am I going wrong?
There are two mistakes in your CSV import query.
First, you need to use a type when you MERGE a relationship between nodes, that query won't compile otherwise. You likely supplied one and forgot to add it when you pasted it here.
Second, the big one, is that your query is merging nodes with different labels and different properties, and this is majorly throwing it off. Your intent was to create 3 nodes, with a longest path connecting them, but your query creates 4 nodes, two isolated groups of two nodes each:
This creates 2 b nodes: (:s {s:b}) and (:o {o:b}). Each of them is connected to a different node, and this is due to treating the nodes to be created from each variable in the CSV differently.
What you should be doing is using the same label and property key for all of the nodes involved, and this will allow the match to the b node to only refer to a single node and not create two:
LOAD CSV WITH HEADERS FROM "file:\\graphUnioned.csv" AS csvLine
MERGE (s:Node {value:csvLine.s})
MERGE (o:Node {value:csvLine.o})
MERGE (s)-[:REL]->(o)
RETURN *;
You'll also want an index on :Node(value) (or whatever your equivalent is when you import real data) so that your MERGEs and subsequent MATCHes are fast when performing lookups of the nodes by property.
Now, to get to your longest path query.
If you are assuming that the start node has no relations to it, and that your end node has no relationships from it, then you can use a query like this:
match (start:Node)
where not ()-->(start)
match p = (start)-[*]->(end)
where not (end)-->()
return p, length(p) as L
order by L desc
limit 1;
I have nodes with this structure
(g:Giocatore { nome, match, nazionale})
(nome:'Del Piero', match:'45343', nazionale:'ITA')
(nome:'Messi', match:'65324', nazionale:'ARG')
(nome:'Del Piero', match:'18235', nazionale:'ITA')
The property 'match' is unique (ID's of match) while there are several 'nome' with the same name.
I want to merge all the nodes with the same 'nome' and create a collection of different 'match' like this
(nome:'Del Piero', match:[45343,18235], nazionale:'ITA')
(nome:'Messi', match:'65324', nazionale:'ARG')
I tried with apoc library too but nothing works.
Any idea?
Can you try this query :
MATCH (n:Giocatore)
WITH n.nome AS nome, collect(n) AS node2Merge
WITH node2Merge, extract(x IN node2Merge | x.match) AS matches
CALL apoc.refactor.mergeNodes(node2Merge) YIELD node
SET node.match = matches
Here I'm using APOC to merge the nodes, but then I do a map transformation on the node list to have an array of match, and I set it on the merged node.
I don't know if you have a lot of Giocatore nodes, so perhaps this query will do an OutOfMemory exception, so you will have to batch your query. You can for example replace the first line by MATCH (n:Giocatore) WHERE n.nome STARTS WITH 'A' and repeat it for each letter or you can also use the apoc.periodic.iterate procedure :
CALL apoc.periodic.iterate(
'MATCH (n:Giocatore) WITH n.nome AS nome, collect(n) AS node2Merge RETURN node2Merge, extract(x IN node2Merge | x.match) AS matches',
'CALL apoc.refactor.mergeNodes(node2Merge) YIELD node
SET node.match = matches',
{batchSize:1000,parallel:true,retries:3,iterateList:true}
) YIELD batches, total