I am trying to find two nodes that are furthest from each other in my Neo4j Database. For the purposes of my analysis, I am considering shortest distance between the two nodes as the distance between them. Therefore, the two nodes that are furthest will have longest shortest path between them. I am using the following syntax from Cypher to find the shortest node.
Given two nodes as shown in the Neo4j example documentation http://docs.neo4j.org/chunked/milestone/query-match.html#match-shortest-path, I can run the following Cypher query.
MATCH p = shortestPath((martin:Person)-[*..15]-(oliver:Person))
WHERE martin.name = 'Martin Sheen' AND oliver.name = 'Oliver Stone'
RETURN p
My database has over 1/2 million nodes. The brute force way will obviously take a long time. Is there any easy or faster way to get the two nodes?
[As an extra wrinkle .. the graph is weighted but this detail can be ignored.]
If I'm reading this correctly, you want all-pairs shortest path. This will give you a list with each node as a source and the shortest path to every other node. While it does do it by weight, you can simple use a weight of 1 for everything.
You'll have to implement this yourself in Java as Cypher doesn't have anything for this.
Related
How do I find multiple distinct short paths between 2 nodes in a graph with 7.5M nodes and 20M relationships?
We want a feature similar to how google maps shows other alternative routes.
problems with:
Dijkstra, shortestPath and allShortestPaths:
Only returns the shortest path or paths with the shortest length.
Yen's k shortest paths:
Absurdly slow on a big graph
Iterate over list of numbers 0-10 and call allShortestPaths with minimum number of length of i:
Absurdly slow on a big graph
After some time, I figured that filtering/blacklisting for certain relationships that already has been used for another path could work. However, when I tried this in practice i noticed that filtering after running allShortestPaths wasn't a viable solution. After this I figured doing allShortestPaths mulitple times and when a path is found I could rename the type to currentTypeName_TEMP, then rename it after again.
To find multiple distinct paths, this could be done with an algorithm shown on the picture, and given the effectiveness of allShortestPaths, this will not be that computational or time intensive (dotted lines means found relationships).
EDIT:
turns out this is rather difficult in cypher, since it is very restricive in nature, we ended up writing a plugin, where we made a hack of dijkstra where it takes N of the shortest paths.
I have been trying hard to find out longest path in a complex network. I have been through many questions in StackOverflow and Internet, but none could help me. I have written a CQL as
start n=node(*)
match p = (n)-[:LinkTo*1..]->(m)
with n,MAX(length(p)) as L
match p = (n)-[:LinkTo*1..]->(m)
where length(p) = L
return p,L
I don't get any solution. Neo4J would keep running for the answer, and I also tried executing it in Neo4J Cloud Hosting. I didn't any solution even there, but got an error
"Error undefined-undefined"
I am in dire need of a solution. The result for this answer will help me complete my project. So, anyone please help me in correcting the query.
Well for one you're doing a highly expensive operation twice when you only have to do it once.
Additionally, you are returning one path per every single node in your database, at least (as there may be multiple paths for a node that are the longest paths available for that node). But from your question it sounds like you want the single largest path in the graph, not one each for every single node.
We can also improve your match by only performing the longest-path match on nodes that are at the head of the path, and not somewhere in the middle.
Maybe try this one?
match (n)
where (n)-[:LinkTo]->() and not ()-[:LinkTo]->(n)
match p = (n)-[:LinkTo*1..]->(m)
return p, length(p) as L
order by L desc
limit 1
The problem you're trying to solve is NP-hard. On small sparse graphs a brute-force approach such as the one suggested by InverseFalcon may succeed in reasonable time, but on any reasonably large and/or densely connected graph, you will quickly run into both time and space problems.
If you have a weighted graph, you can find the longest path between 2 nodes by negating all the edge-weights, and running a shortest weighted path algorithm over the modified graph. However if you want to find the longest path in the entire graph, you are effectively trying to solve the Travelling Salesman Problem, but with -ve edge weights. You can't do that with Cypher.
If your graph is unweighted, I'd find an easier problem, or see if you can convert your graph to a weighted one and tackle it as described above. Alternatively, see if you can frame your requirements in such a way that you don't need to find a longest path.
Basically i want to find the shortest paths for all (s,t) pairs but with several considerations. For instance the network contains several clusters/communities or group of nodes. These groups will be predefined and can be relatively large in the number of nodes.
I want to find the shortest paths for all s,t pairs that traverse at least one node e.g., from gourp1. In the general case if i have only one group of nodes the problem is reduced to the traditional betweenness centrality. Later i would like to find for all s,t pairs the shortest paths that traverse at least one node from gourp1 and group2.
Any suggestions?
Thanks! :)
According to your description, it seems to me that you are willing to get the shortest paths for individual group of nodes. If you are willing to do that, I think you are heading to the right direction.
It will be more robust if you can find the shortest paths of individual groups then combine the nodes of different groups together. That must save the time.
I think You can use Particle Swarm Optimization Algorithm for the solution of the problem. It can help you through using multiple swarms to get the shortest path of different groups. Then you can combine the nodes from different groups.
Hope it helps. :)
I am trying to find the amount of relationships that stem originally from a parent node and I am not sure the syntax to use in order to gain access to this returned integer. I am can be sure in my code that each child node can only have one relationship of a particular type so this allows me to capture a "true" depth reading
My attempt is this but I am hoping there is a cleaner way:
MATCH p=(n {id:'123'})-[r:Foo*]->(c)
RETURN length(p)
I am not sure this is the correct syntax because it returns an array of integers with the last index being the true tally length. I am hoping for something that just returns an int instead of this mentioned array.
I am very grateful for help that you may be able to offer.
As Nicole says, in general, finding the longest path between two nodes in a graph is not feasible in any reasonable time. If your graph is very small, it is possible that you will be able to find all paths, and select the one with the most edges but this won't scale to larger graphs.
However there is a trick that you can do in certain circumstances. If your graph contains no directed cycles, you can assign each edge a weight of -1, and then look for the shortest weighted path between the source and target nodes. Since the edge weights are negative a shortest weighted path must correspond to a path with a maximum number of edges between the desired nodes.
Unfortunately, Cypher doesn't yet support shortest weighted path algorithms, however the Neo4j database engine does. The docs give an example of how to do this. You will also need to implement your own algorithm, such as Bellman-Ford using the traversal API, because Dijkstra won't work with -ve edge weights.
However, please be aware that this trick won't work if your graph contains cycles - it must be a DAG.
Your query:
MATCH p=(n {id:'123'})-[r:Foo*]->(c)
RETURN length(p)
is returning the length of ALL possible paths from n to c. You probably are only interested in the shortest path? You can use the shortestPath function to only consider the shortest path from n to c:
MATCH p = shortestPath((n {id:'123'})-[r:Foo*]->(c))
RETURN length(p)
My question is: is it possible to implement Dijkstra's algorithm using Cypher? the explanation on the neo4j website only talks about REST API and it is very difficult to understand for a beginner like me
Please note that I want to find the shortest path with the shortest distance between two nodes, and not the shortest path (involving least number of relationships) between two nodes. I am aware of the shortestPath algorithm that is very easy to implement using Cypher, but it does not serve my purpose.
Kindly guide me on how to proceed if I have a graph database with nodes, and relationships between the nodes having the property 'distance'. All I want is to write a code with the help of which we will be able to find out the shortest distance between two nodes in the database. Or any tips if I need to change my approach and use some other program for this?
In this case you can implement the allShortestPaths, ordering the paths in an ascending order based on your distance property of the relationships and return only one, based on your last post it would be something like this :
MATCH (from: Location {LocationName:"x"}), (to: Location {LocationName:"y"}) ,
paths = allShortestPaths((from)-[:CONNECTED_TO*]->(to))
WITH REDUCE(dist = 0, rel in rels(paths) | dist + rel.distance) AS distance, paths
RETURN paths, distance
ORDER BY distance
LIMIT 1
No, it's not possible in a reasonable way unless you use transactions and basically rewrite the algorhythm.
The previous answer is wrong as longer but less expensive paths will not be returned by the allShortestPaths subset. You will be filtering a subset of paths that have been chosen without considering relationship cost.