Get all Routes between two nodes neo4j - neo4j

I'm working on a project where I have to deal with graphs...
I'm using a graph to get routes by bus and bike between two stops.
The fact is,all my relationship contains the time needed to go from the start point of the relationship and the end.
In order to get the shortest path between to node, I'm using the shortest path function of cypher. But something, the shortest path is not the fastest....
Is there a way to get all paths between two nodes not linked by a relationship?
Thanks
EDIT:
In fact I change my graph, to make it easier.
So I still have all my nodes. Now the relationship type correspond to the time needed to go from a node to another.
The shortestPath function of cypher give the path which contains less relationship. I would like that it returns the path where the addition of all Type (the time) is the smallest..
Is that possible?
Thanks

In cypher, to get all paths between two nodes not linked by a relationship, and sort by a total in a weight, you can use the reduce function introduced in 1.9:
start a=node(...), b=node(...) // get your start nodes
match p=a-[r*2..5]->b // match paths (best to provide maximum lengths to prevent queries from running away)
where not(a-->b) // where a is not directly connected to b
with p, relationships(p) as rcoll // just for readability, alias rcoll
return p, reduce(totalTime=0, x in rcoll: totalTime + x.time) as totalTime
order by totalTime
You can throw a limit 1 at the end, if you need only the shortest.

You can use the Dijkstra/Astar algorithm, which seems to be a perfect fit for you. Take a look at http://api.neo4j.org/1.8.1/org/neo4j/graphalgo/GraphAlgoFactory.html
Unfortunately you cannot use those from Cypher.

Related

Is neo4j suitable for searching for paths of specific length

I am a total newcommer in the world of graph databases. But let's put that on a side.
I have a task to find a cicular path of certain length (or of any other measure) from start point and back.
So for example, I need to find a path from one node and back which is 10 "nodes" long and at the same time has around 15 weights of some kind. This is just an example.
Is this somehow possible with neo4j, or is it even the right thing to use?
Hope I clarified it enough, and thank you for your answers.
Regards
Neo4j is a good choice for cycle detection.
If you need to find one path from n to n of length 10, you could try some query like this one:
MATCH p=(n:TestLabel {uuid: 1})-[rels:TEST_REL_TYPE*10]-(n)
RETURN p LIMIT 1
The match clause here is asking Cypher to find all paths from n to itself, of exactly 10 hops, using a specific relationship type. This is called variable length relationships in Neo4j. I'm using limit 1 to return only one path.
Resulting path can be visualized as a graph:
You can also specify a range of length, such as [*8..10] (from 8 to 10 hops away).
I'm not sure I understand what you mean with:
has around 15 weights of some kind
You can check relationships properties, such as weight, in variable length paths if you need to. Specific example in the doc here.
Maybe you will also be interested in shortestPath() and allShortestPaths() functions, for which you need to know the end node as well as the start one, and you can find paths between them, even specifying the length.
Since you did not provide a data model, I will just assume that your starting/ending nodes all have the Foo label, that the relevant relationships all have the BAR type, and that your circular path relationships all point in the same direction (which should be faster to process, in general). Also, I gather that you only want circular paths of a specific length (10). Finally, I am guessing that you prefer circular paths with lower total weight, and that you want to ignore paths whose total weight exceed a bounding value (15). This query accomplishes the above, returning the matching paths and their path weights, in ascending order:
MATCH p=(f:Foo)-[rels:BAR*10]->(f)
WITH p, REDUCE(s = 0, r IN rels | s + r.weight) AS pathWeight
WHERE pathWeight <= 15
RETURN p, pathWeight
ORDER BY pathWeight;

Optimizing Cypher Query

I am currently starting to work with Neo4J and it's query language cypher.
I have a multple queries that follow the same pattern.
I am doing some comparison between a SQL-Database and Neo4J.
In my Neo4J Datababase I habe one type of label (person) and one type of relationship (FRIENDSHIP). The person has the propterties personID, name, email, phone.
Now I want to have the the friends n-th degree. I also want to filter out those persons that are also friends with a lower degree.
FOr example if I want to search for the friends 3 degree I want to filter out those that are also friends first and/or second degree.
Here my query type:
MATCH (me:person {personID:'1'})-[:FRIENDSHIP*3]-(friends:person)
WHERE NOT (me:person)-[:FRIENDSHIP]-(friends:person)
AND NOT (me:person)-[:FRIENDSHIP*2]-(friends:person)
RETURN COUNT(DISTINCT friends);
I found something similiar somewhere.
This query works.
My problem is that this pattern of query is much to slow if I search for a higher degree of friendship and/or if the number of persons becomes more.
So I would really appreciate it, if somemone could help me with optimize this.
If you just wanted to handle depths of 3, this should return the distinct nodes that are 3 degrees away but not also less than 3 degrees away:
MATCH (me:person {personID:'1'})-[:FRIENDSHIP]-(f1:person)-[:FRIENDSHIP]-(f2:person)-[:FRIENDSHIP]-(f3:person)
RETURN apoc.coll.subtract(COLLECT(f3), COLLECT(f1) + COLLECT(f2) + me) AS result;
The above query uses the APOC function apoc.coll.subtract to remove the unwanted nodes from the result. The function also makes sure the collection contains distinct elements.
The following query is more general, and should work for any given depth (by just replacing the number after *). For example, this query will work with a depth of 4:
MATCH p=(me:person {personID:'1'})-[:FRIENDSHIP*4]-(:person)
WITH NODES(p)[0..-1] AS priors, LAST(NODES(p)) AS candidate
UNWIND priors AS prior
RETURN apoc.coll.subtract(COLLECT(DISTINCT candidate), COLLECT(DISTINCT prior)) AS result;
The problem with Cypher's variable-length relationship matching is that it's looking for all possible paths to that depth. This can cause unnecessary performance issues when all you're interested in are the nodes at certain depths and not the paths to them.
APOC's path expander using 'NODE_GLOBAL' uniqueness is a more efficient means of matching to nodes at inclusive depths.
When using 'NODE_GLOBAL' uniqueness, nodes are only ever visited once during traversal. Because of this, when we set the path expander's minLevel and maxLevel to be the same, the result are nodes at that level that are not present at any lower level, which is exactly the result you're trying to get.
Try this query after installing APOC:
MATCH (me:person {personID:'1'})
CALL apoc.path.expandConfig(me, {uniqueness:'NODE_GLOBAL', minLevel:4, maxLevel:4}) YIELD path
// a single path for each node at depth 4 but not at any lower depth
RETURN COUNT(path)
Of course you'll want to parameterize your inputs (personID, level) when you get the chance.

Neo4j and Cypher - How can I create/merge chained sequential node relationships (and even better time-series)?

To keep things simple, as part of the ETL on my time-series data, I added a sequence number property to each row corresponding to 0..370365 (370,366 nodes, 5,555,490 properties - not that big). I later added a second property and named it "outeseq" (original) and "ineseq" (second) to see if an outright equivalence to base the relationship on might speed things up a bit.
I can get both of the following queries to run properly on up to ~30k nodes (LIMIT 30000) but past that, its just an endless wait. My JVM has 16g max (if it can even use it on a windows box):
MATCH (a:BOOK),(b:BOOK)
WHERE a.outeseq=b.outeseq-1
MERGE (a)-[s:FORWARD_SEQ]->(b)
RETURN s;
or
MATCH (a:BOOK),(b:BOOK)
WHERE a.outeseq=b.ineseq
MERGE (a)-[s:FORWARD_SEQ]->(b)
RETURN s;
I also added these in hopes of speeding things up:
CREATE CONSTRAINT ON (a:BOOK)
ASSERT a.outeseq IS UNIQUE
CREATE CONSTRAINT ON (b:BOOK)
ASSERT b.ineseq IS UNIQUE
I can't get the relationships created for the entire data set! Help!
Alternatively, I can also get bits of the relationships built with parameters, but haven't figured out how to parameterize the sequence over all of the node-to-node sequential relationships, at least not in a semantically general enough way to do this.
I profiled the query, but did't see any reason for it to "blow-up".
Another question: I would like each relationship to have a property to represent the difference in the time-stamps of each node or delta-t. Is there a way to take the difference between the two values in two sequential nodes, and assign it to the relationship?....for all of the relationships at the same time?
The last Q, if you have the time - I'd really like to use the raw data and just chain the directed relationships from one nodes'stamp to the next nearest node with the minimum delta, but didn't run right at this for fear that it cause scanning of all the nodes in order to build each relationship.
Before anyone suggests that I look to KDB or other db's for time series, let me say I have a very specific reason to want to use a DAG representation.
It seems like this should be so easy...it probably is and I'm blind. Thanks!
Creating Relationships
Since your queries work on 30k nodes, I'd suggest to run them page by page over all the nodes. It seems feasible because outeseq and ineseq are unique and numeric so you can sort nodes by that properties and run query against one slice at time.
MATCH (a:BOOK),(b:BOOK)
WHERE a.outeseq = b.outeseq-1
WITH a, b ORDER BY a.outeseq SKIP {offset} LIMIT 30000
MERGE (a)-[s:FORWARD_SEQ]->(b)
RETURN s;
It will take about 13 times to run the query changing {offset} to cover all the data. It would be nice to write a script on any language which has a neo4j client.
Updating Relationship's Properties
You can assign timestamp delta to relationships using SET clause following the MATCH. Assuming that a timestamp is a long:
MATCH (a:BOOK)-[s:FORWARD_SEQ]->(b:BOOK)
SET s.delta = abs(b.timestamp - a.timestamp);
Chaining Nodes With Minimal Delta
When relationships have the delta property inside, the graph becomes a weighted graph. So we can apply this approach to calculate the shortest path using deltas. Then we just save the length of the shortest path (summ of deltas) into the relation between the first and the last node.
MATCH p=(a:BOOK)-[:FORWARD_SEQ*1..]->(b:BOOK)
WITH p AS shortestPath, a, b,
reduce(weight=0, r in relationships(p) : weight+r.delta) AS totalDelta
ORDER BY totalDelta ASC
LIMIT 1
MERGE (a)-[nearest:NEAREST {delta: totalDelta}]->(b)
RETURN nearest;
Disclaimer: queries above are not supposed to be totally working, they just hint possible approaches to the problem.

Neo4J find shortest path using all nodes (unordered) cypher

I'm not sure if this can be done in any efficient way, but i'm hoping it can be.
I am getting a set of data with data on it that allows me to find very specific nodes. However this data is not ordered in any way in terms of how the nodes are connected.
What I am trying to do is to be able to find all the nodes in neo4J (up to 7) and then say with these 7 nodes, find the path that connects them all.
These given nodes will be the only nodes connected in the desired path.
basically i'm trying to get a set that looks like
1,2,3,4,5,6,7
and to be able to find
2->7->6->3<-5<-1->4
any help or direction would be greatly appreciated
the way I would do it is the following:
You need a starting node from where on you will query the next 7 nodes. To be able to find the very first 7 nodes I would introduce a starting root node. Lets call it simply :Root .
MATCH (:Root)-[r:NEXT*1..7]->(x)<-[]-(y) RETURN x, y
or even simpler:
MATCH (:Root)-[r:NEXT*..7]->(x)<-[]-(y) RETURN x, y
:Root of course could be any other node in your set, to get the next seven nodes from there on.
Is this what you want?
Take a further look at the following neo4j cheat sheet, which has some great tips:
http://assets.neo4j.org/download/Neo4j_CheatSheet_v3.pdf
Regards
EDIT
Ok sorry, I misunderstood you.
Maybe this brings you further:
MATCH (n:Node) where n.refId in [1,2,3,4,5,6,7]
MATCH (n2:Node) where n2.refId in [1,2,3,4,5,6,7]
MATCH p=shortestPath((q)-[:NEXT*]-(q2))
return collect(distinct p)
or if those numbers are node IDs than like this:
MATCH (n:Node) where id(n) in [1,2,3,4,5,6,7]
MATCH (n2:Node) where id(n2) in [1,2,3,4,5,6,7]
MATCH p=shortestPath((q)-[:NEXT*]-(q2))
return collect(distinct p)
This actually returns all the paths between the given nodes as a collection.
So it doesn't return a single path for all those nodes.
I am not aware of a function doing that.
However the neo4j browser displays just a single path between all those nodes desired, because of it's auto complete function. So I think you would have to build your own logic in code, if you want to connect those paths to a single one.
Maybe this is at least a starting point for the problem

Cypher query shortest path

I build a graphe this way: the nodes represents: busStops, and the relationship represent the bus line linking bus stops each others.
The relationship type correspond to the time needed to go from a node two another one.
When I'm querying the graph (thanks to cypher) to get the shortestPath between two which are maybe not linked, the result is the one where the number of relations used is the smallest.
I would to change that in order that the shortest path corresponds to the path where the addition of all relationship types used between two nodes(which correspond to the time) is the smallest?
first, you are doing it wrong. don't use a unique relationship type for each time. use one relationship type and then put a property "time" on all relations.
second, to calculate the addition you can use this cypher formula:
START from=node({busStopId1}), to=node({busStopId2})
MATCH p=from-[:LINE*]-to //asterix * means any distance
RETURN p,reduce(total = 0, r in relationships(p): total + r.time) as tt
ORDER by tt asc;

Resources