Topological sort in Gremlin - graph-algorithm

Using the Gremlin/TinkerPop query language, is there a way to compute the topological ordering of a directed acyclic graph?
For example, given a graph with the following edges
a -> b, a -> d, b -> c, c -> d, e -> c
I would like to obtain one of the following topological orderings: a, b, e, c, d, or a, e, b, c, d, or e, a, b, c, d.

Alright, let's create your sample graph first:
g = TinkerGraph.open().traversal()
g.addV(id, "a").as("a").
addV(id, "b").as("b").
addV(id, "c").as("c").
addV(id, "d").as("d").
addV(id, "e").as("e").
addE("link").from("a").to("b").
addE("link").from("a").to("d").
addE("link").from("b").to("c").
addE("link").from("c").to("d").
addE("link").from("e").to("c").iterate()
And this is Kahn's algorithm implemented in Gremlin:
gremlin> g.V().not(__.inE()).store("x").
repeat(outE().store("e").inV().not(inE().where(without("e"))).store("x")).
cap("x")
==>[v[a],v[b],v[e],v[c],v[d]]

Related

How to get Neo4j nodes linked through another node?

I have node A and an associated node B.
Node C and node D are tied to node B too.
It is necessary to get all nodes A and associated with them C and D
MATCH (a:A) -- (:B) -- (c:C)
RETURN a, c
Received A and bound to it (via B) C.
Is it possible get node A, C and D in the same request?
You can write your query in two ways like below:
Option #1:
MATCH (a:A) -- (b:B) -- (c:C), (b) -- (d:D)
RETURN a, c, d
Option #2:
MATCH (a:A) -- (b:B) -- (c:C)
MATCH (b) -- (d:D)
RETURN a, c, d
which is basically the same thing with the first query.

Get end nodes when using *

Lets say I have points a, b, c, d and a->b->c->d is related with next. Below is merge command for it.
merge (a:point{id:'a'})-[:next]-(b:point{id:'b'})-[:next]-(c:point{id:'c'})-[:next]-(d:point{id:'d'}) return a, b, c, d
Neo4j image of the merge
What I am looking for is a way to get all the points between a and d. Using below query I get the relationships, but how can I get list that contains b, c?
match p=(start:point)-[:next*]-(end:point)
with *, relationships(p) as r
where start.id ='a' and end.id = 'd'
return start, r, end
You can do
RETURN nodes(p)[1..-1]
to get the nodes except the first and last one.

Neo4j - Find missing node to complete circle

I am trying to get a query that starting from a node, it returns the missing node that, when making a new relation to it, would complete a circle. Also it should respond which is the node that, if the circle is close, will end up having a relationship with the input node. Example:
Let's say I have B -> C and C -> A. In this case, if I pass A as input, I would like to receive { newRelationToMake: B, relationToInputNode: C } as a result, since connecting A -> B will result in a closed circle ABC and the relation that the node A will be having will come from C.
Ideally, this query should work for a maximum of n depths. For example for a depth of 4, with relations B -> C, C -> D and D -> A, and I pass A as input, I would need to receive { newRelationToMake: C, relationToInputNode: D} (since if I connect A -> C I close the ACD circle) but also receive {newRelationToMake: B, relationToInputNode: D }(since if I connect A -> B I would close the ABCD circle).
Is there any query to get this information?
Thanks in advance!
You are basically asking for all distinct nodes on paths leading to A, but which are not directly connected to A.
Here is one approach (assuming the nodes all have a Foo label and the relationships all have the BAR type):
MATCH (f:Foo)-[:BAR*2..]->(a:Foo)
WHERE a.id = 'A' AND NOT EXISTS((f)-[:BAR]->(a))
RETURN DISTINCT f AS missingNodes
The variable-length relationship pattern [:BAR*2..] looks for all paths of length 2 or more.

cypher -- missing one possible path

I'm new with cypher expression.
And I have this database
CREATE
(a:City {name: 'A'}),
(b:City {name: 'B'}),
(c:City {name: 'C'}),
(d:City {name: 'D'}),
(e:City {name: 'E'}),
(a)-[:HAS_RAIL_ROAD_TO {distance : 5 }]->(b),
(b)-[:HAS_RAIL_ROAD_TO {distance : 4 }]->(c),
(c)-[:HAS_RAIL_ROAD_TO {distance : 8 }]->(d),
(d)-[:HAS_RAIL_ROAD_TO {distance : 8 }]->(c),
(d)-[:HAS_RAIL_ROAD_TO {distance : 6 }]->(e),
(a)-[:HAS_RAIL_ROAD_TO {distance : 5 }]->(d),
(c)-[:HAS_RAIL_ROAD_TO {distance : 2 }]->(e),
(e)-[:HAS_RAIL_ROAD_TO {distance : 3 }]->(b),
(a)-[:HAS_RAIL_ROAD_TO {distance : 7 }]->(e)
When I execute
MATCH(:City { name: 'A' })-[r:HAS_RAIL_ROAD_TO*4]->(:City { name: 'C' })
return count(r)
I was expecting 3 as result:
(A, B, C,D, C); (A, D, C, D, C); (A, D, E, B, C).
But the result given is 2
I think in the second case (A, D, C, D, C) it is not coming back to D.
What do you think is the reason for this?
This has to do with the uniqueness behavior in Cypher traversals, which ensures that a relationship can only be traversed once per path per MATCH pattern.
(A, D, C, D, C) will not work because there are only two relationships between D and C, and the D, C, D part traversed both of them, leaving no other relationships available to traverse again back from D to C.
This uniqueness behavior is useful for most cases, and also prevents any kind of infinite loop problems with unbounded variable-length patterns.
If you do need to consider reusing relationships in your paths, you'll need a different approach, one that lets you change the uniqueness behavior during traversal.
You can use path expander procs from APOC Procedures to change the uniqueness and expand out, but PLEASE make sure to set an upper bound (via the maxLevel config property) otherwise you risk an infinite loop traversal which will likely blow the heap.
MATCH (start:City { name: 'A' }), (end:City { name: 'C' })
CALL apoc.path.expandConfig(start, {endNodes:[end], minLevel:4, maxLevel:4, relationshipFilter:'HAS_RAIL_ROAD_TO>', uniqueness:'NONE'}) YIELD path
RETURN [node in nodes(path) | node.name] as path

Importing columns from another sheet and joining

I have a master sheet and I would like to import to another sheet all rows but only columns D, I, J, K, L, M, N, P, Q, S, T, U, V, W. But I would also like to join some of these columns.
I'd like to join the following:
D, I & J
K, L, M & N
P, Q, S, T & U
V & W.
I'd also like to have the information continue to feed through as more gets added to the master sheet. Is this actually possible?

Resources