Longest path in a graph - graph-algorithm

Given a undirected graph with vertices form 0 to n-1, write a function that will find the longest path (by number of edges) which vertices make an increasing sequence.
What kind of approach would you recommend for solving this puzzle?

You can transform the original graph into a Directed Acyclic Graph by replacing each of the (undirected) edges by a directed edge going towards the node with bigger number.
Then you end up with this: https://www.geeksforgeeks.org/find-longest-path-directed-acyclic-graph/

I would do a Dynamic Programming algorithm. Denote L(u) to be the longest valid path starting at node u. Your base case is L(n-1) = [n-1] (i.e., the path containing only node n-1). Then, for all nodes s from n-2 to 0, perform a BFS starting at s in which you only allow traversing edges (u,v) such that v > u. Once you hit a node for which you've already started at (i.e., a node u such that you've already computed L(u)), L(s) = longest path from s to u + L(u) out of all possible u > s.
The answer to your problem is the node u that has the maximum value of L(u), and this algorithm is O(E), where E is the number of edges in your graph. I don't think you can do faster than this asymptotically
EDIT: Actually, the "BFS" isn't even a BFS: it's simply traversing the edges (s,v) such that v > s (because you have already visited all nodes v > s, so there's no traversal: you'll immediately hit a node you've already started at)
So actually, the simplified algorithm would be this:
longest_path_increasing_nodes():
L = Hash Map whose keys are nodes and values are paths (list of nodes)
L[n-1] = [n-1] # base case
longest_path = L[n-1]
for s from n-2 to 0: # recursive case
L[s] = [s]
for each edge (s,v):
if v > s and length([s] + L[v]) > length(L[s]):
L[s] = [s] + L[v]
if length(L[s]) > length(longest_path):
longest_path = L[s]
return longest_path
EDIT 2022-03-01: Fixed typo in the last if-statement; thanks user650654!

There are algorithms like Dijkastras algorithm which can be modified to find the longest instead of the shortest path.
Here is a simple approach:
Use a recursive algorithm to find all paths between 2 nodes.
Select the longest path.
If you need help with the recursive algorithm just ask.

Related

BFS cycle detection

Could someone provide a step by step pseudocode using BFS to search a cycle in a directed/undirected graph?
Can it get O(|V|+|E|) complexity?
I have seen only DFS implementation so far.
You can take a non-recursive DFS algorithm for detecting cycles in undirected graphs where you replace the stack for the nodes by a queue to get a BFS algorithm. It's straightforward:
q <- new queue // for DFS you use just a stack
append the start node of n in q
while q is not empty do
n <- remove first element of q
if n is visited
output 'Hurray, I found a cycle'
mark n as visited
for each edge (n, m) in E do
append m to q
Since BFS visits each node and each edge only once, you have a complexity of O(|V|+|E|).
I find BFS algorithm perfect for that.
Time complexity is the same.
You want something like this(Edited from Wikipedia):
Cycle-With-Breadth-First-Search(Graph g, Vertex root):
create empty set S
create empty queue Q
root.parent = null
Q.enqueueEdges(root)
while Q is not empty:
current = Q.dequeue()
for each node n that is adjacent to current:
if n is not in S:
add n to S
n.parent = current
Q.enqueue(n)
else //We found a cycle
n.parent = current
return n and current
I added only the else its a cycle block for the cycle detection and removed the original if reached target block for target detection. In total it's the same algorithm.
To find the exact cycle you will have to find a common ancestor for n and current. The lowest one is available in O(n) time. Than the cycle is known. ancestor to n and current. current and n are connected.
For more info about cycles and BFS read this link https://stackoverflow.com/a/4464388/6782134

Pathfinding in Prolog

I'm trying to teach myself Prolog. Below, I've written some code that I think should return all paths between nodes in an undirected graph... but it doesn't. I'm trying to understand why this particular code doesn't work (which I think differentiates this question from similar Prolog pathfinding posts). I'm running this in SWI-Prolog. Any clues?
% Define a directed graph (nodes may or may not be "room"s; edges are encoded by "leads_to" predicates).
room(kitchen).
room(living_room).
room(den).
room(stairs).
room(hall).
room(bathroom).
room(bedroom1).
room(bedroom2).
room(bedroom3).
room(studio).
leads_to(kitchen, living_room).
leads_to(living_room, stairs).
leads_to(living_room, den).
leads_to(stairs, hall).
leads_to(hall, bedroom1).
leads_to(hall, bedroom2).
leads_to(hall, bedroom3).
leads_to(hall, studio).
leads_to(living_room, outside). % Note "outside" is the only node that is not a "room"
leads_to(kitchen, outside).
% Define the indirection of the graph. This is what we'll work with.
neighbor(A,B) :- leads_to(A, B).
neighbor(A,B) :- leads_to(B, A).
Iff A --> B --> C --> D is a loop-free path, then
path(A, D, [B, C])
should be true. I.e., the third argument contains the intermediate nodes.
% Base Rule (R0)
path(X,Y,[]) :- neighbor(X,Y).
% Inductive Rule (R1)
path(X,Y,[Z|P]) :- not(X == Y), neighbor(X,Z), not(member(Z, P)), path(Z,Y,P).
Yet,
?- path(bedroom1, stairs, P).
is false. Why? Shouldn't we get a match to R1 with
X = bedroom1
Y = stairs
Z = hall
P = []
since,
?- neighbor(bedroom1, hall).
true.
?- not(member(hall, [])).
true.
?- path(hall, stairs, []).
true .
?
In fact, if I evaluate
?- path(A, B, P).
I get only the length-1 solutions.
Welcome to Prolog! The problem, essentially, is that when you get to not(member(Z, P)) in R1, P is still a pure variable, because the evaluation hasn't gotten to path(Z, Y, P) to define it yet. One of the surprising yet inspiring things about Prolog is that member(Ground, Var) will generate lists that contain Ground and unify them with Var:
?- member(a, X).
X = [a|_G890] ;
X = [_G889, a|_G893] ;
X = [_G889, _G892, a|_G896] .
This has the confusing side-effect that checking for a value in an uninstantiated list will always succeed, which is why not(member(Z, P)) will always fail, causing R1 to always fail. The fact that you get all the R0 solutions and none of the R1 solutions is a clue that something in R1 is causing it to always fail. After all, we know R0 works.
If you swap these two goals, you'll get the first result you want:
path(X,Y,[Z|P]) :- not(X == Y), neighbor(X,Z), path(Z,Y,P), not(member(Z, P)).
?- path(bedroom1, stairs, P).
P = [hall]
If you ask for another solution, you'll get a stack overflow. This is because after the change we're happily generating solutions with cycles as quickly as possible with path(Z,Y,P), only to discard them post-facto with not(member(Z, P)). (Incidentally, for a slight efficiency gain we can switch to memberchk/2 instead of member/2. Of course doing the wrong thing faster isn't much help. :)
I'd be inclined to convert this to a breadth-first search, which in Prolog would imply adding an "open set" argument to contain solutions you haven't tried yet, and at each node first trying something in the open set and then adding that node's possibilities to the end of the open set. When the open set is extinguished, you've tried every node you could get to. For some path finding problems it's a better solution than depth first search anyway. Another thing you could try is separating the path into a visited and future component, and only checking the visited component. As long as you aren't generating a cycle in the current step, you can be assured you aren't generating one at all, there's no need to worry about future steps.
The way you worded the question leads me to believe you don't want a complete solution, just a hint, so I think this is all you need. Let me know if that's not right.

Push-relabel gap heuristics

I don't understand how to implement gap heuristics with push relabel. Wiki described it like this:
"In gap relabeling heuristic we maintain an array A of size n, holding in A[i]
the number of nodes for each label (up to n). If a label d is found, such that
A[d] = 0, then all nodes with label > d are relabeled to label n."
Use a gap heuristic. If there is a 'k' such that for no node height(u) =k, you can set height(u) = max(height(u), height(source) +1) for all nodes except source, for which height(u) >k. This is because any such 'k' represents a minimal cut in the graph, and no more flow will go from the nodes S={u where height(u) > k} to nodes in T={v, where height(v)0. But then height(u) > height(v)+1 , contradicting height(u) > k and height(v) < k.
Can someone explain to me in pseudocode how to add the gap heuristic to a FIFO push-relabel as shown in wiki's sample code?
Thanks,
Vince
It might be a little late but here is a link to a Stanford University notebook where you can find a push-relabel maximum flow using a Gap Heuristic in C.
I hope it helps you.
http://www.stanford.edu/~liszt90/acm/notebook.html#file3

What data structures to use for Dijkstra's algorithm in Erlang?

Disclaimer: The author is a newbie in Erlang.
Imagine, we have a graph consisting of 1M nodes, and each node has 0-4 neighbours (the edges are emanating from each node to those neighbours, so the graph is directed and connected).
Here is my choice of data structures:
To store the graph I use digraph, which is based on ETS tables. This allows fast (O(1)) access to the neighbours of a node.
For the list of unvisited nodes, I use gb_sets:take_smallest (the node is already sorted, and it is simultaneously deleted after fetching).
For the list of predecessors I use the dict structure, which allows to store the predecessors in the following way: {Node1,Node1_predecessor},{Node2,Node2_predecessor}.
For the list of visited nodes I use a simple list.
Problems:
The code becomes very hard to read and maintain when I try to update the weight of a node both in the digraph structure and in the Unvisited_nodes structure. It doesn't seem the right way to keep one 'object' with the 'fields' that need to be updated in two data structures simultaneously. What is the right way to do that?
The same question is about predecessors list. Where should I store the predecessor 'field' of a node 'object'? Maybe in the Graph (digraph structure)?
Maybe I should rethink the whole Dijkstra's algorithm in terms of processes and messages instead of objects (nodes and edges) and their fields(weights)?
UPD:
Here is the code based on the recommendations of Antonakos:
dijkstra(Graph,Start_node_name) ->
io:format("dijkstra/2: start~n"),
Paths = dict:new(),
io:format("dijkstra/2: initialized empty Paths~n"),
Unvisited = gb_sets:new(),
io:format("dijkstra/2: initialized empty Unvisited nodes priority queue~n"),
Unvisited_nodes = gb_sets:insert({0,Start_node_name,root},Unvisited),
io:format("dijkstra/2: Added start node ~w with the weight 0 to the Unvisited nodes: ~w~n", [Start_node_name, Unvisited_nodes]),
Paths_updated = loop_through_nodes(Graph,Paths,Unvisited_nodes),
io:format("dijkstra/2: Finished searching for shortest paths: ~w~n", [Paths_updated]).
loop_through_nodes(Graph,Paths,Unvisited_nodes) ->
%% We need this condition to stop looping through the Unvisited nodes if it is empty
case gb_sets:is_empty(Unvisited_nodes) of
false ->
{{Current_weight,Current_name,Previous_node}, Unvisited_nodes_updated} = gb_sets:take_smallest(Unvisited_nodes),
case dict:is_key(Current_name,Paths) of
false ->
io:format("loop_through_nodes: Found a new smallest unvisited node ~w~n",[Current_name]),
Paths_updated = dict:store(Current_name,{Previous_node,Current_weight},Paths),
io:format("loop_through_nodes: Updated Paths: ~w~n",[Paths_updated]),
Out_edges = digraph:out_edges(Graph,Current_name),
io:format("loop_through_nodes: Ready to iterate through the out edges of node ~w: ~w~n",[Current_name,Out_edges]),
Unvisited_nodes_updated_2 = loop_through_edges(Graph,Out_edges,Paths_updated,Unvisited_nodes_updated,Current_weight),
io:format("loop_through_nodes: Looped through out edges of the node ~w and updated Unvisited nodes: ~w~n",[Current_name,Unvisited_nodes_updated_2]),
loop_through_nodes(Graph,Paths_updated,Unvisited_nodes_updated_2);
true ->
loop_through_nodes(Graph,Paths,Unvisited_nodes_updated)
end;
true ->
Paths
end.
loop_through_edges(Graph,[],Paths,Unvisited_nodes,Current_weight) ->
io:format("loop_through_edges: No more out edges ~n"),
Unvisited_nodes;
loop_through_edges(Graph,Edges,Paths,Unvisited_nodes,Current_weight) ->
io:format("loop_through_edges: Start ~n"),
[Current_edge|Rest_edges] = Edges,
{Current_edge,Current_node,Neighbour_node,Edge_weight} = digraph:edge(Graph,Current_edge),
case dict:is_key(Neighbour_node,Paths) of
false ->
io:format("loop_through_edges: Inserting new neighbour node ~w into Unvisited nodes~n",[Current_node]),
Unvisited_nodes_updated = gb_sets:insert({Current_weight+Edge_weight,Neighbour_node,Current_node},Unvisited_nodes),
io:format("loop_through_edges: The unvisited nodes are: ~w~n",[Unvisited_nodes_updated]),
loop_through_edges(Graph,Rest_edges,Paths,Unvisited_nodes_updated,Current_weight);
true ->
loop_through_edges(Graph,Rest_edges,Paths,Unvisited_nodes,Current_weight)
end.
Your choice of data structures looks OK, so it is mostly a question of what to insert in them and how to keep them up to date. I'd suggest the following (I have changed the names a bit):
Result: A dict mapping Node to {Cost, Prev}, where Cost is the total cost of the path to Node and Prev is its predecessor on the path.
Open: A gb_sets structure of {Cost, Node, Prev}.
A graph with edges of the form {EdgeCost, NextNode}.
The result of the search is represented by Result and the graph isn't updated at all. There is no multiprocessing or message passing.
The algorithm goes as follows:
Insert {0, StartNode, Nil} in Open, where Nil is something that marks the end of the path.
Let {{Cost, Node, Prev}, Open1} = gb_sets:take_smallest(Open). If Node is already in Result then do nothing; otherwise add {Cost, Node, Prev} to Result, and for every edge {EdgeCost, NextNode} of Node add {Cost + EdgeCost, NextNode, Node} to Open1, but only if NextNode isn't already in Result. Continue with Open1 until the set is empty.
Dijkstra's algorithm asks for a decrease_key() operation on the Open set. Since this isn't readily supported by gb_sets we have used the workaround of inserting a tuple for NextNode even if NextNode might be present in Open already. That's why we check if the node extracted from Open is already in Result.
Extended discussion of the use of the priority queue
There are several ways of using a priority queue with Dijkstra's algorithm.
In the standard version of Wikipedia a node v is inserted only once but the position of v is updated when the cost and predecessor of v is changed.
alt := dist[u] + dist_between(u, v)
if alt < dist[v]:
dist[v] := alt
previous[v] := u
decrease-key v in Q
Implementations often simplify by replacing decrease-key v in Q with add v to Q. This means that v can be added more than once, and the algorithm must therefore check that an u extracted from the queue hasn't already been added to the result.
In my version I am replacing the entire block above with add v to Q. The queue will therefore contain even more entries, but since they are always extracted in order it doesn't affect the correctness of the algorithm. If you don't want these extra entries, you can use a dictionary to keep track of the minimum cost for each node.

the shortest path in cycle directed Graph

i need an example of the shortest path of directed graph cycle bye one node (it should reach to all nodes of graph from anode will be the input) please if there is an example i need it in c++ or algorithm thanks very much.........
You require to find the minimum spanning tree for it.
For directed graph according to wikipedia you can use this algorithm.
In Pseudocode:
//INPUT: graph G = (V,E)
//OUTPUT: shortest cycle length
min_cycle(G)
min = ∞
for u in V
len = dij_cyc(G,u)
if min > len
min = len
return min
//INPUT: graph G and vertex s
//OUTPUT: minimum distance back to s
dij_cyc(G,s)
for u in V
dist(u) = ∞
//makequeue returns a priority queue of all V
H = makequeue(V) //using dist-values as keys with s First In
while !H.empty?
u = deletemin(H)
for all edges (u,v) in E
if dist(v) > dist(u) + l(u,v) then
dist(v) = dist(u) + l(u,v)
decreasekey(H,v)
return dist(s)
This runs a slightly different Dijkstra's on each vertex. The mutated Dijkstras
has a few key differences. First, all initial distances are set to ∞, even the
start vertex. Second, the start vertex must be put on the queue first to make
sure it comes off first since they all have the same priority. Finally, the
mutated Dijkstras returns the distance back to the start node. If there was no
path back to the start vertex the distance remains ∞. The minimum of all these
returns from the mutated Dijkstras is the shortest path. Since Dijkstras runs
at worst in O(|V|^2) and min_cycle runs this form of Dijkstras |V| times, the
final running time to find the shortest cycle is O(|V|^3). If min_cyc returns
∞ then the graph is acyclic.
To return the actual path of the shortest cycle only slight modifications need to be made.

Resources