BFS cycle detection - graph-algorithm

Could someone provide a step by step pseudocode using BFS to search a cycle in a directed/undirected graph?
Can it get O(|V|+|E|) complexity?
I have seen only DFS implementation so far.

You can take a non-recursive DFS algorithm for detecting cycles in undirected graphs where you replace the stack for the nodes by a queue to get a BFS algorithm. It's straightforward:
q <- new queue // for DFS you use just a stack
append the start node of n in q
while q is not empty do
n <- remove first element of q
if n is visited
output 'Hurray, I found a cycle'
mark n as visited
for each edge (n, m) in E do
append m to q
Since BFS visits each node and each edge only once, you have a complexity of O(|V|+|E|).

I find BFS algorithm perfect for that.
Time complexity is the same.
You want something like this(Edited from Wikipedia):
Cycle-With-Breadth-First-Search(Graph g, Vertex root):
create empty set S
create empty queue Q
root.parent = null
Q.enqueueEdges(root)
while Q is not empty:
current = Q.dequeue()
for each node n that is adjacent to current:
if n is not in S:
add n to S
n.parent = current
Q.enqueue(n)
else //We found a cycle
n.parent = current
return n and current
I added only the else its a cycle block for the cycle detection and removed the original if reached target block for target detection. In total it's the same algorithm.
To find the exact cycle you will have to find a common ancestor for n and current. The lowest one is available in O(n) time. Than the cycle is known. ancestor to n and current. current and n are connected.
For more info about cycles and BFS read this link https://stackoverflow.com/a/4464388/6782134

Related

proof of optimality in activity selection

Can someone please explain in a not so formal way how the greedy choice is the optimal solution for the activity selection problem? This is the simplest explanation that I have found but I don't really get it
How does Greedy Choice work for Activities sorted according to finish time?
Let the given set of activities be S = {1, 2, 3, ..n} and activities be sorted by finish time. The greedy choice is to always pick activity 1. How come the activity 1 always provides one of the optimal solutions. We can prove it by showing that if there is another solution B with the first activity other than 1, then there is also a solution A of the same size with activity 1 as the first activity. Let the first activity selected by B be k, then there always exist A = {B – {k}} U {1}.(Note that the activities in B are independent and k has smallest finishing time among all. Since k is not 1, finish(k) >= finish(1)).
The following is my understanding of why greedy solution always words:
Assertion: If A is the greedy choice(starting with 1st activity in the sorted array), then it gives the optimal solution.
Proof: Let there be another choice B starting with some activity k (k != 1 or finishTime(k)>= finishTime(1)) which alone gives the optimal solution.So, B does not have the 1st activity and the following relation could be written between A & B could be written as:
A = {B - {k}} U {1}
Here:
1.Sets A and B are disjoint
2.Both A and B have compatible activities in them
Since we conclude that |A|=|B|, therefore activity A also gives the optimal solution.
Let's say A is a the optimal solution which starts with 1 if the intervals are S={1,2,3,.....m} and the length of the solution is say n1. If A is not an optimal solution, then there exists another solution B which starts with k!=1 and finishTime(k)>=finishTime(1), which has length n2.
So, n2>n1.
Now, if we exclude k from solution B then we are left with n2-1 number of elements.
Since, k doesn't overlap with other intervals in B, 1 will also not overlap.
This is because all intervals in B(excluding k) will have startTime>= finishTime(k)>=finishTime(1).Hence, if we replace k with 1 in B, we still have n2 length. But optimal solution starting with 1 was A with length n1. We are getting n1=n2 , which contradicts n2>n1. Hence Solution starting with 1 is optimal.

Longest path in a graph

Given a undirected graph with vertices form 0 to n-1, write a function that will find the longest path (by number of edges) which vertices make an increasing sequence.
What kind of approach would you recommend for solving this puzzle?
You can transform the original graph into a Directed Acyclic Graph by replacing each of the (undirected) edges by a directed edge going towards the node with bigger number.
Then you end up with this: https://www.geeksforgeeks.org/find-longest-path-directed-acyclic-graph/
I would do a Dynamic Programming algorithm. Denote L(u) to be the longest valid path starting at node u. Your base case is L(n-1) = [n-1] (i.e., the path containing only node n-1). Then, for all nodes s from n-2 to 0, perform a BFS starting at s in which you only allow traversing edges (u,v) such that v > u. Once you hit a node for which you've already started at (i.e., a node u such that you've already computed L(u)), L(s) = longest path from s to u + L(u) out of all possible u > s.
The answer to your problem is the node u that has the maximum value of L(u), and this algorithm is O(E), where E is the number of edges in your graph. I don't think you can do faster than this asymptotically
EDIT: Actually, the "BFS" isn't even a BFS: it's simply traversing the edges (s,v) such that v > s (because you have already visited all nodes v > s, so there's no traversal: you'll immediately hit a node you've already started at)
So actually, the simplified algorithm would be this:
longest_path_increasing_nodes():
L = Hash Map whose keys are nodes and values are paths (list of nodes)
L[n-1] = [n-1] # base case
longest_path = L[n-1]
for s from n-2 to 0: # recursive case
L[s] = [s]
for each edge (s,v):
if v > s and length([s] + L[v]) > length(L[s]):
L[s] = [s] + L[v]
if length(L[s]) > length(longest_path):
longest_path = L[s]
return longest_path
EDIT 2022-03-01: Fixed typo in the last if-statement; thanks user650654!
There are algorithms like Dijkastras algorithm which can be modified to find the longest instead of the shortest path.
Here is a simple approach:
Use a recursive algorithm to find all paths between 2 nodes.
Select the longest path.
If you need help with the recursive algorithm just ask.

Pathfinding in Prolog

I'm trying to teach myself Prolog. Below, I've written some code that I think should return all paths between nodes in an undirected graph... but it doesn't. I'm trying to understand why this particular code doesn't work (which I think differentiates this question from similar Prolog pathfinding posts). I'm running this in SWI-Prolog. Any clues?
% Define a directed graph (nodes may or may not be "room"s; edges are encoded by "leads_to" predicates).
room(kitchen).
room(living_room).
room(den).
room(stairs).
room(hall).
room(bathroom).
room(bedroom1).
room(bedroom2).
room(bedroom3).
room(studio).
leads_to(kitchen, living_room).
leads_to(living_room, stairs).
leads_to(living_room, den).
leads_to(stairs, hall).
leads_to(hall, bedroom1).
leads_to(hall, bedroom2).
leads_to(hall, bedroom3).
leads_to(hall, studio).
leads_to(living_room, outside). % Note "outside" is the only node that is not a "room"
leads_to(kitchen, outside).
% Define the indirection of the graph. This is what we'll work with.
neighbor(A,B) :- leads_to(A, B).
neighbor(A,B) :- leads_to(B, A).
Iff A --> B --> C --> D is a loop-free path, then
path(A, D, [B, C])
should be true. I.e., the third argument contains the intermediate nodes.
% Base Rule (R0)
path(X,Y,[]) :- neighbor(X,Y).
% Inductive Rule (R1)
path(X,Y,[Z|P]) :- not(X == Y), neighbor(X,Z), not(member(Z, P)), path(Z,Y,P).
Yet,
?- path(bedroom1, stairs, P).
is false. Why? Shouldn't we get a match to R1 with
X = bedroom1
Y = stairs
Z = hall
P = []
since,
?- neighbor(bedroom1, hall).
true.
?- not(member(hall, [])).
true.
?- path(hall, stairs, []).
true .
?
In fact, if I evaluate
?- path(A, B, P).
I get only the length-1 solutions.
Welcome to Prolog! The problem, essentially, is that when you get to not(member(Z, P)) in R1, P is still a pure variable, because the evaluation hasn't gotten to path(Z, Y, P) to define it yet. One of the surprising yet inspiring things about Prolog is that member(Ground, Var) will generate lists that contain Ground and unify them with Var:
?- member(a, X).
X = [a|_G890] ;
X = [_G889, a|_G893] ;
X = [_G889, _G892, a|_G896] .
This has the confusing side-effect that checking for a value in an uninstantiated list will always succeed, which is why not(member(Z, P)) will always fail, causing R1 to always fail. The fact that you get all the R0 solutions and none of the R1 solutions is a clue that something in R1 is causing it to always fail. After all, we know R0 works.
If you swap these two goals, you'll get the first result you want:
path(X,Y,[Z|P]) :- not(X == Y), neighbor(X,Z), path(Z,Y,P), not(member(Z, P)).
?- path(bedroom1, stairs, P).
P = [hall]
If you ask for another solution, you'll get a stack overflow. This is because after the change we're happily generating solutions with cycles as quickly as possible with path(Z,Y,P), only to discard them post-facto with not(member(Z, P)). (Incidentally, for a slight efficiency gain we can switch to memberchk/2 instead of member/2. Of course doing the wrong thing faster isn't much help. :)
I'd be inclined to convert this to a breadth-first search, which in Prolog would imply adding an "open set" argument to contain solutions you haven't tried yet, and at each node first trying something in the open set and then adding that node's possibilities to the end of the open set. When the open set is extinguished, you've tried every node you could get to. For some path finding problems it's a better solution than depth first search anyway. Another thing you could try is separating the path into a visited and future component, and only checking the visited component. As long as you aren't generating a cycle in the current step, you can be assured you aren't generating one at all, there's no need to worry about future steps.
The way you worded the question leads me to believe you don't want a complete solution, just a hint, so I think this is all you need. Let me know if that's not right.

What data structures to use for Dijkstra's algorithm in Erlang?

Disclaimer: The author is a newbie in Erlang.
Imagine, we have a graph consisting of 1M nodes, and each node has 0-4 neighbours (the edges are emanating from each node to those neighbours, so the graph is directed and connected).
Here is my choice of data structures:
To store the graph I use digraph, which is based on ETS tables. This allows fast (O(1)) access to the neighbours of a node.
For the list of unvisited nodes, I use gb_sets:take_smallest (the node is already sorted, and it is simultaneously deleted after fetching).
For the list of predecessors I use the dict structure, which allows to store the predecessors in the following way: {Node1,Node1_predecessor},{Node2,Node2_predecessor}.
For the list of visited nodes I use a simple list.
Problems:
The code becomes very hard to read and maintain when I try to update the weight of a node both in the digraph structure and in the Unvisited_nodes structure. It doesn't seem the right way to keep one 'object' with the 'fields' that need to be updated in two data structures simultaneously. What is the right way to do that?
The same question is about predecessors list. Where should I store the predecessor 'field' of a node 'object'? Maybe in the Graph (digraph structure)?
Maybe I should rethink the whole Dijkstra's algorithm in terms of processes and messages instead of objects (nodes and edges) and their fields(weights)?
UPD:
Here is the code based on the recommendations of Antonakos:
dijkstra(Graph,Start_node_name) ->
io:format("dijkstra/2: start~n"),
Paths = dict:new(),
io:format("dijkstra/2: initialized empty Paths~n"),
Unvisited = gb_sets:new(),
io:format("dijkstra/2: initialized empty Unvisited nodes priority queue~n"),
Unvisited_nodes = gb_sets:insert({0,Start_node_name,root},Unvisited),
io:format("dijkstra/2: Added start node ~w with the weight 0 to the Unvisited nodes: ~w~n", [Start_node_name, Unvisited_nodes]),
Paths_updated = loop_through_nodes(Graph,Paths,Unvisited_nodes),
io:format("dijkstra/2: Finished searching for shortest paths: ~w~n", [Paths_updated]).
loop_through_nodes(Graph,Paths,Unvisited_nodes) ->
%% We need this condition to stop looping through the Unvisited nodes if it is empty
case gb_sets:is_empty(Unvisited_nodes) of
false ->
{{Current_weight,Current_name,Previous_node}, Unvisited_nodes_updated} = gb_sets:take_smallest(Unvisited_nodes),
case dict:is_key(Current_name,Paths) of
false ->
io:format("loop_through_nodes: Found a new smallest unvisited node ~w~n",[Current_name]),
Paths_updated = dict:store(Current_name,{Previous_node,Current_weight},Paths),
io:format("loop_through_nodes: Updated Paths: ~w~n",[Paths_updated]),
Out_edges = digraph:out_edges(Graph,Current_name),
io:format("loop_through_nodes: Ready to iterate through the out edges of node ~w: ~w~n",[Current_name,Out_edges]),
Unvisited_nodes_updated_2 = loop_through_edges(Graph,Out_edges,Paths_updated,Unvisited_nodes_updated,Current_weight),
io:format("loop_through_nodes: Looped through out edges of the node ~w and updated Unvisited nodes: ~w~n",[Current_name,Unvisited_nodes_updated_2]),
loop_through_nodes(Graph,Paths_updated,Unvisited_nodes_updated_2);
true ->
loop_through_nodes(Graph,Paths,Unvisited_nodes_updated)
end;
true ->
Paths
end.
loop_through_edges(Graph,[],Paths,Unvisited_nodes,Current_weight) ->
io:format("loop_through_edges: No more out edges ~n"),
Unvisited_nodes;
loop_through_edges(Graph,Edges,Paths,Unvisited_nodes,Current_weight) ->
io:format("loop_through_edges: Start ~n"),
[Current_edge|Rest_edges] = Edges,
{Current_edge,Current_node,Neighbour_node,Edge_weight} = digraph:edge(Graph,Current_edge),
case dict:is_key(Neighbour_node,Paths) of
false ->
io:format("loop_through_edges: Inserting new neighbour node ~w into Unvisited nodes~n",[Current_node]),
Unvisited_nodes_updated = gb_sets:insert({Current_weight+Edge_weight,Neighbour_node,Current_node},Unvisited_nodes),
io:format("loop_through_edges: The unvisited nodes are: ~w~n",[Unvisited_nodes_updated]),
loop_through_edges(Graph,Rest_edges,Paths,Unvisited_nodes_updated,Current_weight);
true ->
loop_through_edges(Graph,Rest_edges,Paths,Unvisited_nodes,Current_weight)
end.
Your choice of data structures looks OK, so it is mostly a question of what to insert in them and how to keep them up to date. I'd suggest the following (I have changed the names a bit):
Result: A dict mapping Node to {Cost, Prev}, where Cost is the total cost of the path to Node and Prev is its predecessor on the path.
Open: A gb_sets structure of {Cost, Node, Prev}.
A graph with edges of the form {EdgeCost, NextNode}.
The result of the search is represented by Result and the graph isn't updated at all. There is no multiprocessing or message passing.
The algorithm goes as follows:
Insert {0, StartNode, Nil} in Open, where Nil is something that marks the end of the path.
Let {{Cost, Node, Prev}, Open1} = gb_sets:take_smallest(Open). If Node is already in Result then do nothing; otherwise add {Cost, Node, Prev} to Result, and for every edge {EdgeCost, NextNode} of Node add {Cost + EdgeCost, NextNode, Node} to Open1, but only if NextNode isn't already in Result. Continue with Open1 until the set is empty.
Dijkstra's algorithm asks for a decrease_key() operation on the Open set. Since this isn't readily supported by gb_sets we have used the workaround of inserting a tuple for NextNode even if NextNode might be present in Open already. That's why we check if the node extracted from Open is already in Result.
Extended discussion of the use of the priority queue
There are several ways of using a priority queue with Dijkstra's algorithm.
In the standard version of Wikipedia a node v is inserted only once but the position of v is updated when the cost and predecessor of v is changed.
alt := dist[u] + dist_between(u, v)
if alt < dist[v]:
dist[v] := alt
previous[v] := u
decrease-key v in Q
Implementations often simplify by replacing decrease-key v in Q with add v to Q. This means that v can be added more than once, and the algorithm must therefore check that an u extracted from the queue hasn't already been added to the result.
In my version I am replacing the entire block above with add v to Q. The queue will therefore contain even more entries, but since they are always extracted in order it doesn't affect the correctness of the algorithm. If you don't want these extra entries, you can use a dictionary to keep track of the minimum cost for each node.

the shortest path in cycle directed Graph

i need an example of the shortest path of directed graph cycle bye one node (it should reach to all nodes of graph from anode will be the input) please if there is an example i need it in c++ or algorithm thanks very much.........
You require to find the minimum spanning tree for it.
For directed graph according to wikipedia you can use this algorithm.
In Pseudocode:
//INPUT: graph G = (V,E)
//OUTPUT: shortest cycle length
min_cycle(G)
min = ∞
for u in V
len = dij_cyc(G,u)
if min > len
min = len
return min
//INPUT: graph G and vertex s
//OUTPUT: minimum distance back to s
dij_cyc(G,s)
for u in V
dist(u) = ∞
//makequeue returns a priority queue of all V
H = makequeue(V) //using dist-values as keys with s First In
while !H.empty?
u = deletemin(H)
for all edges (u,v) in E
if dist(v) > dist(u) + l(u,v) then
dist(v) = dist(u) + l(u,v)
decreasekey(H,v)
return dist(s)
This runs a slightly different Dijkstra's on each vertex. The mutated Dijkstras
has a few key differences. First, all initial distances are set to ∞, even the
start vertex. Second, the start vertex must be put on the queue first to make
sure it comes off first since they all have the same priority. Finally, the
mutated Dijkstras returns the distance back to the start node. If there was no
path back to the start vertex the distance remains ∞. The minimum of all these
returns from the mutated Dijkstras is the shortest path. Since Dijkstras runs
at worst in O(|V|^2) and min_cycle runs this form of Dijkstras |V| times, the
final running time to find the shortest cycle is O(|V|^3). If min_cyc returns
∞ then the graph is acyclic.
To return the actual path of the shortest cycle only slight modifications need to be made.

Resources