breadth first traversal directed vs undirected graph - breadth-first-search

how does bfs on directed and undirected graph differ in implementation.
i found the following pseudocode on web. i am ok with undirected graph. but can't figure out how to implement it for directed graph.
frontier = new Queue()
mark root visited (set root.distance = 0)
frontier.push(root)
while frontier not empty {
Vertex v = frontier.pop()
for each successor v' of v {
if v' unvisited {
frontier.push(v')
mark v' visited (v'.distance = v.distance + 1)
}
}
}

The implementation in pseudocode is the same, except that the notion of successor would mean neighbor for an undirected graph but child (or similar) for a directed graph.

Related

Subgraph isomorphism (or even set membership) in Z3?

I'm trying to find a way to encode a sort of basic subgraph isomorphism in Z3 (preferably z3py). While I know there are papers on this in the abstract, finding any mechanism to do it has eluded me even for very trivial cases, because I'm very new to Z3 in general!
Suppose you have just about the most basic subgraph with nodes (0,1,2) and edges (0,1) with node 2 off on its own, and the supergraph has nodes (0,1,2) and edges (1,2) with node 0 off on its own. You could map the nodes of the subgraph into the supergraph with
0->1,
1->2,
2->0
...as one possible mapping that would satisfy "if these two nodes are connected in the subgraph, their mapped nodes are connected in the supergraph"
So okay :) I tried
from networkx import Graph
from networkx.linalg.graphmatrix import adjacency_matrix
subgraph = Graph()
subgraph.add_nodes_from([0,1,2])
subgraph.add_edges_from([(0,1)])
supergraph = Graph()
supergraph.add_nodes_from([0,1,2])
supergraph.add_edges_from([(1,2)])
s = Solver()
assignments = [Int(f'n{node}') for node in subgraph.nodes]
# each bit assignment in the subgraph belongs to one in the supergraph
assignment_constraint = [ And(assignments[i] >= 0, assignments[i] <= max(supergraph.nodes)) for i in subgraph.nodes ]
# subgraph bits can't be assigned to the same supergraph bits
assignment_distinct = [ Distinct([assignments[i] for i in subgraph.nodes])]
which just gets me as far as "each assignment from subgraph to supergraph should map a node in the subgraph to some node in the supergraph and no two subgraph nodes can be assigned to the same supergraph node"
...but then I get stuck because I keep thinking along the lines of
for edge in subgraph.edges:
s.add( (assignments[edge[0]], assignments[edge[1]]) in supergraph.edges )
...but of course that doesn't work because pythonically those aren't the right sort of keys so that's always false or broken.
So how does one approach that? I can add constraints like "this_var == 1" but get very confused on things like checking membership, ie
>>> assignments[0] == 1.0
n0 == 1 # so that's OK then
>>> assignments[0] in [1.0, 2.0, 3.0]
False # woops, that fails horribly
and I feel like I'm missing a very basic "frame of mind" thing here.
It is relatively straightforward to encode subgraph isomorphism in z3, pretty much along the lines of how you described. However, this encoding is unlikely to scale to large graphs. As you no doubt know, subgraph isomorphism is NP-complete in general, and this encoding will cause z3 to simply enumerate all possibilities and thus will blow up exponentially.
Having said that, here's a straightforward encoding:
from z3 import *
# Subgraph, number of nodes and edges.
# Nodes will be named implicitly from 0 to noOfNodesA - 1
noOfNodesA = 3
edgesA = [(0, 1)]
# Supergraph:
noOfNodesB = 3
edgesB = [(1, 2)]
# Mapping of subgraph nodes to supergraph nodes:
mapping = Array('Map', IntSort(), IntSort())
s = Solver()
# Check that elt is between low and high, inclusive
def InRange(elt, low, high):
return And(low <= elt, elt <= high)
# Check that (x, y) is in the list
def Contains(x, y, lst):
return Or([And(x == x1, y == y1) for x1, y1 in lst])
# Make sure mapping is into the supergraph
s.add(And([InRange(Select(mapping, n1), 0, noOfNodesB-1) for n1 in range(noOfNodesA)]))
# Make sure we map nodes to distinct nodes
s.add(Distinct([Select(mapping, n1) for n1 in range(noOfNodesA)]))
# Make sure edges are preserved:
for x, y in edgesA:
s.add(Contains(Select(mapping, x), Select(mapping, y), edgesB))
# Solve:
r = s.check()
if r == sat:
m = s.model()
for x in range(noOfNodesA):
print ("%s -> %s" % (x, m.evaluate(Select(mapping, x))))
else:
print ("Solver said: %s" % r)
I've added comments along the way, so hopefully you should be able to read the code through; feel free to ask specific questions.
When I run this, I get:
$ python a.py
0 -> 1
1 -> 2
2 -> 0
which finds exactly the mapping you alluded to in your question.
Best of luck!

Osmnx cannot find path between nodes in composed graph?

I am trying to use osmnx to find distances between a origin point (lat/lon) and nearest infrastructure, such as railways, water or parks.
1) I get the entire graph from an area with network_type='walk'.
2) Get the needed infrastructure, e.g. railway for that same area.
3) Compose the two graphs into one.
4) Find the nearest node from origin point in the original graph.
5) Find the nearest node from the origin point in the infrastructure graph
6) Find the shortest route length between the two nodes.
If you run the example below, you will see that it is missing 20% of the data because it cannot find a route between the nodes. For infrastructure='way["leisure"~"park"]' or infrastructure='way["natural"~"wood"]' this is even worse, with 80-90% of nodes not being connected.
Minimal reproducible example:
import osmnx as ox
import networkx as nx
bbox = [55.5267243, 55.8467243, 12.4100724, 12.7300724]
g = ox.graph_from_bbox(bbox[0], bbox[1], bbox[2], bbox[3],
retain_all=True,
truncate_by_edge=True,
simplify=False,
network_type='walk')
points = [(55.6790884456018, 12.568493971506154),
(55.6790884456018, 12.568493971506154),
(55.6867418740291, 12.58232314016353),
(55.6867418740291, 12.58232314016353),
(55.6867418740291, 12.58232314016353),
(55.67119624894504, 12.587201455313153),
(55.677406927839506, 12.57651997656002),
(55.6856574907879, 12.590500429002823),
(55.6856574907879, 12.590500429002823),
(55.68465359365924, 12.585474365063224),
(55.68153666806675, 12.582594757267945),
(55.67796979175, 12.583111746311117),
(55.68767346629932, 12.610040871066179),
(55.6830855237578, 12.575431380892427),
(55.68746749645466, 12.589488615911913),
(55.67514254640597, 12.574308210656602),
(55.67812748568291, 12.568454119053886),
(55.67812748568291, 12.568454119053886),
(55.6701733527419, 12.58989203029166),
(55.677700136266616, 12.582800629527789)]
railway = ox.graph_from_bbox(bbox[0], bbox[1], bbox[2], bbox[3],
retain_all=True,
truncate_by_edge=True,
simplify=False,
network_type='walk',
infrastructure='way["railway"]')
g_rail = nx.compose(g, railway)
l_rail = []
for point in points:
nearest_node = ox.get_nearest_node(g, point)
rail_nn = ox.get_nearest_node(railway, point)
if nx.has_path(g_rail, nearest_node, rail_nn):
l_rail.append(nx.shortest_path_length(g_rail, nearest_node, rail_nn, weight='length'))
else:
l_rail.append(-1)
There are 2 things that caught my attention.
OSMNX documentation specifies ox.graph_from_bbox parameters be given in the order of north, south, east, west (https://osmnx.readthedocs.io/en/stable/osmnx.html). I mention this because when I tried to run your code, I was getting empty graphs.
The parameter 'retain_all = True' is the key as you may already know. When set to true, it retains all nodes in the graph, even if they are not connected to any of the other nodes in the graph. This happens primarily due to the incompleteness of OpenStreetMap which contains voluntarily contributed geographic information. I suggest you set 'retain_all = False' meaning your graph now contains only the connected nodes. In this way, you get a complete list without any -1.
I hope this helps.
g = ox.graph_from_bbox(bbox[1], bbox[0], bbox[3], bbox[2],
retain_all=False,
truncate_by_edge=True,
simplify=False,
network_type='walk')
railway = ox.graph_from_bbox(bbox[1], bbox[0], bbox[3], bbox[2],
retain_all=False,
truncate_by_edge=True,
simplify=False,
network_type='walk',
infrastructure='way["railway"]')
g_rail = nx.compose(g, railway)
l_rail = []
for point in points:
nearest_node = ox.get_nearest_node(g, point)
rail_nn = ox.get_nearest_node(railway, point)
if nx.has_path(g_rail, nearest_node, rail_nn):
l_rail.append(nx.shortest_path_length(g_rail, nearest_node, rail_nn, weight='length'))
else:
l_rail.append(-1)
print(l_rail)
Out[60]:
[7182.002999999995,
7182.002999999995,
5060.562000000002,
5060.562000000002,
5060.562000000002,
6380.099999999999,
7127.429999999996,
4707.014000000001,
4707.014000000001,
5324.400000000003,
6153.250000000002,
6821.213000000002,
8336.863999999998,
6471.305,
4509.258000000001,
5673.294999999996,
6964.213999999994,
6964.213999999994,
6213.673,
6860.350000000001]

Recommend a good heuristic for longest Hamiltonian path in polynomial time

I have a complete weighted graph with 1000 nodes and need to find the longest possible Hamiltonian path in the graph (the sequence of nodes, to be more precise). I am supposed to fit in 5 sec (for Java), the memory limit is big enough.
Finding the longest Hamiltonian path doesn't look much different from finding solution for TSP (travelling salesman). Of course, an optimal solution is out of question, so I'm looking for a good heuristic.
My best solution so far is using the Nearest Neighbour algorithm, which is easy to implement and runs in polynomial time (takes ~0.7 seconds for 1000 nodes graph). It's a bit far from the optimal solution though.
So I'm looking for a better heuristic that still runs relatively fast.
I see mentioned Tabu Search, Simulated Annealing, Ant Colony, Genetics, Branch and Bound, MST based algorithms and others.
The problem is, as their implementation is not exactly trivial, it's hard to find their time complexity to decide which can fit in the 5 sec. time limit; e.g. run in polynomial time.
For some algorithms like Christofides' I see that the complexity is O(V^4), where V is the number of vertices, which apparently makes it impossible to fit.
I came across the Bitonic Tour solution, usually used for finding the shortest Hamiltonian path in Euclidean graphs, but seems kind of OK for finding the longest path in non-Euclidean graphs too:
public static void minCostTour(int[][] graph) {
int n = graph.length;
int[][] dp = new int[n][n];
dp[0][1] = graph[0][1];
for (int i = 0; i < n - 1; i++) {
for (int j = i + 1; j < n; j++)
if (i == (j - 1) && i != 0) {
dp[i][j] = dp[0][j-1] + graph[0][j];
for (int k = 1; k <= j - 2; k++)
if ((dp[k][j-1] + graph[k][j] < dp[i][j])) {
dp[i][j] = dp[k][j-1] + graph[k][j];
}
} else if (i != 0 || j != 1) {
dp[i][j] = dp[i][j-1] + graph[j-1][j];
}
}
System.out.println("Optimal Tour Cost: " + (dp[n-2][n-1] + graph[n-2][n-1]));
}
The standard algorithm includes an initial sorting of coordinates, which I skipped, as apparently there are no coordinates to sort (the graph is non-Euclidean).
This dynamic programming solution runs in O(V^2) so it might be good.
The problem is that it outputs the Hamiltonian path length and I need the sequence of nodes. I can't really understand how to restore the path from the above algorithm (if possible at all).
TL DR version:
Can the Bitonic Tour algorithm above be used for finding the sequence of nodes on the longest Hamiltonian path in a complete weighted graph?
If not, can you recommend an algorithm with similar (polynomial) time complexity for that task?

DFS and BFS trees related to undirected connected simple graph

Given an undirected connected simple graph G, let Td be a DFS tree and Tb be a BFS tree of G starting at some fixed vertex.
If Tb and Td are identical then G is an acyclic graph. Is this always true?
I worked on the following example :
let Td and Tb be :
(A,B)
(A,C)
(B,D)
(B,E)
(C,F)
(C,G)
and then graph could be :
(A,B)
(A,C)
(B,D)
(B,E)
(C,F)
(C,G)
(C,A)
Am i thinking in the right direction? Because the answer given for this problem is true which according to me should be false.
Please help.

relationship between density of edges to the number of vertices in graph

I want to understand how to compute big-O for a dense versus sparse graph.
"Algorithms in a nutshell" says that for sparse graph, O(E) is O(V) and for dense graph O(E) is closer to O(V^2). Does anyone know how is that derived?
Assuming the graph is simple - at the worst case every node can be connected to all |V|-1 other nodes, resulting in [in not directed graph:] |E| = (|V|-1) + (|V| -2) + ... + 1 <= |V| * (|V| -1) = O(|V|^2). And in directed graph: |E| = |V| * (|V|-1) = O(|V|^2).
A good example for a dense graph is a clique - which have all the edges.
For sparsed graph - we assume the number of edges connected to each vertex is bounded by a constant. Let this constant be k. Thus: |E| <= k* |V|, and we get |E| = O(|V|)
A good example for a sparsed graph is the internet, where every URL is a node and every link is an edge.
Note that if the graph is not simple, you cannot bound |E| with any function of |V|.
It's not derived, it's a definition. In a fully connected (directed) graph with self-loops, the number of edges |E| = |V|² so the definition of a dense graph is reasonable. The definition of a sparse graph is one where O(|E|) = O(|V|), so there's a constant maximum number of edges per vertex.
Note that if the number of edges is much smaller, e.g. O(lg |V|), then it's still O(|V|) as well. One could imagine a "semi-sparse" class of graphs with |E| = O(|V| lg |V|) or something like that, but I personally have never encountered such a class in practice.

Resources