Graph Algorithm : Similar to TSP - graph-algorithm

I want to solve a problem similar to the TSP( Travelling Salesman Problem).
I have N ( N > 0, N < 20 ) nodes and i must visit all nodes.
The cost between nodes are equal.
I can visit a node unlimited times.
I want to find more than one path and the cost have not restriction.
Tell me some effective algorithms about this problem?

Here is a solution that works with a weighted graph.
First, the naive solution, enumerating.
It works in O(n!) because there are (n-1)! Hamiltonian paths, and you need O(n) to check each one.
There is better algorithm, with dynamic programming in O(n*2^n)
Define the state as the following: for x a node, and S a set of nodes containing x:
w[S][x] = the weight of the shortest path that start at node x, and goes through all the node in the set S, and then finishes at 0.
Note that 0 does not necessarily belongs to S.
S = {x} is the basic case: w[S][x] = weight(w,0)
Then the recursion formula:
If S is larger than, {x}, Iterate over the possible next step y
w[S][x] = min(weight(x,y) + w[S\x][y] for all y in S\x)
This algorithm will output just one optimal path.

Related

Computing classes of maximal path-equivalent nodes in a rooted DAG

I have a rooted directed acyclic graph with a single root node r.
I'm interested in computing the following equivalence:
"Nodes v and w are maximal path-equivalent iff every maximal path from r contains either both of v and w or none of them"
In particular, I want to find all equivalence classes w.r.t. the above condition, possibly in O(n+m) time (n nodes, m edges).
I feel like this problem is not unknown but I don't know what terms to search for.
If anyone knows what this problem is called or has any ideas on how to solve it, I would appreciate it.

How can we implement efficiently a maximum set coverage arc of fixed cardinality?

I am working on solving the following problem and implement the solution in C++.
Let us assume that we have an oriented weighted graph G = (V, A, w) and P a set of persons.
We receive a number of queries such that every query gives a person p and two vertices s and d and asks to compute the minimum weighted path between s and d for the person p. One person can have multiple paths.
After the end of all queries I have a number k <= |A| and I should give k arcs such that the number of persons using at least one of the k arcs is maximal (this is a maximum coverage problem).
To solve the first part I implemented the Djikistra algorithm using priority_queue and I compute the minimal weight between s and d. (Is this a good way to do ?)
To solve the second part I store for every arc the set of persons that use this arc and I use a greedy algorithm to compute the set of arcs (at each stage, I choose an arc used by the largest number of uncovered persons). (Is this a good way to do it ?)
Finally, if my algorithms are goods how can I implement them efficiently in C++?

How to design an O(m) time algorithm to compute the shortest cycle of G(undirected unweighted graph) that contains s?

How to design an O(m) time algorithm to compute the shortest cycle of G(undirected unweighted graph) that contains s(s ∈ V) ?
You can run a BFS from your node s as starting point, this will give you a BFS-tree. Afterwards you can built a lowest-common-ancestor (LCA) data structure on this BFS-tree. This can be done for example with Tarjan's lowest-common-ancestor algorithm. I will not got into details here. Given two nodes v and w, LCA lets you find the lowest node in a tree (the BFS-tree in our case) that has v and w as descendents. The idea is when you are considering two nodes that are connected in our BFS-tree you want to check if their paths to the root (s is this case) + the edge that connects them forms a cycle (with s). This is the case if their LCA is s.
Assuming you have built the LCA, you run a second BFS. When expanding the neighbours of a node v, you also take into consideration the nodes already marked as explored. Suppose x is a neighbour of v such that x has already been explored. If the LCA of v and x is s then the path from x to s and form v to s in the BFS-tree plus the edge xv forms a cycle. The first x and v that you encounter in your second BFS gives you the desired result. If no such x exist then s is not contained in any cycle.
The cycle is also the shortest containing s.
The two BFS run in O(m) and the LCA construction can also be done in linear time, hence the whole procedure can be implemented in O(m).
This might a bit overkill. There surely is a much simpler solution.

Feedback on algorithm for Steiner Tree with restrictions

For an assignment, I have to create a Steiner Tree. However, this is not a typical Steiner Tree, as the graph structure we're required to use does not allow insertion of new vertices. Rather, the test cases define a graph structure of N vertices and M edges while specifically marking X vertices as target nodes. These are the nodes we have to span while using some, none or all of the unmarked vertices in the graph.
My solution to this problem is
Implement Dijkstra's Algorithm to find the shortest path between all the target vertices
For each of the shortest paths 1:n
Extract all current selected path vertices into a set
Extract all remaining vertices into a set
For all vertices of the current selected path 1:m
Execute Dijkstra to find shortest path between current vertex and other path's vertices
If this creates a spanning tree, save path and length in priority queue sorted by length value
Pop top of priority queue and return path
My issue is that this is an exhaustive search that uses the initial application of Dijkstra to create a reduced set of possible start-end vertices for a shorter path than a minimum spanning tree.
Is there a heuristic or other algorithm that may solve this problem?
With some help, I worked out this answer for a similar problem that I had. Rather than adding new vertices as in a spacial steiner tree problem, the new steiner points in this graph are the vertices that lie along the path between the marked nodes. For a graph with N vertices, M edges, X require vertices, and S found vertices (vertices along our path):
Compute All Pairs Shorest Paths (Floyd-Warshall, Johnson's, whatever)
for k in X
remove k from X, insert k into S
for v in (X + S) - Both sets
find the shortest distance from k to v - path P
for u in P (all vertices on the path)
insert u into S
if u exists in k, remove u from k
Now for the wall of text as to what this algorithm does. We pick a vertex k in X, and then find the minimum distance to the nearest other vertex in the target set X, or in the result set S, and call it v. Then we follow the path of nodes from {k,u}, inserting them into our result set. Finally, double check and make sure that any vertices in X that were on the path (shouldn't happen) are removed from X.
Any new vertex that you want to add, c, will have a minimum distance to some node already in your result set S. Since the nodes already in S are the minimum distance apart, it follows that c will be the minimum distance from any point in S to c. For example, if you have three nodes, A, B, and C, if A and B are already found to be a minimum distance apart, adding C fulfills the requirement that it is the minimum distance from B, and the minimum distance path from A to C goes through B.
I did some research on the discrete Steiner Tree problem (which is what this is), and this is the best brute force solution that I found. The main problem is going to be the O(n^3) time it takes to do all pairs shortest paths, but then the construction of the minimum tree should be straightforward and quick, since you just need to look up distance information. The implementation I wound up working with is outlined nicely on wikipedia.

Synonym chains - Efficient routing algorithm for iOS/sqlite

A synonym chain is a series of closely related words that span two anchors. For example, the English words "black" and "white" can connected as:
black-dark-obscure-hidden-concealed-snug-comfortable-easy-simple-pure-white
Or, here's "true" and "false":
true-just=fair=beautiful=pretty-artful-artificial-sham-false
I'm working on a thesaurus iOS app, and I would like to display synonym chains also. The goal is to return a chain from within a weighted graph of word relations. My source is a very large thesaurus with weighted data, where the weights measure similarity between words. (e.g., "outlaw" is closely related to "bandit", but more distantly related to "rogue.") Our actual values range from 0.001 to ~50, but you can assume any weight range.
What optimization strategies do you recommend to make this realistic, e.g., within 5 seconds of processing on a typical iOS device? Assume the thesaurus has half a million terms, each with 20 associations. I'm sure there's a ton of prior research on these kinds of problems, and I'd appreciate pointers on what might be applied to this.
My current algorithm involves recursively descending a few levels from the start and end words, and then looking for intercepting words, but that becomes too slow with thousands of sqlite (or Realm) selects.
Since you said your source is a large thesaurus with weighted data, I'm assuming if you pick any word, you will have the weight to its successor in the similarity graph. I will always use the sequence below, when I'm giving any example:
black-dark-obscure-hidden-concealed-snug-comfortable-easy-simple-pure-white
Let's think of the words as being a node on a graph, each relationship of similarity a word has with another is a path on that graph. Each path is weighted with a cost, which is the weight you have on the source file. So the best solution to find a path from one word to another is to use the A* (A star) path finding.
I'm using the minimum "cost" to travel from a word to its successor to be 1. You can adjust it accordingly. First you will need a good heuristic function to use, since this is a greedy algorithm. This heuristic function will return the "greedy" distance between two words, any words. You must respect the fact the the "distance" it returns can never be bigger than the real distance between the two words. Since I don't know any relationship between any words for a thesaurus, my heuristic function will always return the minimum cost 1. In other words, it will always say a word is the most similar word to any other. For example, my heuristic function tells me that 'black' is the best synonym for 'white'.
You must tune the heuristic function if you can, so it will respond with more accurate distances making the algorithm runs faster. That's the tricky part I guess.
You can see the pseudo-code for the algorithm on the Wikipedia article I sent. But here it is for a faster explanation:
function A*(start,goal)
closedset := the empty set -- The set of nodes already evaluated.
openset := {start} -- The set of tentative nodes to be evaluated, initially containing the start node
came_from := the empty map -- The map of navigated nodes.
g_score[start] := 0 -- Cost from start along best known path.
-- Estimated total cost from start to goal through y.
f_score[start] := g_score[start] + heuristic_cost_estimate(start, goal)
while openset is not empty
current := the node in openset having the lowest f_score[] value
if current = goal
return reconstruct_path(came_from, goal)
remove current from openset
add current to closedset
for each neighbor in neighbor_nodes(current)
if neighbor in closedset
continue
tentative_g_score := g_score[current] + dist_between(current,neighbor)
if neighbor not in openset or tentative_g_score < g_score[neighbor]
came_from[neighbor] := current
g_score[neighbor] := tentative_g_score
f_score[neighbor] := g_score[neighbor] + heuristic_cost_estimate(neighbor, goal)
if neighbor not in openset
add neighbor to openset
return failure
function reconstruct_path(came_from,current)
total_path := [current]
while current in came_from:
current := came_from[current]
total_path.append(current)
return total_path
Now, for the algorithm you'll have 2 arrays of nodes, the ones you are going to visit (opened) and the ones you already visited (closed). You will also have two arrays of distances for each node, that you will be completing as you travel through the graph.
One array (g_score) will tell you the real lowest traveled distance between the starting node and the specified node. For example, g_score["hidden"] will return the lowest weighted cost to travel from 'black' to 'hidden'.
The other array (f_score) will tell you the supposed distance between the node you specified to the goal you want to reach. For example, f_score["snug"] will return the supposed weighted cost to travel from "snug" to "white" using the heuristic function. Remember, this cost will always be less or equal the real cost to travel between words, since our heuristic function need to respect the aforementioned rule.
As the algorithm runs, you will be traveling from node to node, from the starting word, saving all the nodes you traveled and the costs you "used" to travel. You will be replacing the traveled path when you find a better cost to travel on the g_score array. You will use the f_score to predict which node will be best visited first, from the array of 'unvisited' nodes. It's best if you save your f_score as a minimum Heap.
You will end the algorithm when you find the node that is the goal that you want. Then you will reconstruct the minimum path using the array of nodes visited that you kept saving at each iteration. Another way the algorithm will stop is if it visited all neighbor nodes and didn't find the goal. When this happens, you can say there is no path from the starting node to the goal.
This algorithm is the most used on games to find the better path between two objects on a 3D world. To improve it, you just need to create a better heuristic function, that can let the algorithm find the better nodes to travel first, leding it to the goal faster.
-- 7f
Here's a closely related question and answer: Algorithm to find multiple short paths
There you can see comments about Dijkstra's and A-star, Dinic's, but more broadly also the idea of maximum flow and minimum cost flow.

Resources