a new edge is insert to a Minimum spanning tree - graph-algorithm

I trying to find an algorithm to the following question with one different :
the edge are not distinct.
Give an efficient algorithm to test if T remains the minimum-cost spanning tree with the new edge added to G.
in this link- there is a solution but it is not for the different I wrote up:
the edges are not nessecerliy distinct.
Updating a Minimum spanning tree when a new edge is inserted
someone has an idea?

Well, the naive approach of just using Prim or Kruskal to find the min cost spanning tree of the new graph and then see which one has a lower total cost isn't too bad at O(|E|log|E|).
But we don't need to look at the whole graph.
Suppose your new edge connects vertices A and B. Let C be the parent of A. If B is not a descendent of A, then if A-B is lower cost than A-C, then T is no longer the MST and B should be the new parent of the subtree rooted at A.
If B is a descendant of A, then if A-B is shorter than any of the branches in T along the path from A to B, then T is no longer the MST, and the highest cost edge along that path should be removed, B is the root of the newly disconnected component, and should be added as a child of A.
I believe you may need to check these things a second time, reversing which vertices are A and B. The complexity of this is log|V| where the base of the log is the average number of children per node of T. In the case of T being a straight line, it's O(|V|), but otherwise, I think you could say it is O(log|V|).

First find an MST using one of the existing efficient algorithms.
Now adding an edge (v,w) creates a cycle in the MST. If the newly added edge has the maximum cost among the edges on the cycle then the MST remains as it is. If some other edge on the cycle has the maximum cost, then that's the edge to be removed to get a tree with lower cost.
So we need an efficient way to find the edge with the maximum value on the cycle. You can climb from v and w until you reach LCA(v, w) (the least common ancestor of v and w) to get the edge with the max cost. This takes linear time in the worst case.
If you are going to answer multiple such queries then pre-processing the MST is probably better. You can pre-process the MST to get a sparse table data structure in O(N lg N) time and then use this data structure to answer max queries in O(lg N) time in the worst case.

Related

The Hamiltonian Cycle

What is the worst case time complexity of the Hamiltonian cycle problem using backtracking?
Is it O(n!) or O(n^n )? Since I tried to find out the complexity and it's coming out to be O(n×n!) which is more like O(n^n ), and not O(n!).
The brute-force solution for finding a Hamiltonian cycle requires O(n!) work (which is indeed O(n^n), but O(n^n) wouldn't be a tight upper bound).
A Hamiltonian cycle in a graph G with n nodes has the form
H = v_1,v_2,v_3,...,v_n,v_1.
Since H includes every node in G, we may start our search from any arbitrarily chosen node, say v_1. Subsequently, there are n-1 candidate nodes to be the second node v_2 (i.e., all nodes but v_1 itself); there are n-2 choices for the third node v_3 (i.e., all nodes but the chosen candidates for v_1 and v_2), and so on so forth; at the end having candidates for v_1 to v_n-1 fixed, there is exactly one remaining candidate for v_n.
(i) This results in a maximum of (n-1)(n-2)...(2)(1) = (n-1)!
combinations.
(ii) In a naive implementation, checking each combination requires O(n) work;
i.e., for checking whether or not a given combination is
a Hamiltonian cycle, we go through the whole sequence of given combination and make sure it has the required properties of a Hamiltonian path.
Hence,
The overall complexity is O(n) x (n-1)! = O(n!)
Of course, we can reduce the required work using a variety of techniques, e.g, branch and bound approaches.

How node2vec works

I have been reading about the node2vec embedding algorithm and I am a little confused how it works.
For reference, node2vec is parametrised by p and q and works by simulating a bunch of random walks from nodes and just running word2vec embeddings on these walks as "sentences". By setting p and q in different ways, you can get more BFS or more DFS type random walks in the simulataion phase, capturing different network structure in the embedding.
Setting q > 1 gives us more BFS behaviour in that the samples of walks comprise of nodes within a small locality. The thing I am confused about is that the paper says this is equivalent to embedding nodes with similar structural properties close to each other.
I don't quite understand how that works. If I have two separate say star/hub structured nodes in my network that are far apart, why would embedding based on the random walks from those two nodes put those two nodes close together in the embedding?
This question has occupied my mind also after reading the article, and more so after empirically seeing that it indeed does that.
I assume you refer to the part in the paper showing the following diagram, states that u and s6 resulting embeddings will be quite similar in the space:
To understand why this indeed happens, first we must understand how the skip-gram model embeds information, which is the mechanism that consumes the random walks.
The skip-gram model eventually generates similar embeddings for tokens that can appear in similar context - but what does that really mean from the skip-gram model perspective?
If we would like to embed the structural equivalence we would favor a DFS-like walk (and additionally we would have to use an adequate window size for the skip-gram model).
So random walks would look like
1. s1 > u > s4 > s5 > s6 > s8
2. s8 > s6 > s5 > s4 > u > s1
3. s1 > s3 > u > s2 > s5 > s6
4. s7 > s6 > s5 > s2 > u > s3
.
.
n. .....
What will happen is that there would be many walks, where u and s6 appear in walks where their surroundings are the same. Since their surroundings will be similar it means that their context is similar and as stated similar context == similar embeddings.
One might further ask what about order? Well order doesn't really matter, since the skip-gram model uses the window size to generate pairs out of every sentence, in the link I provided you can further understand this concept.
So bottom line, if you can create walks that will create similar context for two nodes, their embeddings will be similar.
My understanding of the two sampling strategies goes like this:
DFS: for each node (a) the walk explores a wide context, containing not just the immediate neighbors (b), but also nodes further away (c). When optimizing the embedding and trying to get nodes closer which have similar context, the optimizer has to consider not just the relation of (a)-(b), but also (b)-(c), and so on. This is the same as trying to place nodes so that their distance in the network is conserved (each node trying to find its place based on a wide context).
BFS: for each node (a) the walk only explores the local context, but it does that extensively, so probably all neighbors (b1, b2, ...) will be included (and maybe some 2nd neighbors). Imagine trying to find a nodes place in the embedding space, while only having information on their neighbors. Nodes, that have similarly embedded neighbors should be close, e.g. dangling nodes with only 1 neighbor (and thus respective walk containing the source node many times), or nodes with two neighbors which have high degrees (i.e. a bridges connecting two hubs). So by only knowing the local information the embedding will not optimize for global distances, thus the result is not based on the actual graph structure, but rather on local patterns (called structural equivalence in the paper, just to make it confusing)
BUT!!! I tried reproducing the results for the network of Les Miserables with the parameters used in the original paper (p=1 q=0.5 and p=1 q=2), and couldn't get node2vec to do this 2nd type structural embedding thing. There is something fishy going on, as others also struggle with getting node2vec to embed structurally, here is a paper on it. If someone was able to reproduce their results please tell me how :)

Shuffling on Spark cartesian product

Assume a problem where I have an RDD X, I calculate the mean m in single a worker node and then I want to calculate X-m to e.g. calculate stdevs. I want this to happen in the cluster, not the driver node i.e. I want m to be distributed. I thought of implementing it as a cartesian product of those two RDDs so that essentially as soon as m gets calculated, it propagates to all workers and they calculate X-m. My fear is that Spark will shuffle X's to where m lives and do the subtraction there. Is there a guarantee on to who will shuffled in case of X.cartesian(m)?
The mean/stedev problem above is for illustration purposes - I know it's not excellent but it's simple enough.

How to design an O(m) time algorithm to compute the shortest cycle of G(undirected unweighted graph) that contains s?

How to design an O(m) time algorithm to compute the shortest cycle of G(undirected unweighted graph) that contains s(s ∈ V) ?
You can run a BFS from your node s as starting point, this will give you a BFS-tree. Afterwards you can built a lowest-common-ancestor (LCA) data structure on this BFS-tree. This can be done for example with Tarjan's lowest-common-ancestor algorithm. I will not got into details here. Given two nodes v and w, LCA lets you find the lowest node in a tree (the BFS-tree in our case) that has v and w as descendents. The idea is when you are considering two nodes that are connected in our BFS-tree you want to check if their paths to the root (s is this case) + the edge that connects them forms a cycle (with s). This is the case if their LCA is s.
Assuming you have built the LCA, you run a second BFS. When expanding the neighbours of a node v, you also take into consideration the nodes already marked as explored. Suppose x is a neighbour of v such that x has already been explored. If the LCA of v and x is s then the path from x to s and form v to s in the BFS-tree plus the edge xv forms a cycle. The first x and v that you encounter in your second BFS gives you the desired result. If no such x exist then s is not contained in any cycle.
The cycle is also the shortest containing s.
The two BFS run in O(m) and the LCA construction can also be done in linear time, hence the whole procedure can be implemented in O(m).
This might a bit overkill. There surely is a much simpler solution.

bi-directional maximum flow using ford-fulkerson

I think this is like an undirected graph version of max flow problem.
So for every edge a->b, b->a is also valid. its bi-directional. And they share the same capacity.
Which means if I have capacity 10 between two vertex a, b , and I have a flow from a to b which costs 5, then the remaining capacity from a to b will be 5 as well as the remaining capacity from b to a.
My solution to this is to have one directed edge from b to a and another one from a to b.
The question is, if I decrease residual from a->b in residual graph, do I still increase the residual for the backward edge b->a?
Yeah. In every augmenting path that has available capacity, If you decrease residual from a->b in residual graph, you have to increase the residual for the backward edge b->a. It allows the flow might be "returned" later.

Resources