bi-directional maximum flow using ford-fulkerson - graph-algorithm

I think this is like an undirected graph version of max flow problem.
So for every edge a->b, b->a is also valid. its bi-directional. And they share the same capacity.
Which means if I have capacity 10 between two vertex a, b , and I have a flow from a to b which costs 5, then the remaining capacity from a to b will be 5 as well as the remaining capacity from b to a.
My solution to this is to have one directed edge from b to a and another one from a to b.
The question is, if I decrease residual from a->b in residual graph, do I still increase the residual for the backward edge b->a?

Yeah. In every augmenting path that has available capacity, If you decrease residual from a->b in residual graph, you have to increase the residual for the backward edge b->a. It allows the flow might be "returned" later.

Related

The Hamiltonian Cycle

What is the worst case time complexity of the Hamiltonian cycle problem using backtracking?
Is it O(n!) or O(n^n )? Since I tried to find out the complexity and it's coming out to be O(n×n!) which is more like O(n^n ), and not O(n!).
The brute-force solution for finding a Hamiltonian cycle requires O(n!) work (which is indeed O(n^n), but O(n^n) wouldn't be a tight upper bound).
A Hamiltonian cycle in a graph G with n nodes has the form
H = v_1,v_2,v_3,...,v_n,v_1.
Since H includes every node in G, we may start our search from any arbitrarily chosen node, say v_1. Subsequently, there are n-1 candidate nodes to be the second node v_2 (i.e., all nodes but v_1 itself); there are n-2 choices for the third node v_3 (i.e., all nodes but the chosen candidates for v_1 and v_2), and so on so forth; at the end having candidates for v_1 to v_n-1 fixed, there is exactly one remaining candidate for v_n.
(i) This results in a maximum of (n-1)(n-2)...(2)(1) = (n-1)!
combinations.
(ii) In a naive implementation, checking each combination requires O(n) work;
i.e., for checking whether or not a given combination is
a Hamiltonian cycle, we go through the whole sequence of given combination and make sure it has the required properties of a Hamiltonian path.
Hence,
The overall complexity is O(n) x (n-1)! = O(n!)
Of course, we can reduce the required work using a variety of techniques, e.g, branch and bound approaches.

Could you explain this question? i am new to ML, and i faced this problem, but its solution is not clear to me

The problem is in the picture
Question's image:
Question 2
Many substances that can burn (such as gasoline and alcohol) have a chemical structure based on carbon atoms; for this reason they are called hydrocarbons. A chemist wants to understand how the number of carbon atoms in a molecule affects how much energy is released when that molecule combusts (meaning that it is burned). The chemists obtains the dataset below. In the column on the right, kj/mole is the unit measuring the amount of energy released. examples.
You would like to use linear regression (h a(x)=a0+a1 x) to estimate the amount of energy released (y) as a function of the number of carbon atoms (x). Which of the following do you think will be the values you obtain for a0 and a1? You should be able to select the right answer without actually implementing linear regression.
A) a0=−1780.0, a1=−530.9 B) a0=−569.6, a1=−530.9
C) a0=−1780.0, a1=530.9 D) a0=−569.6, a1=530.9
Since all a0s are negative but two a1s are positive lets figure out the latter first.
As you can see by increasing the number of carbon atoms the energy is become more and more negative, so the relation cannot be positively correlated which rules out options c and d.
Then for the intercept the value that produces the least error is the correct one. For the 1 and 10 (easier to calculate) the outputs are about -2300 and -7000 for a, -1100 and -5900 for b, so one would prefer b over a.
PS: You might be thinking there should be obvious values for a0 and a1 from the data, it's not. The intention of the question is to give you a general understanding of the best fit. Also this way of solving is kinda machine learning as well

Changing Kademlia Metric - Unidirectional Property Importance

Kademlia uses XOR metric. Among other things, this has so called "unidirectional" property (= for any given point x and distance e>0, there is exactly one point y such that d(x,y)=e).
First question is a general question: Is this property of the metric critical for the functionality of Kademlia, or is it just the thing that helps with revealing pressure from certain nodes (as the original paper suggests). In other words, if we want to change the metric, how important is to come with a metric that is "unidirectional" as well?
Second question is about concrete change of the metric: Let's assume we have node identifiers (addresses) as X-bit numbers, would any of the following metric work with Kademlia?
d(x,y) = abs(x-y)
d(x,y) = abs(x-y) + 1/(x xor y)
The first metric simply provides difference between numbers, so for node ID 100 the nodes with IDs 90 and 110 are equally distant, so this is not unidirectional metric. In the second case we fix that adding 1/(x xor y), where we know that (x xor y) is unidirectional, so having 1/(x xor y) should preserve this property.
Thus for node ID 100, the node ID 90 is d(100,90) = 10 + 1/62, while the distance from node ID 110 is d(100,110) = 10 + 1/10.
You wouldn't be dealing with kademlia anymore. There are man other routing algorithms which use different distance metrics, some even non-uniform distance metrics, but they do not rely on kademlia-specific assumptions and sometimes incorporate other features to compensate for some undesirable aspect of those metrics.
Since there can be ties in the metric (two candidates for each point), lookups could no longer converge on a precise set of closest nodes.
Bucket splitting and other routing table maintenance algorithms would need to be changed since they assume that identical distances can only occur with node identity.
I'm not sure whether it would affect Big-O properties or other guarantees of kademlia.
Anyway, this seems like an X-Y problem. You want to modify the metric to serve a particular goal. Maybe you should look for routing overlays designed with that goal in mind instead.
d(x,y) = abs(x-y) + 1/(x xor y)
This seems impractical, division on integers suffers from rounding. and in reality you would not be dealing with such small numbers but much larger (e.g. 160bit) numbers, making divisions more expensive too.

a new edge is insert to a Minimum spanning tree

I trying to find an algorithm to the following question with one different :
the edge are not distinct.
Give an efficient algorithm to test if T remains the minimum-cost spanning tree with the new edge added to G.
in this link- there is a solution but it is not for the different I wrote up:
the edges are not nessecerliy distinct.
Updating a Minimum spanning tree when a new edge is inserted
someone has an idea?
Well, the naive approach of just using Prim or Kruskal to find the min cost spanning tree of the new graph and then see which one has a lower total cost isn't too bad at O(|E|log|E|).
But we don't need to look at the whole graph.
Suppose your new edge connects vertices A and B. Let C be the parent of A. If B is not a descendent of A, then if A-B is lower cost than A-C, then T is no longer the MST and B should be the new parent of the subtree rooted at A.
If B is a descendant of A, then if A-B is shorter than any of the branches in T along the path from A to B, then T is no longer the MST, and the highest cost edge along that path should be removed, B is the root of the newly disconnected component, and should be added as a child of A.
I believe you may need to check these things a second time, reversing which vertices are A and B. The complexity of this is log|V| where the base of the log is the average number of children per node of T. In the case of T being a straight line, it's O(|V|), but otherwise, I think you could say it is O(log|V|).
First find an MST using one of the existing efficient algorithms.
Now adding an edge (v,w) creates a cycle in the MST. If the newly added edge has the maximum cost among the edges on the cycle then the MST remains as it is. If some other edge on the cycle has the maximum cost, then that's the edge to be removed to get a tree with lower cost.
So we need an efficient way to find the edge with the maximum value on the cycle. You can climb from v and w until you reach LCA(v, w) (the least common ancestor of v and w) to get the edge with the max cost. This takes linear time in the worst case.
If you are going to answer multiple such queries then pre-processing the MST is probably better. You can pre-process the MST to get a sparse table data structure in O(N lg N) time and then use this data structure to answer max queries in O(lg N) time in the worst case.

*minimum cut is always the same in flow networks?

I've seen a way of finding a minimum cut in a flow network N=(V,E,c,s,t) by:
find a maximum flow f in the network N (using a Ford-Fulkerson based algo. for example).
set S to contain all vertices v with paths from s to v in the residual network of f.
set T=V\S
return (S,T)
For any maximal flow f, will this cut (S,T) always be the same?
It seems true, but I'm having trouble explaining this.
(namely, if f,f' are max flows, and (S,T), (S',T') are the cuts the algorithm above outputed, then S=S',T=T')
*There might be other minimum cuts, but I'm reffering to minimum cuts obtained this way.
its not exactly the same because:
in this example you can have 2 different cuts in that way,some time the upper node will be in S,some time in T,depends where the algo run the flow(the upper edge from s,you the lower one).

Resources