Finding the right algorithm for undirected graph

Finding the right algorithm for undirected graph - graph-algorithm

We are given an undirected graph. There is a source and a sink and they are not connected directly in all possible cases. I want to find the minimum number of vertices to be removed(not the source or sink) that disconnects the source and the sink.
Is there any specific algorithm for this problem. Links would be helpful.

Maximum Flow Minimum Cut Theorem
It use the minimum cost to separate the graph to two parts, the source node in the first part, and sink in the second.
To set all the weight equal to 1 of all the edges. You can learn this theorem in the wiki

Related

Can I use Breadth-First-Search on weighted graphs if I modify it?

I am having a discussion with a friend if the following will work:
We recently learned in a lecture about Breadth-First-Search. I know that it is a special case of Dijkstra where each edge weight is set to one. Assume now we are given a graph where the edges have integer weights of more than one. Then I would modify this graph by introducing additional vertices and connecting them by edges with weight one, e.g. assume we have an edge of weight 3 connecting the vertices u and v, then I would introduce dummy-vertices d1, d2, remove the edge connecting u and v and instead add edges {u, d1}, {d1, d2}, {d2,v} of weight one.
If I modify my whole graph this way and then apply breadth-first search starting from one of the original vertices, wouldn't this work as well?
Thank you very much in advance!

Since BFS is guaranteed to return an optimal path on unweighted graphs, and you've created the unweighted equivalent of your original graph, you'll be guaranteed to get the shortest path.
What you lose by doing this over Dijkstra's algorithm is runtime optimality. Now the runtime of your algorithm is dependent on the edge weights, whereas Dijkstra's is only dependent on the number of edges.
This sort of thought experiment is a great way to understand how Dijkstra's algorithm works (eg. how would you modify your algorithm to not require creating a new graph? Or not take 100 steps for an edge with weight 100?). In fact this is probably how Dijkstra discovered the algorithm to begin with.

Understanding Inverse Kinematics pybullet

I'm trying to do cartesian control with a simulated PR2 robot in pybullet.
In pybullet, the function calculateInverseKinematics(...) optionally takes joint lower limits, upper limits, joint ranges and rest poses in order to do null space control.
First of all, what practical benefit do you get using null space control instead of "regular" inverse kinematics?
Secondly, why do you need to specify joint ranges, isn't that fully determined by the lower and upper limits? What is the range of a continuous joint?
What exactly are rest poses? Is it just the initial pose before the robot starts to do a task?

There are often many solutions to the Inverse Kinematics problem. Using the null space allows you to influence the IK solution, for example closer to a rest pose.
By default, the PyBullet IK doesn't use the limits from the URDF file, hence you can explicitly specify the desired ranges for the IK solution. A continuous joint has the full 360 degree range.
Check the PyBullet user manual and there are several examples how to use inverse kinematics with PyBullet:
https://github.com/bulletphysics/bullet3/tree/master/examples/pybullet/examples
(just use git checkout https://github.com/bulletphysics/bullet3 and go to examples/pybullet/examples)
There is also an additional PyBullet IK example for the Sawyer robot here:
https://github.com/erwincoumans/pybullet_robots

What parameters can I play with using mcl?

I am clustering undirected graphs using mcl. To do so, I have choose a threshold under which nodes are connected, a similarity measure for each edge and the inflation parameter to tune the granularity of my graph. I have been playing around with these parameters, but so far, the clusters I have seem to be too large (I did visualizations that suggest that the largest clusters should be cut into 2 or more clusters). Therefore, I was wondering what are the other parameters I can play with to improve my clustering (I am currently working with the scheme parameter of mcl to see whether increasing the accuracy would help, but if there are other 'more specific' parameters that could help to get smaller clusters for instance, please let me know)?

There are really mainly two things to consider. The first and most important is outside mcl (http://micans.org/mcl/) itself, namely how the network is constructed. I've written about it elsewhere, but I'll repeat it here because it is important.
If you have a weighted similarity, choose an edge-weight (similarity) cutoff
such that the topology of the network becomes informative; i.e. too many edges
or too few edges yield little discriminative information in the
absence/presence structure of edges. Choose it such that no edges connect
things you consider very dissimilar, and that edges connect things you consider
somewhat similar to quite similar. In the case of mcl, the dynamic range in
edge weight between 'a bit similar' and 'very similar' should be, as a rule of
a thumb, one order of magnitude, i.e. two-fold or five-fold or ten-fold, as
opposed to varying from 0.9 to 1.0. Of course, it is possible to give simple
networks to mcl and it will just utilise the absence/presence of edges. Make sure
the network does not become very dense - a very rough rule of thumb could be to aim
for a total number of edges that is in the order of V * sqrt(V) if the number of nodes (vertcies) is V, that is, each node has, on average, in the order of sqrt(V) neighbours.
The above, network construction, is really crucial, and it is advisable
to try different approaches. Now, given a network,
there is really only one mcl parameter to vary: the inflation parameter (the -I option).
A good set of values to test with is 1.4, 2, 3, 4, 6.
In summary, if you are exploring, try different ways of network construction,
using your knowledge of the data to make the network a meaningful representation,
and combine this with trying different mcl inflation values.

Graph theory - learn cost function to find optimal path

This is a supervised learning problem.
I have a directed acyclic graph (DAG). Each edge has a vector of features X, and each node (vertex) has a label 0 or 1. The task is to find a cost function w(X), so that the shortest path between any pair of nodes has the highest ratio of 1s to 0s (minimum classification error).
The solution must generalize well. I tried logistic regression, and the learned logistic function predicts fairly well the label of a node giving the features of a incoming edge. However, the graph's topology is not taken into account by that approach, so the solution in the whole graph is non-optimal. In other words, the logistic function is not a good weight function given the problem setup above.
Although my problem setup is not the typical binary classification problem setup, here is a good intro to it:
http://en.wikipedia.org/wiki/Supervised_learning#How_supervised_learning_algorithms_work
Here are some more details:
Each feature vector X is a d-dimensional list of real numbers.
Each edge has a vector of features. That is, given the set of edges E = {e1, e2, .. en} and set of feature vectors F = {X1, X2 ... Xn}, then edge ei is associated to vector Xi.
It is possible to come up with a function f(X), so that f(Xi)
gives the likelihood that edge ei points to a node labeled with a 1.
An example of such function is the one I mentioned above, found through logistic
regression. However, as I mentioned above, such function is non-optimal.
SO THE QUESTION IS:
Given the graph, a starting node and an finish node, how do I learn the optimal cost function w(X), so that the ratio of nodes 1s to 0s is maximized (minimum classification error)?

This is not really an answer, but we need to clarify the question. I might come back later for a possible answer though.
Below is an example DAG.
Suppose the red node is the starting node, and the yellow one is the end node. How do you define the shortest path in terms of
the highest ratio of 1s to 0s (minimum classification error) ?
Edit: I add names for each node and two example names for the top two edges.
It seems to me you cannot learn such a cost function that takes feature vectors as inputs and whose output (edge weights? or whatever) can guide you to take a shortest path toward any node considering the graph topology. The reason is stated below:
Let's assume you don't have the feature vectors you stated. Given a graph as above, if you want to find all-pair-shortest-path with respective to the ratio of 1s to 0s, it's perfect to use Bellman equation or more specifically Dijkastra plus a proper heuristic function (e.g., percentage of 1s in the path). Another possible model-free approach is to use q-learning in which we get reward +1 for visiting a 1 node and -1 for visiting a 0 node. We learn a lookup q-table for each target node one at a time. Finally we have the all-pair-shortest-path when all nodes are treated as target nodes.
Now suppose, you magically obtained the feature vectors. Since you are able to find the optimal solution without those vectors, how come they will help when they exist?
There is one possible condition that you can use the feature vector to learn a cost function which optimize edge weights, that is, the feature vectors are dependent on the graph topology (the links between nodes and the position of 1s and 0s). But I did not see this dependency in your description at all. So I guess it does not exist.

This looks like a problem where a genetic algorithm has excellent potential. If you define the desired function as e.g. (but not limited to) a linear combination of the features (you could add quadratic terms, then cubic, ad inifititum), then the gene is the vector of coefficients. The mutator can be just a random offset of one or more coefficients within a reasonable range. The evaluation function is just the average ratio of 1's to 0's along shortest paths for all pairs according to the current mutation. At each generation, pick the best few genes as ancestors and mutate to form the next generation. Repeat until the ueber gene is at hand.

I believe your question is very close to the field of Inverse Reinforcement Learning, where you take certain "expert demonstrations" of optimal paths and try to learn a cost function such that your planner (A* or some reinforcement learning agent) outputs the same path as the expert demonstration. This training is done in an iterative way. I think that in your case, the expert demonstrations could be created by you to be paths that go through maximum number of 1 labelled edges. Here is a link to a good paper on the same: Learning to Search: Functional Gradient Techniques for Imitation Learning. It is from the robotics community where motion planning is usually setup as a graph-search problem and learning cost functions is essential for demonstrating desired behavior.

Determining groups in a hierarchical cluster

I have an algorithm that can group data into a hierarchical cluster tree. The algorithm is the one described in Toby Seagram's Programming Collective Intelligence. The tree output is a binary tree with a "distance" value at each node, that tells you how far apart the two child nodes are.
I can then display this as a Dendrogram and it makes it fairly easy for a human spot which values are grouped together. However I'm having difficult coming up with an algorithm that automatically decides what the groups should be. I'd like to be able to determine automatically:
The number of group
Which points should be placed in each group
Is there a standard algorithm for this?

I think there is no default way to do this. Simple 'manual' methods would be to either:
specify the number of clusters you want/expect
set a threshold for the maximum distance between two nodes; any nodes with a larger distance belong to another cluster
There are some automatic methods to determine the number of clusters. R has the Dynamic Tree Cut package which automatically deals with this problem, also pvclust could be used. Here are two more methods described to deal with this problem, Salvador (2002) and Daniels (2006).

I have found out that the Calinski-Harabasz index (also known as Variance Ratio Criterion) works well with dendrograms produced by hierarchical clustering. You can find more information (and a comparative study) in this paper.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart