I have come across a question based on spanning tree i.e :
what is the upper bound on the number of edge disjoint spanning trees in a
complete graph of n vertices?
(a) n (b) n-1
(c) [n/2] (d) [n/3]
what do we mean by edge disjoint spanning trees? Does that mean different trees such that they don't have any same edges in all the trees?as disjoint means nothing common. Please explain and also what should be its answer then?
Yes. Edge disjoint spanning trees are spanning trees that do not have any edges in common. The maximum number of edge disjoint spanning trees is also known as 'spanning tree packing number or STP number'. For more details regarding this, you can look at this article http://www.sciencedirect.com/science/article/pii/S0012365X00000662#.
When two spanning trees of a same graph don't have any edge in common then it is known as an edge disjoint spanning tree(EDST). And floor(n/2) is number of EDSTs that are possible with n vertices.
Related
Is there any domain (/ dedicated keyword) of graph theory that covers graphs where the edges represent forces?
Force is a vector. Thus, it has two attributes: weight, and direction.
weight represents the magnitude of the force.
direction represents the direction in which the force is acting. This direction is different from directed graphs where only the head or tail nodes matter.
The sense of direction can be better understood by the following examples:
Example 1:
Consider a network of inelastic strings under tension. Let's say the network is under equilibrium. If we pull a node, all other nodes will be pulled. Please note, the length of the strings (~ weight) won't change. But, the locations of the nodes and thereby the direction of the strings may change to bring all the nodes back to equilibrium after the pull.
Example 2: Consider all the planets (~nodes) in the universe in the form of a graph. All of them impart gravitational force (~edges) on each other and are under equilibrium. If we dislodge (or increase the size) of a planet/sun, others are likely to disturb.
The edge weight/length can represent the magnitude of force (But, direction??).
In both the example, the direction component differ them from traditional sense of edge weights where the edges are just scalars. They, do not have direction.
The scalars can be analogous to a sense of distance (shortest distance, eccentricity, closeness centralities) or flow (betweenness centrality etc.); but not force.
The question is How to incorporate direction of edges (in addition to length/weight) in network analysis? Is there any domain that focuses on graphs where edges have weights as well as direction?
Note: The direction of the edge can be an additional parameter like angle; or be specified by the location of the connecting nodes.
What you're describing sounds like force-directed graph drawing algorithms as discussed here. Since you tagged this with networkx, the spring_layout method uses the Fruchterman-Reingold force-directed algorithm.
The networkx documentation doesn't list an actual reference to the algorithm, but the R igraph package lists this as the reference for their layout_with_fr function:
Fruchterman, T.M.J. and Reingold, E.M. (1991). Graph Drawing by Force-directed Placement. Software - Practice and Experience, 21(11):1129-1164.
I was recently going through Topological Sort and DFS from CRLS. They have this entry/exit time concept by which we can classify graph edges into
tree edge
forward edge
back edge
cross edge
So the question is - does Topological sort using DFS try to remove forward edges from the tree keeping only tree edges to arrive at the sorted result?
Note that if a graph has a back-edge, it is not a DAG (Directed Acyclic Graph) since it contains a cycle and hence cannot be topologically sorted.
When we're topologically sorting, we are not removing any edge, we're simply providing a linear order so that edges only travel in one direction: from nodes that appear earlier in the order to nodes that appear later. Forward edges are certainly allowed to exist is such an order. What kind of topological order do you believe the following graph exhibits?
The question is as written in the title
There is a 3x3 grid graph at the above image. We can convert it into junction tree. Then it is possible to use message-passing(product-sum algorithm) for the inference(estimating likelihood/posterior etc). So I wonder why the exact inference in the grid graph is so hard?
Is it impossible to find such a junction tree when the grid goes larger?
The short answer: for a nxn grid, the complexity is at least exponential n.
A junction tree is created from the induced graph of the MRF, which depends on the elimination order (which variables you eliminate first to calculate a marginal). The elimination cost is exponential in the size of the largest clique in the induced graph. See this paper for details.
So even though we can use exact inference on the junction tree, the complexity would be exponential in size of the largest clique in the induced graph of the elimination order that was used.
The best possible elimination order will yield a largest clique size equal to the tree width, which is n for a nxn grid. But there are no efficient algorithms for finding it.
What do the eigenvalues and eigenvectors in spectral clustering physically mean. I see that if λ_0 = λ_1 = 0 then we will have 2 connected components. But, what does λ_2,...,λ_k tell us. I don't understand the algebraic connectivity by multiplicity.
Can we draw any conclusions about the tightness of the graph or in comparison to two graphs?
The smaller the eigenvalue, the less connected. 0 just means "disconnected".
Consider this a value of what share of edges you need to cut to produce separate components. The cut is orthogonal to the eigenvector - there is supposedly some threshold t, such that nodes below t should go into one component, above t to the other.
That depends somewhat on the algorithm. For several of the spectral algorithms, the eigenstuff can be easily run through Principal Component Analysis to reduce the display dimensionality for human consumption. Power iteration clustering vectors are more difficult to interpret.
As Mr.Roboto already noted, the eigenvector is normal to the division brane (a plane after a Gaussian kernel transformation). Spectral clustering methods are generally not sensitive to density (is that what you mean by "tightness"?) per se -- they find data gaps. For instance, it doesn't matter whether you have 50 or 500 nodes within a unit sphere forming your first cluster; the game changer is whether there's clear space (a nice gap) instead of a thin trail of "bread crumb" points (a sequence of tiny gaps) leading to another cluster.
I've been exploring and learning about KD Trees for KNN (K Nearest Neighbors problem)
when would the search not work? or would be worth or not improve the naive search.
are there any drawbacks of this approach?
K-d trees don't work too well in high dimensions (where you have to visit lots and lots of tree branches). One rule of thumb is that if your data dimensionality is k, a k-d tree is only going to be any good if you have many more than 2^k data points.
In high dimensions, you'll generally want to switch to approximate nearest-neighbor searches instead. If you haven't run across it already, FLANN ( github ) is a very useful library for this (with C, C++, python, and matlab APIs); it has good implementations of k-d trees, brute-force search, and several approximate techniques, and it helps you automatically tune their parameters and switch between them easily.
It depends on your distance function.
You can't use k-d-trees with arbitrary distance functions. Minkowski norms should be fine though. But in a lot of applications, you will want to use more advanced distance functions.
Plus, with increasing dimensionality, k-d-trees work much less good.
The reason is simple: k-d-trees avoid looking at points where the one-dimensional distance to the boundary is already larger than the desired threshold, i.e. where for Euclidean distances (where z is the nearest border, y the closes known point):
(x_j - z_j) <=> sqrt(sum_i((x_i - y_i)^2))
equivalently, but cheaper:
(x_j - z_j)^2 <=> sum_i((x_i - y_i)^2)
You can imagine that the chance of this pruning rule holding decrease drastically with the number of dimensions. If you have 100 dimensions, there is next to no chance that a single dimensions squared difference will be larger than the sum of squared differences.
Time complexity for knn :O(k * lg(n))
where k is k-nearest neighbours and lg(n) is kd-tree height
kd-trees will not work well if the dimensions of the data set is high because of such huge space.
lets consider you have many points around the origin ,for simplicity consider in 2-D
If you want to find k-nearest neighbours for any point ,then you have to search along 4 axes because all points are closer to each other which results in backtracking to other axis in kd-tree,
So for a 3-dimensional space we have to search along 8 directions
To generalize for n -dimensional it is 2^k
So the time-complexity becomes O(2^k * lg(n))