Why does Armadillo produce NaN in eigenvectors? - armadillo

I am using Armadillo C++ library to perform the eigen decomposition of a Hermitian matrix R of size 49x49. The resulting eigenvectors have NaN columns sometimes. When I perform the same decomposition in matlab there is no such issue.
My Armadillo code is something like this:
cx_mat R = (S_n_centered * S_n_centered.t()) / S_n.n_cols;
cx_mat E;
vec d;
bool success = eig_sym(d,E,R);
For example when R is this, the eigenvectors corresponding to the three largest eigenvalues are these. The success flag is true and eigenvalues are correct (almost the same as in Matlab).
I am using Armadillo version 7.960.0 and using blas_win64_MT.dll and lapack_win64_MT.dll both of which are included in the examples folder.
Decomposing the same Matrix in Matlab doesn't cause any issues but the eigenvectors are all different.
Could this be a bug in Armadillo or the Lapack library?

Related

What is the equation for SVR inference using an RBF kernel?

I'm using sklearn for SVR (regression) using an RBF kernel. I'm want to know how the inference is done under the hood. I thought it was a function of the support vectors, function mean, and gamma, but it appears I'm missing one aspect (probably some scaling based on how close 2 points are.
Here is "my Equation" that I've tried in the graph's below.
out = mean
for vect in vectors:
out = out + (vect.y - mean) * math.exp(-(vect.x - x) ** 2 * gamma)
When I do just 2 points spaced away, my equation matches what skLearn reports with svr.predict.
With 3 training points and 2 close together, my equation does not match what svr.predict gives:
Given the support vectors, gamma, and mean, and anything else needed, what is the equation for SVR inference with RBF kernel? Can those be obtained from the sklearn svr class?
The equation that works for me using sklearn library and SVR inference with RBF kernel is as follows with python code:
# x and y is already defined, and is the training data for the SVR
svr = svm.SVR(kernel="rbf", C=C, gamma=gamma, epsilon=epsilon, tol=tol)
svr.fit(x,y)
vectors = []
for i in svr.support_:
vectors.append([x[i][0], y[i]])
out = svr._intercept_[0]
for vect, coef in zip(vectors, svr._dual_coef_[0]):
out = out + coef * math.exp(-(vect[0] - x) ** 2 * gamma)
I found that svr._intercept_[0] contains the y offset for the function.
I found that svr._dual_coef_[0] contains the coefficients to multiply each of the exponentials by.
I found that svr.support_ contains the indexes of the elements in your training set used as the support vectors.
I realize I'm accessing what is intended to be accessed within the svr class only, however, I don't see an official API method for accessing these variables, and this is working for me for now.

Write Dirichlet Log Likelihood with DCP ruleset

I would like to write the log likelihood of the Dirichlet density as a disciplined convex programming (DCP) optimization problem with respect to the parameters of the Dirichlet distribution alpha. However, the log likelihood
def dirichlet_log_likelihood(p, alpha):
"""Log of Dirichlet density.
p: Numpy array of shape (K,) that sums to 1.
alpha: Numpy array of shape (K, ) with positive elements.
"""
L = np.log(scipy.special.gamma(alpha.sum()))
L -= np.log(scipy.special.gamma(alpha)).sum()
L += np.sum((alpha - 1) * np.log(p))
return L
despite being concave in alpha is not formulated as DCP because it involves the difference of two concave functions np.log(gamma(alpha.sum())) and np.log(gamma(alpha)).sum(). I would like if possible, to formulate this function of alpha so that it follows the DCP ruleset, so that maximum-likelihood estimation of alpha can be performed with cvxpy.
Is this possible, and if so how might I do it?
As you note, np.log(gamma(alpha.sum())) and -np.log(gamma(alpha)).sum() have different curvature, so you need to combine them as
np.log(gamma(alpha.sum()) / gamma(alpha).sum())
to have any chance of modelling them under the DCP ruleset. The combined expression above can be recognized as the logarithm of the multivariate beta function, and since the multivariate beta function can be written as a product of bivariate beta functions (see here), you can expand the log-product to a sum-log expression where each term is of the form
np.log(beta(x,y))
and this is the convex atom you need in your DCP ruleset. What remains of you, to use it in practice, is to feed in an approximation of this atom into cvxpy. The np.log(gamma(x)) approximation here will serve as a good starting point.
Please see math.stackexchange.com for more details.

Find negative fractional power of a matrix in Armadillo

In Matlab I do A ^ -0.5 to find the negative fractional power of matrix A. What is the equivalent in Armadillo C++ library? The pow() function performs element wise operation.
You can do
expmat(-0.5 * logmat(A))
Use the powmat() function, like so:
mat A(5,5,fill::randu);
cx_mat B = powmat(A, -0.5);
Or use a combination of inv() and sqrtmat().

SVD output interpretation in mahout

I am trying to run a SVD job in mahout. I have a matrix (say A) created (Document x term) of size 372053 x 21338 (21338 no of unique words say N, 372053 documents say M). So my matrix A is of size (M*N). I ran the svd using mahout and i got the cleaned eigen vectors (i gave the expected rank as 200 say R). Now i have a eigen vectors matrix created of size R*N.
Stating the SVD equation
A = U * S * V' (V' being transpose of V)
I need to convert the matrix A to the new space, to get the compressed vectors of the documents (I am trying to implement LSI)
What is the output i get from mahout SVD? (I would like to know in terms of the equation above) I read mailing list that we can get the eigen values from the NamedVectors in the generated eigen vectors matrix.
Please guide me on how to proceed from here to generate the document-term matrix A in the new space (of size M*R).
Any help is highly appreciated :)
A good starting point for LSI with Stochastic SVD on Mahout can be found here.
The good part is that the paper describes also the folding in process and is explicit on the output format in terms of the svd equation.
The work is integrated in the latest version 0.8 and can be used with SSVDCli job or through mahout CLI with mahout ssvd <options>

Libsvm precomputed kernels

I am using libsvm with precomputed kernels. I generated a precomputed kernel file for the example data set heart_scale and executed the function svmtrain(). It worked properly and the support vectors were identifed correctly, i.e. similar to standard kernels.
However, when I am trying to run svmpredict(), it gave different results for the precomputed model file. After digging through the code, I noticed that the svm_predict_values() function, requires the actual features of the support vectors, which is unavailable in precomputed mode. In precomputed mode, we only have the coefficient and index of each support vector, which is mistaken for its features by svmpredict().
Is this a issue or am I missing something.
(Please let me know how to run svmpredict() in precomputed mode.)
The values of the kernel evaluation between a test set vector, x, and each training set vector should be used as the test set feature vector.
Here are the pertinent lines from the libsvm readme:
New training instance for xi:
<label> 0:i 1:K(xi,x1) ... L:K(xi,xL)
New testing instance for any x:
<label> 0:? 1:K(x,x1) ... L:K(x,xL)
The libsvm readme is saying that if you have L training set vectors, where xi is a training set vector with i from [1..L], and a test set vector, x, then the feature vector for x should be
<label of x> 0:<any number> 1:K(x^{test},x1^{train}), 2:K(x^{test},x2^{train}) ... L:K(x^{test},xL^{train})
where K(u,v) is used to denote the output of the kernel function on with vectors u and v as the arguments.
I have included some example python code below.
The results from the original feature vector representation and the precomputed (linear) kernel are not exactly the same, but this is probably due to differences in the optimization algorithm.
from svmutil import *
import numpy as np
#original example
y, x = svm_read_problem('.../heart_scale')
m = svm_train(y[:200], x[:200], '-c 4')
p_label, p_acc, p_val = svm_predict(y[200:], x[200:], m)
##############
#train the SVM using a precomputed linear kernel
#create dense data
max_key=np.max([np.max(v.keys()) for v in x])
arr=np.zeros( (len(x),max_key) )
for row,vec in enumerate(x):
for k,v in vec.iteritems():
arr[row][k-1]=v
x=arr
#create a linear kernel matrix with the training data
K_train=np.zeros( (200,201) )
K_train[:,1:]=np.dot(x[:200],x[:200].T)
K_train[:,:1]=np.arange(200)[:,np.newaxis]+1
m = svm_train(y[:200], [list(row) for row in K_train], '-c 4 -t 4')
#create a linear kernel matrix for the test data
K_test=np.zeros( (len(x)-200,201) )
K_test[:,1:]=np.dot(x[200:],x[:200].T)
K_test[:,:1]=np.arange(len(x)-200)[:,np.newaxis]+1
p_label, p_acc, p_val = svm_predict(y[200:],[list(row) for row in K_test], m)

Resources