Hello I am working on a project involving in face recognition for which I am using Linear Discriminant Analysis(LDA). LDA demands to find the generalized eigen vectors for the between class scatter matrix and with in class scatter matrix and that is where I am struck. I am using opencv with DevC++ for coding. Basically the problem looks like
A*v=lambda*B*v
where A and B are matrices for which generalized eigen vectors should be found
lambda is eigen values and v is vectors
Upon searching about this problem many people suggested to go for calculating the inverse of B and then multiplying with A*v
(inv(B)*A)*v=lambda*v
and then calculate eigen vectors for inv(B)*A.
It seems to be a good solution but in my case the scatter matrix B is almost sigular. I found its determinant is in the order of 10^-36 .So I cant find its inverse and proceed with the above solution. So Can some one suggest me a way to get out of this problem except saying to code for generalized eigen value problem separately.
I am providing a Fisherfaces implementation in my github repository at https://github.com/bytefish/opencv/tree/master/lda. This includes the implementation of an eigenvalue solver for general matrices, see: https://github.com/bytefish/opencv/blob/master/lda/include/decomposition.hpp (I've ported the great JAMA solver), which is exactely what you are looking for.
If you have problems with the code, please drop me a note on the projects page at http://www.bytefish.de/blog/fisherfaces_in_opencv.
Related
I've been studying the Machine Learning course on Coursera by Andrew Ng.
I've solved the programming assignment that is given at the end of the week 3. But there was something that I was wondering and I do not want to skip things without being able to reason them, at least I couldn't.
Why do we omit coefficients of polynomial terms when feature mapping?
I quote from the assignment's PDF (which you can view from the link).
2.2 Feature mapping
One way to fit the data better is to create more features from each data
point. In the provided function mapFeature.m, we will map the features into
all polynomial terms of x 1 and x 2 up to the sixth power.
As you can see coefficients are not included.
Considering I am not good at math I think we found all the polynomial terms doing:
since here we would get coefficients after we take the powers, why do we omit them in the vector?
By coefficients I mean coefficients of the terms
instead of in the vector.
Newbie here typesetting my question, so excuse me if this don't work.
I am trying to give a bayesian classifier for a multivariate classification problem where input is assumed to have multivariate normal distribution. I choose to use a discriminant function defined as log(likelihood * prior).
However, from the distribution,
$${f(x \mid\mu,\Sigma) = (2\pi)^{-Nd/2}\det(\Sigma)^{-N/2}exp[(-1/2)(x-\mu)'\Sigma^{-1}(x-\mu)]}$$
i encounter a term -log(det($S_i$)), where $S_i$ is my sample covariance matrix for a specific class i. Since my input actually represents a square image data, my $S_i$ discovers quite some correlation and resulting in det(S_i) being zero. Then my discriminant function all turn Inf, which is disastrous for me.
I know there must be a lot of things go wrong here, anyone willling to help me out?
UPDATE: Anyone can help how to get the formula working?
I do not analyze the concept, as it is not very clear to me what you are trying to accomplish here, and do not know the dataset, but regarding the problem with the covariance matrix:
The most obvious solution for data, where you need a covariance matrix and its determinant, and from numerical reasons it is not feasible is to use some kind of dimensionality reduction technique in order to capture the most informative dimensions and simply discard the rest. One such method is Principal Component Analysis (PCA), which applied to your data and truncated after for example 5-20 dimensions would yield the reduced covariance matrix with non-zero determinant.
PS. It may be a good idea to post this question on Cross Validated
Probably you do not have enough data to infer parameters in a space of dimension d. Typically, the way you would get around this is to take an MAP estimate as opposed to an ML.
For the multivariate normal, this is a normal-inverse-wishart distribution. The MAP estimate adds the matrix parameter of inverse Wishart distribution to the ML covariance matrix estimate and, if chosen correctly, will get rid of the singularity problem.
If you are actually trying to create a classifier for normally distributed data, and not just doing an experiment, then a better way to do this would be with a discriminative method. The decision boundary for a multivariate normal is quadratic, so just use a quadratic kernel in conjunction with an SVM.
I got Memory Error when I was running dbscan algorithm of scikit.
My data is about 20000*10000, it's a binary matrix.
(Maybe it's not suitable to use DBSCAN with such a matrix. I'm a beginner of machine learning. I just want to find a cluster method which don't need an initial cluster number)
Anyway I found sparse matrix and feature extraction of scikit.
http://scikit-learn.org/dev/modules/feature_extraction.html
http://docs.scipy.org/doc/scipy/reference/sparse.html
But I still have no idea how to use it. In DBSCAN's specification, there is no indication about using sparse matrix. Is it not allowed?
If anyone knows how to use sparse matrix in DBSCAN, please tell me.
Or you can tell me a more suitable cluster method.
The scikit implementation of DBSCAN is, unfortunately, very naive. It needs to be rewritten to take indexing (ball trees etc.) into account.
As of now, it will apparently insist of computing a complete distance matrix, which wastes a lot of memory.
May I suggest that you just reimplement DBSCAN yourself. It's fairly easy, there exists good pseudocode e.g. on Wikipedia and in the original publication. It should be just a few lines, and you can then easily take benefit of your data representation. E.g. if you already have a similarity graph in a sparse representation, it's usually fairly trivial to do a "range query" (i.e. use only the edges that satisfy your distance threshold)
Here is a issue in scikit-learn github where they talk about improving the implementation. A user reports his version using the ball-tree is 50x faster (which doesn't surprise me, I've seen similar speedups with indexes before - it will likely become more pronounced when further increasing the data set size).
Update: the DBSCAN version in scikit-learn has received substantial improvements since this answer was written.
You can pass a distance matrix to DBSCAN, so assuming X is your sample matrix, the following should work:
from sklearn.metrics.pairwise import euclidean_distances
D = euclidean_distances(X, X)
db = DBSCAN(metric="precomputed").fit(D)
However, the matrix D will be even larger than X: n_samplesĀ² entries. With sparse matrices, k-means is probably the best option.
(DBSCAN may seem attractive because it doesn't need a pre-determined number of clusters, but it trades that for two parameters that you have to tune. It's mostly applicable in settings where the samples are points in space and you know how close you want those points to be to be in the same cluster, or when you have a black box distance metric that scikit-learn doesn't support.)
Yes, since version 0.16.1.
Here's a commit for a test:
https://github.com/scikit-learn/scikit-learn/commit/494b8e574337e510bcb6fd0c941e390371ef1879
Sklearn's DBSCAN algorithm doesn't take sparse arrays. However, KMeans and Spectral clustering do, you can try these. More on sklearns clustering methods: http://scikit-learn.org/stable/modules/clustering.html
I would like to know, if there is any code or any good documentation available for implementing HOG features? I tried to read the documentation here but it's quite difficult to understand and it needs SVM..
What I need is just to implement a HOG detector for objects.... Like what it does SIFT or SURF
Btw, I'm not interesting in this work.
Thank you..
you can take a look at
http://szproxy.blogspot.com/2010/12/testtest.html
he also published "tutorial" for HOG on source forge here:
http://sourceforge.net/projects/hogtrainingtuto/?_test=beta
I know this since I'm having the same problem as you. The tutorial though isn't what i would call a tutorial, its a bunch of source codes, no documentation, but I assume that it works and can at least get you somewhere.
At the end and simplifying a bit, all that you need to detect specific objects in image is:
Localize "points of interest" to extract the patches:
In order to get points of interest, you can use some algorithms like Harris corner detector, randomly or something simply like sliding windows.
From these points get patches:
You will have to take the decission of the patch size.
From these patches compute the feature descriptor. (like HOG).
Instead of HOG you can use another feature descriptor like SIFT, SURF...
HOG's implementation is not too hard. You have to calculate the gradients of the extracted patch doing applying Sobel X and Y kernels, after that you have to divide the patch in NxM cells, 8x8 for instance, and compute an histogram of gradients, angle and magnitude. In the following link you can see it more detailed explanation:
HOG Person Detector Tutorial
Check your feature vector in the previously trained classifier
Once you got this vector, check if it is the desired object or not with a previously trained classifier like SMV. Instead SVM you could use NeuralNetworks for instance.
SVM implementation is more dificult, but there are some libraries like opencv that you can use.
There is a function extractHOGFeatures in the Computer Vision System Toolbox for MATLAB.
I need to do matrix operations (mainly multiply and inverse) of a sparse matrix SparseMat in OpenCV.
I noticed that you can only iterate and insert values to SparseMat.
Is there an external code I can use? (or am I missing something?)
It's just that sparse matrices are not really suited for inversion or matrix-matrix-multiplication, so it's quite reasonable there is no builtin function for that. They're actually more used for matrix-vector multiplication (usually when solving iterative linear systems).
What you can do is solve N linear systems (with the columns of the identity matrix as right hand sides) to get the inverse matrix. But then you need N*N storage for the inverse matrix anyway, so using a dense matrix with a usual decompositions algorithm would be a better way to do it, as the performance gain won't be that high when doing N iterative solutions. Or maybe some sparse direct solvers like SuperLU or TAUCS may help, but I doubt that OpenCV has such functionalities.
You should also think if you really need the inverse matrix. Often such problem are also solvable by just solving a linear system, which can be done with a sparse matrix quite easily and fast via e.g. CG or BiCGStab.
You can convert a SparseMat to a Mat, do what operations you need and then convert back.
you can use Eigen library directly. Eigen works together with OpenCV very well.