I have to train a Support Vector Machine model and I'd like to use a custom kernel matrix, instead of the preset ones (like RBF, Poly, ecc.).
How can I do that (if is it possible) with opencv's machine learning library?
Thank you!
AFAICT, custom kernels for SVM aren't supported directly in OpenCV. It looks like LIBSVM, which is the underlying library that OpenCV uses for this, doesn't provide a particularly easy means of defining custom kernels. So, many of the wrappers that use LIBSVM don't provide this either. There seem to be a few, e.g. scikit for python: scikit example of SVM with custom kernel
You could also take a look at a completely different library, like SVMlight. It supports custom kernels directly. Also take a look at this SO question. The answers there include a handful of SVM libraries, along with brief reviews.
If you have compelling reasons to stay within OpenCV, you might be able to accomplish it by using kernel type CvSVM::LINEAR and applying your custom kernel to the data before training the SVM. I'm a little fuzzy on whether this direction would be fruitful, so I hope someone with more experience with SVM can chime in and comment. If it is possible to use a "precomputed kernel" by choosing "linear" as your kernel, then take a look at this answer for more ideas on how to proceed.
You might also consider including LIBSVM and calling it directly, without using OpenCV. See FAQ #418 for LIBSVM, which briefly touches on how to do custom kernels:
Q: I would like to use my own kernel. Any example? In svm.cpp, there are two subroutines for kernel evaluations: k_function() and kernel_function(). Which one should I modify ?
An example is "LIBSVM for string data" in LIBSVM Tools.
The reason why we have two functions is as follows. For the RBF kernel exp(-g |xi - xj|^2), if we calculate xi - xj first and then the norm square, there are 3n operations. Thus we consider exp(-g (|xi|^2 - 2dot(xi,xj) +|xj|^2)) and by calculating all |xi|^2 in the beginning, the number of operations is reduced to 2n. This is for the training. For prediction we cannot do this so a regular subroutine using that 3n operations is needed. The easiest way to have your own kernel is to put the same code in these two subroutines by replacing any kernel.
That last option sounds like a bit of a pain, though. I'd recommend scikit or SVMlight. Best of luck to you!
If you're not married to OpenCV for the SVM stuff, have a look at the shogun toolbox ... lots of SVM voodoo in there.
Related
I've been currently working on my FYP on Brain tumor classification.Extracted features using wavelet transform ,glcm ,polynomial transform etc.
IS IT RIGHT TO APPEND THESE FEATURE VECTORS (columnwise) for training? like combinations of these feature vectors eg: glcm+wavelet
Can you suggest me any papers related to this?
THANK YOU FOR THE HELP
Yes, this method is known as early fusion.
In other words, early fusion is when you are concatenating 2 or more features sets prior to model training.
There are a number of other methods for feature fusion, including model-, and late-fusion.
Take a look at these papers which might help you:
Specific to a health-based application
figure which might help you to grasp the concept
I have a datasets of MFCC that I know is good. I know how to put a row vector into a machine learning algorithm. My question is how to do it with MFCC, as it is a matrix? For example, how would I put this inside a machine learning algorithm:?
http://archive.ics.uci.edu/ml/machine-learning-databases/00195/Test_Arabic_Digit.txt
Any algorithm will work. I am looking at a binary classifier, but will be looking into it more. Scikit seems like a good resource. For now I would just like to know how to input MFCC into an algorithm. Step by step would help me a lot! I have looked in a lot of places but have not found an answer.
Thank you
In python, you can easily flatten a matrix so it becomes in a vector,for example you can use numpy and numpy's flatten function ,additionally an idea that comes to my mind(it's just an idea may or may not work) is to use convolutions, convolutions work very well with 2d structures.
I want to train a convolutional neural network in one language, but use it in another one (for various technical/performance related reasons). Is there a programmatic way of doing this by saving weights?
For example, I can train a multilayer perceptron in Python, then save all the weights in a CSV file, then make a new MLP in Java and use the file to set the weights. However, I'm unsure of how I can do something similar with a convolutional neural network because I don't know how to treat the convolutional layer. I think my main problem is understanding how to export/save the convolution part of the network and then load them elsewhere.
In short - there is no general way and it rather won't be, as it is highly specialized type of data, with no one, common idea of how they should look like.
In other words you have to export it yourself to .txt, .csv, database, or any other storage system of your choice. CNNs do not differ that much from MLP, they also have layers and weights, the only difference is that their structure is a bit more complex - and this structure is additional few numbers that you need to save, for example for spatial convolution you need to know the size of the layer, size of kernels and their movement/padding thus you can reconstruct the entire object in the new language.
If I am getting you right, you want to train a CNN with one language and then save it's weights and after that you want to use these weights with other language.
If that is the case then answer will totally depend on the languages you want to use. I have seen people training CNN with C++/Python with the use of caffe and save weights as .caffemodel. You can then use this ".caffemodel" with lua language with the use of loadcaffe interface.
So higher level answer is; Yes, you can find a way to share between python/C++ and lua. You can also share weights between lua and matlab with the use of mattorch/matio.
I got Memory Error when I was running dbscan algorithm of scikit.
My data is about 20000*10000, it's a binary matrix.
(Maybe it's not suitable to use DBSCAN with such a matrix. I'm a beginner of machine learning. I just want to find a cluster method which don't need an initial cluster number)
Anyway I found sparse matrix and feature extraction of scikit.
http://scikit-learn.org/dev/modules/feature_extraction.html
http://docs.scipy.org/doc/scipy/reference/sparse.html
But I still have no idea how to use it. In DBSCAN's specification, there is no indication about using sparse matrix. Is it not allowed?
If anyone knows how to use sparse matrix in DBSCAN, please tell me.
Or you can tell me a more suitable cluster method.
The scikit implementation of DBSCAN is, unfortunately, very naive. It needs to be rewritten to take indexing (ball trees etc.) into account.
As of now, it will apparently insist of computing a complete distance matrix, which wastes a lot of memory.
May I suggest that you just reimplement DBSCAN yourself. It's fairly easy, there exists good pseudocode e.g. on Wikipedia and in the original publication. It should be just a few lines, and you can then easily take benefit of your data representation. E.g. if you already have a similarity graph in a sparse representation, it's usually fairly trivial to do a "range query" (i.e. use only the edges that satisfy your distance threshold)
Here is a issue in scikit-learn github where they talk about improving the implementation. A user reports his version using the ball-tree is 50x faster (which doesn't surprise me, I've seen similar speedups with indexes before - it will likely become more pronounced when further increasing the data set size).
Update: the DBSCAN version in scikit-learn has received substantial improvements since this answer was written.
You can pass a distance matrix to DBSCAN, so assuming X is your sample matrix, the following should work:
from sklearn.metrics.pairwise import euclidean_distances
D = euclidean_distances(X, X)
db = DBSCAN(metric="precomputed").fit(D)
However, the matrix D will be even larger than X: n_samplesĀ² entries. With sparse matrices, k-means is probably the best option.
(DBSCAN may seem attractive because it doesn't need a pre-determined number of clusters, but it trades that for two parameters that you have to tune. It's mostly applicable in settings where the samples are points in space and you know how close you want those points to be to be in the same cluster, or when you have a black box distance metric that scikit-learn doesn't support.)
Yes, since version 0.16.1.
Here's a commit for a test:
https://github.com/scikit-learn/scikit-learn/commit/494b8e574337e510bcb6fd0c941e390371ef1879
Sklearn's DBSCAN algorithm doesn't take sparse arrays. However, KMeans and Spectral clustering do, you can try these. More on sklearns clustering methods: http://scikit-learn.org/stable/modules/clustering.html
Which ls-svm toolbox can use in matlab? Which implementation do you recommend?
I am not 100% sure what type of SVM you're referring to, but I'm assuming you're interested in an implementation of the least squares SVM of Suykens & Vanderwalle, NIPS 99. If that's the case, I believe neither libsvm nor liblinear do that; check out http://www.esat.kuleuven.be/sista/lssvmlab/ .
If you're interested in a normal quadratic-programming formulation of the svm with quadratic slack penalties, libsvm and liblinear should work. Also, the newer subgradient-based solvers, such as pegasos may be useful as well but I am not sure of whether there is a good matlab library for you to use.
Check out both libsvm and liblinear:
http://www.csie.ntu.edu.tw/~cjlin/libsvm/
http://www.csie.ntu.edu.tw/~cjlin/liblinear/
These the are the fastest SVM solvers that I know of.