I'm porting an algorithm that uses a Support Vector Machine from Python (using scikit-learn) to C++ (using the machine learning library of OpenCV).
I have access to the trained SVM in Python, and I can import SVM model parameters from an XML file into OpenCV. Since the SVM implementation of both scikit-learn and OpenCV is based on LibSVM, I think it should be possible to use the parameters of the trained scikit SVM in OpenCV.
The example below shows an XML file which can be used to initialize an SVM in OpenCV:
<?xml version="1.0"?>
<opencv_storage>
<my_svm type_id="opencv-ml-svm">
<svm_type>C_SVC</svm_type>
<kernel><type>RBF</type>
<gamma>0.058823529411764705</gamma></kernel>
<C>100</C>
<term_criteria><epsilon>0.0</epsilon>
<iterations>1000</iterations></term_criteria>
<var_all>17</var_all>
<var_count>17</var_count>
<class_count>2</class_count>
<class_labels type_id="opencv-matrix">
<rows>1</rows>
<cols>2</cols>
<dt>i</dt>
<data>
0 1</data></class_labels>
<sv_total>20</sv_total>
<support_vectors>
<_>
2.562423055146794554e-02 1.195797425735170838e-01
8.541410183822648050e-02 9.395551202204914520e-02
1.622867934926303379e-01 3.074907666176152077e-01
4.099876888234874062e-01 4.697775601102455179e-01
3.074907666176152077e-01 3.416564073529061440e-01
5.124846110293592716e-01 5.039432008455355660e-01
5.466502517646497639e-01 1.494746782168964394e+00
4.168208169705446942e+00 7.214937388193202183e-01
7.400275229357797802e-01</_>
<!-- omit 19 vectors to keep it short -->
</support_vectors>
<decision_functions>
<_>
<sv_count>20</sv_count>
<rho>-5.137523249549433402e+00</rho>
<alpha>
2.668992955678978518e+01 7.079767098112181145e+01
3.554240018130368384e+01 4.787014908624512088e+01
1.308470223155845069e+01 5.499185410034550614e+01
4.160483074010306126e+01 2.885504210853826379e+01
7.816431542954153144e+01 6.882061506693679576e+01
1.069534676985309574e+01 -1.000000000000000000e+02
-5.088050252552544350e+01 -1.101740897543916375e+01
-7.519686789702373630e+01 -3.893481464245511603e+01
-9.497774056452135483e+01 -4.688632332663718927e+00
-1.972745089701982835e+01 -8.169343841768861125e+01</alpha>
<index>
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
</index></_></decision_functions></my_svm>
</opencv_storage>
I would now like to fill this XML file with values from the trained scikit-learn SVM. But I'm not sure how the parameters of scikit-learn and OpenCV correspond. Here is what I have so far (clf is the classifier object in Python):
<kernel><gamma> corresponds to clf.gamma
<C> corresponds to clf.C
<term_criteria><epsilon> corresponds to clf.tol
<support_vectors> corresponds to clf.support_vectors_
Is this correct so far? Now here are the items I'm not really sure:
What about <term_criteria><iterations>?
Does <decision_functions><_><rho> correspond to clf.intercept_?
Does <decision_functions><_><alpha> correspond to clf.dual_coef_? Here I'm not sure because the scikit-learn documentation says "dual_coef_ which holds the product yiαi". It looks like OpenCV expects only αi, and not yiαi.
You don't need epsilon and iterations anymore, those are used in the training optimization problem. You can set them to your favorite number or ignore them.
Porting the support vectors may require some fiddling, as indexing may be different between scikit-learn and opencv. The XML in your example has no sparse format for example.
As for the other parameters:
rho should correspond to intercept_, but you may need to change sign.
scikit's dual_coef_ corresponds to sv_coef in standard libsvm models (which is alpha_i*y_i).
If opencv complains about the values you provide for alpha when porting, use absolute values of scikit-learn's dual_coef_ (e.g. all positive). These are the true alpha values of an SVM model.
Related
I have trained a set of documents using Doc2vecc.
https://github.com/mchen24/iclr2017
I am trying to generate the embedding vector for the unseen documents.I have trained the documents as mentioned in the go.sh.
"""
time ./doc2vecc -train ./aclImdb/alldata-shuf.txt -word
wordvectors.txt -output docvectors.txt -cbow 1 -size 100 -window 10 -
negative 5 -hs 0 -sample 0 -threads 4 -binary 0 -iter 20 -min-count 10
-test ./aclImdb/alldata.txt -sentence-sample 0.1 -save-vocab
alldata.vocab
"""
I get the docvectors.txt and wordvectors.txt for the train set. Now from here how do I generate vectors for unseen test using the same model without retraining.
As far as I can tell, the author (https://github.com/mchen24) of that doc2vecc.c code (and paper) just made minimal changes to some example 'paragraph vector' code that was itself a minimal change to the original Google/Mikolov word2vec.c (https://github.com/tmikolov/word2vec/blob/master/word2vec.c).
Neither the 'paragraph vector' changes nor the subsequent doc2vecc changes appear to include any functionality for inferring vectors for new documents.
Because these are unsupervised algorithms, for some purposes it may be appropriate to calculate the document-vectors for some downstream classification task, for both training and test texts, in the same combined bulk training. (Your ultimate goals may in fact have unlabeled examples to help learn the document-vectorization, even if your classifier should be trained an evaluated on some subset of known-label texts.)
Doc2VecC is expressly designed to create document vectors as averages of the word-vectors in each document. This is unlike Doc2Vec where document embeddings are trained alongside the word embeddings making it impossible to handle unseen documents. The amount of trained vectors is also enormous in Doc2Vec.
To build the vector for an unseen document, just count all the words from your vocabulary in it and compute an average of the word-vectors.
I am trying to implement the OpenCV LBPHFaceRecognizer() and make it work for the images of digits from the MNIST dataset. These images are 28 x 28 px and look like this:
But for this task I need an haarcascade.xml file which is able to recognize digits. In the OpenCV package I only find xml files which are suited for facerecognition and russian plate numbers.
Here is my code, I basicly just need to replace the cascadePath = "haarcascade_frontalface_default.xml" with an apropriate xml for digits, but where do I get one?
All in all I want to test facerecognition with numbers instead of faces. So an input image where a "1" is shown should be able to recognize all other "1"`s in the dataset.
For this, you need to train a cascade. Here two link to explain how to do this :
1 This is the Opencv documentation for opencv_traincascade which is the opencv app to train cascade (generate .xml)
2 This is a useful tutorial to train cascade with opencv. It explains what to do and give some tricks to generate input file.
I have a project that is based on SVM algorithm implemented by libsvm. Recently I decided to try several other classification algorithm, this is where scikit-learn comes to the picture.
The connection to the scikit was pretty straightforward, it supports libsvm format by load_svmlight_file routine. Ans it's svm implementation is based on the same libsvm.
When everything was done, I decided to the check the consistence of the results by directly running libsvm and via scikit-learn, and the results were different. Among 18 measures in learning curves, 7 were different, and the difference is located at the small steps of the learning curve. The libsvm results seems much more stable, but scikit-learn results have some drastic fluctuation.
The classifiers have exactly the same parameters of course.
I tried to check the version of libsvm in scikit-learn implementation, but I din't find it, the only thing I found was libsvm.so file.
Currently I am using libsvm 3.21 version, and scikit-learn 0.17.1 version.
I wound appreciate any help in addressing this issue.
size libsvm scikit-learn
1 0.1336239435355727 0.1336239435355727
2 0.08699516468193455 0.08699516468193455
3 0.32928301642777424 0.2117238289550198 #different
4 0.2835688734876902 0.2835688734876902
5 0.27846766962743097 0.26651875338163966 #different
6 0.2853854654662907 0.18898048915599963 #different
7 0.28196058132165136 0.28196058132165136
8 0.31473956032575623 0.1958710201604552 #different
9 0.33588303670653136 0.2101641630182972 #different
10 0.4075242509025311 0.2997807499800962 #different
15 0.4391771087975972 0.4391771087975972
20 0.3837789445609818 0.2713167833345173 #different
25 0.4252154334940311 0.4252154334940311
30 0.4256407777477492 0.4256407777477492
35 0.45314944605858387 0.45314944605858387
40 0.4278633233755064 0.4278633233755064
45 0.46174762022239796 0.46174762022239796
50 0.45370452524846866 0.45370452524846866
here is the deal.
I am trying to make an SVM based POS tagger.
The feature vectors for the SVM was created with the help of format converters.
Now here is a screenshot of the training file that I am using.
http://tinypic.com/r/n4fn2r/8
I have 25 labels for various POS tags. when i use the java implementation or the command line tools for prediction i get the following results.
http://tinypic.com/r/2dtw5ky/8
I have tried with all the kernels available but it gave more or less the same results.
This is happening even when the training file is used as the testing file.
please help me out here..!!
P.S. I cannot share more than two links. Thus here is a snippet of the model file
svm_type c_svc
kernel_type rbf
gamma 0.000548546
nr_class 25
total_sv 431
rho -0.929467 1.01073 1.0531 1.03472 1.01585 0.953263 1.03027 -0.921365 0.984535 1.02796 1.01266 1.03374 0.949463 0.977925 0.986551 -0.920912 0.940926 -0.955562 0.975386 -0.981959 -0.884042 0.0516955 -0.980884 -0.966095 0.995091 1.023 1.01489 1.00308 0.948314 1.01137 -0.845876 0.968034 1.0076 1.00064 1.01335 0.942633 0.965703 0.979212 -0.861236 0.935055 -0.91739 0.970223 -0.97103 0.0743777 0.970321 -0.971215 -0.931582 0.972377 0.958193 0.931253 0.825797 0.954894 -0.972884 -0.941726 0.945077 0.922366 0.953999 -1.00503 0.840985 0.882229 -0.961742 0.791631 -0.984971 0.855911 -0.991528 -0.951211 -0.962096 -0.99213 -0.99708 -0.957557 -0.308987 -0.455442 -0.94881 -0.995319 -0.974945 -0.964637 -0.902152 -0.955258 -1.05287 -1.00614 -0.
update
Just trained the SVM with svm type as c-SVC and kernel type as linear. Which gave a non-zero(although very poor) accuracy.
As mentioned by #Pedrom, parameter choice is absolutely crucial when training SVMs. I suggest you have a look at this practical guide. Also, 431 words is nowhere near enough to train a 25-class model. You will definitely need more data.
That said, 0% accuracy is indeed odd. Can you please show us the commands you are using to train and evaluate the model?
I am using jlibsvm to do SVM for regression .My data set is very small (42 samples) . When I use the dataset to create the model using epsilon SVR with sigmoid kernel then no support vectors are generated.
This is what I get in my model file :
svm_type epsilon_svr
kernel_type sigmoid
gamma 0.02380952425301075
coef0 0.0
label
rho -66.42803
total_sv 0
probA -1.0
SV
When I use some other data set on the libsvm website I get a model file with support vectors fine.
Can someone please suggest why no support vectors are being generated for my data set ?
My data set file is formatted right so no issues there...
This could mean that the best found classification, given your data and the hyperparameters, is to assign the same label to all samples.
Are your samples unbalanced? What's the number of positive and negative samples? You might want to try to add a weighting to positive/negative samples to account for that
It could also be the samples are hard to separate given their structure and the kernel type. Have you tried a different structure?
With only 42 data samples, maybe you could add them to your question and get better answers.