How to train HOG and use my HOGDescriptor? - opencv

I want to training data and use HOG algorithm to detect pedestrian.
Now I can use defaultHog.setSVMDetector(HOGDescriptor::getDefaultPeopleDetector()); in opencv to detection, but the result is not very good to my testing video. So I want to do training use my database.
I have prepared 1000+ positive sample, and 1000+ negative samples. They are cropped to size 50 * 100, and I have do the list file.
And I have read some tutorials on the internet, they all so complex, sometimes abstruse. Most of them are analyze the source code and the algorithm of HOG. But with only less examples and simple anylize.
Some instruction show that libsvm\windows\svm-train.exe can be used to training, Can anyone gives an examples according to 1000+ 50*100 positive samples?
For example, like haartraing, we can do it from opencv, like haartraining.exe –a –b with some parameters, and get a *.xml as a result which will be used to people detection?
Or is there any other method to training, and detection?
I prefer to know how to use it and the detail procedures. As the detail algorithm, it is not important to me. I just want to implement it.
If anyone know about it, please give me some tips.

I provided some sample code and instructions to start training your own HOG descriptor using openCV:
See https://github.com/DaHoC/trainHOG/wiki/trainHOG-Tutorial.
The algorithm is indeed too complex to provide in short, the basic idea however is to:
Extract HOG features from negative and positive sample images of identical size and type.
Use the extracted feature vectors along with their respective classes to train a SVM classifier, in this step you can use the svm-train.exe with a generated file of the correct format containing the feature vectors and their classes (or directly include and address the libsvm library class in your sources).
The resulting SVM model and support vectors are calculated into a single descriptor vector that can be used with the openCV detector.
Best regards

Related

Obtain negative results from a machine learning algorithm

I have a set of images of a particular object. I want to find if some of these has anomalies with a machine learning algorithm. For example if I have many photos of glasses I want to find if one of these is broken or has something anomalous. Something like this:
GOOD!!
BAD!!
(Obviously I will use the same kind of glasses...)
The problem is that I don't know every negative situation, so, for training, I have only positive images.
In other words I want an algorithm that recognize if an image has something different from the dataset. Do you have any suggestion?
In particular is there a way to use convolutional neural network?
What you are looking for is usually called anomaly, outlier, or novelty detection. You have lots of examples of what your data should look like, and you want to know when something doesn't look like your data.
A good approach for this problem, since you are using images, you can get a feature vectorized version using a pre-trained CNN on image net. Then you can use an anomaly detector on that feature set. The isolation forest should be an easier one to get working.
This is a typical Classification problem. I do not understand why you need CNN for this ......
My suggestion would be to build/train a classification model
comprising of only GOOD images of glass. Here you would possibly
have all kinds of glasses that are intact with a regular shape.
If the model encounters anything other than GOOD images, it will
classify those as BAD images. This so called BAD images may
include cracked/broken glasses having an irregular shape.
Another option that might work is to use an autoencoder.
Autoencoders are unsupervised neural networks with bottleneck architecture that try to reconstruct its own input.
We could train a deep convolutional autoencoder with examples of good glasses so that it gets specialized in reconstructing those type of images. You don't need to train autoencoder with bad glasses.
Therefore I would expect the trained autoencoder to produce low error rate for good glasses and high error rate for bad glasses. Error rate could be measured with MSE based on the difference between the reconstructed and original values (pixels).
From the trained autoencoder you can plot the MSEs for good vs bad glasses to help you define the right threshold. Or you can also try statistic thresholds such as: avg + 2*std, median + 2*MAD, etc.
Autoencoder details:
http://ufldl.stanford.edu/tutorial/unsupervised/Autoencoders/
Deep autoencoder for images:
https://cds.cern.ch/record/2209085/files/Outlier%20detection%20using%20autoencoders.%20Olga%20Lyudchick%20(NMS).pdf

using svmlight model file in opencv

I have been working on training pedestrian detection classifier based on HOG features. Presently I have done the followings:
a) Extracted HOG features of all files i.e. Positive and Negative and saved those features with label i.e. +1 for positive and -1 for negative in file.
b)downloaded svmlight, extracted binaries i.e. svm_learn, svm_classify.
c) passed the "training file" (features file) to svm_learn binary which produced a model file for me.
d) passed "test file" to svm_classify binary and got result in predictions file.
Now my question is that "What to do next and how?". i think i know that now i need to use that "model file" and not "predictions file" in openCV for detection of pedestrian in video but somewhere i read that openCV uses only 1 support vector but i got 295 SV, so how do i convert it into one proper format and use it and any further compulsory steps if any.
I do appreciate your kindness!
It is not true that OpenCV (presumably you are talking about CvSVM) uses only one support vector. As pointed out by QED, what OpenCV does do is to optimize a linear SVM down to one support vector. I think the idea here is that the support vectors define the classification margin, but to do the actual classification only the separating hyperplane is needed and that can be defined with one vector.
Since you have a svmlight model file, and CvSVM can't read that, you have the following options:
train a CvSVM and save the mode as a CvStatsModel file, that you can load tha tlater to get the support vecotrs.
write some code to convert an svmlight model file into a CvStatsModel file (but for this you have to understand both formats).
get source for svmlight, the bit that reaads the modelfile, and integrate it into your OpenCV application
You may use LIBSVM instead, but really you are then faced with the same problems as svmlight.
For ideas on how to convert the support vectors so you can use them with the HOG detector see Training custom SVM to use with HOGDescriptor in OpenCV

SVM for HOG descriptors in opencv

I am trying to classify the yard digits on the football field. I am able to detect them (different method) well. I have a minimal bounding box drawn around the tens place digits '1,2,3,4,5'. My goal is to classify them.
Ive been trying to train an SVM classifier on hog features I extract from the training set. A small subset of my training digits are here: http://ssadanand.imgur.com/all/
While training, I visualize my hog descriptors and they look correct. I use a 64X128 training window and other default parameters that OPencv's HOGDescriptor uses.
Once I train my images (50 samples per class, 5 classes), I have a 250X3780 training vector and 1X250 label vector which holds the class label values which I feed to a CvSVM object. Here is where I have a problem.
I tried using the default CvSVMParams() while using CvSVM. Terrible performance when tested on the training set itself!
I tried customizing my CvSVMPARAMS doing this:
CvSVMParams params = CvSVMParams();
params.svm_type = CvSVM::EPS_SVR;
params.kernel_type = CvSVM::POLY;
params.C = 1; params.p = 0.5; params.degree = 1;
and different variations of these parameters and my SVM classifier is terribly even when I test on the training set!
Can somebody help me out with parameterizing my SVM for this 5 class classifier?
I don't understand which kernel and what svm type I must use for this problem. Also, how in the world am I supposed to find out the values of c, p, degree for my svm?
I would assume this is an extremely easy classification problem since all my objects are nicely bounded in a box, fairly good resolution, and the classes i.e.: the digits 1,2,3,4,5 are fairly unique in appearance. I don't understand why my SVM is doing so poorly. What am I missing here?
A priori and without experimentation, it's very hard to give you some good parameters but I can give you some ideas.
First, you want to model a multi class classifier but you are using a regression algorithm, not that you can't do that but usually is easier if you start with C-SVM first.
Second, I would recommend to use RBF instead of a Polynomial kernel. Poly is very hard to get it right and usually RBF would do a better job out of the box.
Third, I would play with several values of C, don't be shy and try a bigger C (such as 100) which would force the algorithm to pick more SVs. It can lead to overfitting but if you can't even make the algorithm to learn the training set that's not your immediate problem.
Fourth, I would reduce the dimension of the images at first and then if needed, when you have a more stable model, you could try with that dimension again.
I really recommend you to read LibSVM guide which is very easy to follow http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf
Hope it helps!
EDIT:
I forgot to mention, that a good way to pick parameters for SVM is to perform cross-validation: http://en.wikipedia.org/wiki/Cross-validation_(statistics)
http://www.autonlab.org/tutorials/overfit10.pdf
http://www.youtube.com/watch?v=hihuMBCuSlU
http://www.youtube.com/watch?v=m5StqDv-YlM
EDIT2:
I know is silly because it's on the title of the question, but I didn't realize you were using HOG descriptors until you pointed out on the comments.

CvSVM.predict() gives 'NaN' output and low accuracy

I am using CvSVM to classify only two types of facial expression. I used LBP(Local Binary Pattern) based histogram to extract features from the images, and trained using cvSVM::train(data_mat,labels_mat,Mat(),Mat(),params), where,
data_mat is of size 200x3452, containing normalized(0-1) feature histogram of 200 samples in row major form, with 3452 features each(depends on number of neighbourhood points)
labels_mat is corresponding label matrix containing only two value 0 and 1.
The parameters are:
CvSVMParams params;
params.svm_type =CvSVM::C_SVC;
params.kernel_type =CvSVM::LINEAR;
params.C =0.01;
params.term_crit=cvTermCriteria(CV_TERMCRIT_ITER,(int)1e7,1e-7);
The problem is that:-
while testing I get very bad result (around 10%-30% accuracy), even after applying with different kernel and train_auto() function.
CvSVM::predict(test_data_mat,true) gives 'NaN' output
I will greatly appreciate any help with this, it's got me stumped.
I suppose, that your classes linearly hard/non-separable in feature space you use.
May be it will be better to apply PCA to your dataset before classifier training step
and estimate effective dimensionality of this problem.
Also I think it will be userful test your dataset with other classifiers.
You can adapt for this purpose standard opencv example points_classifier.cpp.
It includes a lot of different classifiers with similar interface you can play with.
The SVM generalization power is low.In the first reduce your data dimension by principal component analysis then change your SVM kerenl type to RBF.

train a classifer with SVM-light for object detection

I am working with SVM-light. I would like to use SVM-light to train a classifier for object detection. I figured out the syntax to start a training:
svm_learn example2/train_induction.dat example2/model
My problem: how can I build the "train_induction.dat" from a
set of positive and negative pictures?
There are two parts to this question:
What feature representation should I use for object detection in images with SVMs?
How do I create an SVM-light data file with (whatever feature representation)?
For an intro to the first question, see Wikipedia's outline. Bag of words models based on SIFT or sometimes SURF or HOG features are fairly standard.
For the second, it depends a lot on what language / libraries you want to use. The features can be extracted from the images using something like OpenCV, vlfeat, or many others. You can then convert those features to the SVM-light format as described on the SVM-light homepage (no anchors on that page; search for "The input file").
If you update with what language and library you want to use, we can give more specific advice.

Resources