Haar Cascade Training - Positive Images Size - opencv

I'm trying to train my own Haar Cascade classifier for apple according to this article. I collected 1000 positive images and 500 negative images from the internet. Each image has different sizes and I cropped the images to create "info.txt" file. While I creating samples like this,
createsamples.exe -info positive/info.txt -vec vector/applevector.vec -
num 1000 -w 24 -h 24
there is some parameters -w and -h. What does it mean ? Should I resize all my positive and negative images ? I tried to train my classifier with default parameters (-w 24 and -h 24) but accuracy of my classifier is so weak. Can it be related with this parameters ? Thank you for advice.
UPDATE
There is some examples of my positive images. I collected them from the internet.

Related

Training Haar classifier to detect letters/digits

I have a test answer sheet with circled answers and I am trying to detect digits/letters using OpenCV. I used 10x10 image of '2' as positive image and 44 other parts of the answer sheet as negatives to create Haar classifier myself.
Apparently, I am doing something wrong as my classifier fails to detect original '2'.
$opencv_createsamples -vec a_desc.bin -info positive.txt -bg negative.txt -num 1 -w 10 -h 10
$opencv_traincascade -data classifiers -vec a_desc.bin -bg negative.txt -numStages 20 -numPos 1 -numNeg 46 -w 10 -h 10
....
===== TRAINING 4-stage =====
<BEGIN
POS count : consumed 1 : 1
NEG count : acceptanceRatio 0 : 0
Required leaf false alarm rate achieved. Branch training terminated.
What I am doing wrong:
too fee negatives
too few positives
size of negatives and positives must match
all images must be in the same format (png | jpg | gif)
Basically, what the expectation for the following approach:
A page from random books is selected
We create 10x10 image of 'e' letter
We create 20 negative images of other letters/digits
We create a classifier using these two datasets
Now we try to classify the original page.
Are we supposed to supposed to find ALL occurrences of 'e' and some false positives?

OpenCV Haar Classifier: training stops prematurely

I have been trying to train image databases to detect faces using Haar cascades. I have made 2 attempts:
1) I have used the following database for positive images:
http://robotics.csie.ncku.edu.tw/Databases/FaceDetect_PoseEstimate.htm#Our_Database_ (6660 images)
For negative images I have used this database:
https://github.com/sonots/tutorial-haartraining/tree/master/data/negatives (3300 images)
I have used this command to train the samples:
opencv_createsamples -info info.dat -vec samples2.vec -w 32 -h 24 -num 6660
I have used this command to train the data:
opencv_traincascade -data ./classifier3 -vec samples2.vec -bg bg.txt -numPos 6000 -numNeg 12000 -numStages 30 -precalcValBufSize 5120 -precalcIdxBufSize 5120 -numThreads 12 -acceptanceRatioBreakValue 10e-5 -w 32 -h 24 -minHitRate 0.99 -maxFalseAlarmRate 0.5 -mode ALL
The training goes on up to stage 9. Then the acceptanceRatio break value is crossed.(The required acceptanceRatio for the model has been reached to avoid over-fitting of training data. Branch training terminated.)
I don't understand the issue here. I have only used the recommended values for the parameters. I had tried changing the minHitRate to 0.95, yet the result is the same. I can think of some potential reasons:
i) I had used the positive images directly without cropping. But I don't
think that should be an issue, as the background is completely plain.
ii) The image database contains faces in different poses. That could lead
to complications while training. Is it a good idea to train faces
under different poses using the same cascade classifier? Or should I
use different classifiers for each pose?
iii) My negative images might be too different compared to the positive
images. Is that the case? If yes, what kind of negative images should
I be looking for?
I tried testing the cascade.xml file on a few sample images, but nothing is detected at all.
2) Keeping in mind the potential reason i), I used this database already cropped, for positive images: http://conradsanderson.id.au/lfwcrop/ (around 13000 images)
But still the problem persists. This time it trains upto stage 11. In this case I used -numPos as 8000 -numNeg as 20000( increased the ratio to give the training more negative samples), -w as 24 and -h as 24.
Can anyone please guide me here?

Infinite loop: Haar, LBP, HOG traincascade of opencv stuck

I am trying to build a classifier to detect faces in Thermal images. So I tried training using Haar, LBP and HOG classifiers. I am working with OpenCV 2.4.8 on windows.
opencv_traincascade.exe -data haarcascades -vec pos.vec -bg neg.txt -numPos 250 -numStages 24 -numNeg 900 -w 24 -h 24
I have 307 positive samples in total. The negative samples are of size 75x75. For each of the three cases the training gets stuck at a particular stage-earlier for Haar (stage-12) and later for LBP (stage-14/15). I reduced the number of negatives (upto 200) but that means the training gets stuck at a later stage. The training hasn't progressed since 2 days. No negatives are being consumed and the command window looks like this-
===== TRAINING 14-stage =====
<BEGIN
POS count : consumed 255 : 262
Also
What do POS count consumed and NEG count consumed signify?
When I reduce the minHitRate to say 0.7 why do the number of POS consumed increase?
Please let me know what I am doing wrong.
Thanks.
I had the similar problem myself. The thing is that classifier at each stage takes those negative examples which are classified as positive in the previous stages. So the thing that happens is that none of the negative samples are classified as positive and the code goes in the infinite loop trying to find one. I solved this by changing the source code so that the algorithm terminates after it cant find any negative example and just use the previous stages for the classifier.
If you dont want to change the code try adding more negative examples or reducing the number of stages.
Count consumed is the amount of possitve and negative images that are used in each stages. And you need to use more possitive and negatives images around 1000 positives and 2000 negative to get a good result

TrainCascade stuck on getting new negatives

I'm working with OpenCV 2.4.7 on windows. I'm using TrainCascade to train a new Haar cascade for eyeglasses using the following command:
opencv_traincascade -data trainCascade20 -vec vector3.vec -bg infofile3.txt -numStages 40 -minHitRate 0.999 maxFalseAlarmRate 0.5 -numPos 170 -numNeg 1000 -w 20 -h 20 -mode ALL -precalcValBufSize 1024 -precalcIdxBufSize 1024
It's stuck (or progressing very slow) on stage 24 on the phase of getting new negatives. The negative images file "infofile3.txt" contains about 12K negative image.
Can someone please explain why it's progressing so slowly and what can I do make it progress (a lot) faster?
Thanks in advance,
Gil.
Around 24 hours sounds normal to me. Haar training can actually take up to days depending on size and number of samples. And of course on the computer as well. The longest my training took was approximately a week for hand detection.
If you are really worried, to check whether the haar training is still on-going, you can try to generate an intermediate haar cascade xml file, from the data available. If you are able to generate the xml file, it would show that it's still running(albeit slow) and not stuck.
How to improve the haar training speed, the only solution I know or used before is "paralleling"
A quick search on google about that leads to a few link, here's one of them: http://www.computer-vision-software.com/blog/2009/06/parallel-world-of-opencv/
I have used such methods, and it's pretty efficient in cutting the time taken to train the Haar Cascade. So hope this method suits you well. Do try my method of generating an immediate xml file from the current data available first though. If there is any needs, do comment, I try get back to you soon. Cheers.

Unable to create positive samples from a limited set of samples for Haar training

I am building a Haar classifier. I have a set of 109 positive samples and 3000 negative samples. To increase my number of positive samples (to say 600), I try using the following command:
opencv_createsamples -vec out.vec -w 24 -h 24 -bg bg.txt -num 600 -info positives.dat
But I get the following error message:
positives.dat(109) : parse errorDone. Created 108 samples
How can I "force" opencv to produce the 600 samples from those 109 I have?
Opencv's default function for creating samples can only create as many as you have in your info file. A
lso, you should know if you use -info as opposed to -img it only resizes them to what you specified as h and w and grayscales all the images. It doesn't actually apply any transforms on them or superimpose them on background images. I'm not entirely sure what the point of the function is really...

Resources