opencv_traincascade - samplOpenCV Error: Bad argument - opencv

Background: I am trying to train my own OpenCV Haar Classifier for face detection. I am working on a VM with Ubuntu 16.04, my working directory has 2 sub-directories: face containing 2429 images of positives, non-face containing 4548 images of negatives. All images are png, gray scale and have both width and height 19 pixels. I have generated a file positives.info that contains the absolute path to every positive image followed by " 1 0 0 18 18", like so:
/home/user/ML-Trainer/face/face1.png 1 0 0 18 18
/home/user/ML-Trainer/face/face2.png 1 0 0 18 18
/home/user/ML-Trainer/face/face3.png 1 0 0 18 18
and another file negatives.txt that contains the absolute path to every positive image
/home/user/ML-Trainer/non-face/other1.png
/home/user/ML-Trainer/non-face/other2.png
/home/user/ML-Trainer/non-face/other3.png
First I ran the following command:
opencv_createsamples -info positives.info -vec positives.vec -num 2429 -w 19 -h 19
and I get the positives.vec as expected, I then created a empty directory data and ran the following:
opencv_traincascade -data data -vec positives.vec -bg negatives.txt -numPos 2429 -numNeg 4548 -numStages 10 -w 19 -h 19 &
It seems to run smoothly:
PARAMETERS:
cascadeDirName: data
vecFileName: positives.vec
bgFileName: negatives.txt
numPos: 2429
numNeg: 4548
numStages: 10
precalcValBufSize[Mb] : 1024
precalcIdxBufSize[Mb] : 1024
acceptanceRatioBreakValue : -1
stageType: BOOST
featureType: HAAR
sampleWidth: 19
sampleHeight: 19
boostType: GAB
minHitRate: 0.995
maxFalseAlarmRate: 0.5
weightTrimRate: 0.95
maxDepth: 1
maxWeakCount: 100
mode: BASIC
Number of unique features given windowSize [19,19] : 63960
===== TRAINING 0-stage =====
<BEGIN
POS count : consumed 2429 : 2429
NEG count : acceptanceRatio 4548 : 1
Precalculation time: 13
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 1|
+----+---------+---------+
| 2| 1| 1|
+----+---------+---------+
| 3| 0.998765| 0.396218|
+----+---------+---------+
END>
Training until now has taken 0 days 0 hours 1 minutes 7 seconds.
But then I get the following error:
===== TRAINING 1-stage =====
<BEGIN
POS current samplOpenCV Error: Bad argument (Can not get new positive sample. The most possible reason is insufficient count of samples in given vec-file.
) in get, file /home/user/opencv-3.4.0/apps/traincascade/imagestorage.cpp, line 158
terminate called after throwing an instance of 'cv::Exception'
what(): /home/user/opencv-3.4.0/apps/traincascade/imagestorage.cpp:158: error: (-5) Can not get new positive sample. The most possible reason is insufficient count of samples in given vec-file.
in function get
How do I solve this:
samplOpenCV Error: Bad argument
Any help would be greatly appreciated.
EDIT:
I have modified -numPos to a smaller number: 2186 (0.9 * 2429), I did this after reading this answer and it got me to
===== TRAINING 3-stage =====
and it gives me the same error. How should I tune the parameters for the opencv_createsamples command?

I eventually managed to make it work by respecting this formula:
vec-file >= (numPos + (numStages-1) * (1-minHitRate) * (numPose) + S)
numPose - number of positive samples which is used to train each stage
numStages - the count of stages which a cascade classifier will have after the training
S - the count of all the skipped samples from vec-file (for all stages)

Related

Haar cascade resulting file is too small

I am trying to train a cascade to detect an area with specifically structured text (MRZ).
I've gathered 200 positive samples and 572 negative samples.
Trainig went as the following:
opencv_traincascade.exe -data cascades -vec vector/vector.vec -bg bg.txt -numPos 200 -numNeg 572 -numStages 3 -precalcValBufSize 2048 -precalcIdxBufSize 2048 -featureType LBP -mode ALL -w 400 -h 45 -maxFalseAlarmRate 0.8 -minHitRate 0.9988
PARAMETERS:
cascadeDirName: cascades
vecFileName: vector/vector.vec
bgFileName: bg.txt
numPos: 199
numNeg: 572 numStages: 3 precalcValBufSize[Mb] : 2048 precalcIdxBufSize[Mb] : 2048 acceptanceRatioBreakValue : -1 stageType: BOOST featureType: LBP sampleWidth: 400 sampleHeight: 45 boostType: GAB minHitRate: 0.9988 maxFalseAlarmRate: 0.8 weightTrimRate: 0.95 maxDepth: 1 maxWeakCount: 100 Number of unique features given windowSize [400,45] : 8778000
===== TRAINING 0-stage ===== <BEGIN POS count : consumed 199 : 199 NEG count : acceptanceRatio 572 : 1 Precalculation time: 26.994
+----+---------+---------+ | N | HR | FA |
+----+---------+---------+ | 1| 1| 1|
+----+---------+---------+ | 2| 1|0.0244755|
+----+---------+---------+ END>
Training until now has taken 0 days 0 hours 36 minutes 35 seconds.
===== TRAINING 1-stage ===== <BEGIN POS count : consumed 199 : 199 NEG count : acceptanceRatio
0 : 0 Required leaf false alarm rate achieved.
Branch training terminated.
The process was running ~35 minutes and produces a 2 kB file with only 45 lines that seems too small for a good cascade.
Needless to say, it doesn't detect the needed area.
I tried to tune the arguments but to no avail.
I know that it is better to use a larger set of samples, but I think that the result with this samples number should also produce a somewhat reasonable result, not so accurate though.
Is a haar cascade a good approach for detecting areas with specific text (MRZ)?
If so how better accuracy can be achieved?
Thanks in advance.
you want to produce 3 stages with maximum false alarm rate 0.8 per stage, this means after 3 stages the classifier will have a maximum of 0.8^3 false alarm rate = 0.512 but after your first stage, the classifier already reaches false alarm rate of 0.0244755 which is much better than your final aim (0.512) so the classifier is already good enough and does not need any more stages.
If that's not fine for you, increase numStages or decrease maxFalseAlarmRate to some amount that you don't reach the "final quality" within your first stage.
You will probably have to collect more samples and samples that represent the environment better, reaching such low false alarm rates is typically a sign for bad training data (too simple or too similar?).
I can't tell you, whether haar cascades are appropriate for solving your task.

OpenCV train cascade giving error "“Train dataset for temp stage can not be filled.”

so I've searched this online and this is a pretty common error but I've tried the given solutions to no avail. My cmd log is:
C:\Users\kosyn_000\Dropbox\OpenCVtrainingdata>opencv_traincascade -data my_trained -vec positives.vec -bg negativedata.txt -numPos 30 -numNeg 76 -numStages 15 -minHitRate 0.995 -w 197 -h 197 -featureType LBP -precalcValBufSize 1024 -precalcIdxBufSize 1024
PARAMETERS:
cascadeDirName: my_trained
vecFileName: positives.vec
bgFileName: negativedata.txt
numPos: 30
numNeg: 76
numStages: 15
precalcValBufSize[Mb] : 1024
precalcIdxBufSize[Mb] : 1024
acceptanceRatioBreakValue : -1
stageType: BOOST
featureType: LBP
sampleWidth: 197
sampleHeight: 197
boostType: GAB
minHitRate: 0.995
maxFalseAlarmRate: 0.5
weightTrimRate: 0.95
maxDepth: 1
maxWeakCount: 100
Number of unique features given windowSize [197,197] : 41409225
===== TRAINING 0-stage =====
<BEGIN
POS count : consumed 30 : 30
Train dataset for temp stage can not be filled. Branch training terminated.
Cascade classifier can't be trained. Check the used training parameters.
C:\Users\kosyn_000\Dropbox\OpenCVtrainingdata>
and my negativedata.txt file has 76 lines of info in the form:
negatives/1411814567410.jpg 1 2 2 199 199
negatives/20131225_192702.jpg 1 2 2 199 199
negatives/20131225_193214.jpg 1 2 2 199 199
negatives/20131225_193325.jpg 1 2 2 199 199
negatives/20131225_193327.jpg 1 2 2 199 199
negatives/20131225_193328.jpg 1 2 2 199 199
Please can someone help me pinpoint the issue because I'm still not sure why I'm getting this error. I'm doing this on a windows system. Thank you.
Found out the issue, apparently the bg file shouldn't contain constraints so now my file is in the form
C:\Users\kosyn_000\Dropbox\OpenCVtrainingdata\negatives/ff.JPG
C:\Users\kosyn_000\Dropbox\OpenCVtrainingdata\negatives/fifa.JPG
C:\Users\kosyn_000\Dropbox\OpenCVtrainingdata\negatives/fred.JPG
C:\Users\kosyn_000\Dropbox\OpenCVtrainingdata\negatives/IMG-20140718-WA0008-1.jpg
C:\Users\kosyn_000\Dropbox\OpenCVtrainingdata\negatives/IMG-20150102-WA0013.jpg
C:\Users\kosyn_000\Dropbox\OpenCVtrainingdata\negatives/IMG-20150120-WA0005.jpg
C:\Users\kosyn_000\Dropbox\OpenCVtrainingdata\negatives/IMG_20140109_012313.jpg
C:\Users\kosyn_000\Dropbox\OpenCVtrainingdata\negatives/IMG_20140405_205621.jpg
C:\Users\kosyn_000\Dropbox\OpenCVtrainingdata\negatives/IMG_20140405_214225.jpg
C:\Users\kosyn_000\Dropbox\OpenCVtrainingdata\negatives/IMG_20140405_214225_transparent.png
C:\Users\kosyn_000\Dropbox\OpenCVtrainingdata\negatives/IMG_20140405_214225_transparent_small.png
and it outputted my xml file fine; albeit taking a bit of time. Lol I can't believe it was something so simple holding me back.

OpenCV error "Train dataset for temp stage can not be filled. Branch training terminated." after starting training stage-3

While searching for this error I only found cases where it happened right at the beginning. In my case, it occurred at the beginning of the 3-stage training.
I'm using OpenCV 2.4.10, with OpenMP enabled. Bellow is the command line I used and the output. Does anyone knows how to solve this problem?
root#6b0f88eaadb9:/opt/ocr-samples3/train-detector# opencv_traincascade -data ./out// -vec ./positive/vecfile.vec -bg ./negative/negative.txt -w 247 -h 80 -numPos 78 -numNeg 1325 -featureType LBP -numStages 8
libdc1394 error: Failed to initialize libdc1394
Training parameters are loaded from the parameter file in data folder!
Please empty the data folder if you want to use your own set of parameters.
PARAMETERS:
cascadeDirName: ./out//
vecFileName: ./positive/vecfile.vec
bgFileName: ./negative/negative.txt
numPos: 78
numNeg: 1325
numStages: 8
precalcValBufSize[Mb] : 256
precalcIdxBufSize[Mb] : 256
stageType: BOOST
featureType: LBP
sampleWidth: 247
sampleHeight: 80
boostType: GAB
minHitRate: 0.995
maxFalseAlarmRate: 0.5
weightTrimRate: 0.95
maxDepth: 1
maxWeakCount: 100
===== TRAINING 0-stage =====
<BEGIN
POS count : consumed 78 : 78
NEG count : acceptanceRatio 1325 : 1
Precalculation time: 6
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 1|
+----+---------+---------+
| 2| 1|0.0211321|
+----+---------+---------+
END>
Training until now has taken 0 days 1 hours 22 minutes 40 seconds.
===== TRAINING 1-stage =====
<BEGIN
POS count : consumed 78 : 78
NEG count : acceptanceRatio 1325 : 0.0928456
Precalculation time: 10
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 1|
+----+---------+---------+
| 2| 1| 1|
+----+---------+---------+
| 3| 1|0.0324528|
+----+---------+---------+
END>
Training until now has taken 0 days 3 hours 19 minutes 57 seconds.
===== TRAINING 2-stage =====
<BEGIN
POS count : consumed 78 : 78
NEG count : acceptanceRatio 1325 : 0.00679104
Precalculation time: 7
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 1|
+----+---------+---------+
| 2| 1| 0.08|
+----+---------+---------+
END>
Training until now has taken 0 days 4 hours 38 minutes 25 seconds.
===== TRAINING 3-stage =====
<BEGIN
POS count : consumed 78 : 78
Train dataset for temp stage can not be filled. Branch training terminated.
How big are you negative images. I had the same problem. The error say that he hasn't enough negative images. Your negative images doe not have to be the same size as your positive. So what I did take the original negative images but a black rectangle on the object. And started again.
The function negative search in the image for new negative images, when he used one negative image he will not use it again. It randomly chosen and has the same size as your positive image.

Train own object detector with opencv

I am in Linux with OpenCV 3.0 Alpha
I search alot on the web on training own object detector
But when i follow some instruction, it doesnt work on me. Here's my situtation:
First I download 550 positive samples in 100px width and 40px height
and i also got 550 negative samples in 100px width and 40px height
then create the positives.info and negatives.txt
I am sure the info file and txt file are in correct content and the images are nice enough
then I create sample vec file:
opencv_createsamples -info positives.info -num 550 -w 48 -h 24 -vec cars.vec
and it comes out:
Info file name: positives.info
Img file name: (NULL)
Vec file name: cars.vec
BG file name: (NULL)
Num: 550
BG color: 0
BG threshold: 80
Invert: FALSE
Max intensity deviation: 40
Max x angle: 1.1
Max y angle: 1.1
Max z angle: 0.5
Show samples: FALSE
Width: 48
Height: 24
Create training samples from images collection...
Done. Created 550 samples
It seems ok with the 550 samples,
then i train the cascade:
opencv_traincascade -data data -vec cars.vec -bg negatives.txt -numPos 500 -numNeg 500 -numStages 2 -w 48 -h 24 -featureType LBP
and it comes :
PARAMETERS:
cascadeDirName: data
vecFileName: cars.vec
bgFileName: negatives.txt
numPos: 500
numNeg: 500
numStages: 2
precalcValBufSize[Mb] : 256
precalcIdxBufSize[Mb] : 256
stageType: BOOST
featureType: LBP
sampleWidth: 48
sampleHeight: 24
boostType: GAB
minHitRate: 0.995
maxFalseAlarmRate: 0.5
weightTrimRate: 0.95
maxDepth: 1
maxWeakCount: 100
===== TRAINING 0-stage =====
<BEGIN
POS count : consumed 500 : 500
NEG count : acceptanceRatio 500 : 1
Precalculation time: 0
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 1|
+----+---------+---------+
| 2| 1| 1|
+----+---------+---------+
| 3| 1| 0.414|
+----+---------+---------+
END>
Training until now has taken 0 days 0 hours 0 minutes 3 seconds.
===== TRAINING 1-stage =====
<BEGIN
POS count : consumed 500 : 500
NEG count : acceptanceRatio 500 : 0.578035
Precalculation time: 0
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 1|
+----+---------+---------+
| 2| 1| 1|
+----+---------+---------+
| 3| 1| 0.46|
+----+---------+---------+
END>
Training until now has taken 0 days 0 hours 0 minutes 6 seconds.
And i found the cascade.xml
so I test it, I just use the face detect provided by the opencv, but i use my own cascade file.
but i comes out the blue rectangle box which should mark up the car but just
draw right in the middle of the image. I test positive image and negative image, it just
draw a rectangle in the middle.
which step i gone wrong? and how do i fix it ?
You should change the training parameters. NumberofStages should increase , also you can use HAAR as a featureType , and you can set minhitrate and maxfalselarm rate.

opencv_traincascade always gets stuck

I am trying to use OpenCV's opencv_traincascade to generate a Haar Cascade. So far I have 87 distinctive positive samples and 39 negative samples for testing purposes. I generated the .vec file with opencv_createsamples, which worked fine. When I'm running opencv_traincascade it always gets stuck after a few stages, no matter how I change the parameters. My call looks like this:
opencv_traincascade -data /opencvimgs/haarcascades/data/ -vec /opencvimgs/haarcascades/out.vec -bg /opencvimgs/haarcascades/neg.txt -numPos 87 -numNeg 39
I tried increasing and decreasing minHitRate and maxFalseAlarmRate as well as numPos and numNeg without any success. It might run for a few more stages but then it seems to hang in an infine loop again. How can I resolve this?
The output below is what the programm writes to the console:
opencv_traincascade -data /opencvimgs/haarcascades/data/ -vec
/opencvimgs/haarcascades/out.vec -bg /opencvimgs/haarcascades/neg.txt -numPos 87 -numNeg 39
PARAMETERS:
cascadeDirName: /opencvimgs/haarcascades/data/
vecFileName: /opencvimgs/haarcascades/out.vec
bgFileName: /opencvimgs/haarcascades/neg.txt
numPos: 87
numNeg: 39
numStages: 20
precalcValBufSize[Mb] : 256
precalcIdxBufSize[Mb] : 256
stageType: BOOST
featureType: HAAR
sampleWidth: 24
sampleHeight: 24
boostType: GAB
minHitRate: 0.995
maxFalseAlarmRate: 0.5
weightTrimRate: 0.95
maxDepth: 1
maxWeakCount: 100
mode: BASIC
===== TRAINING 0-stage =====
<BEGIN
POS count : consumed 87 : 87
NEG count : acceptanceRatio 39 : 1
Precalculation time: 1
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 0|
+----+---------+---------+
END>
===== TRAINING 1-stage =====
<BEGIN
POS count : consumed 87 : 87
NEG count : acceptanceRatio 39 : 0.0697674
Precalculation time: 1
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 0|
+----+---------+---------+
END>
===== TRAINING 2-stage =====
<BEGIN
POS count : consumed 87 : 87
NEG count : acceptanceRatio 39 : 0.00945455
Precalculation time: 1
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 0|
+----+---------+---------+
END>
===== TRAINING 3-stage =====
<BEGIN
POS count : consumed 87 : 87
NEG count : acceptanceRatio 39 : 0.000326907
Precalculation time: 1
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 0|
+----+---------+---------+
END>
===== TRAINING 4-stage =====
<BEGIN
POS count : consumed 87 : 87
A possible answer is that you're using too few negative samples.
Read the instruction from OpenCV documents and reference paper from Viola and Jones.
They are using cascaded classifier to achieve high accuracy and low false alarms by eliminate part of the negative samples each time. If you are using too few negative samples, it defeat the purpose of the cascaded classifier in the first place.
Notice that, for practical use, the system have much much more images without faces than with faces.

Resources