Im trying to train a haar cascade. For that as a test run, I'm taking 5 positive images (which have the image). I use a program called objectmarker.exe to mark the object in the image and store the coordinates as well as the height and width of the rectangle in a text file (positives.txt)
Now when I try to create a .vec file using the the text file from command line, the program executes, but i get the following:
positive(1).txt : parse errorDone. Created 0 samples
The .vec file does get generated but if i try to view it, it opens a window and crashes.
I use the following code
C:\Sahil\Major Project\Haartraining Stuff\Haartraining Stuff\STEPS\step 02>openc
v_createsamples.exe -info positives.txt -num5 -vec vec5.vec -w 20 -h 20
Info file name: positives.txt
Img file name: (NULL)
Vec file name: vec5.vec
BG file name: (NULL)
Num: 1000
BG color: 0
BG threshold: 80
Invert: FALSE
Max intensity deviation: 40
Max x angle: 1.1
Max y angle: 1.1
Max z angle: 0.5
Show samples: FALSE
Width: 20
Height: 20
Create training samples from images collection...
positives.txt(1) : parse errorDone. Created 0 samples
my postives.txt is in the following format
C:/Sahil/Major Project/Haartraining Stuff/Haartraining Stuff/STEPS/step 02/rawdata/00007 001 (3).bmp_0000_0065_0107_0107_0199.bmp 1 1 2 106 193
C:/Sahil/Major Project/Haartraining Stuff/Haartraining Stuff/STEPS/step 02/rawdata/00007 001 (4).bmp_0000_0065_0107_0107_0199.bmp 1 1 2 108 195
C:/Sahil/Major Project/Haartraining Stuff/Haartraining Stuff/STEPS/step 02/rawdata/00007 001.bmp_0000_0065_0107_0107_0199.bmp 1 2 5 110 195
C:/Sahil/Major Project/Haartraining Stuff/Haartraining Stuff/STEPS/step 02/rawdata/img1.bmp 1 4 4 103 190
C:/Sahil/Major Project/Haartraining Stuff/Haartraining Stuff/STEPS/step 02/rawdata/img2.bmp 1 3 5 118 217
kindly suggest what i can do to correct this error. as i cannot proceed further
How is opencv_createsamples.exe distinguishing image file names? It might be written not to check white characters in paths/file names. Try without spaces either in the paths and file names.
Related
I have problem with GridSearchCV freezing (CPU is active but program in not advancing) with linear svm (but with rbf svm it works fine).
Depending on the random_state that I use for splitting my data, I have this freezing in different splits points of cv for different PCA components?
The features of one sample looks like the following(it is about 39 features)
[1 117 137 2 80 16 2 39 228 88 5 6 0 10 13 6 22 23 1 227 246 7 1.656934307 0 5 0.434195726 0.010123735 0.55568054 5 275 119.48398 0.9359527 0.80484825 3.1272728 98 334 526 0.13454546 0.10181818]
Another sample's features:
[23149 4 31839 9 219 117 23 5 31897 12389 108 2 0 33 23 0 0 18 0 0 0 23149 0 0 74 0.996405221 0.003549844 4.49347E-05 74 5144 6.4480677 0.286384 0.9947901 3.833787 20 5135 14586 0.0060264384 0.011664075]
If I delete the last 10 feature I don't have this problem ( The 10 new features that I added before my code worked fine). I did not check other combinations of the 10 last new features to check if a specific feature is causing this problem.
Also I use StandardScaler to scale the features but still facing this issue. I have less of this problem if I use MinMaxScaler scaler (but read soewhere it is not good for svm).
I also put n_jobs to different numbers and it only could advance by little but freezes again.
What do you suggest?
I followed part of this code to write my code:
TypeError grid seach
I've created an info text file by using opencv_annotation tool, with around 300 image, and some contain multiple ROIs (Region of Interests). The following is an example output of the file, with dots indicating many lines with the same format:
positives\1\105.png 1 9 10 17 14
...
positives\2\003.png 2 14 2 5 7 11 18 8 9
...
positives\3\045.png 3 21 9 7 9 13 10 9 11 7 15 6 7
However, opencv_annotation then crashes, with the error assertion failed (ssize.area() > 0) ..., and only a fraction (~200 out of ~600) of the ROIs in the info text file were placed into the vec file, verified by how opencv_traincascade reports insufficient samples when using the parameter -numPos 500 when attempting to use the vec file.
Why does this occur, and how can I fix it?
The cause of this is that a ROI that was improperly defined while creating the samples.
In my specific case, this specific line was found:
positives\2\005.png 2 9 6 6 7 0 0 0 0
Recall how the format of these files:
[filename] [# of objects] [[x y width height] [... 2nd object] ...]
In my case, the specified region had a width and length of 0 pixels, which causes opencv_createsamples to crash with the very error. Simply removing the object data and decrementing the [# of objects] by one solved this issue, like so:
positives\2\005.png 1 9 6 6 7
Additionally, one should also look out for values that reach out of bounds, such as the following:
positives\2\005.png 2 9 6 6 7 -5 0 3 2
positives\2\005.png 2 9 6 6 7 30 30 200 300
positives\2\005.png 2 9 6 6 7 35 35 3 5
In the first example, no point should be negative. In the second example, our .png file is only 32 x 32 pixels large, so a ROI cannot have a width and height of 200 and 300, respectively. Finally, in the last example, the x and y positions are out of bounds of our image.
On the website :
http://leon.bottou.org/projects/infimnist
It says :
Generating files containing the MNIST8M training set:
$ infimnist lab 10000 8109999 > mnist8m-labels-idx1-ubyte
$ infimnist pat 10000 8109999 > mnist8m-patterns-idx3-ubyte
However, i fail to see why its from 10 000 to 8 109 999
Even if i do : 8 109 999 - 10 000 , it still doesnt make sense to me.
To me 8M would be 8 000 000 + 9 999 because i would end at 9 999 and start from 10 000 to 8 009 999 , which would be 8 million images.
Does anyone understand why its calculated as 8 109 999 ?
According to a fellow kaggle user, this is why :
The 8M dataset is the original images + 134 distortions/original. So there are
135*60,000 = 8,100,000
training images.
Adding the 10,000 test images you get 8,110,000 images.
The test images are from index 0 to 10,000-1=9,999 and the training images are from index 10,000 to 8,110,000-1 = 8,109,999.
I hope this helps.
The original dataset is also here:
https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html
You can see that "# of data: 8,100,000"
i try to training a classifier, i have create a file .vec whit create sample and it's ok.
Info file name: C:\OpenCV\positive.txt
Img file name: (NULL)
Vec file name: C:\OpenCV\sample.vec
BG file name: (NULL)
Num: 20
BG color: 0
BG threshold: 80
Invert: FALSE
Max intensity deviation: 40
Max x angle: 1.1
Max y angle: 1.1
Max z angle: 0.5
Show samples: FALSE
Width: 50
Height: 50
Create training samples from images collection...
Done. Created 20 samples
and now use training.bat, this is the file:
C:\OpenCV\opencv-2_4\build\x86\vc10\bin\opencv_traincascade.exe -data classifier -vec "C:\OpenCV\samples.vec" -bg "C:\OpenCV\negative.txt" -npos 20 -nneg 16 -numStages 4 -minHitRate 0.999 -maxFalseAllarmRate 0.5 -w 74 -h 100 -mode ALL -precalcvalBuffSize 256 -precalcdxBufSize 256
But when i call training.bat in dos give me this error:
Image reader can not be created from -vec C:\OpenCV\samples.vec and -bg C:\OpenCV\negative.txt.
can someone help?
It generally pops when the files do not exist in the directory you are calling, make sure you wrote the file name and path correctly, and make sure the vector file you are calling has the ".vec" extension.
I need to detect special image (something like symbol +) in scanned document. I'm going to train cascade using opencv_traincascade program (opencv 3.0)
This is my file structure:
C:\imgs\learn1
Bad
1.bmp
....
Good
1.bmp
....
Bad.dat
Good.dat
This my Bad.dat:
Bad\1.bmp
...
Bad\53.bmp
Bad\img001.jpg
...
Bad\img146.jpg
This is my Good.dat (every good file fully contains the special image and nothing more)
Good\1.bmp 1 0 0 60 59
...
Good\100.bmp 1 0 0 27 28
I've successfuly created vec file.
C:\opencv\build\x64\vc12\bin>opencv_createsamples.exe
-info C:\imgs\learn1\Good.dat
-vec samples.vec
-w 10 -h 10
Info file name: C:\imgs\learn1\Good.dat
Img file name: (NULL)
Vec file name: samples.vec
BG file name: (NULL)
Num: 1000
BG color: 0
BG threshold: 80
Invert: FALSE
Max intensity deviation: 40
Max x angle: 1.1
Max y angle: 1.1
Max z angle: 0.5
Show samples: FALSE
Width: 10
Height: 10
Create training samples from images collection...
C:\imgs\learn1\Good.dat(101) : parse errorDone. Created 100 samples
This is call and result of opencv_traincascade
C:\opencv\build\x64\vc12\bin>
-opencv_traincascade.exe
-data haarcascade
-vec C:\opencv\build\x64\vc12\bin\samples.vec
-bg C:\imgs\learn1\Bad.dat
-numStages 16
-minhiteate 0.99
-maxFalseAlarmRate 0.5
-numPos 80
-numNeg 199
-w 10
-h 10
-mode ALL
-precalcValBufSize 1024
-precalcIdxBufSize 1024
PARAMETERS:
cascadeDirName: haarcascade
vecFileName: C:\opencv\build\x64\vc12\bin\samples.vec
bgFileName: C:\imgs\learn1\Bad.dat
numPos: 80
numNeg: 199
numStages: 16
precalcValBufSize[Mb] : 1024
precalcIdxBufSize[Mb] : 1024
acceptanceRatioBreakValue : -1
stageType: BOOST
featureType: HAAR
sampleWidth: 10
sampleHeight: 10
boostType: GAB
minHitRate: 0.995
maxFalseAlarmRate: 0.5
weightTrimRate: 0.95
maxDepth: 1
maxWeakCount: 100
mode: ALL
===== TRAINING 0-stage =====
<BEGIN
POS count : consumed 80 : 80
Train dataset for temp stage can not be filled. Branch training terminated.
Cascade classifier can't be trained. Check the used training parameters.
As you can see, there is some error. Can you help me what is wrong exactly? "Check the used training parameters" is very general phrase.
(The folder C:\opencv\build\x64\vc12\bin\haarcascade exists)
I don't know what was wrong, but I've done it.
1)I've increased number of positive examples to 400
2)I've increased number of negative examples to 398
3)I found that if an image size 61 x 60, I shoud write in Good.dat
Good\1.bmp 1 0 0 60 59
(Image coordinates begin from 0 and end at width-1 and height-1 values)
4)I found type error: minhiteate - > minHitRate
and nothing helps...
5)I try to train in openvc 2.4 and i've got my cascade.xml file
But now I can't use it because of other error, but it's offtopic. (now I,m googling)