Haartraining opencv - opencv

I am tryng to train cascades using haar training.I have used the following parameters.
C:\opencv\opencv_bin\bin>opencv_haartraining -data haar -vec train.vec -bg neg.
txt -numPos 1000 -numNeg 2000 -nstages 10 -mem 2000 -mode all -w 30 -h 32
but i am getting the following error
Data dir name: haar
Vec file name: train.vec
BG file name: neg.txt, is a vecfile: no
Num pos: 2000
Num neg: 2000
Num stages: 10
Num splits: 1 (stump as weak classifier)
Mem: 2000 MB
Symmetric: TRUE
Min hit rate: 0.995000
Max false alarm rate: 0.500000
Weight trimming: 0.950000
Equal weights: FALSE
Mode: BASIC
Width: 30
Height: 32
Applied boosting algorithm: GAB
Error (valid only for Discrete and Real AdaBoost): misclass
Max number of splits in tree cascade: 0
Min number of positive samples per cluster: 500
Required leaf false alarm rate: 0.000976563
Tree Classifier
Stage
+---+
| 0|
+---+
Number of features used : 234720
Parent node: NULL
*** 1 cluster ***
OpenCV Error: Unspecified error (Vec file sample size mismatch) in icvGetHaarTra
iningDataFromVec, file C:\Downloads\Software\OpenCV-2.2.0-win\OpenCV-2.2.0\modul
es\haartraining\cvhaartraining.cpp, line 1929
terminate called after throwing an instance of 'cv::Exception'
what(): C:\Downloads\Software\OpenCV-2.2.0-win\OpenCV-2.2.0\modules\haartrain
ing\cvhaartraining.cpp:1929: error: (-2) Vec file sample size mismatch in functi
on icvGetHaarTrainingDataFromVec
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
C:\opencv\opencv_bin\bin>cmd |as.txt
'as.txt' is not recognized as an internal or external command,
operable program or batch file.
i am using a vec file having 1000 samples which i downloaded from the internet,and have 2000 negative samples.

"Vec file sample size mismatch" - Try checking the site for the size of the samples. The vec file may not be the one for 30x32 images(which you are trying to pass as -w 30 -h 32).
This is just a guess. Try it. And try using traincascade object. It is there in $OpencvDir$/apps/traincascade/. Compile it like any other object. It can be used for LBP and HOG as well.
Hope this helps.
Regards,
Prasanna S

The ratio of w and h is different from the setting in info.txt. You should modify w's and h's of all images in info.txt int 30:32.

Related

Weka RF doesn't give any confusion matrix or expected results

I am using WEKA to classify a small dataset with only 27 instances into a binary classification. I have tried with bigger datasets and weka show the confusion matrix and the other metrics, but with my main and small 27 instances dataset only shows this:
Scheme: weka.classifiers.trees.RandomForest -P 100 -I 100 -num-slots 1 -K 0 -M 1.0 -V 0.001 -S 1
Relation: t_PROMIS_mtbi-weka.filters.unsupervised.attribute.Remove-R1
Instances: 27
Attributes: 7
Var2
Var3
Var4
Var5
Var6
Var7
ERS
Test mode: 10-fold cross-validation
=== Classifier model (full training set) ===
RandomForest
Bagging with 100 iterations and base learner
weka.classifiers.trees.RandomTree -K 0 -M 1.0 -V 0.001 -S 1 -do-not-check-capabilities
Time taken to build model: 0.01 seconds
=== Cross-validation ===
=== Summary ===
Correlation coefficient 0.0348
Mean absolute error 0.4544
Root mean squared error 0.529
Relative absolute error 91.7269 %
Root relative squared error 102.952 %
Total Number of Instances 27
i don't undersantd why this is happening. Is it a size thing?
I have already solved it, The problem was that i was using numbers 1/0 on my class viariable, I changed it for a "Yes"/"No" variable and works.

Is there a reason why a feature only present in a given class is not being predicted strongly into that class?

Summary & Questions
I'm using liblinear 2.30 - I noticed a similar issue in prod, so I tried to isolate it through a simple reduced training with 2 classes, 1 train doc per class, 5 features with same weight in my vocabulary and 1 simple test doc containing only one feature which is present only in class 2.
a) what's the feature value being used for?
b) I wanted to understand why this test document containing a single feature which is only present in one class is not being strongly predicted into that class?
c) I'm not expecting to have different values per features. Is there any other implications by increasing each feature value from 1 to something-else? How can I determine that number?
d) Could my changes affect other more complex trainings in a bad way?
What I tried
Below you will find data related to a simple training (please focus on feature 5):
> cat train.txt
1 1:1 2:1 3:1
2 2:1 4:1 5:1
> train -s 0 -c 1 -p 0.1 -e 0.01 -B 0 train.txt model.bin
iter 1 act 3.353e-01 pre 3.333e-01 delta 6.715e-01 f 1.386e+00 |g| 1.000e+00 CG 1
iter 2 act 4.825e-05 pre 4.824e-05 delta 6.715e-01 f 1.051e+00 |g| 1.182e-02 CG 1
> cat model.bin
solver_type L2R_LR
nr_class 2
label 1 2
nr_feature 5
bias 0
w
0.3374141436539016
0
0.3374141436539016
-0.3374141436539016
-0.3374141436539016
0
And this is the output of the model:
solver_type L2R_LR
nr_class 2
label 1 2
nr_feature 5
bias 0
w
0.3374141436539016
0
0.3374141436539016
-0.3374141436539016
-0.3374141436539016
0
1 5:10
Below you will find my model's prediction:
> cat test.txt
1 5:1
> predict -b 1 test.txt model.bin test.out
Accuracy = 0% (0/1)
> cat test.out
labels 1 2
2 0.416438 0.583562
And here is where I'm a bit surprised because of the predictions being just [0.42, 0.58] as the feature 5 is only present in class 2. Why?
So I just tried with increasing the feature value for the test doc from 1 to 10:
> cat newtest.txt
1 5:10
> predict -b 1 newtest.txt model.bin newtest.out
Accuracy = 0% (0/1)
> cat newtest.out
labels 1 2
2 0.0331135 0.966887
And now I get a better prediction [0.03, 0.97]. Thus, I tried re-compiling my training again with all features set to 10:
> cat newtrain.txt
1 1:10 2:10 3:10
2 2:10 4:10 5:10
> train -s 0 -c 1 -p 0.1 -e 0.01 -B 0 newtrain.txt newmodel.bin
iter 1 act 1.104e+00 pre 9.804e-01 delta 2.508e-01 f 1.386e+00 |g| 1.000e+01 CG 1
iter 2 act 1.381e-01 pre 1.140e-01 delta 2.508e-01 f 2.826e-01 |g| 2.272e+00 CG 1
iter 3 act 2.627e-02 pre 2.269e-02 delta 2.508e-01 f 1.445e-01 |g| 6.847e-01 CG 1
iter 4 act 2.121e-03 pre 1.994e-03 delta 2.508e-01 f 1.183e-01 |g| 1.553e-01 CG 1
> cat newmodel.bin
solver_type L2R_LR
nr_class 2
label 1 2
nr_feature 5
bias 0
w
0.19420510395364846
0
0.19420510395364846
-0.19420510395364846
-0.19420510395364846
0
> predict -b 1 newtest.txt newmodel.bin newtest.out
Accuracy = 0% (0/1)
> cat newtest.out
labels 1 2
2 0.125423 0.874577
And again predictions were still ok for class 2: 0.87
a) what's the feature value being used for?
Each instance of n features is considered as a point in an n-dimensional space, attached with a given label, say +1 or -1 (in your case 1 or 2). A linear SVM tries to find the best hyperplane to separate those instance into two sets, say SetA and SetB. A hyperplane is considered better than other roughly when SetA contains more instances labeled with +1 and SetB contains more those with -1. i.e., more accurate. The best hyperplane is saved as the model. In your case, the hyperplane has formulation:
f(x)=w^T x
where w is the model, e.g (0.33741,0,0.33741,-0.33741,-0.33741) in your first case.
Probability (for LR) formulation:
prob(x)=1/(1+exp(-y*f(x))
where y=+1 or -1. See Appendix L of LIBLINEAR paper.
b) I wanted to understand why this test document containing a single feature which is only present in one class is not being strongly predicted into that class?
Not only 1 5:1 gives weak probability such as [0.42,0.58], if you predict 2 2:1 4:1 5:1 you will get [0.337417,0.662583] which seems that the solver is also not very confident about the result, even the input is exactly the same as the training data set.
The fundamental reason is the value of f(x), or can be simply seen as the distance between x and the hyperplane. It can be 100% confident x belongs to a certain class only if the distance is infinite large (see prob(x)).
c) I'm not expecting to have different values per features. Is there any other implications by increasing each feature value from 1 to something-else? How can I determine that number?
TL;DR
Enlarging both training and test set is like having a larger penalty parameter C (the -c option). Because larger C means a more strict penalty on error, intuitively speaking, the solver has more confidence with the prediction.
Enlarging every feature of the training set is just like having a smaller C.
Specifically, logistic regression solves the following equation for w.
min 0.5 w^T w + C ∑i log(1+exp(−yi w^T xi))
(eq(3) of LIBLINEAR paper)
For most instance, yi w^T xi is positive and larger xi implies smaller ∑i log(1+exp(−yi w^T xi)).
So the effect is somewhat similar to having a smaller C, and a smaller C implies smaller |w|.
On the other hand, enlarging the test set is the same as having a large |w|. Therefore, the effect of enlarging both training and test set is basically
(1). Having smaller |w| when training
(2). Then, having larger |w| when testing
Because the effect is more dramatic in (2) than (1), overall, enlarging both training and test set is like having a larger |w|, or, having a larger C.
We can run on the data set and multiply every features by 10^12. With C=1, we have the model and probability
> cat model.bin.m1e12.c1
solver_type L2R_LR
nr_class 2
label 1 2
nr_feature 5
bias 0
w
3.0998430106024949e-12
0
3.0998430106024949e-12
-3.0998430106024949e-12
-3.0998430106024949e-12
0
> cat test.out.m1e12.c1
labels 1 2
2 0.0431137 0.956886
Next we run on the original data set. With C=10^12, we have the probability
> cat model.bin.m1.c1e12
solver_type L2R_LR
nr_class 2
label 1 2
nr_feature 5
bias 0
w
3.0998430101989314
0
3.0998430101989314
-3.0998430101989314
-3.0998430101989314
0
> cat test.out.m1.c1e12
labels 1 2
2 0.0431137 0.956886
Therefore, because larger C means more strict penalty on error, so intuitively the solver has more confident with prediction.
d) Could my changes affect other more complex trainings in a bad way?
From (c) we know your changes is like having a larger C, and that will result in a better training accuracy. But it almost can be sure that the model is over fitting the training set when C goes too large. As a result, the model cannot endure the noise in training set and will perform badly in test accuracy.
As for finding a good C, a popular way is by cross validation (-v option).
Finally,
it may be off-topic but you may want to see how to pre-process the text data. It is common (e.g., suggested by the author of liblinear here) to instance-wise normalize the data.
For document classification, our experience indicates that if you normalize each document to unit length, then not only the training time is shorter, but also the performance is better.

How to interpret output of EM on weka

I tried to run the EM algorithm on data with the default parameters in WEKA and but I am not able to understand how to interpret it?
=== Run information ===
Scheme: weka.clusterers.EM -I 100 -N -1 -X 10 -max -1 -ll-cv 1.0E-6 -ll-iter 1.0E-6 -M 1.0E-6 -K 10 -num-slots 1 -S 100
Relation: Chronic_Kidney_Disease-weka.filters.unsupervised.attribute.Remove-R12-weka.filters.unsupervised.attribute.Remove-R3-weka.filters.unsupervised.attribute.Remove-R3-4-weka.filters.unsupervised.attribute.Remove-R5-10,12-20
Instances: 800
Attributes: 6
age
bp
rbc
pc
hemo
class
Test mode: evaluate on training data
=== Clustering model (full training set) ===
EM
==
Number of clusters selected by cross validation: 6
Number of iterations performed: 100
Cluster
Attribute 0 1 2 3 4 5
(0.29) (0.22) (0.38) (0.02) (0.04) (0.05)
===================================================================
age
mean 53.5869 65.0962 46.44 51.3652 56.1297 10.939
std. dev. 12.4505 7.9718 15.546 3.7759 10.2604 6.7004
bp
mean 77.3114 79.7 71.4394 115.138 92.1235 66.5196
std. dev. 11.7858 12.1008 8.4722 31.4278 5.8351 10.0583
rbc
normal 185.8341 165.6585 306.8285 14.0588 7.3129 32.3071
abnormal 45.4643 13.3988 1.0652 3.3197 29.7885 6.9635
[total] 231.2984 179.0574 307.8937 17.3785 37.1015 39.2706
pc
normal 152.713 147.8797 306.8886 13.0467 1.9999 31.4721
abnormal 78.5854 31.1776 1.005 4.3319 35.1016 7.7985
[total] 231.2984 179.0574 307.8937 17.3785 37.1015 39.2706
hemo
mean 10.6591 11.7665 15.0745 9.5796 8.1499 12.0494
std. dev. 2.1313 1.1677 1.3496 2.5159 2.1512 1.5108
class
ckd 230.1835 177.972 7.2109 16.3651 36.1014 38.167
notckd 1.1149 1.0853 300.6828 1.0134 1 1.1036
[total] 231.2984 179.0574 307.8937 17.3785 37.1015 39.2706
Time taken to build model (full training data) : 13.21 seconds
=== Model and evaluation on training set ===
Clustered Instances
0 218 ( 27%)
1 196 ( 25%)
2 302 ( 38%)
3 12 ( 2%)
4 34 ( 4%)
5 38 ( 5%)
Log likelihood: -11.18988
Please help in understanding the output.
Thanks in advance
It's given you six clusters, with 27%, 25%, 38%, 2%, 4% and 5% of the data in respectively. (Which is >100%, so is rounded).
It's arrived on 6 after cross-validation (training on some, testing on the others for several runs).
The mean and standard deviation of each attribute for the items in each cluster are given.
The log likelihood is a measure of how good the clusters are - the training tried to minimise this. It is uses to compare which of the possible clusters is better and doesn't mean much by itself.

OpenCV training to detect static image

I want to detect the flag of my country. But I have trouble with training. I have one positive sample and 4 negative samples. This is my folder structrue:
/negative
/img
img1.jpg
img2.jpg
img3.jpg
img4.jpg
/positive
flag.jpg
This is how I call create_samples:
opencv_createsamples -img positive/flag.jpg -vec flag.vec
But command does not finish and popup windows appears saying that error appeared. This is output of create_samples command:
Info file name: (NULL)
Img file name: positive/flag.jpg
Vec file name: flag.vec
BG file name: (NULL)
Num: 1000
BG color: 0
BG threshold: 80
Invert: FALSE
Max intensity deviation: 40
Max x angle: 1.1
Max y angle: 1.1
Max z angle: 0.5
Show samples: FALSE
Width: 24
Height: 24
Create training samples from single image applying distortions...
Can anyone guide me through the process of haar training of static image (1 image) in OpenCV? I am running Windows 7 Ultimate x64
EDIT
This works:
opencv_createsamples -img positive/flag.jpg -vec flag.vec -num 0
I guess the problem is with -num 0 parameter
Haar is not your best shot at detecting a country's flag , better use color detection
COLOR_MIN = np.array([20, 80, 80],np.uint8)
COLOR_MAX = np.array([40, 255, 255],np.uint8)
but if you still insist on haar
creating samples :
$ <opencv_createsamples> -vec <binary_description> -image <positive_image> -bg <negative_description>
Training the cascade :
$ <opencv_traincascade> -data <cascade> -vec <binary_description> -bg <negative_description>

How to do proper testing in Weka and how to get desired results ?

I am currently working over a application of ANN, SVM and Linear Regression methods for prediction of fruit yield of a region based on meteorological factors (13 factors )
Total data set is: 36
While Implementing those methods on WEKA I am getting BAD results:
Like in the case of MultilayerPreceptron my results are :
(i divided the dataset with 28 for training and 8 for test )
=== Run information ===
Scheme: weka.classifiers.functions.MultilayerPerceptron -L 0.3 -M 0.2 -N 500 -V 0 -S 0 -E 20 -H a -G -R
Relation: apr6_data
Instances: 28
Attributes: 15
Time taken to build model: 3.69 seconds
=== Predictions on test set ===
inst# actual predicted error
1 2.551 2.36 -0.191
2 2.126 3.079 0.953
3 2.6 1.319 -1.281
4 1.901 3.539 1.638
5 2.146 3.635 1.489
6 2.533 2.917 0.384
7 2.54 2.744 0.204
8 2.82 3.473 0.653
=== Evaluation on test set ===
=== Summary ===
Correlation coefficient -0.4415
Mean absolute error 0.8493
Root mean squared error 1.0065
Relative absolute error 144.2248 %
Root relative squared error 153.5097 %
Total Number of Instances 8
In case of SVM for regression :
inst# actual predicted error
1 2.551 2.538 -0.013
2 2.126 2.568 0.442
3 2.6 2.335 -0.265
4 1.901 2.556 0.655
5 2.146 2.632 0.486
6 2.533 2.24 -0.293
7 2.54 2.766 0.226
8 2.82 3.175 0.355
=== Evaluation on test set ===
=== Summary ===
Correlation coefficient 0.2888
Mean absolute error 0.3417
Root mean squared error 0.3862
Relative absolute error 58.0331 %
Root relative squared error 58.9028 %
Total Number of Instances 8
What can be the possible error in my application ? Please let me know !
Thanks
Do I need to normalize the data ? I guess it is being done by WEKA classifiers.
If you want to normalize data, you have to do it. Preprocess tab - > Filters (choose) -> then find normalize and then click apply.
If you want to discretize your data, you have to follow the same process.
You might have better luck with discretising the prediction e.g. into low/medium/high yield.
You need to normalize or discretize- this cannot be said based on your data or on your single run. For instance, discretization brings in better result for naive baye's classifiers. For SVM- not sure.
I did not see your Precision, Recall or F-score from your data. But as you are saying you have bad results on test set, then it is very possible that your classifier is experiencing overfitting. Try to increase training instances (36 is too less I guess). Keep us posting what is happening when you increase training instances.

Resources