So, I've created my own HoG features extractor and a simple sliding window algorithm, which pseudo code looks like this:
for( int i = 0; i < img.rows; i++ ) {
for( int j = 0; j < img.cols; j++ ) {
extract image ROI from the current position
calculate features for the ROI
feed the features into svm.predict() function, to determine whether it's human or not
}
}
However since it's very slow (especially when you include different scales), I've decided to train some cascade classifiers using openv_traincascade command on my positive and negative samples.
opencv_traincascade provides me with cascade.xml, params.xml, and a number of stages.xml files
My question is how do I utilize this trained cascade classifiers in my detection loop ?
Edit: No, not detectMultiScale. The reason I'm using cascade classifier is just to speed up detection of non-object, I still need to use my own algorithm to calculate the score of probable ROI. Sorry for the confusion
Following opencv doc on Object Detection, you have to create the cascade detector object and ::load the cascade you want to apply (the xml file you have generated). ::detectMultiScale is used to fill a std::vector of detected object from current frame by sliding windows of different scales and sizes and merging high confident close samples.
Code here!
Same question answered here ?
In order to use the Cascade Classifier you have to call the CascadeClassifier::load function (you pass the path to the generated XML file as argument). And whenever you want to check whether the object of interest is existing or not you should either call the CascadeClassifier::detect or CascadeClassifier::detectMultiScale functions.
Related
I use SIFT to detect, describe feature points in two images as follows.
void FeaturePointMatching::SIFTFeatureMatchers(cv::Mat imgs[2], std::vector<cv::Point2f> fp[2])
{
cv::SiftFeatureDetector dec;
std::vector<cv::KeyPoint>kp1, kp2;
dec.detect(imgs[0], kp1);
dec.detect(imgs[1], kp2);
cv::SiftDescriptorExtractor ext;
cv::Mat desp1, desp2;
ext.compute(imgs[0], kp1, desp1);
ext.compute(imgs[1], kp2, desp2);
cv::BruteForceMatcher<cv::L2<float> > matcher;
std::vector<cv::DMatch> matches;
matcher.match(desp1, desp2, matches);
std::vector<cv::DMatch>::iterator iter;
fp[0].clear();
fp[1].clear();
for (iter = matches.begin(); iter != matches.end(); ++iter)
{
//if (iter->distance > 1000)
// continue;
fp[0].push_back(kp1.at(iter->queryIdx).pt);
fp[1].push_back(kp2.at(iter->trainIdx).pt);
}
// remove outliers
std::vector<uchar> mask;
cv::findFundamentalMat(fp[0], fp[1], cv::FM_RANSAC, 3, 1, mask);
std::vector<cv::Point2f> fp_refined[2];
for (size_t i = 0; i < mask.size(); ++i)
{
if (mask[i] != 0)
{
fp_refined[0].push_back(fp[0][i]);
fp_refined[1].push_back(fp[1][i]);
}
}
std::swap(fp_refined[0], fp[0]);
std::swap(fp_refined[1], fp[1]);
}
In the above code, I use findFundamentalMat() to remove outliers, but in the result img1 and img2 there are still some bad matches. In the images, each green line connects the matched feature point pair. And please ignore red marks. I can not find anything wrong, could anyone give me some hints? Thanks in advance.
RANSAC is just one of the robust estimators. In principle, one can use a variety of them but RANSAC has been shown to work quite well as long as your input data is not dominated by outliers. You can check other variants on RANSAC like MSAC, MLESAC, MAPSAC etc. which have some other interesting properties as well. You may find this CVPR presentation interesting (http://www.imgfsr.com/CVPR2011/Tutorial6/RANSAC_CVPR2011.pdf)
Depending on the quality of the input data, you can estimate the optimal number of RANSAC iterations as described here (https://en.wikipedia.org/wiki/RANSAC#Parameters)
Again, it is one of the robust estimator methods. You may take other statistical approaches like modelling your data with heavy tail distributions, trimmed least squares etc.
In your code you are missing the RANSAC step. RANSAC has basically 2 steps:
generate hypothesis (do a random selection of data points necessary to fit your mode: training data).
model evaluation (evaluate your model on the rest of the points: testing data)
iterate and choose the model that gives the lowest testing error.
RANSAC stands for RANdom SAmple Consensus, it does not remove outliers, it selects a group of points to calculate the fundamental matrix for that group of points. Then you need to do a re projection using the fundamental matrix just calculated with RANSAC to remove the outliers.
Like any algorithm, ransac is not perfect. You can try to run other disponible (in the opencv implementation) robust algorithms, like LMEDS. And you can reiterate, using the last points marked as inliers as input to a new estimation. And you can vary the threshold and confidence level. I suggest run ransac 1 ~ 3 times and then run LMEDS, that does not need a threshold, but only work well with at least +50% of inliers.
And, you can have geometrical order problems:
*If the baseline between the two stereo is too small the fundamental matrix estimation can be unreliable, and may be better to use findHomography() instead, for your purpose.
*if your images have some barrel/pincushin distortion, they are not in conformity with epipolar geometry and the fundamental matrix are not the correct mathematical model to link matches. In this case, you may have to calibrate your camera and then run undistort() and then process the output images.
I have about 130,000 SIFT descriptors. I am building a hierarchical Kmeans-index using Opencv's flann module. After this I want to quantize these 130,000 descriptors (will quantize more later). I am using flann's knnsearch method for doing this. But the result of this method is something weird. For every descriptor the nearest index it is showing is the index of the descriptor itself. However, it should be displaying the cluster-ID of the nearest cluster which will be one of the leaves of the HIK-tree.
Should I try k=2
Here is a code snippet -
int k=1;
cv::flann::KMeansIndexParams indexParams(8,4,cvflann::FLANN_CENTERS_KMEANSPP) ;
cv::flann::Index hik_tree(cluster_data, indexParams);
Mat indices,dist;
hik_tree.knnSearch(cluster_data, indices, dist, k, cv::flann::SearchParams(64));
knnSearch is looking for the k-nearest neighbours in the index (it does not give the cluster-ID!). You build your index using cluster_data, and then you try to match cluster_data against itself. In this situation, it is not surprising that the closest neighbour to each descriptor is itself...
EDIT: If you want to get the centers, have a look at this (from the source of the FLANN library):
/**
* Chooses the initial centers using the algorithm proposed in the KMeans++ paper:
* Arthur, David; Vassilvitskii, Sergei - k-means++: The Advantages of Careful Seeding
*/
template <typename Distance>
class KMeansppCenterChooser : public CenterChooser<Distance>
{
...
k-NN is a supervised classification algorithm, that's why you are supposed to construct an Index object with your training samples, so use
cv::flann::Index hik_tree(samples, indexParams);
instead of
cv::flann::Index hik_tree(cluster_data, indexParams);
I have generated a cascade using Local Binary Patterns for object recongition. It seems that the tool to evaluate the detection rate is opencv_performance.exe but I found out that it only works with haar cascades ? Is there something wrong with my cascade ? Do I need to change the format ?
You can code your own evaluator and use the same metric that opencv_performance uses. You need to change the old function that loads the cascades and also the function that makes the detection (change to detectMultiScale),
A suggestion of another metric is Intersection over Union (IOU), that calculates the percent of overlapping of two rectangles (the groundtruth rect and the detected rect).
The pseudo-code would be:
IOU = area(intersection(rect1,rect2)) / area(union(rect1,rect2)) and compare with, for example, if it is greater than 0.5.
Take a look here: http://answers.opencv.org/question/117/performance-evaluation-for-detection/
How do i do a gaussi smoothing in the 3th dimension?
I have this detection pyramid, votes accumulated at four scales. Objects are found at each peak.
I already smoothed each of them in 2d, and reading in my papers that i need to filter the third dimension with a \sigma = 1, which i havent tried before, i am not even sure what it means.
I Figured out how to do it in Matlab, and need something simular in opencv/c++.
Matlab Raw Values:
Matlab Smoothen with M0 = smooth3(M0,'gaussian'); :
Gaussian filters are separable. You apply 1D filter at each dimension as follows:
for (dim = 0; dim < D; dim++)
tensor = gaussian_filter(tensor, dim);
I would recommend OpenCV for an implementation of a gaussian filter (and image processing in general) in C++.
Note that this assumes that your pyramid levels are all of the same size.
You can have your own functions that sample your scale-space pyramid on the fly while convolving the third dimension, but if you have enough memory I believe that it would be faster to scale up your coarser level to have the same size of the finest level.
Long ago (in 2008-2009) I have developed a small C++ template lib to apply some simple transformations and convolution filters. The library's source can be found in the Linderdaum Engine - it has nothing to do with the rest of the engine and does not use any of the engine's features. The license is MIT, so do whatever you want with it.
Take a look into the Linderdaum's source code (http://www.linderdaum.com) at Src/Linderdaum/Images/VolumeLib.*
The function to prepare the kernel is PrepareGaussianFilter() and MakeScalarVolumeConvolution() applies the filter. It is easy to adapt the library for the different data sources because the I/O is implemented using callback functions.
I try to implement a people detecting system based on SVM and HOG using OpenCV2.3. But I got stucked.
I came this far:
I can compute HOG values from an image database and then I calculate with LIBSVM the SVM vectors, so I get e.g. 1419 SVM vectors with 3780 values each.
OpenCV just wants one feature vector in the method hog.setSVMDetector(). Therefore I have to calculate one feature vector from my 1419 SVM vectors, that LIBSVM has calculated.
I found one hint, how to calculate this single feature vector: link
“The detecting feature vector at component i (where i is in the range e.g. 0-3779) is built out of the sum of the support vectors at i * the alpha value of that support vector, e.g.
det[i] = sum_j (sv_j[i] * alpha[j]) , where j is the number of the support vector, i
is the number of the components of the support vector.”
According to this, my routine works this way:
I take the first element of my first SVM vector, multiply it with the alpha value and add it with the first element of the second SVM vector that has been multiplied with alpha value, …
But after summing up all 1419 elements I get quite high values:
16.0657, -0.351117, 2.73681, 17.5677, -8.10134,
11.0206, -13.4837, -2.84614, 16.796, 15.0564,
8.19778, -0.7101, 5.25691, -9.53694, 23.9357,
If you compare them, to the default vector in the OpenCV sample peopledetect.cpp (and hog.cpp in the OpenCV source)
0.05359386f, -0.14721455f, -0.05532170f, 0.05077307f,
0.11547081f, -0.04268804f, 0.04635834f, -0.05468199f, 0.08232084f,
0.10424068f, -0.02294518f, 0.01108519f, 0.01378693f, 0.11193510f,
0.01268418f, 0.08528346f, -0.06309239f, 0.13054633f, 0.08100729f,
-0.05209739f, -0.04315529f, 0.09341384f, 0.11035026f, -0.07596218f,
-0.05517511f, -0.04465296f, 0.02947334f, 0.04555536f,
you see, that the default vector values are in the boundaries between –1 and +1, but my values exceed them far.
I think, my single feature vector routine needs some adjustment, any ideas?
Regards,
Christoph
The aggregated vector's values do look high.
I used the loadSVMfromModelFile() located in http://lnx.mangaitalia.net/trainer/main.cpp
I had to remove svinstr.sync(); from the code since it caused losing parts of the lines and getting wrong results.
I don't know much about the rest of the file, I only used this function.