Accuracy tuning for Haar-Cascade Classifier - opencv

I'm using Haar-Cascade Classifier in order to detect faces.
I'm currently facing some problems with the following function:
void ImageManager::detectAndDisplay(Mat frame, CascadeClassifier face_cascade){
string window_name = "Capture - Face detection";
string filename;
std::vector<Rect> faces;
std::vector<Rect> eyes;
Mat frame_gray;
Mat crop;
Mat res;
Mat gray;
string text;
stringstream sstm;
cvtColor(frame, frame_gray, COLOR_BGR2GRAY);
equalizeHist(frame_gray, frame_gray);
// Detect faces
face_cascade.detectMultiScale(frame_gray, faces, 1.1, 2, 0 | CASCADE_SCALE_IMAGE, Size(30, 30));
// Set Region of Interest
cv::Rect roi_b;
cv::Rect roi_c;
size_t ic = 0; // ic is index of current element
for (ic = 0; ic < faces.size(); ic++) // Iterate through all current elements (detected faces)
{
roi_c.x = faces[ic].x;
roi_c.y = faces[ic].y;
roi_c.width = (faces[ic].width);
roi_c.height = (faces[ic].height);
crop = frame_gray(roi_c);
faces_img.push_back(crop);
rectangle(frame, Point(roi_c.x, roi_c.y), Point(roi_c.x + roi_c.width, roi_c.y + roi_c.height), Scalar(0,0,255), 2);
}
imshow("test", frame);
waitKey(0);
cout << faces_img.size();
}
The frame is the photo I'm trying to scan.
The face_cascade is the classifier.

internally, the CascadeClassifier does several detections, and groups those.
minNeighbours (in the detectMultiScale call) is the amount of detections in about the same place nessecary to count as a valid detection, so increase that from your current 2 to maybe 5 or so, until you start to miss positives.

As an addition to berak's statement, it's not only about reducing/increasing of detectMultiScale parameters if you're not doing the stuff only on an image. You'll face performance problems that do not let the user use the application.
Performance issues are relying on miscalculations. And what calculation takes is just testing.
If you are not trying to have the best results under different light conditions(since this is visual-dependent information) you'll have to scale the input array before sending it as an argument to detectMultiScale function. Once detection's completed, rescale to the previous size(it may be done by changing the rectangle's size that's used as an argument for detectMultiScale).

Related

Non connecting morphological filter

After some simple preprocessing I am receiving boolean mask of segmented images.
I want to "enhance" borders of the mask and make them more smooth. For that I am using OPEN morphology filter with a rather big circle kernel , it works very well until the distance between segmented objects is enough. But In alot of samples objects stick together. Is there exists some more or less simple method to smooth such kind of images without changing its morphology ?
Without applying a morphological filter first, you can try to detect the external contours of the image. Now you can draw these external contours as filled contours and then apply your morphological filter. This works because now you don't have any holes to fill. This is fairly simple.
Another approach:
find external contours
take the x, y of coordinates of the contour points. you can consider these as 1-D signals and apply a smoothing filter to these signals
In the code below, I've applied the second approach to a sample image.
Input image
External contours without any smoothing
After applying a Gaussian filter to x and y 1-D signals
C++ code
Mat im = imread("4.png", 0);
Mat cont = im.clone();
Mat original = Mat::zeros(im.rows, im.cols, CV_8UC3);
Mat smoothed = Mat::zeros(im.rows, im.cols, CV_8UC3);
// contour smoothing parameters for gaussian filter
int filterRadius = 5;
int filterSize = 2 * filterRadius + 1;
double sigma = 10;
vector<vector<Point> > contours;
vector<Vec4i> hierarchy;
// find external contours and store all contour points
findContours(cont, contours, hierarchy, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_NONE, Point(0, 0));
for(size_t j = 0; j < contours.size(); j++)
{
// draw the initial contour shape
drawContours(original, contours, j, Scalar(0, 255, 0), 1);
// extract x and y coordinates of points. we'll consider these as 1-D signals
// add circular padding to 1-D signals
size_t len = contours[j].size() + 2 * filterRadius;
size_t idx = (contours[j].size() - filterRadius);
vector<float> x, y;
for (size_t i = 0; i < len; i++)
{
x.push_back(contours[j][(idx + i) % contours[j].size()].x);
y.push_back(contours[j][(idx + i) % contours[j].size()].y);
}
// filter 1-D signals
vector<float> xFilt, yFilt;
GaussianBlur(x, xFilt, Size(filterSize, filterSize), sigma, sigma);
GaussianBlur(y, yFilt, Size(filterSize, filterSize), sigma, sigma);
// build smoothed contour
vector<vector<Point> > smoothContours;
vector<Point> smooth;
for (size_t i = filterRadius; i < contours[j].size() + filterRadius; i++)
{
smooth.push_back(Point(xFilt[i], yFilt[i]));
}
smoothContours.push_back(smooth);
drawContours(smoothed, smoothContours, 0, Scalar(255, 0, 0), 1);
cout << "debug contour " << j << " : " << contours[j].size() << ", " << smooth.size() << endl;
}
Not 100% sure what you are trying to achieve, but this may be an avenue to explore... the tool potrace takes images and converts them to vectorised images which involves smoothing. It prefers PGM format input files so I use ImageMagick to prepare them. Anyway, here is an example of the command and the result so see what you think:
convert disks.png pgm:- | potrace - -s -o out.svg
I have converted the resulting SVG file to a PNG so I can upload it to SO.

Detecting mouth with openCV

I am trying to detect the mouth in an image with openCV, so I am using the following code:
#include "face_detection.h"
using namespace cv;
// Function detectAndDisplay
void detectAndDisplay(const std::string& file_name, cv::CascadeClassifier& face_cascade, cv::CascadeClassifier& mouth_cascade)
{
Mat frame = imread(file_name);
std::vector<Rect> faces;
Mat frame_gray;
Mat crop;
Mat res;
Mat gray;
cvtColor(frame, frame_gray, COLOR_BGR2GRAY);
equalizeHist(frame_gray, frame_gray);
// Detect faces
face_cascade.detectMultiScale(frame_gray, faces, 1.1, 3, 0 | CASCADE_SCALE_IMAGE, Size(30, 30));
for(unsigned int i=0;i<faces.size();i++)
{
rectangle(frame,faces[i],Scalar(255,0,0),1,8,0);
Mat face = frame(faces[i]);
cvtColor(face,face,CV_BGR2GRAY);
std::vector <Rect> mouthi;
mouth_cascade.detectMultiScale(face, mouthi);
for(unsigned int k=0;k<mouthi.size();k++)
{
Point pt1(mouthi[k].x+faces[i].x , mouthi[k].y+faces[i].y);
Point pt2(pt1.x+mouthi[k].width, pt1.y+mouthi[k].height);
rectangle(frame, pt1,pt2,Scalar(0,255,0),1,8,0);
}
}
imshow("Frame", frame);
waitKey(33);
}
The classifiers are haarcascade_frontalface_alt.xml and haarcascade_mcs_mouth.xml.
The face is detected correctly but the mouth is not: I also obtain the eyes and some other parts, like the forehead.
Is there a way to detect only the mouth?
I think I managed to solve the problem: focusing on the lower half of the face and increasing the scale factor did the trick and now I am able to detect the mouth with a good precision. Anyway this task seems much more complicated than face detection, even if I am using "simple" images, which means straight and full frontal.
Here are two examples: a success and a failure.
I was facing the same problem, so I focused only on the lower half of the face
and created an ROI from the detected face. It looks something like this:
Mat ROI=image(Rect(face.x,face.y+face.height*0.6,face.width,face.height*0.3));
Where face is the detected face from the image.
This created an ROI from the detected face for the lower half only. Else the mouth detector was detecting the eyes also as mouth.
Then use the MouthCascade.xml from this link: http://alereimondo.no-ip.org/OpenCV/34
which is far more efficient than the inbuilt OpenCV one.

Using opencv matchtemplate for blister pack inspection

I am doing a project in which I have to inspect pharmaceutical blister pack for missing tablets.
I am trying to use opencv's matchTemplate function. Let me show the code and then some results.
int match(string filename, string templatename)
{
Mat ref = cv::imread(filename + ".jpg");
Mat tpl = cv::imread(templatename + ".jpg");
if (ref.empty() || tpl.empty())
{
cout << "Error reading file(s)!" << endl;
return -1;
}
imshow("file", ref);
imshow("template", tpl);
Mat res_32f(ref.rows - tpl.rows + 1, ref.cols - tpl.cols + 1, CV_32FC1);
matchTemplate(ref, tpl, res_32f, CV_TM_CCOEFF_NORMED);
Mat res;
res_32f.convertTo(res, CV_8U, 255.0);
imshow("result", res);
int size = ((tpl.cols + tpl.rows) / 4) * 2 + 1; //force size to be odd
adaptiveThreshold(res, res, 255, ADAPTIVE_THRESH_MEAN_C, THRESH_BINARY, size, -128);
imshow("result_thresh", res);
while (true)
{
double minval, maxval, threshold = 0.8;
Point minloc, maxloc;
minMaxLoc(res, &minval, &maxval, &minloc, &maxloc);
if (maxval >= threshold)
{
rectangle(ref, maxloc, Point(maxloc.x + tpl.cols, maxloc.y + tpl.rows), CV_RGB(0,255,0), 2);
floodFill(res, maxloc, 0); //mark drawn blob
}
else
break;
}
imshow("final", ref);
waitKey(0);
return 0;
}
And here are some pictures.
The "sample" image of a good blister pack:
The template cropped from "sample" image:
Result with "sample" image:
Missing tablet from this pack is detected:
But here are the problems:
I currently don't have any idea why this happens. Any suggestion and/or help is appreciated.
The original code that I followed and modified is here: http://opencv-code.com/quick-tips/how-to-handle-template-matching-with-multiple-occurences/
I found a solution for my own question. I just need to apply Canny edge detector on both image and template before throwing them to matchTemplate function. The full working code:
int match(string filename, string templatename)
{
Mat ref = cv::imread(filename + ".jpg");
Mat tpl = cv::imread(templatename + ".jpg");
if(ref.empty() || tpl.empty())
{
cout << "Error reading file(s)!" << endl;
return -1;
}
Mat gref, gtpl;
cvtColor(ref, gref, CV_BGR2GRAY);
cvtColor(tpl, gtpl, CV_BGR2GRAY);
const int low_canny = 110;
Canny(gref, gref, low_canny, low_canny*3);
Canny(gtpl, gtpl, low_canny, low_canny*3);
imshow("file", gref);
imshow("template", gtpl);
Mat res_32f(ref.rows - tpl.rows + 1, ref.cols - tpl.cols + 1, CV_32FC1);
matchTemplate(gref, gtpl, res_32f, CV_TM_CCOEFF_NORMED);
Mat res;
res_32f.convertTo(res, CV_8U, 255.0);
imshow("result", res);
int size = ((tpl.cols + tpl.rows) / 4) * 2 + 1; //force size to be odd
adaptiveThreshold(res, res, 255, ADAPTIVE_THRESH_MEAN_C, THRESH_BINARY, size, -64);
imshow("result_thresh", res);
while(1)
{
double minval, maxval;
Point minloc, maxloc;
minMaxLoc(res, &minval, &maxval, &minloc, &maxloc);
if(maxval > 0)
{
rectangle(ref, maxloc, Point(maxloc.x + tpl.cols, maxloc.y + tpl.rows), Scalar(0,255,0), 2);
floodFill(res, maxloc, 0); //mark drawn blob
}
else
break;
}
imshow("final", ref);
waitKey(0);
return 0;
}
Any suggestion for improvement is appreciated. I am strongly concerned about performance and robustness of my code, so I am looking for all ideas.
There are 2 things that got my nerves now: the lower Canny threshold and the negative constant on adaptiveThreshold function.
Edit: Here is the result, as you asked :)
Template:
Test image, missing 2 tablets:
Canny results of template and test image:
matchTemplate result (converted to CV_8U):
After adaptiveThreshold:
Final result:
I don't think think the adaptive threshold is a good choice.
What you need to do here is called non-maximum suppression. You have an image with multiple local maxima, and you want to remove all pixels that are not local maxima.
cv::dilate(res_32f, res_dilated, null, 5);
cv::compare(res_32f, res_dilated, mask_local_maxima, cv::CMP_GE);
cv::set(res_32f, 0, mask_local_maxima)
Now all pixels in the res_32f image that are not local maxima are set to zero. All the maximum pixels are still at their original value, so you can adjust the threshold later in the line
double minval, maxval, threshold = 0.8;
All local maxima should also now be surrounded by enough zeroes that the floodfill will not extend too far.
Now I think you should be able to adjust the threshold to exclude all false positives.
If this is not enough, here is another suggestion:
Instead of just one template, I would run the search with multiple templates; your current template,and one with a tablet from the right side and the left side of the pack. Due to perspective these tablets look quite a bit different. Keep track of the found tablets so you do not detect the smae tablet multiple times.
With these multiple templates you can raise the threshold even higher.
One further refinement: if the detection is still too erratic, try blurring your template and search image with a Gaussian blur. This will remove fine details and noise that may throw of the matchTemplate function, while leaving the larger structures intact.
Using a canny filter instead seems unreliable to me: It seems to rely on the fact that a removed tablet region will have more edges at the center. But I am not sure if this will always be the case; and you discard a lot of information about color and brightness with the canny filter, so I would expect worse results.
(that said, if it works for you, it works)
Have you tried the Surf algorithm in order to get more detailed descriptors? You could try to collect descriptor for both the full and the empty sample image. And perform different action for each one of thr object detected.

Having some difficulty in image stitching using OpenCV

I'm currently working on Image stitching using OpenCV 2.3.1 on Visual Studio 2010, but I'm having some trouble.
Problem Description
I'm trying to write a code for stitching multiple images derived from a few cameras(about 3~4), i,e, the code should keep executing image stitching until I ask it to stop.
The following is what I've done so far:
(For simplification, I'll replace some part of the code with just a few words)
1.Reading frames(images) from 2 cameras (Currently I'm just working on 2 cameras.)
2.Feature detection, descriptor calculation (SURF)
3.Feature matching using FlannBasedMatcher
4.Removing outliers and calculate the Homography with inliers using RANSAC.
5.Warp one of both images.
For step 5., I followed the answer in the following thread and just changed some parameters:
Stitching 2 images in opencv
However, the result is terrible though.
I just uploaded the result onto youtube and of course only those who have the link will be able to see it.
http://youtu.be/Oy5z_7LeaMk
My code is shown below:
(Only crucial parts are shown)
VideoCapture cam1, cam2;
cam1.open(0);
cam2.open(1);
while(1)
{
Mat frm1, frm2;
cam1 >> frm1;
cam2 >> frm2;
//(SURF detection, descriptor calculation
//and matching using FlannBasedMatcher)
double max_dist = 0; double min_dist = 100;
//-- Quick calculation of max and min distances between keypoints
for( int i = 0; i < descriptors_1.rows; i++ )
{
double dist = matches[i].distance;
if( dist < min_dist ) min_dist = dist;
if( dist > max_dist ) max_dist = dist;
}
(Draw only "good" matches
(i.e. whose distance is less than 3*min_dist ))
vector<Point2f> frame1;
vector<Point2f> frame2;
for( int i = 0; i < good_matches.size(); i++ )
{
//-- Get the keypoints from the good matches
frame1.push_back( keypoints_1[ good_matches[i].queryIdx ].pt );
frame2.push_back( keypoints_2[ good_matches[i].trainIdx ].pt );
}
Mat H = findHomography( Mat(frame1), Mat(frame2), CV_RANSAC );
cout << "Homography: " << H << endl;
/* warp the image */
Mat warpImage2;
warpPerspective(frm2, warpImage2,
H, Size(frm2.cols, frm2.rows), INTER_CUBIC);
Mat final(Size(frm2.cols*3 + frm1.cols, frm2.rows),CV_8UC3);
Mat roi1(final, Rect(frm1.cols, 0, frm1.cols, frm1.rows));
Mat roi2(final, Rect(2*frm1.cols, 0, frm2.cols, frm2.rows));
warpImage2.copyTo(roi2);
frm1.copyTo(roi1);
imshow("final", final);
What else should I do to make the stitching better?
Besides, is it reasonable to make the Homography matrix fixed instead of keeping computing it ?
What I mean is to specify the angle and the displacement between the 2 cameras by myself so as to derive a Homography matrix that satisfies what I want.
Thanks. :)
It sounds like you are going about this sensibly, but if you have access to both of the cameras, and they will remain stationary with respect to each other, then calibrating offline, and simply applying the transformation online will make your application more efficient.
One point to note is, you say you are using the findHomography function from OpenCV. From the documentation, this function:
Finds a perspective transformation between two planes.
However, your points are not restricted to a specific plane as they are imaging a 3D scene. If you wanted to calibrate offline, you could image a chessboard with both cameras, and the detected corners could be used in this function.
Alternatively, you may like to investigate the Fundamental matrix, which can be calculated with a similar function. This matrix describes the relative position of the cameras, but some work (and a good textbook) will be required to extract them.
If you can find it, I would strongly recommend having a look at Part II: "Two-View Geometry" in the book "Multiple View Geometry in computer vision", by Richard Hartley and Andrew Zisserman, which goes through the process in detail.
I have been working lately on image registration. My algorithm takes two images, calculates the SURF features, find correspondences, find homography matrix and then stitch both images together, I did it with the next code:
void stich(Mat base, Mat target,Mat homography, Mat& panorama){
Mat corners1(1, 4,CV_32F);
Mat corners2(1,4,CV_32F);
Mat corners(1,4,CV_32F);
vector<Mat> planes;
/* compute corners
of warped image
*/
corners1.at<float>(0,0)=0;
corners2.at<float>(0,0)=0;
corners1.at<float>(0,1)=0;
corners2.at<float>(0,1)=target.rows;
corners1.at<float>(0,2)=target.cols;
corners2.at<float>(0,2)=0;
corners1.at<float>(0,3)=target.cols;
corners2.at<float>(0,3)=target.rows;
planes.push_back(corners1);
planes.push_back(corners2);
merge(planes,corners);
perspectiveTransform(corners, corners, homography);
/* compute size of resulting
image and allocate memory
*/
double x_start = min( min( (double)corners.at<Vec2f>(0,0)[0], (double)corners.at<Vec2f> (0,1)[0]),0.0);
double x_end = max( max( (double)corners.at<Vec2f>(0,2)[0], (double)corners.at<Vec2f>(0,3)[0]), (double)base.cols);
double y_start = min( min( (double)corners.at<Vec2f>(0,0)[1], (double)corners.at<Vec2f>(0,2)[1]), 0.0);
double y_end = max( max( (double)corners.at<Vec2f>(0,1)[1], (double)corners.at<Vec2f>(0,3)[1]), (double)base.rows);
/*Creating image
with same channels, depth
as target
and proper size
*/
panorama.create(Size(x_end - x_start + 1, y_end - y_start + 1), target.depth());
planes.clear();
/*Planes should
have same n.channels
as target
*/
for (int i=0;i<target.channels();i++){
planes.push_back(panorama);
}
merge(planes,panorama);
// create translation matrix in order to copy both images to correct places
Mat T;
T=Mat::zeros(3,3,CV_64F);
T.at<double>(0,0)=1;
T.at<double>(1,1)=1;
T.at<double>(2,2)=1;
T.at<double>(0,2)=-x_start;
T.at<double>(1,2)=-y_start;
// copy base image to correct position within output image
warpPerspective(base, panorama, T,panorama.size(),INTER_LINEAR| CV_WARP_FILL_OUTLIERS);
// change homography to take necessary translation into account
gemm(T, homography,1,T,0,T);
// warp second image and copy it to output image
warpPerspective(target,panorama, T, panorama.size(),INTER_LINEAR);
//tidy
corners.release();
T.release();
}
Any question I will try

OpenCV - Image Stitching

I am using following code to stitch to input images. For an unknown
reason the output result is crap!
It seems that the homography matrix is wrong (or is affected wrongly)
because the transformed image is like an "exploited star"!
I have commented the part that I guess is the source of the problem
but I cannot realize it.
Any help or point is appriciated!
Have a nice day,
Ali
void Stitch2Image(IplImage *mImage1, IplImage *mImage2)
{
// Convert input images to gray
IplImage* gray1 = cvCreateImage(cvSize(mImage1->width, mImage1->height), 8, 1);
cvCvtColor(mImage1, gray1, CV_BGR2GRAY);
IplImage* gray2 = cvCreateImage(cvSize(mImage2->width, mImage2->height), 8, 1);
cvCvtColor(mImage2, gray2, CV_BGR2GRAY);
// Convert gray images to Mat
Mat img1(gray1);
Mat img2(gray2);
// Detect FAST keypoints and BRIEF features in the first image
FastFeatureDetector detector(50);
BriefDescriptorExtractor descriptorExtractor;
BruteForceMatcher<L1<uchar> > descriptorMatcher;
vector<KeyPoint> keypoints1;
detector.detect( img1, keypoints1 );
Mat descriptors1;
descriptorExtractor.compute( img1, keypoints1, descriptors1 );
/* Detect FAST keypoints and BRIEF features in the second image*/
vector<KeyPoint> keypoints2;
detector.detect( img1, keypoints2 );
Mat descriptors2;
descriptorExtractor.compute( img2, keypoints2, descriptors2 );
vector<DMatch> matches;
descriptorMatcher.match(descriptors1, descriptors2, matches);
if (matches.size()==0)
return;
vector<Point2f> points1, points2;
for(size_t q = 0; q < matches.size(); q++)
{
points1.push_back(keypoints1[matches[q].queryIdx].pt);
points2.push_back(keypoints2[matches[q].trainIdx].pt);
}
// Create the result image
result = cvCreateImage(cvSize(mImage2->width * 2, mImage2->height), 8, 3);
cvZero(result);
// Copy the second image in the result image
cvSetImageROI(result, cvRect(mImage2->width, 0, mImage2->width, mImage2->height));
cvCopy(mImage2, result);
cvResetImageROI(result);
// Create warp image
IplImage* warpImage = cvCloneImage(result);
cvZero(warpImage);
/************************** Is there anything wrong here!? *******************/
// Find homography matrix
Mat H = findHomography(Mat(points1), Mat(points2), 8, 3.0);
CvMat HH = H; // Is this line converted correctly?
// Transform warp image
cvWarpPerspective(mImage1, warpImage, &HH);
// Blend
blend(result, warpImage);
/*******************************************************************************/
cvReleaseImage(&gray1);
cvReleaseImage(&gray2);
cvReleaseImage(&warpImage);
}
This is what I would suggest you to try, in this order:
1) Use CV_RANSAC option for homography. Refer http://opencv.willowgarage.com/documentation/cpp/calib3d_camera_calibration_and_3d_reconstruction.html
2) Try other descriptors, particularly SIFT or SURF which ship with OpenCV. For some images FAST or BRIEF descriptors are not discriminating enough. EDIT (Aug '12): The ORB descriptors, which are based on BRIEF, are quite good and fast!
3) Try to look at the Homography matrix (step through in debug mode or print it) and see if it is consistent.
4) If above does not give you a clue, try to look at the matches that are formed. Is it matching one point in one image with a number of points in the other image? If so the problem again should be with the descriptors or the detector.
My hunch is that it is the descriptors (so 1) or 2) should fix it).
Also switch to Hamming distance instead of L1 distance in BruteForceMatcher. BRIEF descriptors are supposed to be compared using Hamming distance.
Your homography, might calculated based on wrong matches and thus represent bad allignment.
I suggest to path the matrix through additional check of interdependancy between its rows.
You can use the following code:
bool cvExtCheckTransformValid(const Mat& T){
// Check the shape of the matrix
if (T.empty())
return false;
if (T.rows != 3)
return false;
if (T.cols != 3)
return false;
// Check for linear dependency.
Mat tmp;
T.row(0).copyTo(tmp);
tmp /= T.row(1);
Scalar mean;
Scalar stddev;
meanStdDev(tmp,mean,stddev);
double X = abs(stddev[0]/mean[0]);
printf("std of H:%g\n",X);
if (X < 0.8)
return false;
return true;
}

Resources