Can I use approxPolyDP to improve the people detection? - opencv

Can I use approxPolyDP to improve the people detection?
I am trying to detect people using BackgroundSubtractorMOG2. So after I receive the foreground of the image I obtain all the contours of the image using this function:
Imgproc.findContours(contourImg, contours, hierarchy, Imgproc.RETR_EXTERNAL, Imgproc.CHAIN_APPROX_NONE);
I iterate each element of contours variable and if the contour has a specified contour area => it's a person
My question is if I can detect the people shapes better ussing approxPolyDP.
Is it possible? If so can you tell me how?
I already used CLOSE morphological operation before finding counturs

My question is if I can detect the people shapes better ussing approxPolyDP.
Although "better" is somewhat ambiguous, you could improve your classification by using that method. From the docs we can see:
The functions approxPolyDP approximate a curve or a polygon with another curve/polygon with less vertices so that the distance between them is less or equal to the specified precision.
That "precision" refers to the epsilon parameter, which stands for the "maximum distance between the original curve and its approximation" (also from the docs). It is basically an accuracy parameter obtained from the arc length (the lower the more precise the contour will be).
We can see from this tutorial that a way to achieve this is:
epsilon = 0.1*cv2.arcLength(contour,True)
approx = cv2.approxPolyDP(contour,epsilon,True)
resulting in a better approximation for the contour. In the example in that tutorial they achieve an optimal contour using 1% of arc length, although you should carefully select this percentage for your specific situation.
By using these procedures you will surely obtain higher precision on your contour areas, which will better enable you to correctly classify people from other objects that have similar areas. You will also have to modify your classification criterion (the >= some_area) accordingly to correctly discriminate between people and non-people objects now that you have a more precise area.

Related

Finding Homography v/s Contour detection

The problem is to detect a known rectangular object in an image.
Which of the following is computationally less expensive:
Finding homography - For finding homography, we use the template of the known object to do feature matching.
Contour detection - We try to detect the biggest contour in the image. In this particular case we assume that the biggest contour will correspond to the known rectangular object we are trying to find.
In both the cases we do perspective transform after detecting the object to set the perspective.
NOTE: We are using Open-CV functions to find the homography and detecting contour.
You should try finding the biggest contour. It's the simplest and will be far faster. You needs to detect Canny edges then find contours and find the one with the biggest area. However, it can fail if contours are unclear or if there is a bigger object as it doesn't consider shape. You can also apply both of your ideas to get better results.
EDIT:
To reply your comment, you have Canny edge + find contours + find biggest against find features + match features
I think that the first combination is less computationally expensive. Moreover, there is a good implementation of squares/rectangle detection here.
However, if the contours of the rectangle are not clear, and if moreover the rectangle is highly textured, you should get better results with features matching.

Confusion regarding Object recognition and features using SURF

I have some conceptual issues in understanding SURF and SIFT algorithm All about SURF. As far as my understanding goes SURF finds Laplacian of Gaussians and SIFT operates on difference of Gaussians. It then constructs a 64-variable vector around it to extract the features. I have applied this CODE.
(Q1 ) So, what forms the features?
(Q2) We initialize the algorithm using SurfFeatureDetector detector(500). So, does this means that the size of the feature space is 500?
(Q3) The output of SURF Good_Matches gives matches between Keypoint1 and Keypoint2 and by tuning the number of matches we can conclude that if the object has been found/detected or not. What is meant by KeyPoints ? Do these store the features ?
(Q4) I need to do object recognition application. In the code, it appears that the algorithm can recognize the book. So, it can be applied for object recognition. I was under the impression that SURF can be used to differentiate objects based on color and shape. But, SURF and SIFT find the corner edge detection, so there is no point in using color images as training samples since they will be converted to gray scale. There is no option of using colors or HSV in these algorithms, unless I compute the keypoints for each channel separately, which is a different area of research (Evaluating Color Descriptors for Object and Scene Recognition).
So, how can I detect and recognize objects based on their color, shape? I think I can use SURF for differentiating objects based on their shape. Say, for instance I have a 2 books and a bottle. I need to only recognize a single book out of the entire objects. But, as soon as there are other similar shaped objects in the scene, SURF gives lots of false positives. I shall appreciate suggestions on what methods to apply for my application.
The local maxima (response of the DoG which is greater (smaller) than responses of the neighbour pixels about the point, upper and lover image in pyramid -- 3x3x3 neighbourhood) forms the coordinates of the feature (circle) center. The radius of the circle is level of the pyramid.
It is Hessian threshold. It means that you would take only maximas (see 1) with values bigger than threshold. Bigger threshold lead to the less number of features, but stability of features is better and visa versa.
Keypoint == feature. In OpenCV Keypoint is the structure to store features.
No, SURF is good for comparison of the textured objects but not for shape and color. For the shape I recommend to use MSER (but not OpenCV one), Canny edge detector, not local features. This presentation might be useful

How to improve the homography accuracy?

I used OpenCV's cv::findHomography API to calculate the homography matrix of two planar images.
The matched key points are extracted by SIFT and matched by BFMatcher. As I know, cv:findHomography use RANSAC iteration to find out the best four corresponding points to get the homography matrix.
So I draw the selected four pairs of points with the calculated contour using homograhy matrix of the edge of the object.
The result are as the links:
https://postimg.cc/image/5igwvfrx9/
As we can see, the selected matched points by RANSAC are correct, but the contour shows that the homography is not accurate.
But these test shows that, both the selected matched points and the homography are correct:
https://postimg.cc/image/dvjnvtm53/
My guess is that if the selected matched points are too close, the small error of the pixel position will lead to the significant error of the homography matrix. If the four points are in the corner of the image, then the shift of the matched points by 4-6 pixels still got good homography matrix.
(According the homogenous coordinate, I think it is reasonable, as the small error in the near plane will be amplified in the far away)
My question is:
1.Is my guess right?
2.Since the four matched points are generated by the RANSAC iteration, the overall error of all the keypoints are minimal. But How to get the stable homography, at least making the contour's mapping is correct? The theory proved that if the four corresponding points in a plane are found, the homography matrix should be calculated, but is there any trick in the engineer work?
I think you're right, and the proximity of the 4 points does not help the accuracy of the result. What you observe is maybe induced by numerical issues: the result may be locally correct for these 4 points but becomes worse when going further.
However, RANSAC will not help you here. The reason is simple: RANSAC is a robust estimation procedure that was designed to find the best point pairs among many correspondences (including some wrong ones). Then, in the inner loop of the RANSAC, a standard homography estimation is performed.
You can see RANSAC as a way to reject wrong point correspondences that would provoke a bad result.
Back to your problem:
What you really need is to have more points. In your examples, you use only 4 point correspondences, which is just enough to estimate an homography.
You will improve your result by providing more matches all over the target image. The problem then becomes over-determined, but a least squares solution can still be found by OpenCV. Furthermore, of there is some error either in the point correspondence process or in some point localization, RANSAC will be able to select the best ones and still give you a reliable result.
If RANSAC results in overfitting on some 4 points (as it seems to be the case in your example), try to relax the constraint by increasing the ransacReprojThreshold parameter.
Alternatively, you can either:
use a different estimator (the robust median CV_LMEDS is a good choice if there are few matching errors)
or use RANSAC in a first step with a large reprojection error (to get a rough estimate) in order to detect the spurious matchings then use LMEDS on the correct ones.
Just to extend #sansuiso's answer, with which I agree:
If you provide around 100 correspondences to RANSAC, probably you are getting more than 4 inliers from cvFindHomography. Check the status output parameter.
To obtain a good homography, you should have many more than 4 correspondences (note that 4 correspondences gives you an homography always), which are well distributed around the image and which are not linear. You can actually use a minimum number of inliers to decide whether the homography obtained is good enough.
Note that RANSAC finds a set of points that are consistent, but the way it has to say that that set is the best one (the reprojection error) is a bit limited. There is a RANSAC-like method, called MSAC, that uses a slightly different error measurement, check it out.
The bad news, in my experience, is that it is little likely to obtain a 100% precision homography most of the times. If you have several similar frames, it is possible that you see that homography changes a little between them.
There are tricks to improve this. For example, after obtaining a homography with RANSAC, you can use it to project your model into the image, and look for new correspondences, so you can find another homography that should be more accurate.
Your target has a lot of symmetric and similar elements. As other people mentioned (and you clarified later) the point spacing and point number can be a problem. Another problem is that SIFT is not designed to deal with significant perspective distortions that are present in your case. Try to track your object through smaller rotations and as was mentioned reproject it using the latest homography to make it look as close as possible to the original. This will also allow you to skip processing heavy SIFT and to use something as lightweight as FAST with cross correlation of image patches for matching.
You also may eventually come to understanding that using points is not enough. You have to use all that you got and this means lines or conics. If a homography transforms a point Pb = H* Pa it is easy to verify that in homogeneous coordinates line Lb = Henv.transposed * La. this directly follows from the equation La’.Pa = 0 = La’ * Hinv * H * Pa = La’ * Hinv * Pb = Lb’.Pb
The possible min. configurations is 1 line and three points or three lines and one point. Two lines and two points doesn’t work. You can use four lines or four points as well. Of course this means that you cannot use the openCV function anymore and has to write your own DLT and then non-linear optimization.

obtaining 2d-3d point correspondences for pnp or posit

I am trying to estimate the pose and position of a satellite given an image of it. I have a 3D model of the satellite. Using either PnP solvers or POSIT works great when I pick out the point correspondences myself, however I need to to find a method to match the points up automatically. Using a corner detector (best one I found so far is based on the contour) I can find all the relevant points in the image in addition a few spurious points. However I need to match a given point in the image to the correct point in the 3D model. The articles I have read on the subject always seem to assume that we have found the point pairs without going into details about how to do so.
Is there any approach usually taken that can determine these correspondences based on some invariant features? Or should i resort to a different method not based on corner points?
You can have a look at the SoftPOSIT algorithm, which determines 3D-2D correspondences and then executes POSIT algorithm. As far as I know Matlab code is available for SoftPOSIT.
ou have to do PnP with RANSAC, see openCV code solvePnPRansac(). This method can tolerate a high percent of mismatches so you don't need to be precise with all your matches but just have a certain percent of correct ones (even as low as 30%). Of course the min number of right correspondences is 4.
Speaking of invariant features - if the amount of rotation between neighbouring frame is small you don't need to use invariant features. Even a small patch of with grey intensities would suffice to find a match. The only problem is that you have to update your descriptor or even choose a different feature point on your model depending on the model rotation. The latter may be hard to do since you have to know 3D coordinate of every feature.

Metric for ellipse fitting in OpenCV

OpenCV has a nice in-built ellipse-fitting algorithm called fitEllipse(const Mat& points)
However, it has some major shortcomings, limiting its usefulness. For example, it already requires selected points, so I already have to do a feature extraction myself. HoughCircles detects circles on a given image, pity there is no HoughEllipses.
The other major shortcoming, which stands in the center of my question, is that it does no provide any metric about how accurate the fitting was. It returns an ellipse which best fits the given points, even if the shape does not even remotely look like an ellipse. Is there a way to get the estimated error from the algorithm? I would like to use it as a threshold to filter out shapes which are not even close to be considered ellipses.
I asked this, because maybe there is a simple solution before I try to reinvent the wheel and write my very own fitEllipse function.
If you don't mind getting your hands dirty, you could actually modify the source code for fitEllipse(). The fitEllipse() function uses least-squares to determine the likely ellipses, and the least-squares solution is a tangible distance metric, which is what you want.
If that is something you're willing to do, it would be a very simple code change. Simply add a float whose value is passed back after the function call, where the float stores the current best least-squares value.
fitEllipse gives you the ellipse as a cv::RotatedRect and so you know the angle of rotation of the ellipse, its center and its two axes.
You can compute the sum of the square of the distances between your points and the ellipse, that sum is the metric you are looking for.
The distance between a point and an ellipse is described here http://www.geometrictools.com/Documentation/DistancePointEllipseEllipsoid.pdf and the code is here http://www.geometrictools.com/GTEngine/Include/Mathematics/GteDistPointHyperellipsoid.h
You need to go from OpenCV cv::RotatedRect to Ellipse2 of Geometric Tools Engine and then you can compute the distance.
Why don't you do a findContours() to reduce the memory space required? There's your selected points structure right there. If you want to further simplify you can run a ConvexHull() or ApproxPoly() on that. Fit the ellipse to those points, and then I suppose you can check similarity between the two structures to get some kind of estimate. A difference operator between the two Mats would be a (very) rough estimate?
Depending on the application, you might be able to use CAMShift (or mean shift), which fits an ellipse to a region with similar colors.

Resources