Detecting garbage homographies from findHomography in OpenCV? - opencv

I'm using findHomography on a list of points and sending the result to warpPerspective.
The problem is that sometimes the result is complete garbage and the resulting image is represented by weird gray rectangles.
How can I detect when findHomography sends me bad results?

There are several sanity tests you can perform on the output. On top of my head:
Compute the determinant of the homography, and see if it's too close to zero for comfort.
Even better, compute its SVD, and verify that the ratio of the first-to-last singular
value is sane (not too high). Either result will tell you whether the matrix is close to
singular.
Compute the images of the image corners and of its center (i.e. the points you get when
you apply the homography to those corners and center), and verify that they make sense,
i.e. are they inside the image canvas (if you expect them to be)? Are they well separated
from each other?
Plot in matlab/octave the output (data) points you fitted the homography to, along
with their computed values from the input ones, using the homography, and verify that they
are close (i.e. the error is low).
A common mistake that leads to garbage results is incorrect ordering of the lists of input and output points, that leads the fitting routine to work using wrong correspondences. Check that your indices are correct.

Understanding the degenerate homography cases is the key. You cannot get a good homography if your points are collinear or close to collinear, for example. Also, huge gray squares may indicate extreme scaling. Both cases may arise from the fact that there are very few inliers in your final homography calculation or the mapping is wrong.
To ensure that this never happens:
1. Make sure that points are well spread in both images.
2. Make sure that there are at least 10-30 correspondences (4 is enough if noise is small).
3. Make sure that points are correctly matched and the transformation is a homography.
To find bad homographies apply found H to your original points and see the separation from your expected points that is |x2-H*x1| < Tdist, where Tdist is your threshold for distance error. If there are only few points that satisfy this threshold your homography may be bad and you probably violated one of the above mentioned requirements.

But this depends on the point-correspondences you use to compute the homography...
Just think that you are trying to find a transformation that maps lines to lines (from one plane to another), so not any possible configuration of point-correspondences will give you an homography that creates nice images.
It is even possible that the homography maps some of the points to the infinity.

Related

How to 3d reconstruct robustly from multiple images with known poses in OpenCV

The traditional solution for high resolution images examples :
extract features (dense) for all images
match features to find tracks through images
triangulate features to 3d points.
I can give two problem here for my case (many 640*480 images with small movements between each others) , first: matching is very slow , especially if the number of images is big, so a better solution can be optical flow tracking.., but it's getting sparse with big moves, ( a mix could solve the problem !!)
second: triangulate tracks , though it is over-determined problem, I find it hard to code a solution, .. (here am asking for simplifying what I read in references )
I searched quite a bit for libraries in that direction, with no useful result.
again, I have ground truth camera matrices and need only 3d positions as first estimate (without BA),
A coded software solution can be of great help as I don't need to reinvent the wheel, though a detailed instructions maybe helpful
this basically shows the underlying geometry for estimating the depth.
As you said, we have camera pose Q, and we are picking a point X from world, X_L is it's projection on left image, now, with Q_L, Q_R and X_L, we are able to make up this green colored epipolar plane, the rest job is easy, we search through points on line (Q_L, X), this line exactly describe the depth of X_L, with different assumptions: X1, X2,..., we can get different projections on the right image
Now we compare the pixel intensity difference from X_L and the reprojected point on right image, just pick the smallest one and that corresponding depth is exactly what we want.
Pretty easy hey? Truth is it's way harder, image is never strictly convex:
This makes our matching extremely hard, since the non-convex function will result any distance function have multiple critical points (candidate matches), how do you decide which one is the correct one?
However, people proposed path based match to handle this problem, methods like: SAD, SSD, NCC, they are introduced to create the distance function as convex as possible, still, they are unable to handle large scale repeated texture problem and low texture problem.
To solve this, people start to search over a long range in the epipolar line, and suddenly found that we can describe this whole distribution of matching metrics into a distance along the depth.
The horizontal axis is depth, and the vertical axis is matching metric score, and this illustration lead us found the depth filter, and we usually describe this distribution with gaussian, aka, gaussian depth filter, and use this filter to discribe the uncertainty of depth, combined with the patch matching method, we can roughly get a proposal.
Now what, let's use some optimization tools, like GN or gradient descent to finally refine the depth estimaiton.
To sum up, the total process of the depth estimation is like the following steps:
assume all depth in all pixel following a initial gaussian distribution
start search through epipolar line and reproject points into target frame
triangulate depth and calculate the uncertainty of the depth from depth filter
run 2 and 3 again to get a new depth distribution and merge with previous one, if they converged then break, ortherwise start again from 2.

How to improve the homography accuracy?

I used OpenCV's cv::findHomography API to calculate the homography matrix of two planar images.
The matched key points are extracted by SIFT and matched by BFMatcher. As I know, cv:findHomography use RANSAC iteration to find out the best four corresponding points to get the homography matrix.
So I draw the selected four pairs of points with the calculated contour using homograhy matrix of the edge of the object.
The result are as the links:
https://postimg.cc/image/5igwvfrx9/
As we can see, the selected matched points by RANSAC are correct, but the contour shows that the homography is not accurate.
But these test shows that, both the selected matched points and the homography are correct:
https://postimg.cc/image/dvjnvtm53/
My guess is that if the selected matched points are too close, the small error of the pixel position will lead to the significant error of the homography matrix. If the four points are in the corner of the image, then the shift of the matched points by 4-6 pixels still got good homography matrix.
(According the homogenous coordinate, I think it is reasonable, as the small error in the near plane will be amplified in the far away)
My question is:
1.Is my guess right?
2.Since the four matched points are generated by the RANSAC iteration, the overall error of all the keypoints are minimal. But How to get the stable homography, at least making the contour's mapping is correct? The theory proved that if the four corresponding points in a plane are found, the homography matrix should be calculated, but is there any trick in the engineer work?
I think you're right, and the proximity of the 4 points does not help the accuracy of the result. What you observe is maybe induced by numerical issues: the result may be locally correct for these 4 points but becomes worse when going further.
However, RANSAC will not help you here. The reason is simple: RANSAC is a robust estimation procedure that was designed to find the best point pairs among many correspondences (including some wrong ones). Then, in the inner loop of the RANSAC, a standard homography estimation is performed.
You can see RANSAC as a way to reject wrong point correspondences that would provoke a bad result.
Back to your problem:
What you really need is to have more points. In your examples, you use only 4 point correspondences, which is just enough to estimate an homography.
You will improve your result by providing more matches all over the target image. The problem then becomes over-determined, but a least squares solution can still be found by OpenCV. Furthermore, of there is some error either in the point correspondence process or in some point localization, RANSAC will be able to select the best ones and still give you a reliable result.
If RANSAC results in overfitting on some 4 points (as it seems to be the case in your example), try to relax the constraint by increasing the ransacReprojThreshold parameter.
Alternatively, you can either:
use a different estimator (the robust median CV_LMEDS is a good choice if there are few matching errors)
or use RANSAC in a first step with a large reprojection error (to get a rough estimate) in order to detect the spurious matchings then use LMEDS on the correct ones.
Just to extend #sansuiso's answer, with which I agree:
If you provide around 100 correspondences to RANSAC, probably you are getting more than 4 inliers from cvFindHomography. Check the status output parameter.
To obtain a good homography, you should have many more than 4 correspondences (note that 4 correspondences gives you an homography always), which are well distributed around the image and which are not linear. You can actually use a minimum number of inliers to decide whether the homography obtained is good enough.
Note that RANSAC finds a set of points that are consistent, but the way it has to say that that set is the best one (the reprojection error) is a bit limited. There is a RANSAC-like method, called MSAC, that uses a slightly different error measurement, check it out.
The bad news, in my experience, is that it is little likely to obtain a 100% precision homography most of the times. If you have several similar frames, it is possible that you see that homography changes a little between them.
There are tricks to improve this. For example, after obtaining a homography with RANSAC, you can use it to project your model into the image, and look for new correspondences, so you can find another homography that should be more accurate.
Your target has a lot of symmetric and similar elements. As other people mentioned (and you clarified later) the point spacing and point number can be a problem. Another problem is that SIFT is not designed to deal with significant perspective distortions that are present in your case. Try to track your object through smaller rotations and as was mentioned reproject it using the latest homography to make it look as close as possible to the original. This will also allow you to skip processing heavy SIFT and to use something as lightweight as FAST with cross correlation of image patches for matching.
You also may eventually come to understanding that using points is not enough. You have to use all that you got and this means lines or conics. If a homography transforms a point Pb = H* Pa it is easy to verify that in homogeneous coordinates line Lb = Henv.transposed * La. this directly follows from the equation La’.Pa = 0 = La’ * Hinv * H * Pa = La’ * Hinv * Pb = Lb’.Pb
The possible min. configurations is 1 line and three points or three lines and one point. Two lines and two points doesn’t work. You can use four lines or four points as well. Of course this means that you cannot use the openCV function anymore and has to write your own DLT and then non-linear optimization.

obtaining 2d-3d point correspondences for pnp or posit

I am trying to estimate the pose and position of a satellite given an image of it. I have a 3D model of the satellite. Using either PnP solvers or POSIT works great when I pick out the point correspondences myself, however I need to to find a method to match the points up automatically. Using a corner detector (best one I found so far is based on the contour) I can find all the relevant points in the image in addition a few spurious points. However I need to match a given point in the image to the correct point in the 3D model. The articles I have read on the subject always seem to assume that we have found the point pairs without going into details about how to do so.
Is there any approach usually taken that can determine these correspondences based on some invariant features? Or should i resort to a different method not based on corner points?
You can have a look at the SoftPOSIT algorithm, which determines 3D-2D correspondences and then executes POSIT algorithm. As far as I know Matlab code is available for SoftPOSIT.
ou have to do PnP with RANSAC, see openCV code solvePnPRansac(). This method can tolerate a high percent of mismatches so you don't need to be precise with all your matches but just have a certain percent of correct ones (even as low as 30%). Of course the min number of right correspondences is 4.
Speaking of invariant features - if the amount of rotation between neighbouring frame is small you don't need to use invariant features. Even a small patch of with grey intensities would suffice to find a match. The only problem is that you have to update your descriptor or even choose a different feature point on your model depending on the model rotation. The latter may be hard to do since you have to know 3D coordinate of every feature.

Opencv FindCornerSubPix precision

I need to detect corners on grayscale images with the highest possible accuracy. Currently I am using the OpenCV function: cvFindCornerSubPix().
I prepared a simple test: got an image with a corner of black/white edges:
and then a series of the same object, moved 1/16pixel each. I did check pixel values manually, test images are just fine.
Detection results were disappointing:
Even though in TermCrit the condition is set to 100 iterations or 0.005 threshold, detection error gets as big as 0.08 pixel.
The graph shows error as a function of position within a pixel. Does not look random at all. Another thing worth a note: for other anular positions of the corner (when edges are not horizontal/vertical) results are better, but still not perfect.
Any idea, how to make this function work properly, why it does not, or what to use instead?
I would greatly appreciate any advice
Less than 10% of a pixel really isn't bad performance at all. For reference, a correlation peak detector suitable for the production of 3D model from satellite images will have the same order of magnitude of error.
As pointed out in the comments, the exact error pattern will depend on the interpolation method that you use to generate the subpixel pattern. In order to avoid the non-monotonicity introduced by higher-order interpolation methods (beyond order 2), I would suggest the following protocol:
Generate you input image in a high-res, 16 times bigger;
Move your target by 1 pixel at a time in this HR image;
Produce your test images by downsampling (careful: apply an appropriate blurring function like a PSF first if you go for brutal downsampling in order to avoid aliasing) to the correct size.
Finally, it is often not desirable to go to a smaller error magnitude. The subpixel corner detector was designed to be used in images where many (typically between 20 and 100) points are detected. These points are then used in a robust estimation process that should remove outliers and average the error on the valid point sets.

How to compare two contours of a binary pattern image?

I'm creating a part scanner in C that pulls all possibilities for scanned parts as images in a directory. My code currently fetches all images from that directory and dumps them into a vector. I then produce groups of contours for all the images. The program then falls into a while loop where it constantly grabs images from a webcam, and generates contours for those as well. I have set up a jig for the part to rest on, so orientation and size are not a concern, however I don't want to have to calibrate the machine, so there may be movement between the template images and the part images taken.
What is the best way to compare the contours? I have tried several methods including matchTemplate without contours, but if you take a look at the two parts below, you can see that these two are very close to each other, so matchShapes and matchTemplate can't distinguish between them the way I was using them. I'm also not sure how to use cvMatchShapes. It works with just loading the images directly into match shapes, but the results are inconclusive. I think that contours is the way to go, I'm just not sure of how to go about implementing the comparison phase. Any help would be great.
You can view the templates here: http://www.cryogendesign.com/partDetection.html"
If you are ready for do-it-yourself, one approach could be to compute a "distance image" (assign every pixel the smallest Euclidean distance to the contour taken as the reference). See http://en.wikipedia.org/wiki/Distance_transform.
Using this distance image, you can quickly compute the average distance of a new contour to the reference one (for every contour pixel, get the distance from the distance image). The average distance gives you an indication of the goodness-of-fit and will let you find the best match to a set of reference templates.
If the parts have some moving freedom, the situation is a bit harder: before computing the average distance, you must fit the new contour to the reference one. You will need to apply a suitable transform (translation, rotation, possibly scaling), and find the parameters that will minimize... the average distance.
You can calculate the chamfer distance between the two contours:
T and E are the set of edges of the template and the image and x is the point of reference where you start to compare the two set of edges. So for each x you get a different value.
DT is the distance transform of an image. Matlab provides the algorithm here.
If you want a more detailed version of how to calculate the chamfer distance, take a look here.

Resources