OpenCV recoverPose returns few inliers (sometimes 0) for about 30% of cases - opencv

I try to triangulate key points around the camera but only a few points reach OpenCV method triangulatePoints() as inliers through pipeline. I use OpenCV 4.5.0.
My pipeline is:
i'm creating the feature detector using SURF, which detects the keypoints and computes the descriptors
detector_ = cv::xfeatures2d::SURF::create(400, 4, 2, false, false);
detector_->detectAndCompute(img, cv::noArray(), keyPoints, descriptors);
i'm matching the keypoints from two adjacent images using 2 candidates and filtering out those of them where two candidates are too close.
i'm using fundamental matrix to find essential because using of findEssentialMat leads to strange results at the end of the pipeline and i read somewhere about 5-points algorithm unreliability.
for (int i = 0; i < (int) goodMatches.size(); i++) {
prevSurvivors.push_back(keyPoints1[goodMatches[i].queryIdx].pt);
curSurvivors.push_back(keyPoints2[goodMatches[i].trainIdx].pt);
}
fundamental_matrix_ = cv::findFundamentalMat(prevSurvivors, curSurvivors, outputMask, cv::FM_RANSAC);
i'm removing those points from prevSurvivors and curSurvivors for which outputMask has 0. After that i'm calculating essential matrix
essential_matrix_ = INTRINSICS.t() * fundamental_matrix_ * INTRINSICS;
Finally i'm checking the rank of the essential matrix and calling the recoverPose method.
bool hasEssentialMatGoodRank = hasEssentialMatrixAppropriateRank();
if (hasEssentialMatGoodRank) {
outputMask.release();
cv::recoverPose(essential_matrix_, prevSurvivors, curSurvivors, INTRINSICS, R, t, outputMask);
}
What i see is the outputMask which may have 50 inliers for frames N and N+1, but 0 for frames N+1 and N+2. It breaks my pipeline and i can't understand why.
Frames N, N+1 and N+2
N and N+1
N+1 and N+2
The questions are:
what i'm doing wrong with pipeline or
which options, algorithms or methods i should change to get better results

Related

OpenCV: why solvepnp reprojection error(rMSE)'s definition is different from usual

According to the definition, RMSE should be.
However, I found the below code from the OpenCV solvepnp reprojection part. (github)
for (size_t i = 0; i < vec_rvecs.size(); i++)
{
std::vector<Point2d> projectedPoints;
projectPoints(objectPoints, vec_rvecs[i], vec_tvecs[i], cameraMatrix, distCoeffs, projectedPoints);
double rmse = norm(Mat(projectedPoints, false), imagePoints, NORM_L2) / sqrt(2*projectedPoints.size());
}
I suppose RMSE here is defined as follows.
I'm confused about the "2 n" part. It seems to me that opencv treats err_x and err_y as individual errors, thus there will be 2xn elements in total. why doesn't treat it as one element since .

Simple registration algorithm for small sets of 2D points

I am trying to find a simple algorithm to find the correspondence between two sets of 2D points (registration). One set contains the template of an object I'd like to find and the second set mostly contains points that belong to the object of interest, but it can be noisy (missing points as well as additional points that do not belong to the object). Both sets contain roughly 40 points in 2D. The second set is a homography of the first set (translation, rotation and perspective transform).
I am interested in finding an algorithm for registration in order to get the point-correspondence. I will be using this information to find the transform between the two sets (all of this in OpenCV).
Can anyone suggest an algorithm, library or small bit of code that could do the job? As I'm dealing with small sets, it does not have to be super optimized. Currently, my approach is a RANSAC-like algorithm:
Choose 4 random points from set 1 and from set 2.
Compute transform matrix H (using openCV getPerspective())
Warp 1st set of points using H and test how they aligned to the 2nd set of points
Repeat 1-3 N times and choose best transform according to some metric (e.g. sum of squares).
Any ideas? Thanks for your input.
With python you can use Open3D librarry, wich is very easy to install in Anaconda. To your purpose ICP should work fine, so we'll use the classical ICP, wich minimizes point-to-point distances between closest points in every iteration. Here is the code to register 2 clouds:
import numpy as np
import open3d as o3d
# Parameters:
initial_T = np.identity(4) # Initial transformation for ICP
distance = 0.1 # The threshold distance used for searching correspondences
(closest points between clouds). I'm setting it to 10 cm.
# Read your point clouds:
source = o3d.io.read_point_cloud("point_cloud_1.xyz")
target = o3d.io.read_point_cloud("point_cloud_0.xyz")
# Define the type of registration:
type = o3d.pipelines.registration.TransformationEstimationPointToPoint(False)
# "False" means rigid transformation, scale = 1
# Define the number of iterations (I'll use 100):
iterations = o3d.pipelines.registration.ICPConvergenceCriteria(max_iteration = 100)
# Do the registration:
result = o3d.pipelines.registration.registration_icp(source, target, distance, initial_T, type, iterations)
result is a class with 4 things: the transformation T(4x4), 2 metrict (rmse and fitness) and the set of correspondences.
To acess the transformation:
I used it a lot with 3D clouds obteined from Terrestrial Laser Scanners (TLS) and from robots (Velodiny LIDAR).
With MATLAB:
We'll use the point-to-point ICP again, because your data is 2D. Here is a minimum example with two point clouds random generated inside a triangle shape:
% Triangle vértices:
V1 = [-20, 0; -10, 10; 0, 0];
V2 = [-10, 0; 0, 10; 10, 0];
% Create clouds and show pair:
points = 5000
N1 = criar_nuvem_triangulo(V1,points);
N2 = criar_nuvem_triangulo(V2,points);
pcshowpair(N1,N2)
% Registrate pair N1->N2 and show:
[T,N1_tranformed,RMSE]=pcregistericp(N1,N2,'Metric','pointToPoint','MaxIterations',100);
pcshowpair(N1_tranformed,N2)
"criar_nuvem_triangulo" is a function to generate random point clouds inside a triangle:
function [cloud] = criar_nuvem_triangulo(V,N)
% Function wich creates 2D point clouds in triangle format using random
% points
% Parameters: V = Triangle vertices (3x2 Matrix)| N = Number of points
t = sqrt(rand(N, 1));
s = rand(N, 1);
P = (1 - t) * V(1, :) + bsxfun(#times, ((1 - s) * V(2, :) + s * V(3, :)), t);
points = [P,zeros(N,1)];
cloud = pointCloud(points)
end
results:
You may just use cv::findHomography. It is a RANSAC-based approach around cv::getPerspectiveTransform.
auto H = cv::findHomography(srcPoints, dstPoints, CV_RANSAC,3);
Where 3 is the reprojection threshold.
One traditional approach to solve your problem is by using point-set registration method when you don't have matching pair information. Point set registration is similar to method you are talking about.You can find matlab implementation here.
Thanks

OpenCV: Essential Matrix Decomposition

I am trying to extract Rotation matrix and Translation vector from the essential matrix.
<pre><code>
SVD svd(E,SVD::MODIFY_A);
Mat svd_u = svd.u;
Mat svd_vt = svd.vt;
Mat svd_w = svd.w;
Matx33d W(0,-1,0,
1,0,0,
0,0,1);
Mat_<double> R = svd_u * Mat(W).t() * svd_vt; //or svd_u * Mat(W) * svd_vt;
Mat_<double> t = svd_u.col(2); //or -svd_u.col(2)
</code></pre>
However, when I am using R and T (e.g. to obtain rectified images), the result does not seem to be right(black images or some obviously wrong outputs), even so I used different combination of possible R and T.
I suspected to E. According to the text books, my calculation is right if we have:
E = U*diag(1, 1, 0)*Vt
In my case svd.w which is supposed to be diag(1, 1, 0) [at least in term of a scale], is not so. Here is an example of my output:
svd.w = [21.47903827647813; 20.28555196246256; 5.167099204708699e-010]
Also, two of the eigenvalues of E should be equal and the third one should be zero. In the same case the result is:
eigenvalues of E = 0.0000 + 0.0000i, 0.3143 +20.8610i, 0.3143 -20.8610i
As you see, two of them are complex conjugates.
Now, the questions are:
Is the decomposition of E and calculation of R and T done in a right way?
If the calculation is right, why the internal rules of essential matrix are not satisfied by the results?
If everything about E, R, and T is fine, why the rectified images obtained by them are not correct?
I get E from fundamental matrix, which I suppose to be right. I draw epipolar lines on both the left and right images and they all pass through the related points (for all the 16 points used to calculate the fundamental matrix).
Any help would be appreciated.
Thanks!
I see two issues.
First, discounting the negligible value of the third diagonal term, your E is about 6% off the ideal one: err_percent = (21.48 - 20.29) / 20.29 * 100 . Sounds small, but translated in terms of pixel error it may be an altogether larger amount.
So I'd start by replacing E with the ideal one after SVD decomposition: Er = U * diag(1,1,0) * Vt.
Second, the textbook decomposition admits 4 solutions, only one of which is physically plausible (i.e. with 3D points in front of the camera). You may be hitting one of non-physical ones. See http://en.wikipedia.org/wiki/Essential_matrix#Determining_R_and_t_from_E .

How to access the nearest neighbours found by a cv::flann knnsearch?

Or is it even possible with flann ? Im not the most experienced coder, I also might just be overlooking something really basic (C++,OpenCV 2.4.3.)
The problem :
I have two pointclouds and want to calculate a displacement map. I am trying to use the flann .lib to get the nearest neighbour to a point in the first cloud from the points of the second cloud, and use them and the distance to calculate the displacement vector(s).
What I got so far is this:
int nn = 1;
cv::Mat MyIndex(data1.size(),3,CV_64FC1);
cv::Mat MyQuery(data2.size(),3,CV_64FC1);
cv::Mat indices(data2.size(),1,CV_32SC1);
cv::Mat distances(data2.size(),3,CV_64FC1);
cv::flann::Index_<double> NN_Index(MyIndex, cvflann::KDTreeIndexParams(4));
NN_Index.knnsearch(MyQuery,indices,distances,nn,cvflann::SearchParams(32));
It works as far as I can tell, I got the distances, I got the query points, I got the indices. But how do I get the actual points that got matched to my query points, from the indices ?
I looked through the flann.hpp but couldn't really find any hints. I messed arround a bit with MyIndex, NN_Index and the indices, but didn't get any useful results.
Try
for (int queryIdx = 0; queryIdx < MyQuery.rows; ++queryIdx) {
int dbIdx = indices.at<int>(queryIdx, 0);
std::cout<<"Query Idx:"<<queryIdx<<" matched to "<<"Database Idx:"<<dbIdx<<std::endl;
}

How to get most similar Eigenfaces or Fisherfaces in OpenCV?

I'm trying to find a measurement for the similarity of 2 faces. I use OpenCV. For that I train Eigenfaces / Fisherfaces with 1000 Photos of 1000 different people (so 1 Photo each person). So I also have 1000 labels in the training set.
Now I can use the predict method to get the most similar face.
I want to input 2 unknown face images to find if they are both similar to the same vector of faces in the training set.
Here is the code of openCV that returns the most similar label (with the lowest distance).
for(size_t sampleIdx = 0; sampleIdx < _projections.size(); sampleIdx++) {
double dist = norm(_projections[sampleIdx], q, NORM_L2);
if((dist < minDist) && (dist < _threshold)) {
minDist = dist;
minClass = _labels.at<int>((int)sampleIdx);
}
Questions:
Can anyone tell me how to rewrite this to output the top 10 faces and not just the top 1 ? I'm thinking about pushing them into a priority queue, but maybe there is something easier?!
In the training: should I put all the faces on the same label or on different labels? So should I have 1 label or 1000 ?
Cheers
Here's what I did. Note I'm really good at perl, really newb at C++ (in fact, this is my first c++ project!) so I output a lot to the command line and parsed it with perl.
I went to facerec.cpp as you did, and I changed the contents of the for loop to this:
for(size_t sampleIdx = 0; sampleIdx < _projections.size(); sampleIdx++) {
double dist = norm(_projections[sampleIdx], q, NORM_L2);
int labelClass = _labels.at<int>((int)sampleIdx);
cout << dist << " " << labelClass << endl;
if((dist < minDist) && (dist < _threshold)) {
minDist = dist;
minClass = _labels.at<int>((int)sampleIdx);
}
}
This now outputs the distance and label of every face. Since all the predict function appears to do is take the picture with the shortest distance (lowest number) and return that as the answer, you can now take the resulting list, sort it, and take the first 10 results. Or you can take the first ten labels or whatever. This just gives you access to all of the data rather than the first X results.
I also added
#include <iostream>
using namespace std;
to the top of the file so I could use cout.
Q1:: Since OpenCV doesn't provide a default function, you have to create your own by creating a vector which has distance and label. You can write your own function as below and store the distance and label in the vector. Here you need to rebuild the opencv.
virtual void predict(InputArray src, int &label, double &confidence, Vector <variable>) const = 0;

Resources