OpenCV: why solvepnp reprojection error(rMSE)'s definition is different from usual - opencv

According to the definition, RMSE should be.
However, I found the below code from the OpenCV solvepnp reprojection part. (github)
for (size_t i = 0; i < vec_rvecs.size(); i++)
{
std::vector<Point2d> projectedPoints;
projectPoints(objectPoints, vec_rvecs[i], vec_tvecs[i], cameraMatrix, distCoeffs, projectedPoints);
double rmse = norm(Mat(projectedPoints, false), imagePoints, NORM_L2) / sqrt(2*projectedPoints.size());
}
I suppose RMSE here is defined as follows.
I'm confused about the "2 n" part. It seems to me that opencv treats err_x and err_y as individual errors, thus there will be 2xn elements in total. why doesn't treat it as one element since .

Related

OpenCV recoverPose returns few inliers (sometimes 0) for about 30% of cases

I try to triangulate key points around the camera but only a few points reach OpenCV method triangulatePoints() as inliers through pipeline. I use OpenCV 4.5.0.
My pipeline is:
i'm creating the feature detector using SURF, which detects the keypoints and computes the descriptors
detector_ = cv::xfeatures2d::SURF::create(400, 4, 2, false, false);
detector_->detectAndCompute(img, cv::noArray(), keyPoints, descriptors);
i'm matching the keypoints from two adjacent images using 2 candidates and filtering out those of them where two candidates are too close.
i'm using fundamental matrix to find essential because using of findEssentialMat leads to strange results at the end of the pipeline and i read somewhere about 5-points algorithm unreliability.
for (int i = 0; i < (int) goodMatches.size(); i++) {
prevSurvivors.push_back(keyPoints1[goodMatches[i].queryIdx].pt);
curSurvivors.push_back(keyPoints2[goodMatches[i].trainIdx].pt);
}
fundamental_matrix_ = cv::findFundamentalMat(prevSurvivors, curSurvivors, outputMask, cv::FM_RANSAC);
i'm removing those points from prevSurvivors and curSurvivors for which outputMask has 0. After that i'm calculating essential matrix
essential_matrix_ = INTRINSICS.t() * fundamental_matrix_ * INTRINSICS;
Finally i'm checking the rank of the essential matrix and calling the recoverPose method.
bool hasEssentialMatGoodRank = hasEssentialMatrixAppropriateRank();
if (hasEssentialMatGoodRank) {
outputMask.release();
cv::recoverPose(essential_matrix_, prevSurvivors, curSurvivors, INTRINSICS, R, t, outputMask);
}
What i see is the outputMask which may have 50 inliers for frames N and N+1, but 0 for frames N+1 and N+2. It breaks my pipeline and i can't understand why.
Frames N, N+1 and N+2
N and N+1
N+1 and N+2
The questions are:
what i'm doing wrong with pipeline or
which options, algorithms or methods i should change to get better results

Homography matrix in Opencv?

In LATCH_match.cpp in opencv_3.1.0 the homography matrix is defined and used as:
Mat homography;
FileStorage fs("../data/H1to3p.xml", FileStorage::READ);
...
fs.getFirstTopLevelNode() >> homography;
...
Mat col = Mat::ones(3, 1, CV_64F);
col.at<double>(0) = matched1[i].pt.x;
col.at<double>(1) = matched1[i].pt.y;
col = homography * col;
...
Why H1to3p.xml is:
<opencv_storage><H13 type_id="opencv-matrix"><rows>3</rows><cols>3</cols><dt>d</dt><data>
7.6285898e-01 -2.9922929e-01 2.2567123e+02
3.3443473e-01 1.0143901e+00 -7.6999973e+01
3.4663091e-04 -1.4364524e-05 1.0000000e+00 </data></H13></opencv_storage>
With which criteria these numbers were chosen? They can be used for any other homography test for filtering keypoints (as in LATCH_match.cpp)?
I assume that your "LATCH_match.cpp in opencv_3.1.0" is
https://github.com/Itseez/opencv/blob/3.1.0/samples/cpp/tutorial_code/xfeatures2D/LATCH_match.cpp
In that file, you find:
// If you find this code useful, please add a reference to the following paper in your work:
// Gil Levi and Tal Hassner, "LATCH: Learned Arrangements of Three Patch Codes", arXiv preprint arXiv:1501.03719, 15 Jan. 2015
And so, looking at http://arxiv.org/pdf/1501.03719v1.pdf you will find
For each set, we compare the first image against each of the remaining
five and check for correspondences. Performance is measured using the
code from [16, 17]1 , which computes recall and 1-precision
using known ground truth homographies between the images.
I think that the image ../data/graf1.png is https://github.com/Itseez/opencv/blob/3.1.0/samples/data/graf1.png that I show here:
According to the comment Homography matrix in Opencv? by Catree the original dataset is at http://www.robots.ox.ac.uk/~vgg/research/affine/det_eval_files/graf.tar.gz where it is said that
Homographies between image pairs included.
So I think that the homography stored in file ../data/H1to3p.xml is the homography between image 1 and image 3.

how to predict second class which is more close to test data

I am using opencv-2.4 (CvSVM) for classification. For each test data it is predicting one class as predicted output. But I need to find the next class which is more close to the test data.
Is there any way to find that in opencv SVM classifier ??
Unfortunately, you can not do it directly with the current interface.
One solution would be to use the library libsvm instead.
You may do it in opencv, but it will require a little bit of work.
First, you must know that OpenCV uses a "1-against-1" strategy for multi-class classification.
For a N-class problem, it will train N*(N-1)/2 binary classifier (one for each couple of classes), and then uses a majority vote to choose the most probable class.
You will have to apply each classifier, and do the majority yourself to get what you want.
The code below show you how to do that with OpenCV 3 (warning: it is untested, probably contains errors, but it gives you a good starting point).
Ptr<SVM> svm;
int N; //number of classes
Mat data; //input data to classify
Mat sv=svm->getSupportVectors();
Ptr<Kernel> kernel=svm->getKernel();
Mat buffer(1,sv.rows,CV_32F);
kernel->calc(sv.rows, sv.cols , sv.ptr<float>(), data.ptr<float>(), buffer.ptr<float>()); // apply kernel on data (CV_32F vector) and support vectors
Mat alpha, svidx;
vector<int> votes(N, 0); // results of majority vote will be stored here
int i, j, dfi;
for( i = dfi = 0; i < N; i++ )
{
for( j = i+1; j < N; j++, dfi++ )
{
// compute score for each binary svm
double rho=svm->getDecisionFunction(dfi, alpha, svidx);
double sum = -rho;
for( k = 0; k < sv.rows; k++ )
sum += alpha.at<float>(k)*buffer.at<float>(svidx.at<int>(k));
// majority vote
votes[sum > 0 ? i : j]++;
}
}
Edit: This code is adapted from the internal code of Opencv here.
It is incorrect, as pointed out by David Doria in the comments, since there is no getKernel function defined in the SVM class. I still leave it here, since it should'nt be too difficult to modify the internal OpenCV code to add it, and there is apparently no other way to do it.

How to get most similar Eigenfaces or Fisherfaces in OpenCV?

I'm trying to find a measurement for the similarity of 2 faces. I use OpenCV. For that I train Eigenfaces / Fisherfaces with 1000 Photos of 1000 different people (so 1 Photo each person). So I also have 1000 labels in the training set.
Now I can use the predict method to get the most similar face.
I want to input 2 unknown face images to find if they are both similar to the same vector of faces in the training set.
Here is the code of openCV that returns the most similar label (with the lowest distance).
for(size_t sampleIdx = 0; sampleIdx < _projections.size(); sampleIdx++) {
double dist = norm(_projections[sampleIdx], q, NORM_L2);
if((dist < minDist) && (dist < _threshold)) {
minDist = dist;
minClass = _labels.at<int>((int)sampleIdx);
}
Questions:
Can anyone tell me how to rewrite this to output the top 10 faces and not just the top 1 ? I'm thinking about pushing them into a priority queue, but maybe there is something easier?!
In the training: should I put all the faces on the same label or on different labels? So should I have 1 label or 1000 ?
Cheers
Here's what I did. Note I'm really good at perl, really newb at C++ (in fact, this is my first c++ project!) so I output a lot to the command line and parsed it with perl.
I went to facerec.cpp as you did, and I changed the contents of the for loop to this:
for(size_t sampleIdx = 0; sampleIdx < _projections.size(); sampleIdx++) {
double dist = norm(_projections[sampleIdx], q, NORM_L2);
int labelClass = _labels.at<int>((int)sampleIdx);
cout << dist << " " << labelClass << endl;
if((dist < minDist) && (dist < _threshold)) {
minDist = dist;
minClass = _labels.at<int>((int)sampleIdx);
}
}
This now outputs the distance and label of every face. Since all the predict function appears to do is take the picture with the shortest distance (lowest number) and return that as the answer, you can now take the resulting list, sort it, and take the first 10 results. Or you can take the first ten labels or whatever. This just gives you access to all of the data rather than the first X results.
I also added
#include <iostream>
using namespace std;
to the top of the file so I could use cout.
Q1:: Since OpenCV doesn't provide a default function, you have to create your own by creating a vector which has distance and label. You can write your own function as below and store the distance and label in the vector. Here you need to rebuild the opencv.
virtual void predict(InputArray src, int &label, double &confidence, Vector <variable>) const = 0;

cvPerspectiveTransform: What am I supposed to provide?

I'm trying to use cvPerspectiveTransform to transform four 2D points. I got the transformation matrix (3x3) already through cvFindHomography. I can't figure out what kind of structure to provide to not run into some error.
Would anybody be so kind to show me how to do it with these points?
x:y
0:0
640:0
0:480
640:480
I'm using OpenCV 2.4.0 on Win.
This is one way to initialize your matrices correctly. It's probably not the most elegant, but it works:
CvMat* input = cvCreateMat(1, 4, CV_32FC2);
CvMat* output = cvCreateMat(1, 4, CV_32FC2);
float data[8] = {0,0,0,640,480,0,640,480};
for (int i =0; i < 8; i++)
{
input->data.fl[i] = data[i];
}
cvPerspectiveTransform(input, output, matrix_from_cvFindHomography);
The C++ API offers a more intuitive implementation. Many OpenCV functions, like perspectiveTransform, accept vectors of points as inputs, which can be initialized in this manner:
std::vector<cv::Point2f> inputs;
std::vector<cv::Point2f> outputs;
inputs.push_back(cv::Point2f(0,0));
inputs.push_back(cv::Point2f(640,0));
inputs.push_back(cv::Point2f(0,480));
inputs.push_back(cv::Point2f(640,480));
cv::perspectiveTransform(inputs, outputs, matrix_from_findHomography);
assuming you have a 3x3 cv::Mat of floats, you can convert that to (if you want double change all the f's to d's)
cv::Matx33f transform(your_cv_Mat);
cv::Matx31f pt1(0,0,1);
cv::Matx31f pt2(640,0,1);
...
pt1 = transform*pt1;
pt2 = transform*pt2;
...
make sure you normalize by the third coordinate, read up on homogenous coordinates if that does not make sense
pt1 *= 1/pt1(2);
pt2 *= 1/pt2(2);
...
cv::Point2f final_pt1(pt1(0),pt1(1));
cv::Point2f final_pt2(pt2(0),pt2(1));
You do not need to do this with Matx, it will work with cv::Mat just as well. Personally I like Matx for working with transforms because its size and type is easier to keep track of and its contents can be more easily viewed in the debugger.

Resources