I use openCV function projectPoints() to rotate, translate and project a set of 3D points and solvePnp() to find this rotation and translation. This works well when the lens distortion coefficients are all zero but fails otherwise. It takes as little distortion as this to fail completely:
distCoeffs << 0.0, 0.01, 0.0, 0.0, 0.0;
The code is below:
#include <iostream>
#include "opencv.hpp"
using namespace std;
using namespace cv;
#define DEG2RAD (3.1415293/180.0)
#define RAD2DEG (1.0/DEG2RAD)
int main() {
const int npoints = 10; // number of points
// extrinsic
const Point3f tvec(10, 20, 30);
Point3f rvec(3, 5, 7);
cout << "Finding extrinsic parameters (PnP)" << endl;
cout<<"Test transformations: ";
cout<<"Rotation: "<<rvec<<"; translation: "<<tvec<<endl;
rvec*=DEG2RAD;
// intrinsic
Mat_ <double>cameraMatrix(3, 3);
cameraMatrix << 300., 0., 200., 0, 300., 100., 0., 0., 1.;
Mat_ <double>distCoeffs(1, 5); // (k_1, k_2, p_1, p_2[, k_3[, k_4, k_5, k_6]]) of 4, 5, or 8 elements.
//distCoeffs << 1.2, 0.2, 0., 0., 0.; // non-zero distortion
distCoeffs << 0.0, 0.0, 0.0, 0.0, 0.0; // zero distortion
cout<<"distrotion coeff: "<<distCoeffs<<endl;
cout<<"============= Running PnP..."<<endl;
vector<Point3f> objPts(npoints);
vector<Point2f> imagePoints(npoints);
Mat rvec_est, tvec_est;
randu(Mat(objPts), 0.0f, 100.0f);
// project
projectPoints(Mat(objPts), Mat(rvec), Mat(tvec), cameraMatrix, distCoeffs, Mat(imagePoints));
// extrinsic
solvePnP(objPts, imagePoints, cameraMatrix, distCoeffs, rvec_est, tvec_est);
cout<<"Rotation: "<<rvec_est*RAD2DEG<<endl;
cout<<"Translation "<<tvec_est<<endl;
return 0;
}
When all distortion coefficients are 0 the result is OK:
Finding extrinsic parameters (PnP)
Test transformations: Rotation: [3, 5, 7]; translation: [10, 20, 30]
distrotion coeff: [0, 0, 0, 0, 0]
============= Running PnP...
Rotation: [2.999999581709123; 4.999997813985293; 6.999999826089725]
Translation [9.999999792663072; 19.99999648222693; 29.99999699621362]
However when they aren't zero the result is totally wrong:
Finding extrinsic parameters (PnP)
Test transformations: Rotation: [3, 5, 7]; translation: [10, 20, 30]
distrotion coeff: [1.2, 0.2, 0, 0, 0]
============= Running PnP...
Rotation: [-91.56479629305277; -124.3631985067845; -74.46486950666471]
Translation [-69.72473511009439; -117.7463271636532; -87.27777166027946]
Since people asked, I am adding intermediate input - some 3D points and their projections for non-zero distortion coefficients. My camera matrix was
cameraMatrix << 300., 0., 200., 0, 300., 100., 0., 0., 1.;
3d points [53.0283, 19.9259, 40.1059]; 2D projection [1060.34, 700.59]
3d points [81.4385, 43.7133, 24.879]; 2D projection [6553.88, 5344.22]
3d points [77.3105, 76.2094, 30.7794]; 2D projection [5143.32, 6497.12]
3d points [70.2432, 47.8447, 79.219]; 2D projection [771.497, 611.726]
Another interesting observation: applying undistort when distCoeff are non zero doesn’t really works (but it does produce identical 2D points when distortion coefficients are all 0):
cout<<"applying undistort..."<<endl;
vector<Point2f> imagePointsUndistort(npoints);
undistortPoints(Mat(imagePoints), Mat(imagePointsUndistort), cameraMatrix, distCoeffs);
for (int i=0; i<4; i++)
cout<<"2d original "<<imagePoints[i]<<"; 2d undistort "<<imagePointsUndistort[i]<<endl;
applying undistort...
2d original [1060.34, 700.59]; 2d undistort [0, 0]
2d original [6553.88, 5344.22]; 2d undistort [0, 0]
2d original [5143.32, 6497.12]; 2d undistort [0, 0]
2d original [771.497, 611.726]; 2d undistort [0, 0]
The reason why I tried undistort() is because if one undoes the effect of known intrinsic parameters PnP becomes just a minimum direction problem of the form Ax=0. It needs min. 6 points for an approximate linear solution which is probably further improved with LMA (flags=CV_ITERATIVE). Technically there are only 6DOF and thus 3 points required so other methods (flags=CV_P3P, CV_EPNP) take less points. Anyways, regardless of a method or number of points the result is still invalid with non-zero distortion coefficients. The last thing I will try is to put all points on a 3D plane. It still fails:
for (int i=0; i<npoints; i++)
objPts[i].z=0.0f;
Finding extrinsic parameters (PnP)
Test transformations: Rotation: [3, 5, 7]; translation: [10, 20, 30]
distrotion coeff: [1.2, 0.2, 0, 0, 0]
============= Running PnP...
Rotation: [-1830.321574903016; 2542.206083947917; 2532.255948350521]
Translation [1407.918216894239; 1391.373407846455; 556.7108606094299]
How to make your code work?
I am able to reproduce the described behavior using the code you provided, however, either one of the two following options solve the problem:
Replace const Point3f tvec(10, 20, 30); by const Point3f tvec(10, 20, N); where N is much lower than 0 (e.g. -300) or much larger than 100 (e.g. 300).
Replace your call to solvePnP by a call to solvePnPRansac.
Why does each of these changes fix the undesired behavior?
First, consider what your original code requests from the solvePnP function. You are using a rotation of rather small magnitude, hence for simplicity of the explanation, I will assume that the rotation is identity. Then, the camera is positionned at world coordinates X=10, Y=20 and Z=30 and you generate object points randomly with world coordinates (X,Y,Z) uniformly drawn in [0,100]3. Hence, the camera is in the middle of the possible range for the object points, as illustrated on the following picture:
This means that object points may be generated very close to the focal plane (i.e. the plane going through the optical center and perpendicularly with respect to the optical axis). The projection in the camera image for such object points is undefined. However, in practice the non-linear optimization algorithm for undistortPoints is unstable even for object points close to the focal plane. This unstability causes the iterative algorithm for undistortPoints to diverge, except when the coefficients are all zero since in that case the initial values remain strictly constant during the estimation.
Hence, the two possible solutions to avoid this behavior are the following:
Avoid generating object points near the focal plane of the camera, i.e. change the translation vector or the range of the coordinates of the object points.
Eliminate the object points too close to the focal plane of the camera, whose undistorted estimation diverged (outliers), before the PnP estimation for example using solvePnPRansac.
Details about why undistortPoints fails:
NB: As we know the 3D world points, I used the following call to obtain the true undistorted coordinates, independently from the result of undistortPoints:
cv::projectPoints(obj_pts, rvec, tvec, cv::Mat_<double>::eye(3,3), cv::Mat_<double>::zeros(5,1), true_norm_pts);
The following function is a simplified version of what undistortPoints is doing:
void simple_undistort_point(const cv::Mat &img_pt,
const cv::Mat_<double> &K,
const cv::Mat_<double> &D,
cv::Mat &norm_pt)
{
// Define temporary variables
double k[8]={D.at<double>(0),
D.at<double>(1),
D.at<double>(2),
D.at<double>(3),
D.at<double>(4)},
fx, fy, ifx, ify, cx, cy;
fx = K.at<double>(0,0);
fy = K.at<double>(1,1);
ifx = 1./fx;
ify = 1./fy;
cx = K.at<double>(0,2);
cy = K.at<double>(1,2);
// Cancel distortion iteratively
const int iters = 5;
double x, y, x0, y0;
x0=x=(img_pt.at<double>(0)-cx)*ifx;
y0=y=(img_pt.at<double>(1)-cy)*ify;
for(int j = 0; j < iters; ++j)
{
double r2 = x*x + y*y;
double icdist = 1/(1 + ((k[4]*r2 + k[1])*r2 + k[0])*r2);
double deltaX = 2*k[2]*x*y + k[3]*(r2 + 2*x*x);
double deltaY = k[2]*(r2 + 2*y*y) + 2*k[3]*x*y;
x = (x0 - deltaX)*icdist;
y = (y0 - deltaY)*icdist;
}
// Store result
norm_pt.create(1,2,CV_64F);
norm_pt.at<double>(0) = x;
norm_pt.at<double>(1) = y;
}
If you add code to check how x and y change with each iteration, you'll see that the iterative optimization diverges due to r2 being very large at the beginning. Here is a log example:
#0: [2.6383300, 1.7651500] r2=10.0766000, icdist=0.0299408, deltaX=0, deltaY=0
#1: [0.0789937, 0.0528501] r2=0.00903313, icdist=0.9892610, deltaX=0, deltaY=0
#2: [2.6100000, 1.7462000] r2=9.86128000, icdist=0.0309765, deltaX=0, deltaY=0
#3: [0.0817263, 0.0546783] r2=0.00966890, icdist=0.9885120, deltaX=0, deltaY=0
#4: [2.6080200, 1.7448800] r2=9.84637000, icdist=0.0310503, deltaX=0, deltaY=0
end: [0.0819209, 0.0548085]
true: [0.9327440, 0.6240440]
When r2 is large, r2*r2*r2 is huge hence icdist is very small, hence the next iteration starts with a very small r2. When r2 is very small, icdist is close to 1, hence x and y are respectively set to x0 and y0 and we are back with a large r2, etc.
So why is r2 so large in the first place? Because the points may be generated close to the focal plane, in which case they are far from the optical axis (hence a very large r2). See the following log example:
img_pt#0=[991.4992804037340, 629.5460091483255], r2=10.07660, norm(cv_undist-true)=1.0236800
img_pt#1=[5802.666489402056, 4402.387472311543], r2=554.4490, norm(cv_undist-true)=2.1568300
img_pt#2=[5040.551339386630, 5943.173381042060], r2=639.7070, norm(cv_undist-true)=2.1998700
img_pt#3=[741.9742544382640, 572.9513930063181], r2=5.749100, norm(cv_undist-true)=0.8158670
img_pt#4=[406.9101658356062, 403.0152736214052], r2=1.495890, norm(cv_undist-true)=0.1792810
img_pt#5=[516.2079583447821, 1038.026553216831], r2=10.88760, norm(cv_undist-true)=1.0494500
img_pt#6=[1876.220394606081, 8129.280202695572], r2=747.5450, norm(cv_undist-true)=2.2472900
img_pt#7=[236.9935231831764, 329.3418854620716], r2=0.599625, norm(cv_undist-true)=0.0147487
img_pt#8=[1037.586015858139, 1346.494838992490], r2=25.05890, norm(cv_undist-true)=1.2998400
img_pt#9=[499.9808133105154, 715.6213031242644], r2=5.210870, norm(cv_undist-true)=0.7747020
You can see that for most points, r2 is very large, except for a few (#3, #4 & #7) which are also those associated with the best undistortion accuracy.
This problem is due to the particular undistortion algorithm implemented in OpenCV, which has been chosen for its efficiency. Other non-linear optimization algorithm (e.g. Levenberg-Marquardt) would be more accurate but also much slower, and would definitely be an overkill in most applications.
Let me go through opencv sources. But first I present "pure" opencv function that works as in the sources (please read below how I got this point) merged with your code to show it works as the library one:
#include <iostream>
#include <opencv2\opencv.hpp>
using namespace std;
using namespace cv;
#define DEG2RAD (3.1415293/180.0)
#define RAD2DEG (1.0/DEG2RAD)
Point2f Project(Point3f p, double R[], double t[], double k[], double fx, double fy, double cx, double cy) {
double X = p.x, Y = p.y, Z = p.z;
double x = R[0]*X + R[1]*Y + R[2]*Z + t[0];
double y = R[3]*X + R[4]*Y + R[5]*Z + t[1];
double z = R[6]*X + R[7]*Y + R[8]*Z + t[2];
double r2, r4, r6, a1, a2, a3, cdist, icdist2;
double xd, yd;
z = z ? 1./z : 1;
x *= z; y *= z;
r2 = x*x + y*y;
r4 = r2*r2;
r6 = r4*r2;
a1 = 2*x*y;
a2 = r2 + 2*x*x;
a3 = r2 + 2*y*y;
cdist = 1 + k[0]*r2 + k[1]*r4 + k[4]*r6;
icdist2 = 1./(1 + k[5]*r2 + k[6]*r4 + k[7]*r6);
xd = x*cdist*icdist2 + k[2]*a1 + k[3]*a2;
yd = y*cdist*icdist2 + k[2]*a3 + k[3]*a1;
double xRet = xd*fx + cx;
double yRet = yd*fy + cy;
return Point2f(xRet, yRet);
}
int main() {
const int npoints = 10; // number of points
// extrinsic
const Point3f tvec(10, 20, 30);
Point3f rvec(3, 5, 7);
cout << "Finding extrinsic parameters (PnP)" << endl;
cout<<"Test transformations: ";
cout<<"Rotation: "<<rvec<<"; translation: "<<tvec<<endl;
rvec*=DEG2RAD;
// intrinsic
Mat_ <double>cameraMatrix(3, 3);
cameraMatrix << 300., 0., 200., 0, 300., 100., 0., 0., 1.;
Mat_ <double>distCoeffs(1, 5); // (k_1, k_2, p_1, p_2[, k_3[, k_4, k_5, k_6]]) of 4, 5, or 8 elements.
distCoeffs << 1.2, 0.2, 0., 0., 0.; // non-zero distortion
//distCoeffs << 0.0, 0.0, 0.0, 0.0, 0.0; // zero distortion
//distCoeffs << 1.8130418031666484e+000, -1.3285019729932657e+001, -1.6921715019797313e-002, -1.3327183367510961e-001, -5.2725832482783389e+001;
cout<<"distrotion coeff: "<<distCoeffs<<endl;
cout<<"============= Running PnP..."<<endl;
vector<Point3f> objPts(npoints);
vector<Point2f> imagePoints(npoints);
Mat rvec_est, tvec_est;
randu(Mat(objPts), 0.0f, 100.0f);
// project
projectPoints(Mat(objPts), Mat(rvec), Mat(tvec), cameraMatrix, distCoeffs, Mat(imagePoints));
std::cout << objPts << std::endl;
std::cout << imagePoints << std::endl;
double R[9];
Mat matR( 3, 3, CV_64F, R);
Mat_<double> m(1,3);
m << (double)rvec.x, (double)rvec.y, (double)rvec.z;
Rodrigues(m, matR);
std::cout << matR << std::endl;
double t[3] = {tvec.x, tvec.y, tvec.z};
double k[8] = {1.2, 0.2, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0};
double fx = 300, fy = 300, cx = 200, cy = 100;
for(int i=0;i<objPts.size();i++)
std::cout << Project(objPts[i], R, t, k, fx, fy, cx, cy) << "; ";
std::cout << std::endl;
// extrinsic
solvePnP(objPts, imagePoints, cameraMatrix, distCoeffs, rvec_est, tvec_est);
cout<<"Rotation: "<<rvec_est*RAD2DEG<<endl;
cout<<"Translation "<<tvec_est<<endl;
return 0;
}
R is rotation, t translation, k distortion. Look at the 'r2' computation - it is x*x + y*y, but x,y is the position (scaled by z though) just after applying translation and rotation. And this r stands for (as wikpedia says) for "square distance in image projected by ideal pinhole model". We can say projectPoints implementation is OK.
How I got this result:
I'm digging up version 2.4.8. If you go to the calibration.cpp in the calib3d module, start with
void cv::projectPoints( InputArray _opoints,
InputArray _rvec,
InputArray _tvec,
InputArray _cameraMatrix,
InputArray _distCoeffs,
OutputArray _ipoints,
OutputArray _jacobian,
double aspectRatio )
{
Mat opoints = _opoints.getMat();
int npoints = opoints.checkVector(3), depth = opoints.depth();
CV_Assert(npoints >= 0 && (depth == CV_32F || depth == CV_64F));
CvMat dpdrot, dpdt, dpdf, dpdc, dpddist;
CvMat *pdpdrot=0, *pdpdt=0, *pdpdf=0, *pdpdc=0, *pdpddist=0;
_ipoints.create(npoints, 1, CV_MAKETYPE(depth, 2), -1, true);
CvMat c_imagePoints = _ipoints.getMat();
CvMat c_objectPoints = opoints;
Mat cameraMatrix = _cameraMatrix.getMat();
Mat rvec = _rvec.getMat(), tvec = _tvec.getMat();
CvMat c_cameraMatrix = cameraMatrix;
CvMat c_rvec = rvec, c_tvec = tvec;
double dc0buf[5]={0};
Mat dc0(5,1,CV_64F,dc0buf);
Mat distCoeffs = _distCoeffs.getMat();
if( distCoeffs.empty() )
distCoeffs = dc0;
CvMat c_distCoeffs = distCoeffs;
int ndistCoeffs = distCoeffs.rows + distCoeffs.cols - 1;
if( _jacobian.needed() )
{
// cut out, we dont use this part
}
cvProjectPoints2( &c_objectPoints, &c_rvec, &c_tvec, &c_cameraMatrix, &c_distCoeffs,
&c_imagePoints, pdpdrot, pdpdt, pdpdf, pdpdc, pdpddist, aspectRatio );
}
Nothing special, right? No content manipulation at all. Let's go deeper:
CV_IMPL void cvProjectPoints2( const CvMat* objectPoints,
const CvMat* r_vec,
const CvMat* t_vec,
const CvMat* A,
const CvMat* distCoeffs,
CvMat* imagePoints, CvMat* dpdr,
CvMat* dpdt, CvMat* dpdf,
CvMat* dpdc, CvMat* dpdk,
double aspectRatio )
{
Ptr<CvMat> matM, _m;
Ptr<CvMat> _dpdr, _dpdt, _dpdc, _dpdf, _dpdk;
int i, j, count;
int calc_derivatives;
const CvPoint3D64f* M;
CvPoint2D64f* m;
double r[3], R[9], dRdr[27], t[3], a[9], k[8] = {0,0,0,0,0,0,0,0}, fx, fy, cx, cy;
CvMat _r, _t, _a = cvMat( 3, 3, CV_64F, a ), _k;
CvMat matR = cvMat( 3, 3, CV_64F, R ), _dRdr = cvMat( 3, 9, CV_64F, dRdr );
// some code not important ...
if( r_vec->rows == 3 && r_vec->cols == 3 )
{
_r = cvMat( 3, 1, CV_64FC1, r );
cvRodrigues2( r_vec, &_r );
cvRodrigues2( &_r, &matR, &_dRdr );
cvCopy( r_vec, &matR );
}
else
{
_r = cvMat( r_vec->rows, r_vec->cols, CV_MAKETYPE(CV_64F,CV_MAT_CN(r_vec->type)), r );
cvConvert( r_vec, &_r );
cvRodrigues2( &_r, &matR, &_dRdr );
}
Last part is important, because we use cv::Rodriguez to create an rotation matrix from rotation vector. And later in the function we also create translation matrix, but still no data manipulation. Going further in the ProjectPoints2:
fx = a[0]; fy = a[4];
cx = a[2]; cy = a[5];
if( fixedAspectRatio )
fx = fy*aspectRatio;
if( distCoeffs )
{
if( !CV_IS_MAT(distCoeffs) ||
(CV_MAT_DEPTH(distCoeffs->type) != CV_64F &&
CV_MAT_DEPTH(distCoeffs->type) != CV_32F) ||
(distCoeffs->rows != 1 && distCoeffs->cols != 1) ||
(distCoeffs->rows*distCoeffs->cols*CV_MAT_CN(distCoeffs->type) != 4 &&
distCoeffs->rows*distCoeffs->cols*CV_MAT_CN(distCoeffs->type) != 5 &&
distCoeffs->rows*distCoeffs->cols*CV_MAT_CN(distCoeffs->type) != 8) )
CV_Error( CV_StsBadArg, cvDistCoeffErr );
_k = cvMat( distCoeffs->rows, distCoeffs->cols,
CV_MAKETYPE(CV_64F,CV_MAT_CN(distCoeffs->type)), k );
cvConvert( distCoeffs, &_k );
}
Here we set focal lengths from camera matrix and principal point coords. Also we set array k which contains distortion coefs. Now we finished setting up variables. Let's go to the computations:
double X = M[i].x, Y = M[i].y, Z = M[i].z;
double x = R[0]*X + R[1]*Y + R[2]*Z + t[0];
double y = R[3]*X + R[4]*Y + R[5]*Z + t[1];
double z = R[6]*X + R[7]*Y + R[8]*Z + t[2];
double r2, r4, r6, a1, a2, a3, cdist, icdist2;
double xd, yd;
z = z ? 1./z : 1;
x *= z; y *= z;
r2 = x*x + y*y;
r4 = r2*r2;
r6 = r4*r2;
a1 = 2*x*y;
a2 = r2 + 2*x*x;
a3 = r2 + 2*y*y;
cdist = 1 + k[0]*r2 + k[1]*r4 + k[4]*r6;
icdist2 = 1./(1 + k[5]*r2 + k[6]*r4 + k[7]*r6);
xd = x*cdist*icdist2 + k[2]*a1 + k[3]*a2;
yd = y*cdist*icdist2 + k[2]*a3 + k[3]*a1;
m[i].x = xd*fx + cx; // here projection
m[i].y = yd*fy + cy;
And we have the function exactly as the one I presented on the top/
Related
I want to project a single point (-1450,-1660) on an image
I am using opencv 4.0.1 c++
I have the camera matrix and distortion coefficient
and my code is
vector <Point3f> inputpoints;
Point3f myPoint;
myPoint.x = -1450;
myPoint.y = -1660;
myPoint.z = 0;
inputpoints.push_back(myPoint);
vector<Point2f> outputpoints;
vector<Point3f> tvec;
tvec.push_back(Point3f(0, 0, 0));
vector<Point3f> rvec;
rvec.push_back(Point3f(0, 0, 0));
double mydata[9] = { 3.3202343554882879e+02, 1., 6.4337059696010670e+02, 0, 3.3196938477610536e+02, 5.3844814394773562e+02, 0., 0., 1. };
Mat mycameraMatrix = Mat(3, 3, CV_64F, mydata);
double mydata2[4] = { -1.1129472191078109e-03, 4.9443845791693870e-02,
-7.2244333582166609e-03, -1.7309984187889034e-03 };
Mat mydiscoff = Mat{ 4,1, CV_64F ,mydata2 };
Mat newCamMat1= Mat(3, 3, CV_64F);
cv::fisheye::projectPoints(inputpoints, rvec, tvec, mycameraMatrix, mydiscoff, outputpoints);
when I run the program I get this exception
OpenCV(4.0.1) Error: Assertion failed (mtype == type0 || (CV_MAT_CN(mtype) == CV_MAT_CN(type0) && ((1 << type0) & fixedDepthMask) != 0)) in cv::debug_build_guard::_OutputArray::create, file c:\build\master_winpack-build-win64-vc15\opencv\modules\core\src\matrix_wrap.cpp, line 1395
I changed the type of camera matrix and distortion coefficient to CV_32f but I still got the same error , I am a very beginner in openCV ..so can any one tell me what caused this exception?
I know the rvec should be 3*3 but I just followed someone else code who wrote that can be written in this way
okay the problem was that projectpoints and fisheye::projectpoints differ in the order of parameters ..so I was putting the order which belongs to projectpoints
I have a disparity image created with a calibrated stereo camera pair and opencv. It looks good, and my calibration data is good.
I need to calculate the real world distance at a pixel.
From other questions on stackoverflow, i see that the approach is:
depth = baseline * focal / disparity
Using the function:
setMouseCallback("disparity", onMouse, &disp);
static void onMouse(int event, int x, int y, int flags, void* param)
{
cv::Mat &xyz = *((cv::Mat*)param); //cast and deref the param
if (event == cv::EVENT_LBUTTONDOWN)
{
unsigned int val = xyz.at<uchar>(y, x);
double depth = (camera_matrixL.at<float>(0, 0)*T.at<float>(0, 0)) / val;
cout << "x= " << x << " y= " << y << " val= " << val << " distance: " << depth<< endl;
}
}
I click on a point that i have measured to be 3 meters away from the stereo camera.
What i get is:
val= 31 distance: 0.590693
The depth mat values are between 0 and 255, the depth mat is of type 0, or CV_8UC1.
The stereo baseline is 0.0643654 (in meters).
The focal length is 284.493
I have also tried:
(from OpenCV - compute real distance from disparity map)
float fMaxDistance = static_cast<float>((1. / T.at<float>(0, 0) * camera_matrixL.at<float>(0, 0)));
//outputDisparityValue is single 16-bit value from disparityMap
float fDisparity = val / (float)cv::StereoMatcher::DISP_SCALE;
float fDistance = fMaxDistance / fDisparity;
which gives me a (closer to truth, if we assume mm units) distance of val= 31 distance: 2281.27
But is still incorrect.
Which of these approaches is correct? And where am i going wrong?
Left, Right, Depth map. (EDIT: this depth map is from a different pair of images)
EDIT: Based on an answer, i am trying this:
`std::vector pointcloud;
float fx = 284.492615;
float fy = 285.683197;
float cx = 424;// 425.807709;
float cy = 400;// 395.494293;
cv::Mat Q = cv::Mat(4,4, CV_32F);
Q.at<float>(0, 0) = 1.0;
Q.at<float>(0, 1) = 0.0;
Q.at<float>(0, 2) = 0.0;
Q.at<float>(0, 3) = -cx; //cx
Q.at<float>(1, 0) = 0.0;
Q.at<float>(1, 1) = 1.0;
Q.at<float>(1, 2) = 0.0;
Q.at<float>(1, 3) = -cy; //cy
Q.at<float>(2, 0) = 0.0;
Q.at<float>(2, 1) = 0.0;
Q.at<float>(2, 2) = 0.0;
Q.at<float>(2, 3) = -fx; //Focal
Q.at<float>(3, 0) = 0.0;
Q.at<float>(3, 1) = 0.0;
Q.at<float>(3, 2) = -1.0 / 6; //1.0/BaseLine
Q.at<float>(3, 3) = 0.0; //cx - cx'
//
cv::Mat XYZcv(depth_image.size(), CV_32FC3);
reprojectImageTo3D(depth_image, XYZcv, Q, false, CV_32F);
for (int y = 0; y < XYZcv.rows; y++)
{
for (int x = 0; x < XYZcv.cols; x++)
{
cv::Point3f pointOcv = XYZcv.at<cv::Point3f>(y, x);
Eigen::Vector4d pointEigen(0, 0, 0, left.at<uchar>(y, x) / 255.0);
pointEigen[0] = pointOcv.x;
pointEigen[1] = pointOcv.y;
pointEigen[2] = pointOcv.z;
pointcloud.push_back(pointEigen);
}
}`
And that gives me a cloud.
I would recommend to use reprojectImageTo3D of OpenCV to reconstruct the distance from the disparity. Note that when using this function you indeed have to divide by 16 the output of StereoSGBM. You should already have all the parameters f, cx, cy, Tx. Take care to give f and Tx in the same units. cx, cy are in pixels.
Since the problem is that you need the Q matrix, I think that this link or this one should help you to build it. If you don't want to use reprojectImageTo3D I strongly recommend the first link!
I hope this helps!
To find the point-based depth of an object from the camera, use the following formula:
Depth = (Baseline x Focallength)/disparity
I hope you are using it correctly as per your question.
Try the below nerian calculator for the therotical error.
https://nerian.com/support/resources/calculator/
Also, use sub-pixel interpolation in your code.
Make sure object you are identifying for depth should have good texture.
The most common problems with depth maps are:
Untextured surfaces (plain object)
Calibration results are bad.
What is the RMS value for your calibration, camera resolution, and lens type(focal
length)? This is important to provide much better data for your program.
It is my first post on Stack so I'm sorry in advance for my clumsiness. Please let me know if I can improve my question anyway.
► What I want to achieve (in a long term):
I try to manipulate my Unity3d presentation with a laser pointer using OpenCV fo Unity.
I believe one picture is worth more than a thousand words, so this should tell the most:
► What is the problem:
I try to make a simple 4-point calibration (projection) from camera view (some kind of trapezium) into plane space.
I thought it will be something very basic and easy, but I have no experience with OpenCV and I can't make it work.
► Sample:
I made a much less complicated example, without any laser detection and all other stuff. Only 4-points trapezium that I try to reproject into the plane space.
Link to the whole sample project: https://1drv.ms/u/s!AiDsGecSyzmuujXGQUapcYrIvP7b
The core script from my example:
using OpenCVForUnity;
using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using UnityEngine.UI;
using System;
public class TestCalib : MonoBehaviour
{
public RawImage displayDummy;
public RectTransform[] handlers;
public RectTransform dummyCross;
public RectTransform dummyResult;
public Vector2 webcamSize = new Vector2(640, 480);
public Vector2 objectSize = new Vector2(1024, 768);
private Texture2D texture;
Mat cameraMatrix;
MatOfDouble distCoeffs;
MatOfPoint3f objectPoints;
MatOfPoint2f imagePoints;
Mat rvec;
Mat tvec;
Mat rotationMatrix;
Mat imgMat;
void Start()
{
texture = new Texture2D((int)webcamSize.x, (int)webcamSize.y, TextureFormat.RGB24, false);
if (displayDummy) displayDummy.texture = texture;
imgMat = new Mat(texture.height, texture.width, CvType.CV_8UC3);
}
void Update()
{
imgMat = new Mat(texture.height, texture.width, CvType.CV_8UC3);
Test();
DrawImagePoints();
Utils.matToTexture2D(imgMat, texture);
}
void DrawImagePoints()
{
Point[] pointsArray = imagePoints.toArray();
for (int i = 0; i < pointsArray.Length; i++)
{
Point p0 = pointsArray[i];
int j = (i < pointsArray.Length - 1) ? i + 1 : 0;
Point p1 = pointsArray[j];
Imgproc.circle(imgMat, p0, 5, new Scalar(0, 255, 0, 150), 1);
Imgproc.line(imgMat, p0, p1, new Scalar(255, 255, 0, 150), 1);
}
}
private void DrawResults(MatOfPoint2f resultPoints)
{
Point[] pointsArray = resultPoints.toArray();
for (int i = 0; i < pointsArray.Length; i++)
{
Point p = pointsArray[i];
Imgproc.circle(imgMat, p, 5, new Scalar(255, 155, 0, 150), 1);
}
}
public void Test()
{
float w2 = objectSize.x / 2F;
float h2 = objectSize.y / 2F;
/*
objectPoints = new MatOfPoint3f(
new Point3(-w2, -h2, 0),
new Point3(w2, -h2, 0),
new Point3(-w2, h2, 0),
new Point3(w2, h2, 0)
);
*/
objectPoints = new MatOfPoint3f(
new Point3(0, 0, 0),
new Point3(objectSize.x, 0, 0),
new Point3(objectSize.x, objectSize.y, 0),
new Point3(0, objectSize.y, 0)
);
imagePoints = GetImagePointsFromHandlers();
rvec = new Mat(1, 3, CvType.CV_64FC1);
tvec = new Mat(1, 3, CvType.CV_64FC1);
rotationMatrix = new Mat(3, 3, CvType.CV_64FC1);
double fx = webcamSize.x / objectSize.x;
double fy = webcamSize.y / objectSize.y;
double cx = 0; // webcamSize.x / 2.0f;
double cy = 0; // webcamSize.y / 2.0f;
cameraMatrix = new Mat(3, 3, CvType.CV_64FC1);
cameraMatrix.put(0, 0, fx);
cameraMatrix.put(0, 1, 0);
cameraMatrix.put(0, 2, cx);
cameraMatrix.put(1, 0, 0);
cameraMatrix.put(1, 1, fy);
cameraMatrix.put(1, 2, cy);
cameraMatrix.put(2, 0, 0);
cameraMatrix.put(2, 1, 0);
cameraMatrix.put(2, 2, 1.0f);
distCoeffs = new MatOfDouble(0, 0, 0, 0);
Calib3d.solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs, rvec, tvec);
Mat uv = new Mat(3, 1, CvType.CV_64FC1);
uv.put(0, 0, dummyCross.anchoredPosition.x);
uv.put(1, 0, dummyCross.anchoredPosition.y);
uv.put(2, 0, 0);
Calib3d.Rodrigues(rvec, rotationMatrix);
Mat P = rotationMatrix.inv() * (cameraMatrix.inv() * uv - tvec);
Vector2 v = new Vector2((float)P.get(0, 0)[0], (float)P.get(1, 0)[0]);
dummyResult.anchoredPosition = v;
}
private MatOfPoint2f GetImagePointsFromHandlers()
{
MatOfPoint2f m = new MatOfPoint2f();
List<Point> points = new List<Point>();
foreach (RectTransform handler in handlers)
{
Point p = new Point(handler.anchoredPosition.x, handler.anchoredPosition.y);
points.Add(p);
}
m.fromList(points);
return m;
}
}
Thanks in advance for any help.
This question is not opencv specific but heavily math-based and more often seen in the realm of computer graphics. What you are looking for is called a Projective Transformation.
A Projective Transformation takes a set of coordinates and projects them onto something. In your case you want to project a 2D point in the camera view to a 2D point on a flat plane.
So we want a projection transform for 2D-Space. To perform a projection transform we need to find the projection matrix for the transformation we want to apply. In this case we need a matrix that expresses the projective deformation of the camera in relation to a flat plane.
To work with projections we first need to convert our points into homogeneous coordinates. To do so we simply add a new component to our vectors with value 1. So (x,y) becomes (x,y,1). And we will do that with all our five available points.
Now we start with the actual math. First some definitions: The camera's point of view and respective coordinates shall be the camera space, coordinates in relation to a flat plane are in flat space. Let c₁ to c₄ be the corner points of the plane in relation to camera space as homogeneous vectors. Let p be the point that we have found in camera space and p' the point we want to find in flat space, both as homogeneous vectors again.
Mathematically speaking, we are looking for a Matrix C that will allow us to calculate p' by giving it p.
p' = C * p
Now we obviously need to find C. To find a projection matrix for two dimensional space, we need four points (how convenient..) I will assume that c₁ will go to (0,0), c₂ will go to (0,1), c₃ to (1,0) and c₄ to (1,1). You need to solve two matrix equations using e.g. the gaussian row elimination or an LR Decomposition algorithm. OpenCV should contain functions to do those tasks for you, but be aware of matrix conditioning and their impact on a usable solution.
Now back to the matrices. You need to calculate two basis change matrices as they are called. They are used to change the frame of reference of your coordinates (exactly what we want to do). The first matrix will transform our coordinates to three dimensional basis vectors and the second one will transform our 2D plane into three dimensional basis vectors.
For the coordinate one you'll need to calculate λ, μ and r in the following equation:
⌈ c₁.x c₂.x c₃.x ⌉ ⌈ λ ⌉ ⌈ c₄.x ⌉
c₁.y c₂.y c₃.y * μ = c₄.y
⌊ 1 1 1 ⌋ ⌊ r ⌋ ⌊ 1 ⌋
this will lead you to your first Matrix, A
⌈ λ*c₁.x μ*c₂.x r*c₃.x ⌉
A = λ*c₁.y μ*c₂.y r*c₃.y
⌊ λ μ r ⌋
A will now map the points c₁ to c₄ to the basis coordinates (1,0,0), (0,1,0), (0,0,1) and (1,1,1). We do the same thing for our plane now. First solve
⌈ 0 0 1 ⌉ ⌈ λ ⌉ ⌈ 1 ⌉
0 1 0 * μ = 1
⌊ 1 1 1 ⌋ ⌊ r ⌋ ⌊ 1 ⌋
and get B
⌈ 0 0 r ⌉
B = 0 μ 0
⌊ λ μ r ⌋
A and B will now map from those three dimensional basis vectors into your respective spaces. But that is not quite what we want. We want camera space -> basis -> flat space, so only matrix B manipulates in the right direction. But that is easily fixable by inverting A. That will give us matrix C = B * A⁻¹ (watch the order of B and A⁻¹ it is not interchangeable). This leaves us with a formula to calculate p' out of p.
p' = C * p
p' = B * A⁻¹ * p
Read it from left to right like: take p, transform p from camera space into basis vectors and transform those into flat space.
If you remember correctly, p' still has three components, so we need to dehomogenize p' first before we can use it. This will yield
x' = p'.x / p'.z
y' = p'.y / p'.z
and viola we have successfully transformed a laser point from a camera view onto a flat piece of paper. Totally not overly complicated or so...
I Develop Code. MouseUp Call this Function. And Resolution Edit;
void Cal()
{
// Webcam Resolution 1280*720
MatOfPoint2f pts_src = new MatOfPoint2f(
new Point(Double.Parse(imagePoints.get(0,0).GetValue(0).ToString()), Double.Parse(imagePoints.get(0, 0).GetValue(1).ToString())),
new Point(Double.Parse(imagePoints.get(1,0).GetValue(0).ToString()), Double.Parse(imagePoints.get(1, 0).GetValue(1).ToString())),
new Point(Double.Parse(imagePoints.get(2,0).GetValue(0).ToString()), Double.Parse(imagePoints.get(2, 0).GetValue(1).ToString())),
new Point(Double.Parse(imagePoints.get(3,0).GetValue(0).ToString()), Double.Parse(imagePoints.get(3, 0).GetValue(1).ToString()))
);
//Resolution 1920*1080
MatOfPoint2f pts_dst = new MatOfPoint2f(
new Point(0, 0),
new Point(1920, 0),
new Point(1920, 1080),
new Point(0, 1080)
);
// 1. Calculate Homography
Mat h = Calib3d.findHomography((pts_src), (pts_dst));
// Pick Point (WebcamDummy Cavas : 1280*0.5f / 720*0.5f)
MatOfPoint2f srcPointMat = new MatOfPoint2f(
new Point(dummyCross.anchoredPosition.x*2.0f, dummyCross.anchoredPosition.y*2.0f)
);
MatOfPoint2f dstPointMat = new MatOfPoint2f();
{
//2. h Mat Mul srcPoint to dstPoint
Core.perspectiveTransform(srcPointMat, dstPointMat, h);
Vector2 v = new Vector2((float)dstPointMat.get(0, 0)[0], (float)dstPointMat.get(0, 0)[1]);
//(ResultDummy Cavas: 1920 * 0.5f / 1080 * 0.5f)
dummyResult.anchoredPosition = v*0.5f;
Debug.Log(dummyCross.anchoredPosition.ToString() + "\n" + dummyResult.anchoredPosition.ToString());
}
}
I'm working on an augmented reality application for android using opencv 2.4.4 and have some problem with homography decomposition.
As we know, homography matrix is define as H=A.[R t] , where A is the intrinsic camera matrix, R is rotation matrix and t is translation vector.
I want to estimate the view side of camera using pictures, also the orientation of camera in 3d room.
Homography matrix can I estimate with opencv function: findHomography, and I think it works!!!
Here how I do it:
static Mat mFindHomography(MatOfKeyPoint keypoints1, MatOfKeyPoint keypoints2, MatOfDMatch matches){
List<Point> lp1 = new ArrayList<Point>(500);
List<Point> lp2 = new ArrayList<Point>(500);
KeyPoint[] k1 = keypoints1.toArray();
KeyPoint[] k2 = keypoints2.toArray();
List<DMatch> matchesList = matches.toList();
if (matchesList.size() < 4){
MatOfDMatch mat = new MatOfDMatch();
return mat;
}
// Add matches keypoints to new list to apply homography
for(DMatch match : matchesList){
Point kk1 = k1[match.queryIdx].pt;
Point kk2 = k2[match.trainIdx].pt;
lp1.add(kk1);
lp2.add(kk2);
}
MatOfPoint2f srcPoints = new MatOfPoint2f(lp1.toArray(new Point[0]));
MatOfPoint2f dstPoints = new MatOfPoint2f(lp2.toArray(new Point[0]));
Mat mask = new Mat();
Mat homography = Calib3d.findHomography(srcPoints, dstPoints, Calib3d.RANSAC, 10, mask); // Finds a perspective transformation between two planes. ---Calib3d.LMEDS Least-Median robust method
List<DMatch> matches_homo = new ArrayList<DMatch>();
int size = (int) mask.size().height;
for(int i = 0; i < size; i++){
if ( mask.get(i, 0)[0] == 1){
DMatch d = matchesList.get(i);
matches_homo.add(d);
}
}
MatOfDMatch mat = new MatOfDMatch();
mat.fromList(matches_homo);
matchesFilterdByRansac = (int) mat.size().height;
return homography;
}
After that, I want to decompose this homography matrix and compute euler angles. As we know H=A.[R t], I multiply homography matrix with inverse of camera intrinsic matrix: H.A^{-1} = [R t]. So, I want to decompose [R t] in rotation and translation and compute euler angles from rotation matrix. But it didn't work. What is wrong there?!!
if(!homography.empty()){ // esstimate pose frome homography
Mat intrinsics = Mat.zeros(3, 3, CvType.CV_32FC1); // camera intrinsic matrix
intrinsics.put(0, 0, 890);
intrinsics.put(0, 2, 400);
intrinsics.put(1, 1, 890);
intrinsics.put(1, 2, 240);
intrinsics.put(2, 2, 1);
// Inverse Matrix from Wolframalpha
double[] inverseIntrinsics = { 0.001020408, 0 , -0.408163265,
0, 0.0011235955, -0.26966292,
0, 0 , 1 };
// cross multiplication
double[] rotationTranslation = matrixMultiply3X3(homography, inverseIntrinsics);
Mat pose = Mat.eye(3, 4, CvType.CV_32FC1); // 3x4 matrix, the camera pose
float norm1 = (float) Core.norm(rotationTranslation.col(0));
float norm2 = (float) Core.norm(rotationTranslation.col(1));
float tnorm = (norm1 + norm2) / 2.0f; // Normalization value ---test: float tnorm = (float) h.get(2, 2)[0];// not worked
Mat normalizedTemp = new Mat();
Core.normalize(rotationTranslation.col(0), normalizedTemp);
normalizedTemp.convertTo(normalizedTemp, CvType.CV_32FC1);
normalizedTemp.copyTo(pose.col(0)); // Normalize the rotation, and copies the column to pose
Core.normalize(rotationTranslation.col(1), normalizedTemp);
normalizedTemp.convertTo(normalizedTemp, CvType.CV_32FC1);
normalizedTemp.copyTo(pose.col(1));// Normalize the rotation and copies the column to pose
Mat p3 = pose.col(0).cross(pose.col(1)); // Computes the cross-product of p1 and p2
p3.copyTo(pose.col(2));// Third column is the crossproduct of columns one and two
double[] buffer = new double[3];
rotationTranslation.col(2).get(0, 0, buffer);
pose.put(0, 3, buffer[0] / tnorm); //vector t [R|t] is the last column of pose
pose.put(1, 3, buffer[1] / tnorm);
pose.put(2, 3, buffer[2] / tnorm);
float[] rotationMatrix = new float[9];
rotationMatrix = getArrayFromMat(pose);
float[] eulerOrientation = new float[3];
SensorManager.getOrientation(rotationMatrix, eulerOrientation);
// Convert radian to degree
double yaw = (double) (eulerOrientation[0]) * (180 / Math.PI));// * -57;
double pitch = (double) (eulerOrientation[1]) * (180 / Math.PI));
double roll = (double) (eulerOrientation[2]) * (180 / Math.PI));}
Note that opencv 3.0 has a homogeraphy decomposition function (here), but I'm using opencv 2.4.4 for android!!! Is there a wrapper for it in java?
Second problem is with decomposing of rotation matrix in euler angels. Is there any problem with:
float[] eulerOrientation = new float[3];
SensorManager.getOrientation(rotationMatrix, eulerOrientation);
I used this link too, but not better result!
double pitch = Math.atan2(pose.get(2, 1)[0], pose.get(2, 2)[0]);
double roll = Math.atan2(-1*pose.get(2, 0)[0], Math.sqrt( Math.pow(pose.get(2, 1)[0], 2) + Math.pow(pose.get(2, 2)[0], 2)) );
double yaw = Math.atan2(pose.get(1, 0)[0], pose.get(0, 0)[0]);
Thanks a lot for any response
I hope this answer will help those looking for a solution today.
My answer uses c++ and opencv 2.4.9. I copied the decomposehomographymat function from opencv 3.0. After computing homography I use the copied function to decompose homography. To filter homography matrix and select the correct answer from the 4 decompositions, check my answer here.
To obtain euler angles from the rotation matrix, you can refer to this. I am able to get good results with this method.
I use OpenCV to undestort set of points after camera calibration.
The code follows.
const int npoints = 2; // number of point specified
// Points initialization.
// Only 2 ponts in this example, in real code they are read from file.
float input_points[npoints][2] = {{0,0}, {2560, 1920}};
CvMat * src = cvCreateMat(1, npoints, CV_32FC2);
CvMat * dst = cvCreateMat(1, npoints, CV_32FC2);
// fill src matrix
float * src_ptr = (float*)src->data.ptr;
for (int pi = 0; pi < npoints; ++pi) {
for (int ci = 0; ci < 2; ++ci) {
*(src_ptr + pi * 2 + ci) = input_points[pi][ci];
}
}
cvUndistortPoints(src, dst, &camera1, &distCoeffs1);
After the code above dst contains following numbers:
-8.82689655e-001 -7.05507338e-001 4.16228324e-001 3.04863811e-001
which are too small in comparison with numbers in src.
At the same time if I undistort image via the call:
cvUndistort2( srcImage, dstImage, &camera1, &dist_coeffs1 );
I receive good undistorted image which means that pixel coordinates are not modified so drastically in comparison with separate points.
How to obtain the same undistortion for specific points as for images?
Thanks.
The points should be "unnormalized" using camera matrix.
More specifically, after call of cvUndistortPoints following transformation should be also added:
double fx = CV_MAT_ELEM(camera1, double, 0, 0);
double fy = CV_MAT_ELEM(camera1, double, 1, 1);
double cx = CV_MAT_ELEM(camera1, double, 0, 2);
double cy = CV_MAT_ELEM(camera1, double, 1, 2);
float * dst_ptr = (float*)dst->data.ptr;
for (int pi = 0; pi < npoints; ++pi) {
float& px = *(dst_ptr + pi * 2);
float& py = *(dst_ptr + pi * 2 + 1);
// perform transformation.
// In fact this is equivalent to multiplication to camera matrix
px = px * fx + cx;
py = py * fy + cy;
}
More info on camera matrix at OpenCV 'Camera Calibration and 3D Reconstruction'
UPDATE:
Following C++ function call should work as well:
std::vector<cv::Point2f> inputDistortedPoints = ...
std::vector<cv::Point2f> outputUndistortedPoints;
cv::Mat cameraMatrix = ...
cv::Mat distCoeffs = ...
cv::undistortPoints(inputDistortedPoints, outputUndistortedPoints, cameraMatrix, distCoeffs, cv::noArray(), cameraMatrix);
It may be your matrix size :)
OpenCV expects a vector of points - a column or a row matrix with two channels. But because your input matrix is only 2 pts, and the number of channels is also 1, it cannot figure out what's the input, row or colum.
So, fill a longer input mat with bogus values, and keep only the first:
const int npoints = 4; // number of point specified
// Points initialization.
// Only 2 ponts in this example, in real code they are read from file.
float input_points[npoints][4] = {{0,0}, {2560, 1920}}; // the rest will be set to 0
CvMat * src = cvCreateMat(1, npoints, CV_32FC2);
CvMat * dst = cvCreateMat(1, npoints, CV_32FC2);
// fill src matrix
float * src_ptr = (float*)src->data.ptr;
for (int pi = 0; pi < npoints; ++pi) {
for (int ci = 0; ci < 2; ++ci) {
*(src_ptr + pi * 2 + ci) = input_points[pi][ci];
}
}
cvUndistortPoints(src, dst, &camera1, &distCoeffs1);
EDIT
While OpenCV specifies undistortPoints accept only 2-channel input, actually, it accepts
1-column, 2-channel, multi-row mat or (and this case is not documented)
2 column, multi-row, 1-channel mat or
multi-column, 1 row, 2-channel mat
(as seen in undistort.cpp, line 390)
But a bug inside (or lack of available info), makes it wrongly mix the second one with the third one, when the number of columns is 2. So, your data is considered a 2-column, 2-row, 1-channel.
I also reach this problems, and I take some time to research an finally understand.
Formula
You see the formula above, in the open system, distort operation is before camera matrix, so the process order is:
image_distorted ->camera_matrix -> un-distort function->camera_matrix->back to image_undistorted.
So you need a small fix to and camera1 again.
Mat eye3 = Mat::eye(3, 3, CV_64F);
cvUndistortPoints(src, dst, &camera1, &distCoeffs1, &eye3,&camera1);
Otherwise, if the last two parameters is empty, It would be project to a Normalized image coordinate.
See codes: opencv-3.4.0-src\modules\imgproc\src\undistort.cpp :297
cvUndistortPointsInternal()