I am trying to map the depth from the Kinectv2 to RGB space from a DSLR camera and I am stuck with weird pixel mapping.
I am working on Processing, using OpenCV and Nicolas Burrus' method where :
P3D.x = (x_d - cx_d) * depth(x_d,y_d) / fx_d
P3D.y = (y_d - cy_d) * depth(x_d,y_d) / fy_d
P3D.z = depth(x_d,y_d)
P3D' = R.P3D + T
P2D_rgb.x = (P3D'.x * fx_rgb / P3D'.z) + cx_rgb
P2D_rgb.y = (P3D'.y * fy_rgb / P3D'.z) + cy_rgb
Unfortunatly i have a problem when I reproject 3D point to RGB World Space. In order to check if the problem came from my OpenCV calibration I used MRPT Kinect & Setero Calibration in order to get the intrinsics and distorsion coefficients of the cameras and the rototranslation relative transformation between the two cameras.
eDatas from stereo calibration MRPT
Here my datas :
depth c_x = 262.573912;
depth c_y = 216.804166;
depth f_y = 462.676558;
depth f_x = 384.377033;
depthDistCoeff = {
1.975280e-001, -6.939150e-002, 0.000000e+000, -5.830770e-002, 0.000000e+000
};
DSLR c_x_R = 538.134412;
DSLR c_y_R = 359.760525;
DSLR f_y_R = 968.431461;
DSLR f_x_R = 648.480385;
rgbDistCoeff = {
2.785566e-001, -1.540991e+000, 0.000000e+000, -9.482198e-002, 0.000000e+000
};
R = {
8.4263457190597e-001, -8.9789363922252e-002, 5.3094712387890e-001,
4.4166517232817e-002, 9.9420220953803e-001, 9.8037162878270e-002,
-5.3667149820385e-001, -5.9159417476295e-002, 8.4171483671105e-001
};
T = {-4.740111e-001, 3.618596e-002, -4.443195e-002};
Then I use the data in processing in order to compute the mapping using :
PVector pixelDepthCoord = new PVector(i * offset_, j * offset_);
int index = (int) pixelDepthCoord .x + (int) pixelDepthCoord .y * depthWidth;
int depth = 0;
if (rawData[index] != 255)
{
//2D Depth Coord
depth = rawDataDepth[index];
} else
{
}
//3D Depth Coord - Back projecting pixel depth coord to 3D depth coord
float bppx = (pixelDepthCoord.x - c_x) * depth / f_x;
float bppy = (pixelDepthCoord.y - c_y) * depth / f_y;
float bppz = -depth;
//transpose 3D depth coord to 3D color coord
float x_ =(bppx * R[0] + bppy * R[1] + bppz * R[2]) + T[0];
float y_ = (bppx * R[3] + bppy * R[4] + bppz * R[5]) + T[1];
float z_ = (bppx * R[6] + bppy * R[7] + bppz * R[8]) + T[2];
//Project 3D color coord to 2D color Cood
float pcx = (x_ * f_x_R / z_) + c_x_R;
float pcy = (y_ * f_y_R / z_) + c_y_R;
Then i get the following transformations :
Weird mapping behavior
I think i have a probleme in my method but i don't get it. Does anyone has any ideas or a clues. I am racking my brain since many days on this problem ;)
Thanks
Related
I'm trying to better understand the calibrateCamera and SolvePnP functions in OpenCV, specifically the rotation vectors returned by these functions which I believe is an axis-angle rotation vector (NOT as I had thought initially the yaw,pitch,roll angles). I would like to know the rotation around the x,y and z axis of my checkerboard image. The OpenCV functions return a rotation vector in the form rot = [a,b,c]
Using this answer
as a guide I calculate the angle theta with theta = sqrt(a^2,b^2,c^2) and the rotation axis v = [a/theta, b/theta, c/theta];
Then I take these values and use the Axis-Angle To Euler conversion on euclideanspace.com. shown here:
heading = atan2(y * sin(angle)- x * z * (1 - cos(angle)) , 1 - (y^2 + z^2 ) * (1 - cos(angle)))
attitude = asin(x * y * (1 - cos(angle)) + z * sin(angle))
bank = atan2(x * sin(angle)-y * z * (1 - cos(angle)) , 1 - (x^2 + z^2) * (1 - cos(angle)))
I'm using one of the example OpenCV checkerboard images (Left01.jpg), shown below (note the frame axes in the upper left corner with red = x, green = y, blue = z
Using this image I get a rotation vector from calibrateCamera of [0.166,0.294,0.014]
Running these values through the calculations discussed and converting to degrees I get:
heading = 16.7 deg
attitude = 1.7 deg
bank = 9.3 deg
I believe these correspond to yaw,pitch,roll? The 16.7 degree heading seems high looking at the image, but it's hard to tell. Does this make sense? What would be the correct way to figure out the euler angles (angles around each axis) given the OpenCV rotation vector? Snippets of my code are shown below.
double RMSError = calibrateCamera(
objectPointsArray,
imagePointsArray,
img.size(),
intrinsics,
distortion,
rotation,
translation,
CALIB_ZERO_TANGENT_DIST |
CALIB_FIX_K3 | CALIB_FIX_K4 | CALIB_FIX_K5 |
CALIB_FIX_ASPECT_RATIO);
Mat rvec = rotation.at(0);
//try and get the rotation angles here
//https://stackoverflow.com/questions/12933284/rodrigues-into-eulerangles-and-vice-versa
float theta = sqrt(pow(rvec.at<double>(0),2) + pow(rvec.at<double>(1),2) + pow(rvec.at<double>(2),2));
Mat axis = (Mat_<double>(1, 3) << rvec.at<double>(0) / theta, rvec.at<double>(1) / theta, rvec.at<double>(2) / theta);
float x_ = axis.at<double>(0);
float y_ = axis.at<double>(1);
float z_ = axis.at<double>(2);
//this is yaw,pitch,roll respectively...maybe
float heading = atan2(y_ * sin(theta) - x_ * z_ * (1 - cos(theta)), 1 - (pow(y_,2) + pow(z_,2)) * (1 - static_cast<double>(cos(theta))));
float attitude = asin(x_ * y_ * (1 - cos(theta) + z_ * sin(theta)));
float bank = atan2(x_ * sin(theta) - y_ * z_ * (1 - cos(theta)), 1 - (pow(x_, 2) + pow(z_, 2)) * (1 - static_cast<double>(cos(theta))));
float headingDeg = heading * (180 / 3.14);
float attitudeDeg = attitude * (180 / 3.14);
float bankDeg = bank * (180 / 3.14);
Our AR device is based on a camera with pretty strong optical zoom. We measure the distortion of this camera using classical camera-calibration tools (checkerboards), both through OpenCV and the GML Camera Calibration tools.
At higher zoom levels (I'll use 249 out of 255 as an example) we measure the following camera parameters at full HD resolution (1920x1080):
fx = 24545.4316
fy = 24628.5469
cx = 924.3162
cy = 440.2694
For the radial and tangential distortion we measured 4 values:
k1 = 5.423406
k2 = -2964.24243
p1 = 0.004201721
p2 = 0.0162647516
We are not sure how to interpret (read: implement) those extremely large values for k1 and k2. Using OpenCV's classic "undistort" operation to rectify the image using these values seems to work well. Unfortunately this is (much) too slow for realtime usage.
The thumbnails below look similar, clicking them will display the full size images where you can spot the difference:
Camera footage
Undistorted using OpenCV
That's why we want to take the opposite aproach: leave the camera footage be distorted and apply a similar distortion to our 3D scene using shaders. Following the OpenCV documentation and this accepted answer in particular, the distorted position for a corner point (0, 0) would be
// To relative coordinates
double x = (point.X - cx) / fx; // -960 / 24545 = -0.03911
double y = (point.Y - cy) / fy; // -540 / 24628 = -0.02193
double r2 = x*x + y*y; // 0.002010
// Radial distortion
// -0.03911 * (1 + 5.423406 * 0.002010 + -2964.24243 * 0.002010 * 0.002010) = -0.039067
double xDistort = x * (1 + k1 * r2 + k2 * r2 * r2);
// -0.02193 * (1 + 5.423406 * 0.002010 + -2964.24243 * 0.002010 * 0.002010) = -0.021906
double yDistort = y * (1 + k1 * r2 + k2 * r2 * r2);
// Tangential distortion
... left out for brevity
// Back to absolute coordinates.
xDistort = xDistort * fx + cx; // -0.039067 * 24545.4316 + 924.3162 = -34.6002 !!!
yDistort = yDistort * fy + cy; // -0.021906 * 24628.5469 + 440.2694 = = -99.2435 !!!
These large pixel displacements (34 and 100 pixels at the upper left corner) seem overly warped and do not correspond with the undistorted image OpenCV generates.
So the specific question is: what is wrong with the way we interpreted the values we measured, and what should the correct code for distortion be?
I have a disparity image created with a calibrated stereo camera pair and opencv. It looks good, and my calibration data is good.
I need to calculate the real world distance at a pixel.
From other questions on stackoverflow, i see that the approach is:
depth = baseline * focal / disparity
Using the function:
setMouseCallback("disparity", onMouse, &disp);
static void onMouse(int event, int x, int y, int flags, void* param)
{
cv::Mat &xyz = *((cv::Mat*)param); //cast and deref the param
if (event == cv::EVENT_LBUTTONDOWN)
{
unsigned int val = xyz.at<uchar>(y, x);
double depth = (camera_matrixL.at<float>(0, 0)*T.at<float>(0, 0)) / val;
cout << "x= " << x << " y= " << y << " val= " << val << " distance: " << depth<< endl;
}
}
I click on a point that i have measured to be 3 meters away from the stereo camera.
What i get is:
val= 31 distance: 0.590693
The depth mat values are between 0 and 255, the depth mat is of type 0, or CV_8UC1.
The stereo baseline is 0.0643654 (in meters).
The focal length is 284.493
I have also tried:
(from OpenCV - compute real distance from disparity map)
float fMaxDistance = static_cast<float>((1. / T.at<float>(0, 0) * camera_matrixL.at<float>(0, 0)));
//outputDisparityValue is single 16-bit value from disparityMap
float fDisparity = val / (float)cv::StereoMatcher::DISP_SCALE;
float fDistance = fMaxDistance / fDisparity;
which gives me a (closer to truth, if we assume mm units) distance of val= 31 distance: 2281.27
But is still incorrect.
Which of these approaches is correct? And where am i going wrong?
Left, Right, Depth map. (EDIT: this depth map is from a different pair of images)
EDIT: Based on an answer, i am trying this:
`std::vector pointcloud;
float fx = 284.492615;
float fy = 285.683197;
float cx = 424;// 425.807709;
float cy = 400;// 395.494293;
cv::Mat Q = cv::Mat(4,4, CV_32F);
Q.at<float>(0, 0) = 1.0;
Q.at<float>(0, 1) = 0.0;
Q.at<float>(0, 2) = 0.0;
Q.at<float>(0, 3) = -cx; //cx
Q.at<float>(1, 0) = 0.0;
Q.at<float>(1, 1) = 1.0;
Q.at<float>(1, 2) = 0.0;
Q.at<float>(1, 3) = -cy; //cy
Q.at<float>(2, 0) = 0.0;
Q.at<float>(2, 1) = 0.0;
Q.at<float>(2, 2) = 0.0;
Q.at<float>(2, 3) = -fx; //Focal
Q.at<float>(3, 0) = 0.0;
Q.at<float>(3, 1) = 0.0;
Q.at<float>(3, 2) = -1.0 / 6; //1.0/BaseLine
Q.at<float>(3, 3) = 0.0; //cx - cx'
//
cv::Mat XYZcv(depth_image.size(), CV_32FC3);
reprojectImageTo3D(depth_image, XYZcv, Q, false, CV_32F);
for (int y = 0; y < XYZcv.rows; y++)
{
for (int x = 0; x < XYZcv.cols; x++)
{
cv::Point3f pointOcv = XYZcv.at<cv::Point3f>(y, x);
Eigen::Vector4d pointEigen(0, 0, 0, left.at<uchar>(y, x) / 255.0);
pointEigen[0] = pointOcv.x;
pointEigen[1] = pointOcv.y;
pointEigen[2] = pointOcv.z;
pointcloud.push_back(pointEigen);
}
}`
And that gives me a cloud.
I would recommend to use reprojectImageTo3D of OpenCV to reconstruct the distance from the disparity. Note that when using this function you indeed have to divide by 16 the output of StereoSGBM. You should already have all the parameters f, cx, cy, Tx. Take care to give f and Tx in the same units. cx, cy are in pixels.
Since the problem is that you need the Q matrix, I think that this link or this one should help you to build it. If you don't want to use reprojectImageTo3D I strongly recommend the first link!
I hope this helps!
To find the point-based depth of an object from the camera, use the following formula:
Depth = (Baseline x Focallength)/disparity
I hope you are using it correctly as per your question.
Try the below nerian calculator for the therotical error.
https://nerian.com/support/resources/calculator/
Also, use sub-pixel interpolation in your code.
Make sure object you are identifying for depth should have good texture.
The most common problems with depth maps are:
Untextured surfaces (plain object)
Calibration results are bad.
What is the RMS value for your calibration, camera resolution, and lens type(focal
length)? This is important to provide much better data for your program.
I have an rendered Image. I want to apply radial and tangential distortion coefficients to my image that I got from opencv. Even though there is undistort function, there is no distort function. How can I distort my images with distortion coefficients?
I was also looking for the same type of functionality. I couldn't find it, so I implemented it myself. Here is the C++ code.
First, you need to normalize the image point using the focal length and centers
rpt(0) = (pt_x - cx) / fx
rpt(1) = (pt_y - cy) / fy
then distort the normalized image point
double x = rpt(0), y = rpt(1);
//determining the radial distortion
double r2 = x*x + y*y;
double icdist = 1 / (1 - ((D.at<double>(4) * r2 + D.at<double>(1))*r2 + D.at<double>(0))*r2);
//determining the tangential distortion
double deltaX = 2 * D.at<double>(2) * x*y + D.at<double>(3) * (r2 + 2 * x*x);
double deltaY = D.at<double>(2) * (r2 + 2 * y*y) + 2 * D.at<double>(3) * x*y;
x = (x + deltaX)*icdist;
y = (y + deltaY)*icdist;
then you can translate and scale the point using the center of projection and focal length:
x = x * fx + cx
y = y * fy + cy
Im developing an iOS Augmented Reality application using OpenCV. I'm having issues creating the camera projection matrix to allow the OpenGL overlay to map directly on top of the marker. I feel this is due to my iPhone 6 camera not being correctly calibrated to the application. I know there is OpenCV code to calibrate webcams etc using the chess board, but I can't find a way to calibrate my embedded iPhone camera.
Is there a way? Or are there known estimate values for iPhone 6? Which include: focal length in x and y, primary point in x and y, along with the distortion coefficient matrix.
Any help will be appreciated.
EDIT:
Deduced values are as follows (using iPhone 6, camera feed resolution 1280x720):
fx=1229
cx=360
fy=1153
cy=640
This code provides an accurate estimate for the focal length and primary points for devices currently running iOS 9.1.
AVCaptureDeviceFormat *format = deviceInput.device.activeFormat;
CMFormatDescriptionRef fDesc = format.formatDescription;
CGSize dim = CMVideoFormatDescriptionGetPresentationDimensions(fDesc, true, true);
float cx = float(dim.width) / 2.0;
float cy = float(dim.height) / 2.0;
float HFOV = format.videoFieldOfView;
float VFOV = ((HFOV)/cx)*cy;
float fx = abs(float(dim.width) / (2 * tan(HFOV / 180 * float(M_PI) / 2)));
float fy = abs(float(dim.height) / (2 * tan(VFOV / 180 * float(M_PI) / 2)));
NOTE:
I had an initialization issue with this code. I recommend once the values are initialised and correctly set, to save them to a data file and read this file in for the values.
In my non-OpenCV AR application I am using field of view (FOV) of the iPhone's camera to construct the camera projection matrix. It works alright for displaying the Sun path overlaid on top of the camera view.
I don't know how much accuracy you need. It could be that knowing only FOV would not be enough you.
iOS API provides a way to get field of view of the camera. I get it as so:
AVCaptureDevice * camera = ...
AVCaptureDeviceFormat * format = camera.activeFormat;
float fieldOfView = format.videoFieldOfView;
After getting the FOV I compute the projection matrix:
typedef double mat4f_t[16]; // 4x4 matrix in column major order
mat4f_t projection;
createProjectionMatrix(projection,
GRAD_TO_RAD(fieldOfView),
viewSize.width/viewSize.height,
5.0f,
1000.0f);
where
void createProjectionMatrix(
mat4f_t mout,
float fovy,
float aspect,
float zNear,
float zFar)
{
float f = 1.0f / tanf(fovy/2.0f);
mout[0] = f / aspect;
mout[1] = 0.0f;
mout[2] = 0.0f;
mout[3] = 0.0f;
mout[4] = 0.0f;
mout[5] = f;
mout[6] = 0.0f;
mout[7] = 0.0f;
mout[8] = 0.0f;
mout[9] = 0.0f;
mout[10] = (zFar+zNear) / (zNear-zFar);
mout[11] = -1.0f;
mout[12] = 0.0f;
mout[13] = 0.0f;
mout[14] = 2 * zFar * zNear / (zNear-zFar);
mout[15] = 0.0f;
}