I've been trying to generate a point cloud from a pair of rectified stereo images. I first obtained the disparity map using opencv's sgbm implementation. I then converted it to a point cloud using the following code,
[for (int u=0; u < left.rows; ++u)
{
for (int v=0; v < left.cols; ++v)
{
if(disp.at<int>(u,v)==0)continue;
pcl::PointXYZRGB p;
p.x = v;
p.y = u;
p.z = (left_focalLength * baseLine * 0.01/ disp.at<int>(u,v));
std::cout << p.z << std::endl;
cv::Vec3b bgr(left.at<cv::Vec3b>(u,v));
p.b = bgr\[0\];
p.g = bgr\[1\];
p.r = bgr\[2\];
pc.push_back(p);
}
}][1]
left is the left image, disp is the output disparity image in cv_16s.
Is my disparity to pcl conversion correct or is it a problem with the disparity values?
I've included a screenshot of the disparity map, point cloud and original left image.
Thank you!
screenshot
I'm not confident with this language, but I noticed a thing:
Assuming that this line convert disparty to depth (Z)
p.z = (left_focalLength * baseLine * 0.01/ disp.at<int>(u,v));
What is 0.01? If this calculation gives you a range of depths (Z) from 1 to 10, this factor reduces your range from 0.01 to 0.1. Depth is always close to zero and you have a flat image (flat image = constant depth).
PS I do not see in your code X,Y conversion from u,v pixel values with Z value. Something like
X = u*Z/f
Related
Given a set of 3D points in camera's perspective corresponding to a planar surface (ground), is there any fast efficient method to find the orientation of the plane regarding the camera's plane? Or is it only possible by running heavier "surface matching" algorithms on the point cloud?
I've tried to use estimateAffine3D and findHomography, but my main limitation is that I don't have the point coordinates on the surface plane - I can only select a set of points from the depth images and thus must work from a set of 3D points in the camera frame.
I've written a simple geometric approach that takes a couple of points and computes vertical and horizontal angles based on depth measurement, but I fear this is both not very robust nor very precise.
EDIT: Following the suggestion by #Micka, I've attempted to fit the points to a 2D plane on the camera's frame, with the following function:
#include <opencv2/opencv.hpp>
//------------------------------------------------------------------------------
/// #brief Fits a set of 3D points to a 2D plane, by solving a system of linear equations of type aX + bY + cZ + d = 0
///
/// #param[in] points The points
///
/// #return 4x1 Mat with plane equations coefficients [a, b, c, d]
///
cv::Mat fitPlane(const std::vector< cv::Point3d >& points) {
// plane equation: aX + bY + cZ + d = 0
// assuming c=-1 -> aX + bY + d = z
cv::Mat xys = cv::Mat::ones(points.size(), 3, CV_64FC1);
cv::Mat zs = cv::Mat::ones(points.size(), 1, CV_64FC1);
// populate left and right hand matrices
for (int idx = 0; idx < points.size(); idx++) {
xys.at< double >(idx, 0) = points[idx].x;
xys.at< double >(idx, 1) = points[idx].y;
zs.at< double >(idx, 0) = points[idx].z;
}
// coeff mat
cv::Mat coeff(3, 1, CV_64FC1);
// problem is now xys * coeff = zs
// solving using SVD should output coeff
cv::SVD svd(xys);
svd.backSubst(zs, coeff);
// alternative approach -> requires mat with 3D coordinates & additional col
// solves xyzs * coeff = 0
// cv::SVD::solveZ(xyzs, coeff); // #note: data type must be double (CV_64FC1)
// check result w/ input coordinates (plane equation should output null or very small values)
double a = coeff.at< double >(0);
double b = coeff.at< double >(1);
double d = coeff.at< double >(2);
for (auto& point : points) {
std::cout << a * point.x + b * point.y + d - point.z << std::endl;
}
return coeff;
}
For simplicity purposes, it is assumed that the camera is properly calibrated and that 3D reconstruction is correct - something I already validated previously and therefore out of the scope of this issue. I use the mouse to select points on a depth/color frame pair, reconstruct the 3D coordinates and pass them into the function above.
I've also tried other approaches beyond cv::SVD::solveZ(), such as inverting xyz with cv::invert(), and with cv::solve(), but it always ended in either ridiculously small values or runtime errors regarding matrix size and/or type.
I was trying to generate a 3D point cloud (PC) from an image with predicted depths. The camera intrinsics and the ground truth depth images are given. Firstly, I am generating a PC with the GT depth using the camera intrinsic and it looks like this:
But, when I try to generate the PC for the same image with the predicted depths, the PC looks weird. Here is the PC with the predicted depths:
I am using the same camera intrinsics for doing this. I am using the same code and procedure for both the PC generations. I was expecting two PCs to be close but what I am getting is really weird. What am I doing wrong?
My code for generating the point cloud is as follows:
int rows = RGB.size[0];
int cols = RGB.size[1];
for (int v = 0; v < rows; v++) {
for (int u = 0; u < cols; u++) {
auto z = depth.at<ushort>(v, u) / 5000;
auto x = (u - intrinsics.cx) * z / intrinsics.fx;
auto y = (v - intrinsics.cy) * z / intrinsics.fy;
// std::cout<<"x = "<< x << " y = " << y <<std::endl;
point3d << x, y, z;
pc.vertices.push_back(point3d);
pc.colors.push_back(RGB.at<cv::Vec3b>(v, u));
}
}
The GT depth image:
The predicted depth image:
Edit: I found the mistake. The depth values were scaled by 5000. So, I missed that part and didn't divide the value of z while constructing the point cloud. After dividing by 5000, the problem was resolved.
The depth value should have been divided by 5000 while constructing the 3D scene as the depth values were scaled by 5000 originally.
For details The camera intrinsics and guide on how to construct the 3D point cloud
I am working on a project wich involves Aruco markers and opencv.
I am quite far in the project progress. I can read the rotation vectors and convert them to a rodrigues matrix using rodrigues() from opencv.
This is a example of a rodrigues matrix I get:
[0,1,0;
1,0,0;
0,0,-1]
I use the following code.
Mat m33(3, 3, CV_64F);
Mat measured_eulers(3, 1, CV_64F);
Rodrigues(rotationVectors, m33);
measured_eulers = rot2euler(m33);
Degree_euler = measured_eulers * 180 / CV_PI;
I use the predefined rot2euler to convert from rodrigues matrix to euler angles.
And I convert the received radians to degrees.
rot2euler looks like the following.
Mat rot2euler(const Mat & rotationMatrix)
{
Mat euler(3, 1, CV_64F);
double m00 = rotationMatrix.at<double>(0, 0);
double m02 = rotationMatrix.at<double>(0, 2);
double m10 = rotationMatrix.at<double>(1, 0);
double m11 = rotationMatrix.at<double>(1, 1);
double m12 = rotationMatrix.at<double>(1, 2);
double m20 = rotationMatrix.at<double>(2, 0);
double m22 = rotationMatrix.at<double>(2, 2);
double x, y, z;
// Assuming the angles are in radians.
if (m10 > 0.998) { // singularity at north pole
x = 0;
y = CV_PI / 2;
z = atan2(m02, m22);
}
else if (m10 < -0.998) { // singularity at south pole
x = 0;
y = -CV_PI / 2;
z = atan2(m02, m22);
}
else
{
x = atan2(-m12, m11);
y = asin(m10);
z = atan2(-m20, m00);
}
euler.at<double>(0) = x;
euler.at<double>(1) = y;
euler.at<double>(2) = z;
return euler;
}
If I use the rodrigues matrix I give as an example I get the following euler angles.
[0; 90; -180]
But I am suppose to get the following.
[-180; 0; 90]
When is use this tool http://danceswithcode.net/engineeringnotes/rotations_in_3d/demo3D/rotations_in_3d_tool.html
You can see that [0; 90; -180] doesn't match the rodrigues matrix but [-180; 0; 90] does. (I am aware of the fact that the tool works with ZYX coordinates)
So the problem is I get the correct values but in a wrong order.
Another problem is that this isn't always the case.
For example rodrigues matrix:
[1,0,0;
0,-1,0;
0,0,-1]
Provides me the correct euler angles.
If someone knows a solution to the problem or can provide me with a explanation how the rot2euler function works exactly. It will be higly appreciated.
Kind Regards
Brent Convens
I guess I am quite late but I'll answer it nonetheless.
Dont quote me on this, ie I'm not 100 % certain but this is one
of the files ( {OPENCV_INSTALLATION_DIR}/apps/interactive-calibration/rotationConverters.cpp ) from the source code of openCV 3.3
It seems to me that openCV is giving you Y-Z-X ( similar to what is being shown in the code above )
Why I said I wasn't sure because I just looked at the source code of cv::Rodrigues and it doesnt seem to call this piece of code that I have shown above. The Rodrigues function has the math harcoded into it ( and I think it can be checked by Taking the 2 rotation matrices and multiplying them as - R = Ry * Rz * Rx and then looking at the place in the code where there is a acos(R(2,0)) or asin(R(0,2) or something similar,since one of the elements of "R" will usually be a cos() or sine which will give you a solution as to which angle is being found.
Not specific to OpenCV, but you could write something like this:
cosine_for_pitch = math.sqrt(pose_mat[0][0] ** 2 + pose_mat[1][0] ** 2)
is_singular = cosine_for_pitch < 10**-6
if not is_singular:
yaw = math.atan2(pose_mat[1][0], pose_mat[0][0])
pitch = math.atan2(-pose_mat[2][0], cosine_for_pitch)
roll = math.atan2(pose_mat[2][1], pose_mat[2][2])
else:
yaw = math.atan2(-pose_mat[1][2], pose_mat[1][1])
pitch = math.atan2(-pose_mat[2][0], cosine_for_pitch)
roll = 0
Here, you could explore more:
https://www.learnopencv.com/rotation-matrix-to-euler-angles/
http://www.staff.city.ac.uk/~sbbh653/publications/euler.pdf
I propose to use the PCL library to do that with this formulation
pcl::getEulerAngles(transformatoin,roll,pitch,yaw);
you need just to initialize the roll, pitch, yaw and a pre-calculated transformation matrix you can do it
The Project Tango C API documentation says that the TANGO_CALIBRATION_POLYNOMIAL_3_PARAMETERS lens distortion is modeled as:
x_corr_px = x_px (1 + k1 * r2 + k2 * r4 + k3 * r6) y_corr_px = y_px (1
+ k1 * r2 + k2 * r4 + k3 * r6)
That is, the undistorted coordinates are a power series function of the distorted coordinates. There is another definition in the Java API, but that description isn't detailed enough to tell which direction the function maps.
I've had a lot of trouble getting things to register properly, and I suspect that the mapping may actually go in the opposite direction, i.e. the distorted coordinates are a power series of the undistorted coordinates. If the camera calibration was produced using OpenCV, then the cause of the problem may be that the OpenCV documentation contradicts itself. The easiest description to find and understand is the OpenCV camera calibration tutorial, which does agree with the Project Tango docs:
But on the other hand, the OpenCV API documentation specifies that the mapping goes the other way:
My experiments with OpenCV show that its API documentation appears correct and the tutorial is wrong. A positive k1 (with all other distortion parameters set to zero) means pincushion distortion, and a negative k1 means barrel distortion. This matches what Wikipedia says about the Brown-Conrady model and will be opposite from the Tsai model. Note that distortion can be modeled either way depending on what makes the math more convenient. I opened a bug against OpenCV for this mismatch.
So my question: Is the Project Tango lens distortion model the same as the one implemented in OpenCV (documentation notwithstanding)?
Here's an image I captured from the color camera (slight pincushioning is visible):
And here's the camera calibration reported by the Tango service:
distortion = {double[5]#3402}
[0] = 0.23019999265670776
[1] = -0.6723999977111816
[2] = 0.6520439982414246
[3] = 0.0
[4] = 0.0
calibrationType = 3
cx = 638.603
cy = 354.906
fx = 1043.08
fy = 1043.1
cameraId = 0
height = 720
width = 1280
Here's how to undistort with OpenCV in python:
>>> import cv2
>>> src = cv2.imread('tango00042.png')
>>> d = numpy.array([0.2302, -0.6724, 0, 0, 0.652044])
>>> m = numpy.array([[1043.08, 0, 638.603], [0, 1043.1, 354.906], [0, 0, 1]])
>>> h,w = src.shape[:2]
>>> mDst, roi = cv2.getOptimalNewCameraMatrix(m, d, (w,h), 1, (w,h))
>>> dst = cv2.undistort(src, m, d, None, mDst)
>>> cv2.imwrite('foo.png', dst)
And that produces this, which is maybe a bit overcorrected at the top edge but much better than my attempts with the reverse model:
The Tango C-API Docs state that (x_corr_px, y_corr_px) is the "corrected output position". This corrected output position needs to then be scaled by focal length and offset by center of projection to correspond to a distorted pixel coordinates.
So, to project a point onto an image, you would have to:
Transform the 3D point so that it is in the frame of the camera
Convert the point into normalized image coordinates (x, y)
Calculate r2, r4, r6 for the normalized image coordinates (r2 = x*x + y*y)
Compute (x_corr_px, y_corr_px) based on the mentioned equations:
x_corr_px = x (1 + k1 * r2 + k2 * r4 + k3 * r6)
y_corr_px = y (1 + k1 * r2 + k2 * r4 + k3 * r6)
Compute distorted coordinates
x_dist_px = x_corr_px * fx + cx
y_dist_px = y_corr_px * fy + cy
Draw (x_dist_px, y_dist_px) on the original, distorted image buffer.
This also means that the corrected coordinates are the normalized coordinates scaled by a power series of the normalized image coordinates' magnitude. (this is the opposite of what the question suggests)
Looking at the implementation of cvProjectPoints2 in OpenCV (see [opencv]/modules/calib3d/src/calibration.cpp), the "Poly3" distortion in OpenCV is being applied the same direction as in Tango. All 3 versions (Tango Docs, OpenCV Tutorials, OpenCV API) are consistent and correct.
Good luck, and hopefully this helps!
(Update: Taking a closer look at a the code, it looks like the corrected coordinates and distorted coordinates are not the same. I've removed the incorrect parts of my response, and the remaining parts of this answer are still correct.)
Maybe it's not the right place to post, but I really want to share the readable version of code used in OpenCV to actually correct the distortion.
I'm sure that I'm not the only one who needs x_corrected and y_corrected and fails to find an easy and understandable formula.
I've rewritten the essential part of cv2.undistortPoints in Python and you may notice that the correction is performed iteratively. This is important, because the solution for polynom of 9-th power does not exist and all we can do is to apply its the reveresed version several times to get the numerical solution.
def myUndistortPoint((x0, y0), CM, DC):
[[k1, k2, p1, p2, k3, k4, k5, k6]] = DC
fx, _, cx = CM[0]
_, fy, cy = CM[1]
x = x_src = (x0 - cx) / fx
y = y_src = (y0 - cy) / fy
for _ in range(5):
r2 = x**2 + y**2
r4 = r2**2
r6 = r2 * r4
rad_dist = (1 + k4*r2 + k5*r4 + k6*r6) / (1 + k1*r2 + k2*r4 + k3*r6)
tang_dist_x = 2*p1 * x*y + p2*(r2 + 2*x**2)
tang_dist_y = 2*p2 * x*y + p1*(r2 + 2*y**2)
x = (x_src - tang_dist_x) * rad_dist
y = (y_src - tang_dist_y) * rad_dist
x = x * fx + cx
y = y * fy + cy
return x, y
To speed up, you can use only three iterations, on most cameras this will give enough precision to fit the pixels.
I'm trying to make a copy of the resizing algorithm of OpenCV with bilinear interpolation in C. What I want to achieve is that the resulting image is exactly the same (pixel value) to that produced by OpenCV. I am particularly interested in shrinking and not in the magnification, and I'm interested to use it on single channel Grayscale images. On the net I read that the bilinear interpolation algorithm is different between shrinkings and enlargements, but I did not find formulas for shrinking-implementations, so it is likely that the code I wrote is totally wrong. What I wrote comes from my knowledge of interpolation acquired in a university course in Computer Graphics and OpenGL. The result of the algorithm that I wrote are images visually identical to those produced by OpenCV but whose pixel values are not perfectly identical (in particular near edges). Can you show me the shrinking algorithm with bilinear interpolation and a possible implementation?
Note: The code attached is as a one-dimensional filter which must be applied first horizontally and then vertically (i.e. with transposed matrix).
Mat rescale(Mat src, float ratio){
float width = src.cols * ratio; //resized width
int i_width = cvRound(width);
float step = (float)src.cols / (float)i_width; //size of new pixels mapped over old image
float center = step / 2; //V1 - center position of new pixel
//float center = step / src.cols; //V2 - other possible center position of new pixel
//float center = 0.099f; //V3 - Lena 512x512 lower difference possible to OpenCV
Mat dst(src.rows, i_width, CV_8UC1);
//cycle through all rows
for(int j = 0; j < src.rows; j++){
//in each row compute new pixels
for(int i = 0; i < i_width; i++){
float pos = (i*step) + center; //position of (the center of) new pixel in old map coordinates
int pred = floor(pos); //predecessor pixel in the original image
int succ = ceil(pos); //successor pixel in the original image
float d_pred = pos - pred; //pred and succ distances from the center of new pixel
float d_succ = succ - pos;
int val_pred = src.at<uchar>(j, pred); //pred and succ values
int val_succ = src.at<uchar>(j, succ);
float val = (val_pred * d_succ) + (val_succ * d_pred); //inverting d_succ and d_pred, supposing "d_succ = 1 - d_pred"...
int i_val = cvRound(val);
if(i_val == 0) //if pos is a perfect int "x.0000", pred and succ are the same pixel
i_val = val_pred;
dst.at<uchar>(j, i) = i_val;
}
}
return dst;
}
Bilinear interpolation is not separable in the sense that you can resize vertically and the resize again vertically. See example here.
You can see OpenCV's resize code here.