OpenCV: stereoRectify results in empty image - opencv

I have 2 cameras and want to calculate the disparity between them.
The translation between those cameras is mainly in z-Direction (i.e. "inside the image plane") and a little bit in x and y direction. For example: (0.2, 0.2, 0.8)
When I now calculate the rectification parameters with the stereoRectify() method, the output images are just black. Using other translation vectors works just fine, but the results are wrong of course.
Why is it like that and how can I solve the problem?
Edit: These translation values result in both rectified images black. Other (wrong) translation vectors work just fine. Changing the value of alpha doesn't change much.
rotation_quat = Quaternion(0.999999913938509, 0.00029714546497339216, -0.00011465939948083866, 0.0002658585515330323)
rotation = rotation_quat.rotation_matrix.astype(np.float64)
translation = np.array([0.2,0.2,0.8]).astype(np.float64)
R1, R2, P1, P2, Q, roi1, roi2 = cv2.stereoRectify(intrinsicMatrix1, distCoeffs1, intrinsicMatrix2, distCoeffs2, img1.shape[::-1], rotation, translation, alpha=1)
map11, map12 = cv2.initUndistortRectifyMap(intrinsicMatrix1, distCoeffs1, R1, P1, img1.shape[::-1], cv2.CV_32FC1)
map21, map22 = cv2.initUndistortRectifyMap(intrinsicMatrix2, distCoeffs1, R2, P2, img2.shape[::-1], cv2.CV_32FC1)
# rectify
img1_rect = cv2.remap(img1, map11, map12, cv2.INTER_LANCZOS4)
img2_rect = cv2.remap(img2, map21, map22, cv2.INTER_LANCZOS4)
cv2.imshow("img1_rect", img1_rect)
cv2.waitKey(0)
cv2.imshow("img2_rect", img2_rect)
cv2.waitKey(0)

Related

Stereo rectification with OpenCV 4.2.0

I already do rectification with my own code. Now I am trying to make cv2.stereoRectify work.
Suppose I have the following code:
import numpy as np
import cv2
img1 = cv2.imread(IMG_LEFT) # Load image left
img2 = cv2.imread(IMG_RIGHT) # Load image right
A1 = np.array(A1) # Left camera matrix intrinsic
A2 = np.array(A2) # Right camera matrix intrinsic
RT1 = np.array(RT1) # Left camera extrinsic (3x4)
RT2 = np.array(RT2) # Right camera extrinsic (3x4)
# Original projection matrices
Po1 = A1.dot( RT1 )
Po2 = A2.dot( RT2 )
# Camera centers (world coord.)
C1 = -np.linalg.inv(Po1[:,:3]).dot(Po1[:,3])
C2 = -np.linalg.inv(Po2[:,:3]).dot(Po2[:,3])
# Transformations
T1to2 = C2 - C1 # Translation from first to second camera
R1to2 = RT2[:,:3].dot(np.linalg.inv(RT1[:,:3])) # Rotation from first to second camera (3x3)
Then, I would like to find the rectification transformations (3x3). Following the OpenCV documentation I am trying:
Rectify1, Rectify2, Pn1, Pn2, _, _, _ = cv2.stereoRectify(A1, np.zeros((1,5)), A2, np.zeros((1,5)), (img1.shape[1], img1.shape[0]), R1to2, T1to2, alpha=-1 )
mapL1, mapL2 = cv2.initUndistortRectifyMap(A1, np.zeros((1,5)), Rectify1, Pn1, (img1.shape[1], img1.shape[0]), cv2.CV_32FC1)
mapR1, mapR2 = cv2.initUndistortRectifyMap(A2, np.zeros((1,5)), Rectify2, Pn2, (img2.shape[1], img2.shape[0]), cv2.CV_32FC1)
img1_rect = cv2.remap(img1, mapL1, mapL2, cv2.INTER_LINEAR)
img2_rect = cv2.remap(img2, mapR1, mapR2, cv2.INTER_LINEAR)
Anyway I am getting totally screwed images, surely not rectified. What am I doing wrong?
I guess it is something about rotations/translations, but I cannot figure it out.
Also, is OpenCV a bit overcomplicated about this? It should be an easy operation anyway.
Many thanks.
EDIT:
You may notice that I set distortion parameters to zero. That is because I am using computer generated stereo images that have no lens distortion.
Digging into OpenCV documentation I found the reason why stereoRectify() does not seem to work.
Most of research paper often refer to homography transformation to be applied to the image.
OpenCV, instead, is computing the rotation in the object space as it is (shyly) explained in the cv2.initUndistortrectifyMap() documentation (see here).
So after calling:
R1, R2, Pn1, Pn2, _, _, _ = cv2.stereoRectify(A1, np.zeros((1,5)), A2, np.zeros((1,5)), (img1.shape[1], img1.shape[0]), R1to2, T1to2, alpha=-1 )
To get the rectification transformations I use:
Rectify1 = R1.dot(np.linalg.inv(A1))
Rectify2 = R2.dot(np.linalg.inv(A2))
Where R1 and R2 are the transformations output of OpenCV and A1 and A2 are the 3x3 camera matrices (intrinsics).
It seems to work just fine. Please comment if you have any better idea.
Hope this to be useful for anyone.

Stereo Image rectification using opencv is not working

I am trying to rectified two stereo images using cv2.stereoRectify after finding a camera matrix of two cameras (K1,D1,K2,D2,R,T). After that, I got R1, R2, P1, P2, Q, roi1, roi2 and use these parameters in cv2.initUndistortRectifyMap to get left rectified-image and right-rectified image.
But after using this method the rectified images are not good. I using alpha = -1.
Here is my code:
R1, R2, P1, P2, Q, roi1, roi2 = cv2.stereoRectify(cal_data.camera_model.get('M1'), cal_data.camera_model.get('dist1'), cal_data.camera_model.get('M2'), cal_data.camera_model.get('dist2'),
(960, 544), cal_data.camera_model.get('R'), cal_data.camera_model.get('T'), alpha=-1)
print(R1, R2, P1, P2)
leftFrame = cv2.imread('/home/nikhil_m/Pictures/Webcam/2019-07-29-171837.jpg')
rightFrame = cv2.imread('/home/nikhil_m/Pictures/Webcam/2019-07-29-171809.jpg')
leftFrame = cv2.resize(leftFrame,(960,544))
rightFrame = cv2.resize(rightFrame, (960, 544))
leftMapX, leftMapY = cv2.initUndistortRectifyMap(cal_data.camera_model.get('M1'), cal_data.camera_model.get('dist1'), R1, P1, (960,544), cv2.CV_32FC1)
left_rectified = cv2.remap(leftFrame, leftMapX, leftMapY, cv2.INTER_LINEAR, cv2.BORDER_CONSTANT)
rightMapX, rightMapY = cv2.initUndistortRectifyMap(cal_data.camera_model.get('M2'), cal_data.camera_model.get('dist2'), R2, P2, (960,544), cv2.CV_32FC1)
right_rectified = cv2.remap(rightFrame, rightMapX, rightMapY, cv2.INTER_LINEAR, cv2.BORDER_CONSTANT)
Is there any better way to rectified images or I am doing something wrong.Please help
Edit : Calibrated image.
Original image:
Camera matrix:
Intrinsic_mtx_1 [[1.22248627e+03 0.00000000e+00 5.24929333e+02]
[0.00000000e+00 1.32603348e+03 4.99669610e+01]
[0.00000000e+00 0.00000000e+00 1.00000000e+00]]
dist_1 [[ 0.09850468 1.08533383 -0.10682535 0.01777223 -3.39061053]]
Intrinsic_mtx_2 [[1.07148978e+03 0.00000000e+00 4.21476300e+02]
[0.00000000e+00 1.09912897e+03 2.61293969e+02]
[0.00000000e+00 0.00000000e+00 1.00000000e+00]]
dist_2 [[-0.15751877 -0.12428592 -0.01325468 0.02449842 3.72130512]]
R [[ 0.89624385 -0.12740274 -0.42487116]
[ 0.14523621 0.98934946 0.00969995]
[ 0.41911026 -0.0704002 0.90520186]]
T [[16.81657383]
[-5.69906211]
[ 2.42601652]]
E [[ -2.74088083 -1.99896304 -5.18233385]
[ -4.87369619 0.87480896 -16.25313836]
[ 7.55012482 15.91139212 -2.25824725]]
F [[ 1.77370674e-06 1.19257563e-06 3.10912454e-03]
[ 3.07460616e-06 -5.08784373e-07 1.09461230e-02]
[-6.78615804e-03 -1.05410214e-02 1.00000000e+00]]
Look at your rotation matrix R. If your cameras are parallel it should be close to identity matrix. I used this code to examine the angles between your cameras and it turned out that the angles are [-4.44710871 -24.77842715 9.20475808] in degrees. Therefore I assume that your cameras are not parallel and that's why your approach does not work. See my answer in this thread to find out how to do a proper stereo rectification in your case.

what is u,v image coordinates?

I saw u,v image coordinates at bottom of here. I downloaded the data and one sample is [214.65 222.52 145.72 165.42 96.492 114.22 64.985 71.877 43.323
33.477 128.98 173.29 120.12 160.49 121.11 134.89 128. 98.462
175.26 177.23 177.23 151.63 178.22 130.95 177.23 98.462 212.68
175.26 214.65 118.15 215.63 80.738 208.74 68.923 249.11 173.29
242.22 122.09 237.29 86.646 234.34 48.246].
I did search but did not find explanation of u,v image coordinates and how to convert to x-y coordinates. It is not UV mapping, because the data is not between [0, 1]. I may be wrong.
any comments welcomed. Thanks
To be more confident, we can plot these points using Matlab/Octave or OpenCV on the corresponding color image and see if their positions match to the labeled joints. For joint structure we can look at the same README file W, T0, T1, T2, T3, I0, I1, I2, I3, M0, M1, M2, M3, R0, R1, R2, R3, L0, L1, L2, L3. Every joint has 2 coordinates so the sequence of 42 numbers correspond to u, v (X, Y) coordinates of corresponding joints in the sequence.
I tried to directly plot an image and 2D points in Matlab/Octave using this code:
clc; clear;
im = imread('0001_color_composed.png');
data = csvread('0001_joint2D.txt');
x = zeros(length(data)/2,1);
y = x;
for i = 1: length(data)/2
x(i) = data(2*i-1);
y(i) = data(2*i);
end
imshow(im);
hold on;
plot(x, y, 'go');
and these image and annotation. As you can see in the resulting image below all u, v coordinates correspond to pixel coordinates in X and Y counted from the top left corner of image in pixels, i.e. u = X, v = Y (as if image shown using imshow(), the origin of the coordinate frame for consecutive plots is set to the image coordinate frame origin which is the top left corner).

Estimating distance from camera to ground plane point

How can I calculate distance from camera to a point on a ground plane from an image?
I have the intrinsic parameters of the camera and the position (height, pitch).
Is there any OpenCV function that can estimate that distance?
You can use undistortPoints to compute the rays backprojecting the pixels, but that API is rather hard to use for your purpose. It may be easier to do the calculation "by hand" in your code. Doing it at least once will also help you understand what exactly that API is doing.
Express your "position (height, pitch)" of the camera as a rotation matrix R and a translation vector t, representing the coordinate transform from the origin of the ground plane to the camera. That is, given a point in ground plane coordinates Pg = [Xg, Yg, Zg], its coordinates in camera frame are given by
Pc = R * Pg + t
The camera center is Cc = [0, 0, 0] in camera coordinates. In ground coordinates it is then:
Cg = inv(R) * (-t) = -R' * t
where inv(R) is the inverse of R, R' is its transpose, and the last equality is due to R being an orthogonal matrix.
Let's assume, for simplicity, that the the ground plane is Zg = 0.
Let K be the matrix of intrinsic parameters. Given a pixel q = [u, v], write it in homogeneous image coordinates Q = [u, v, 1]. Its location in camera coordinates is
Qc = Ki * Q
where Ki = inv(K) is the inverse of the intrinsic parameters matrix. The same point in world coordinates is then
Qg = R' * Qc + Cg
All the points Pg = [Xg, Yg, Zg] that belong to the ray from the camera center through that pixel, expressed in ground coordinates, are then on the line
Pg = Cg + lambda * (Qg - Cg)
for lambda going from 0 to positive infinity. This last formula represents three equations in ground XYZ coordinates, and you want to find the values of X, Y, Z and lambda where the ray intersects the ground plane. But that means Zg=0, so you have only 3 unknowns. Solve them (you recover lambda from the 3rd equation, then substitute in the first two), and you get Xg and Yg of the solution to your problem.

Pose estimation: solvePnP and epipolar geometry do not agree

I have a relative camera pose estimation problem where I am looking at a scene with differently oriented cameras spaced a certain distance apart. Initially, I am computing the essential matrix using the 5 point algorithm and decomposing it to get the R and t of camera 2 w.r.t camera 1.
I thought it would be a good idea to do a check by triangulating the two sets of image points into 3D, and then running solvePnP on the 3D-2D correspondences, but the result I get from solvePnP is way off. I am trying to do this to "refine" my pose as the scale can change from one frame to another. Anyway, In one case, I had a 45 degree rotation between camera 1 and camera 2 along the Z axis, and the epipolar geometry part gave me this answer:
Relative camera rotation is [1.46774, 4.28483, 40.4676]
Translation vector is [-0.778165583410928; -0.6242059242696293; -0.06946429947410336]
solvePnP, on the other hand..
Camera1: rvecs [0.3830144497209735; -0.5153903947692436; -0.001401186630803216]
tvecs [-1777.451836911453; -1097.111339375749; 3807.545406775675]
Euler1 [24.0615, -28.7139, -6.32776]
Camera2: rvecs [1407374883553280; 1337006420426752; 774194163884064.1] (!!)
tvecs[1.249151852575814; -4.060149502748567; -0.06899980661249146]
Euler2 [-122.805, -69.3934, 45.7056]
Something is troublingly off with the rvecs of camera2 and tvec of camera 1. My code involving the point triangulation and solvePnP looks like this:
points1.convertTo(points1, CV_32F);
points2.convertTo(points2, CV_32F);
// Homogenize image points
points1.col(0) = (points1.col(0) - pp.x) / focal;
points2.col(0) = (points2.col(0) - pp.x) / focal;
points1.col(1) = (points1.col(1) - pp.y) / focal;
points2.col(1) = (points2.col(1) - pp.y) / focal;
points1 = points1.t(); points2 = points2.t();
cv::triangulatePoints(P1, P2, points1, points2, points3DH);
cv::Mat points3D;
convertPointsFromHomogeneous(Mat(points3DH.t()).reshape(4, 1), points3D);
cv::solvePnP(points3D, points1.t(), K, noArray(), rvec1, tvec1, 1, CV_ITERATIVE );
cv::solvePnP(points3D, points2.t(), K, noArray(), rvec2, tvec2, 1, CV_ITERATIVE );
And then I am converting the rvecs through Rodrigues to get the Euler angles: but since rvecs and tvecs themselves seem to be wrong, I feel something's wrong with my process. Any pointers would be helpful. Thanks!

Resources