Transformation Matrix From EstimatePoseSingleMarkers - opencv

I try to get transformation matrix from my camera to the mid point of an aruco marker and I use cv2.aruco.estimatePoseSingleMarkers function.
In the description it says :
The returned transformation is the one that transforms points from each marker coordinate system to the camera coordinate system.
I give corners of the marker detection result to the pose estimation function and convert the rotation vector with cv2.Rodriguez to rotation matrix. Already the resulting translation vector is different from the measured translation from the camera coordinates to the marker coordinates which it should be the same in my knowledge.
Regardless of the inference, I initialize a 4x4 transformation matrix and put the rotation matrix and the translation vector from pose estimation to the 4x4 matrix which I assume it gives the transformation from the marker coordinate frame to the camera coordinate frame but when I correct the transformation with another point from the environment it shows the transformation matrix is not correct.
I am sure about the MARKER_SIZE.
marker2camera_r_vec, marker2camera_t_vec, _ = cv2.aruco.estimatePoseSingleMarkers(corners, MARKER_SIZE, camera.mtx, camera.dist)
marker2camera_r_mat = np.array(cv2.Rodrigues(marker2camera_r_vec)[0])
transformation_matrix = np.zeros([4, 4])
transformation_matrix[0:3, 0:3] = rotation_matrix
transformation_matrix[0:3, 3] = translation_vector
transformation_matrix[3, 3] = 1
Example results:
marker2camera_r_mat =
[[-0.96533802 -0.03093691 -0.25916292],
[ 0.07337548 0.92073721 -0.38322189],
[ 0.25047664 -0.38895487 -0.88655263]]
marker2camera_t_vec =
[ 1.45883855 6.98282269 77.73744481]
transformation_matrix =
[[-9.65338018e-01 -3.09369077e-02 -2.59162919e-01 1.45883855e+00],
[ 7.33754824e-02 9.20737214e-01 -3.83221895e-01 6.98282269e+00],
[ 2.50476645e-01 -3.88954869e-01 -8.86552627e-01 7.77374448e+01],
[ 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00]]
Actual point coordinates of the marker from the camera coordinates is =
[-0.086 0.12 0.83]
Can anyone tell me what I do wrong and explain the steps to obtain the transformation matrix from the camera coordinates to the marker coordinates via cv2.aruco.estimatePoseSingleMarkers ?

Related

How to calculate the ray of a camera with the help of the camera matrix?

Lets assume I have the following pinhole model:
I know fx, fy, cx and cy from my camera matrix. And I know the pixels u and v on my plane and I also have the the extrinsics of my camera R and t.
How can I calculate the ray of one pixel?
The initial idea would be to calculate the vextor from the camera center to the x,y coordinate on the image plane. But with this model i dont knwo the distance between image plane and camera center.
You can do this:
Image points: imgPts (2xN)
Camera matrix: M (3x3) = [fx 0 cx;0 fy cy;0 0 1];
Camera Center: camCenter = inv(Rt)*([0 0 0 1]');
Point at sensor: ptsCam = inv(M)*imgPts;
Points at world coordinates: pts = inv(Rt)*ptsCam; // Add one to the third coordinate, so ptsCam (3xN).
Direction vectors: v = pts - camCenter; v = v/norm(v); // normalize.
Output: pts and v.

2D image coordinate to 3D world coordinate using Depth Map

Having the intrinsic and extrinsic matrices how does one transfer 2D image coordinate to 3D world coordinate using the depth map image? A similar problem was discussed here: 2D Coordinate to 3D world coordinate, but it assumes images are rectified which is not the case here. I have a problem formulating the equation for this.
p_camera.x = (pixCoord.x - cx) / fx;
p_camera.y = (pixCoord.y - cy) / fy;
p_camera.z = depth;
p_camera.x = -p_camera.x*p_camera.z;
p_camera.y = -p_camera.y*p_camera.z;
P_world = R * (p_camera + T);

Estimating distance from camera to ground plane point

How can I calculate distance from camera to a point on a ground plane from an image?
I have the intrinsic parameters of the camera and the position (height, pitch).
Is there any OpenCV function that can estimate that distance?
You can use undistortPoints to compute the rays backprojecting the pixels, but that API is rather hard to use for your purpose. It may be easier to do the calculation "by hand" in your code. Doing it at least once will also help you understand what exactly that API is doing.
Express your "position (height, pitch)" of the camera as a rotation matrix R and a translation vector t, representing the coordinate transform from the origin of the ground plane to the camera. That is, given a point in ground plane coordinates Pg = [Xg, Yg, Zg], its coordinates in camera frame are given by
Pc = R * Pg + t
The camera center is Cc = [0, 0, 0] in camera coordinates. In ground coordinates it is then:
Cg = inv(R) * (-t) = -R' * t
where inv(R) is the inverse of R, R' is its transpose, and the last equality is due to R being an orthogonal matrix.
Let's assume, for simplicity, that the the ground plane is Zg = 0.
Let K be the matrix of intrinsic parameters. Given a pixel q = [u, v], write it in homogeneous image coordinates Q = [u, v, 1]. Its location in camera coordinates is
Qc = Ki * Q
where Ki = inv(K) is the inverse of the intrinsic parameters matrix. The same point in world coordinates is then
Qg = R' * Qc + Cg
All the points Pg = [Xg, Yg, Zg] that belong to the ray from the camera center through that pixel, expressed in ground coordinates, are then on the line
Pg = Cg + lambda * (Qg - Cg)
for lambda going from 0 to positive infinity. This last formula represents three equations in ground XYZ coordinates, and you want to find the values of X, Y, Z and lambda where the ray intersects the ground plane. But that means Zg=0, so you have only 3 unknowns. Solve them (you recover lambda from the 3rd equation, then substitute in the first two), and you get Xg and Yg of the solution to your problem.

Pose estimation: solvePnP and epipolar geometry do not agree

I have a relative camera pose estimation problem where I am looking at a scene with differently oriented cameras spaced a certain distance apart. Initially, I am computing the essential matrix using the 5 point algorithm and decomposing it to get the R and t of camera 2 w.r.t camera 1.
I thought it would be a good idea to do a check by triangulating the two sets of image points into 3D, and then running solvePnP on the 3D-2D correspondences, but the result I get from solvePnP is way off. I am trying to do this to "refine" my pose as the scale can change from one frame to another. Anyway, In one case, I had a 45 degree rotation between camera 1 and camera 2 along the Z axis, and the epipolar geometry part gave me this answer:
Relative camera rotation is [1.46774, 4.28483, 40.4676]
Translation vector is [-0.778165583410928; -0.6242059242696293; -0.06946429947410336]
solvePnP, on the other hand..
Camera1: rvecs [0.3830144497209735; -0.5153903947692436; -0.001401186630803216]
tvecs [-1777.451836911453; -1097.111339375749; 3807.545406775675]
Euler1 [24.0615, -28.7139, -6.32776]
Camera2: rvecs [1407374883553280; 1337006420426752; 774194163884064.1] (!!)
tvecs[1.249151852575814; -4.060149502748567; -0.06899980661249146]
Euler2 [-122.805, -69.3934, 45.7056]
Something is troublingly off with the rvecs of camera2 and tvec of camera 1. My code involving the point triangulation and solvePnP looks like this:
points1.convertTo(points1, CV_32F);
points2.convertTo(points2, CV_32F);
// Homogenize image points
points1.col(0) = (points1.col(0) - pp.x) / focal;
points2.col(0) = (points2.col(0) - pp.x) / focal;
points1.col(1) = (points1.col(1) - pp.y) / focal;
points2.col(1) = (points2.col(1) - pp.y) / focal;
points1 = points1.t(); points2 = points2.t();
cv::triangulatePoints(P1, P2, points1, points2, points3DH);
cv::Mat points3D;
convertPointsFromHomogeneous(Mat(points3DH.t()).reshape(4, 1), points3D);
cv::solvePnP(points3D, points1.t(), K, noArray(), rvec1, tvec1, 1, CV_ITERATIVE );
cv::solvePnP(points3D, points2.t(), K, noArray(), rvec2, tvec2, 1, CV_ITERATIVE );
And then I am converting the rvecs through Rodrigues to get the Euler angles: but since rvecs and tvecs themselves seem to be wrong, I feel something's wrong with my process. Any pointers would be helpful. Thanks!

Angles of rotation matrix using OpenCv function cvPosit

I'm working on a 3D Pose estimation system. I used OpenCVs function cvPosit to calculate the rotation matrix and the translation vector.
I also need the angles of the rotation matrix, but no algorithms seem to be working.
The function cv::RQDecomp3x3(), which was the answer of topic "in opencv : how to get yaw, roll, pitch from POSIT rotation matrix" cannot work, because the function needs the 3x3 matrix of the projection matrix.
Furthermore I tried to use algorithms from the links below, but nothing worked.
visionopen.com/cv/vosm/doc/html/recognitionalgs_8cpp_source.html
stackoverflow.com/questions/16266740/in-opencv-how-to-get-yaw-roll-pitch-from-posit-rotation-matrix
quad08pyro.groups.et.byu.net/vision.htm
stackoverflow.com/questions/13565625/opencv-c-posit-why-are-my-values-always-nan-with-small-focal-lenght
www.c-plusplus.de/forum/308773-full
I used the most common Posit Tutorial and an own example with Blender, so I could render an image to retreive the image points and to know the exact angles. The object's Z-Axis in Blender was rotated by 10 degrees - And I checked all the degrees of all 3 Axis due to changes in Axis between Blender and OpenCV.
double focalLength = 700.0;
CvPOSITObject* positObject;
std::vector<CvPoint3D32f> modelPoints;
modelPoints.push_back(cvPoint3D32f(0.0f, 0.0f, 0.0f));
modelPoints.push_back(cvPoint3D32f(CUBE_SIZE, 0.0f, 0.0f));
modelPoints.push_back(cvPoint3D32f(0.0f, CUBE_SIZE, 0.0f));
modelPoints.push_back(cvPoint3D32f(0.0f, 0.0f, CUBE_SIZE));
std::vector<CvPoint2D32f> imagePoints;
imagePoints.push_back( cvPoint2D32f( 157,372) );
imagePoints.push_back( cvPoint2D32f(423,386 ));
imagePoints.push_back( cvPoint2D32f( 157,108 ));
imagePoints.push_back( cvPoint2D32f(250,337));
// Moving the points to the image center as described in the tutorial
for (int i = 0; i < imagePoints.size();i++) {
imagePoints[i] = cvPoint2D32f(imagePoints[i].x -320, 240 - imagePoints[i].y);
}
CvVect32f translation_vector = new float[3];
CvTermCriteria criteria = cvTermCriteria(CV_TERMCRIT_EPS | CV_TERMCRIT_ITER,iterations, 0.1f);
positObject = cvCreatePOSITObject( &modelPoints[0], static_cast<int>(modelPoints.size()));
CvMatr32f rotation_matrix = new float[9];
cvPOSIT( positObject, &imagePoints[0], focalLength, criteria, rotation_matrix, translation_vector );
algorithms to get angles...
I already tried to calculate the results from radian to degree and clockwise but I already get bad results using the rotation matrix of cvPosit from OpenCV. I also changed matrix format to check wrong formatting...
I used simple rotation matrices - like only doing a rotation of the x-axis, y and z-axis and some algorithm worked. The rotation matrix of cvPosit didn't work with that algorithm.
I appreciate any support.

Resources