findHomography() / 3x3 matrix - how to get rotation part out of it? - opencv

As a result to a call to findHomography() I get back a 3x3 matrix mtx[3][3]. This matrix contains the translation part in mtx[0][2] and mtx[1][2]. But how can I get the rotation part out of this 3x3 matrix?
Unfortunaltely my target system uses completely different calculation so I can't reuse the 3x3 matrix directly and have to extract the rotation out of this, that's why I'm asking this question.

Generally speaking, you can't decompose the final transformation matrix into its constituent parts. There are some certain cases where it is possible. For example if the only operation preceding the operation was a translation, then you can do arccos(m[0][0]) to get the theta value of the rotation.

Found it for my own meanwhile: There is an OpenCV function RQDecomp3x3() that can be used to extract parts of the transformation out of a matrix.

RQDecomp3x3 has a problem to return rotation in other axes except Z so in this way you just find spin in z axes correctly,if you find projection matrix and pass it to "decomposeProjectionMatrix" you will find better resaults,projection matrix is different to homography matrix you should attention to this point.

Related

solvePnP with Unity3D

I have a real/physical stick with an IR camera attached to it and some IR LED that forms a pattern that I'm using in order to make a virtual stick move in the same way as the physical one.
For that, I'm using OpenCV in Python and send a rotation and translation vector calculated by solvePnP to Unity.
I'm struggling to understand how I can use the results given by the solvePnP function into my 3D world.
So far what I did is: using the solvePnP function to get the rotation and translation vectors. And then use this rotation vector to move my stick in the 3d World
transform.rotation = Quaternion.Euler(new Vector3(x, z, y));
It seems to work okay when my stick is positioned at a certain angle and if I move slowly...but most of the time it moves everywhere.
By looking for answers online, most of people are doing several more steps after solvePnP - from what I understand:
Using Rodrigues to convert the rotation vector to a rotation matrix
Copy the rotation matrix and translation vector into a extrinsic matrix
Inverse the extrinsic matrix
I understand that these steps are necessary if I was working with matrix like in OpenGL - but what about Unity3D? Are these extra steps necessary? Or can I directly use the vectors given by the solvePnP function (which I doubt as the results I'm having so far aren't good).
This is old, but the answer to the question "what about Unity3D? Are these extra steps necessary? Or can I directly use the vectors given by the solvePnP function"
is:
-No, you can't directly use them. I tried to convert rvec using Quaternion.Euler and as you've posted, the results were bad.
-Yes, you have to use Rodrigues, which converts rvec correctly into a rotation matrix.
-About inversing the extrinsic matrix: it depends.
If your object is at (0,0,0) in world space and you want to place the camera, you have to invert the transform resulting from tvec and rvec, in order to get the desired result.
If on the other hand your camera has a fixed position and you want to position the object relatively to it, you have to apply the camera's localToWorld matrix to your transform resulting from rvec and tvec, in order to get the desired result

Calculate a Homography with only Translation, Rotation and Scale in Opencv

I do have two sets of points and I want to find the best transformation between them.
In OpenCV, you have the following function:
Mat H = Calib3d.findHomography(src_points, dest_points);
that returns you a 3x3 Homography matrix, using RANSAC. My problem is now, that I only need translation and rotation (& maybe scale), I don't need affine and perspective.
The thing is, my points are only in 2D.
(1) Is there a function to compute something like a homography but with less degrees of freedom?
(2) If there is none, is it possible to extract a 3x3 matrix that does only translation and rotation from the 3x3 homography matrix?
Thanks in advance for any help!
Isa
OpenCV estimateRigidTransform function is exactly what you need: it returns Translation, Rotation and Scale (use false value for fullAffine flag). And it DOES use RANSAC (see source code to be sure of it).
Homography is for 2D points, the third dimension is just for casting points in 3 dim homogeneous coordinates and performing perspective effects. You can always cast points back:
homogeneous [x, y, w]
cartesian [x/w, y/w]
However since you calculate 6DOF instead of 4DOF (similarity) you result is pretty different from what you expect with 4DOF. More flexible transformation will fit more points in RANSAC at the expense of distortions in transformations you care about. Bottom line - don’t try to decompose H, instead fit similarity or isometry (also called rigid or euclidean). The reason why they are absent in the library - they are expressed in closed form even with correct least squared metric in point coordinates and thus don't require non-linear optimization. In other words, they are very simple.
If you only have rotation and translation, I wrote a quick functions to find them (no RANSAC though). It is probably similar to a rigidTransform but more understandable (hopefully)
https://stackoverflow.com/a/18091472/457687
With scale there is still a closed form solution, but slightly different formulas for translation and scaling. See Learning similarity parameters, p. 25

Given n points in two views, how can I recover the angle of rotation of the camera to achieve the second view?

let's say that all we have are corresponding image points in two views. From these points, I can compute a homography/essential matrix, however extracting the angle of rotation of the camera is not understood.
I'm not sure if I have misunderstood your question but if you are looking are looking for the rotation matrix between two camera coordinate systems and have computed the essential matrix then the rotation matrix can be calculated by taking an SVD of the essential matrix and doing some simple matrix multiplication.
The algorithm is listed here.

How can I use the output 3x3 matrix from getPerspectiveTransform in OpenCV?

I'm now trying to analyze the perspective transform/homography matrix between two images capturing the same object (e.g., a rectangle) but at different perspectives/shooting angles. The perspective transform can be derived by using the function getPerspectiveTransform in OpenCV 2.3.1. I want to find the corresponding rotation and translation matrices.
The output of getPerspectiveTransform is a 3x3 matrix which I can directly use it to warp the source image into the target image. But my question is that how I can find the rotation and translation matrices based on the obtained 3x3 matrix?
I was looking into the funciton decomposeProjectionMatrix for the corresponding rotation and translation matrices. But the input is required to be a 3x4 projection matrix. How can I relate the perspective transformation (i.e., a 3x3 matrix) to the 3x4 projection matrix? Am I on the right track?
Thank you very much.
The information contained in the homography matrix (returned from getPerspectiveTransform) is not enough to extract rotation/translation. The missing column is key to correctly find the angles.
The good news is that in some scenarios, you can use the solvePnP() function to extract the desired parameters from two sets of points.
Also, this question is about the same thing you are asking for. It should help
Analyze camera movement with OpenCV

Finding Homography And Warping Perspective

With FeatureDetector I get features on two images with the same element and match this features with BruteForceMatcher.
Then I'm using OpenCv function findHomography to get homography matrix
H = findHomography( src2Dfeatures, dst2Dfeatures, outlierMask, RANSAC, 3);
and getting H matrix, then align image with
warpPerspective(img1,alignedSrcImage,H,img2.size(),INTER_LINEAR,BORDER_CONSTANT);
I need to know rotation angle, scale, displacement of detected element. Is there any simple way to get this than some big equations? Some evaluated formulas just to put data in?
Homography would match projections of your elements lying on a plane or lying arbitrary in 3D if the camera goes through a pure rotation or zoom and no translation. So here are the cases we are talking about with indication of what is the input to our calculations:
- planar target, pure rotation, intra-frame homography
- planar target, rotation and translation, target to frame homography
- 3D target, pure rotation, frame to frame mapping (constrained by a fundamental matrix)
In case of the planar target, a pure rotation is easy to calculate through your frame-to-frame Homography (H12):
given intrinsic camera matrix A, plane to image homographies for frame H1, and H2 that can be expressed as H1=A, H2=A*R, H12 = H2*H1-1=ARA-1 and thus R=A-1H12*A
In case of elements lying on a plane, rotation with translation of the camera (up to unknown scale) can be calculated through decomposition of target-to-frame homography. Note that the target can be just one of the views. Assuming you have your original planar target as an image (taken at some reference orientation) your task is to decompose the homography between images H12 which can be done through SVD. The first two columns of H represent the first two columns of the rotation martrix and be be recovered through H=ULVT, [r1 r2] = UDVT where D is 3x2 Identity matrix with the last row being all 0. The third column of a rotation matrix is just a vector product of the first two columns. The last column of the Homography is a translation vector times some constant.
Finally for arbitrary configuration of points in 3D and pure camera rotation, the rotation is calculated using the essential matrix decomposition rather than homography, see this
cv::decomposeProjectionMatrix();
and
cv::RQDecomp3x3();
are both similar to what you want to achive.
None of them is perfect. The theory behind them and why you cannot extract all params from a 3x3 matrix is a bit cumbersome. But the short answer is that a 3x3 proj matrix is a simplification from the complete 4x4 one, based on the fact that all points stay in the same plane.
You can try to use levenberg marquardt optimalization, where parameters will be translation and rotation, equations will represent by computed distances between features from two images(use only inliers from ransac homography).
Here is C++ implementation of LM http://www.ics.forth.gr/~lourakis/levmar/

Resources