solvePnP with Unity3D - opencv

I have a real/physical stick with an IR camera attached to it and some IR LED that forms a pattern that I'm using in order to make a virtual stick move in the same way as the physical one.
For that, I'm using OpenCV in Python and send a rotation and translation vector calculated by solvePnP to Unity.
I'm struggling to understand how I can use the results given by the solvePnP function into my 3D world.
So far what I did is: using the solvePnP function to get the rotation and translation vectors. And then use this rotation vector to move my stick in the 3d World
transform.rotation = Quaternion.Euler(new Vector3(x, z, y));
It seems to work okay when my stick is positioned at a certain angle and if I move slowly...but most of the time it moves everywhere.
By looking for answers online, most of people are doing several more steps after solvePnP - from what I understand:
Using Rodrigues to convert the rotation vector to a rotation matrix
Copy the rotation matrix and translation vector into a extrinsic matrix
Inverse the extrinsic matrix
I understand that these steps are necessary if I was working with matrix like in OpenGL - but what about Unity3D? Are these extra steps necessary? Or can I directly use the vectors given by the solvePnP function (which I doubt as the results I'm having so far aren't good).

This is old, but the answer to the question "what about Unity3D? Are these extra steps necessary? Or can I directly use the vectors given by the solvePnP function"
is:
-No, you can't directly use them. I tried to convert rvec using Quaternion.Euler and as you've posted, the results were bad.
-Yes, you have to use Rodrigues, which converts rvec correctly into a rotation matrix.
-About inversing the extrinsic matrix: it depends.
If your object is at (0,0,0) in world space and you want to place the camera, you have to invert the transform resulting from tvec and rvec, in order to get the desired result.
If on the other hand your camera has a fixed position and you want to position the object relatively to it, you have to apply the camera's localToWorld matrix to your transform resulting from rvec and tvec, in order to get the desired result

Related

Understanding the output of solvepnp?

I am have been using solvepnp() for the calculation of the rotation and translation matrix. But the euler angles calculated from the obtained rotation matrix gave very erratic values. Trying to find the problem, I had a set of 2D projection points for my marker and kept the other parameters of solvepnp() constant.
Eg values:
2D points
[219.67473, 242.78395; 363.4151, 238.61298; 503.04855, 234.56117; 501.70917, 628.16742; 500.58069, 959.78564; 383.1756, 972.02679; 262.8746, 984.56982; 243.17044, 646.22925]
The euler angle theta(x) calculated from the output rotation matrix of solvepnp() was -26.4877
Next, I incremented only the x value of the first point(i.e 219.67473) by 0.1 to check the variation of the theta(x) euler angle (keeping the remaining points and the other parameters constant) and ran the solvepnp() again .For that very small change,I had values which were decreasing from -19 degree, -18 degree (for x coord = 223.074) then suddenly jump to 27 degree for a while (for x coord = 223.174 to 226.974) then come down to 1.3 degree (for x coord = 227.074).
I cannot understand this behaviour at all.Could somebody please explain?
My euler angle calculation from the rotation matrix uses this procedure.
Try Rodrigues() for conversion between rotation matrix and rotation vector to make sure everything is clean and right. Non RANSAC version can be very sensitive to outliers that create a huge error in the parameters and thus bias a solution. Using RANSAC version of solvePnP may make it more stable to outliers. For example, adding too much to one of the points coordinates will eventually make it an outlier and it won’t influence a solution after that.
If everything fails, write a series unit tests: create an artificial set of points in 3D (possibly non planar), apply a simple translation first, in second variant apply rotation only, and in a third test apply both. Project using your camera matrix and then plug in your 2D, 3D points and projection matrix into your code to find the pose. If the result deviates from the inverse of the translations and rotations your applied to the points look for the bug in feeding parameters to PnP.
It seems the coordinate systems are different.OpenCV uses right-hand coordinate-system Y-pointing downwards. At nghiaho.com it says the calculations are based on this and if you look at the axis they don't seem to match. I guess you are using Rodrigues for matrix computation? Try comparing rotation vectors as well.

Consistency of projecting points onto an undistorted image

I want to project a point in 3D space into 2D image coordinates. I have the calibrated intrinsics and extrinsics of the camera I'm using. I have the camera matrix K and distortion coefficients D. However, I want the projected image coordinates to be of the undistorted image.
From my research, I found two ways to do this.
Use opencv's getOptimalNewCameraMatrix function to compute a new undistorted image's camera matrix K'. Then use this K' in opencv's projectPoints function, with the distortion parameters set to 0, to get the projected point.
Use projectPoints function using the raw camera matrix K, along with the distortion coefficients D in this function and get the projected point.
Should the output of both methods match?
I think that there is something missing in your thought.
Camera matrix K and dist. coefficent D are the parameters for make the undistortion (if your lens is distorting the image like in a fisheye). They are what is called intrinsic camera parameters.
If we change terms from computer vision to computer graphics, those parameters are the one you use for defining the frustum of the view, and, for example, they are used for getting the focal length of the camera.
But they are not enough to do the projection stuff.
For the projection, if you think in a computer graphics term (like opengl, for instance) you need to have the model-view-projection matrix. The model matrix is the matrix that specifies the position of the object in the world. The view matrix specifies the position of the camera, and the projection matrix specify the frustum (focal angle, perspective distortion, etc).
If you want to know how to transform the points of the model from 3d to 2d (or viceversa) you need the projection and the view matrixes (you have the model matrix because you have the 3d points from which you want to start). And in computer vision the view matrix is called estrinsic parameters.
So, you need the estrinsic parameters too, that are the position of the camera in the world. That is, for instance, those parameters are the rvec and tvec that cv:: projectPoints needs.
If you want to compute them, they are exactly the output of cv::solvePnP that do the opposite of what you want to do: from some known 3d points coupled with the known 2d projection on them on the camera screen, this function gives you the estrinsic parameters (from which you can get the view matrix for some opengl-opencv-augmented-reality-whatever application via cv::Rodrigues).
Last note: while the instrinsic parameters are fixed in all the pictures you shoot with a camera (while you don't change the focal length of course), the estrinisc parameters changes every time you move the camera for take a new picture from a different view point (that is: this changes the perspective point of view, so the 3D-2D projection you want to find)
Hope could help!

findHomography() / 3x3 matrix - how to get rotation part out of it?

As a result to a call to findHomography() I get back a 3x3 matrix mtx[3][3]. This matrix contains the translation part in mtx[0][2] and mtx[1][2]. But how can I get the rotation part out of this 3x3 matrix?
Unfortunaltely my target system uses completely different calculation so I can't reuse the 3x3 matrix directly and have to extract the rotation out of this, that's why I'm asking this question.
Generally speaking, you can't decompose the final transformation matrix into its constituent parts. There are some certain cases where it is possible. For example if the only operation preceding the operation was a translation, then you can do arccos(m[0][0]) to get the theta value of the rotation.
Found it for my own meanwhile: There is an OpenCV function RQDecomp3x3() that can be used to extract parts of the transformation out of a matrix.
RQDecomp3x3 has a problem to return rotation in other axes except Z so in this way you just find spin in z axes correctly,if you find projection matrix and pass it to "decomposeProjectionMatrix" you will find better resaults,projection matrix is different to homography matrix you should attention to this point.

How can I use the output 3x3 matrix from getPerspectiveTransform in OpenCV?

I'm now trying to analyze the perspective transform/homography matrix between two images capturing the same object (e.g., a rectangle) but at different perspectives/shooting angles. The perspective transform can be derived by using the function getPerspectiveTransform in OpenCV 2.3.1. I want to find the corresponding rotation and translation matrices.
The output of getPerspectiveTransform is a 3x3 matrix which I can directly use it to warp the source image into the target image. But my question is that how I can find the rotation and translation matrices based on the obtained 3x3 matrix?
I was looking into the funciton decomposeProjectionMatrix for the corresponding rotation and translation matrices. But the input is required to be a 3x4 projection matrix. How can I relate the perspective transformation (i.e., a 3x3 matrix) to the 3x4 projection matrix? Am I on the right track?
Thank you very much.
The information contained in the homography matrix (returned from getPerspectiveTransform) is not enough to extract rotation/translation. The missing column is key to correctly find the angles.
The good news is that in some scenarios, you can use the solvePnP() function to extract the desired parameters from two sets of points.
Also, this question is about the same thing you are asking for. It should help
Analyze camera movement with OpenCV

Compute transformation matrix from a set of coordinates (with OpenCV)

I have a small cube with n (you can assume that n = 4) distinguished points on its surface. These points are numbered (1-n) and form a coordinate space, where point #1 is the origin.
Now I'm using a tracking camera to get the coordinates of those points, relative to the camera's coordinate space. That means that I now have n vectors p_i pointing from the origin of the camera to the cube's surface.
With that information, I'm trying to compute the affine transformation matrix (rotation + translation) that represents the transformation between those two coordinate spaces. The translation part is fairly trivial, but I'm struggling with the computation of the rotation matrix.
Is there any build-in functionality in OpenCV that might help me solve this problem?
Sounds like cvGetPerspectiveTransform is what you're looking for; cvFindHomograpy might also be helpful.
solvePnP should give you the rotation matrix and the translation vector. Try it with CV_EPNP or CV_ITERATIVE.
Edit: Or perhaps you're looking for RQ decomposition.
Look at the Stereo Camera tutorial for OpenCV. OpenCV uses a planar chessboard for all the computation and sets its Z-dimension to 0 to build its list of 3D points. You already have 3D points so change the code in the tutorial to reflect your list of 3D points. Then you can compute the transformation.

Resources