how to get camera position from opencv solvepnp for unity - opencv

I am having trouble converting the output of solvePnP to a camera position in Unity. I have spent the last several days going through documentation, reading every question I could find about it, and trying all different approaches but still I am stuck.
Here's what I can do. I have a 3D object in the real world with known 2D-3D coresponsandance points. And I can use these and solvePNP to get the rvec,tvec. Then I can plot the 2D points and then on top of those I can plot the points found with projectPoints. These points line up pretty closely.
(success, rotation_vector, translation_vector) = cv2.solvePnP(model_points, image_points, camera_matrix,
dist_coeffs, flags=cv2.cv2.SOLVEPNP_ITERATIVE)
print("Rotation Vector:\n" + str(rotation_vector))
print("Translation Vector:\n" + str(translation_vector))
(repro_points, jacobian) = cv2.projectPoints(model_points, rotation_vector,
translation_vector, camera_matrix, dist_coeffs)
original_img = cv2.imread(r"{}".format("og.jpg"))
for p in repro_points:
cv2.circle(original_img, (int(p[0][0]), int(p[0][1])), 3, (255, 255, 255), -1)
print(str(p[0][0]) + "-" + str(p[0][1]))
for p in image_points:
cv2.circle(original_img, (int(p[0]), int(p[1])), 3, (0, 0, 255), -1)
cv2.imshow("og", original_img)
cv2.waitKey()
The code above will display my original image with the image_points and repro_points more or less lined up. Now I understand from reading that tvec and rvec cannot be used in Unity directly instead they represent the transformation matrix between the camera space and the object space such that you can translate points between the two spaces with the following formula.
Now I want to take what I solvepnp has given me, the rvec and tvec and determine how to rotate and translate my unity camera to line up in the same position as the original picture was taken. In Unity I start with the camera facing my object and with both of them at 0,0,0. So to try to keep things clear in my head I made another python script to try and convert Rvec and Tvec into unity position and rotations. This is based on advice I saw on opencv forum. First I get the rotation matrix from Rodrigues, transpose it and then swap the 2nd and 3rd row to swap y and z although I don't know if that is right. Then I rotate the matrix and finally negate it and multiply it by tvec to get the position. But this position does not line up with the real world.
def main(argv):
print("Start")
rvec = np.array([0.11160132, -2.67532422, -0.55994949])
rvec = rvec.reshape(3,1)
print("RVEC")
print(rvec)
tvec = np.array([0.0896325, -0.14819345, -0.36882839])
tvec = tvec.reshape(3,1)
print("TVEC")
print(tvec)
rotmat,_ = cv2.Rodrigues(rvec)
print("Rotation Matrix:")
print(rotmat)
#trans_mat = cv2.hconcat((rotmat, tvec))
#print("Transformation Matrix:")
#print(trans_mat)
#transpose the transformation matrix
transposed_mat = np.transpose(rotmat)
print("Transposed Mat: ")
print(transposed_mat)
#swap rows 1 & 2
swap = np.empty([3, 3])
swap[0] = rotmat[0]
swap[1] = rotmat[2]
swap[2] = rotmat[1]
print("SWAP")
print(swap)
R = np.rot90(swap)
#this is supposed to be the rotation matrix for the camera
print("R:")
print(R)
#translation matrix
#they say negative matrix is 1's on diagonals do they mean idenity matrix
#negativematrix = np.identity(3)
position = np.matmul(-R, tvec);
print("Position: ")
print(position)
The output of this code is:
Start
RVEC
[[ 0.11160132]
[-2.67532422]
[-0.55994949]]
TVEC
[[ 0.0896325 ]
[-0.14819345]
[-0.36882839]]
Rotation Matrix:
[[-0.91550667 0.00429232 -0.4022799 ]
[-0.15739624 0.91641547 0.36797976]
[ 0.37023502 0.40020526 -0.83830888]]
Transposed Mat:
[[-0.91550667 -0.15739624 0.37023502]
[ 0.00429232 0.91641547 0.40020526]
[-0.4022799 0.36797976 -0.83830888]]
SWAP
[[-0.91550667 0.00429232 -0.4022799 ]
[ 0.37023502 0.40020526 -0.83830888]
[-0.15739624 0.91641547 0.36797976]]
R:
[[-0.4022799 -0.83830888 0.36797976]
[ 0.00429232 0.40020526 0.91641547]
[-0.91550667 0.37023502 -0.15739624]]
Position:
[[0.04754685]
[0.39692311]
[0.07887335]]
If I swap y and z here I could sort of see it being close but it is still not right.
To get the rotation I have been doing the following. And I also tried subtracting 180 from the y axis since in Unity my camera and object are facing one another. But this is not coming out right for me either.
rotation_mat, jacobian = cv2.Rodrigues(rotation_vector)
pose_mat = cv2.hconcat((rotation_mat, translation_vector))
tr = -np.matrix(rotation_mat).T * np.matrix(translation_vector)
print("TR_TR")
print(tr)
_, _, _, _, _, _, euler_angles = cv2.decomposeProjectionMatrix(pose_mat)
print("Euler:")
print(euler_angles)
I was feeling good this morning when I got the re-projected points to line up, but now I feel like I'm stuck in the mud. Any help is appreciated. Thank you.

I had a similar problem when I was writing an AR application for Unity. I remember spending several days too, until I figured it out. I had a DLL written in C++ OpenCV which took an image from a webcam, detected an object in the image and found its pose. A front end written in C# Unity would call the DLL's functions and update the position and orientation of a 3D model accordingly.
Simplified versions of the C++ and Unity code are:
void getCurrentPose(float* outR, float* outT)
{
cv::Mat Rvec;
cv::Mat Tvec;
cv::solvePnP(g_modelPoints, g_imagePoints, g_cameraMatrix, g_distortionParams, Rvec, Tvec, false, cv::SOLVEPNP_ITERATIVE);
cv::Matx33d R;
cv::Rodrigues(Rvec, R);
cv::Point3d T;
T.x = Tvec.at<double>(0, 0);
T.y = Tvec.at<double>(1, 0);
T.z = Tvec.at<double>(2, 0);
// Uncomment to return the camera transformation instead of model transformation
/* const cv::Matx33d Rinv = R.t();
const cv::Point3d Tinv = -R.t() * T;
R = Rinv;
T = Tinv;
*/
for(int i = 0; i < 9; i++)
{
outR[i] = (float)R.val[i];
}
outT[0] = (float)T.x;
outT[1] = (float)T.y;
outT[2] = (float)T.z;
}
and
public class Vision : MonoBehaviour {
GameObject model = null;
void Start() {
model = GameObject.Find("Model");
}
void Update() {
float[] r = new float[9];
float[] t = new float[3];
dll_getCurrentPose(r, t); // Get object pose from DLL
Matrix4x4 R = new Matrix4x4();
R.SetRow(0, new Vector4(r[0], r[1], r[2], 0));
R.SetRow(1, new Vector4(r[3], r[4], r[5], 0));
R.SetRow(2, new Vector4(r[6], r[7], r[8], 0));
R.SetRow(3, new Vector4(0, 0, 0, 1));
Quaternion Q = R.rotation;
model.transform.SetPositionAndRotation(
new Vector3(t[0], -t[1], t[2]),
new Quaternion(-Q.x, Q.y, -Q.z, Q.w));
}
}
It should be easy to port in Python and try it. Since you want the camera transformation, you should probably uncomment the commented lines in the C++ code. I don't know if this code will work for your case but maybe it is worth trying. Unfortunately I don't have Unity installed anymore to try things so I can't make any other suggestions.

Related

Pose Estimation - cv::SolvePnP with Scenekit - Coordinate System Question

I have been working on Pose Estimation (rectifying key points on a 3D model with 2D points on an image to match pose) via OpenCV's cv::solvePNP, using features / key points from Apples Vision framework.
TL-DR:
My scene kit model is being translated and the units look correct when introspecting the translation and rotation vectors from solvePnP (ie, they are the right order of magnitude), but the coordinate system of the translation appears off:
I am trying to understand the coordinate system requirements with solvePnP wrt to Metal / OpenGL coordinate system and my camera projection matrix.
What 'projectionMatrix' does my SCNCamera require to match image based coordinate system passed into solvePnP?
Some things ive read / believe I am taking into account.
OpenCV vs OpenGL (thus Metal) have row major vs column major differences.
OpenCV's coordinate system for 3D is different than OpenGL (thus Metal).
Longer with code:
My workflow is as such:
Step 1 - use a 3D model tool to introspect points on my 3D model and get the objects vertex positions for the major key points in the 2D detected features. I am using left pupil, right pupil, tip of nose, tip of chin, left outer lip corner, right outer lip corner.
Step 2 - Run a vision request and extract a list of points in image space (converting for OpenCV's top left coordinate system) and extract the same ordered list of 2D points.
Step 3 - Construct a camera matrix by using the size of the input image.
Step 4 - run cv::solvePnP, and then use cv::Rodrigues to convert the rotation vector to a matrix
Step 5 - Convert the coordinate system of the resulting transforms into something appropriate for the GPU - invert the y and z axis and combine the translation and rotation to a single 4x4 Matrix, and then transpose it for the appropriate major ness of OpenGL / Metal
Step 6 - apply the resulting transform to Scenekit via:
let faceNodeTransform = openCVWrapper.transform(for: landmarks, imageSize: size)
self.destinationView.pointOfView?.transform = SCNMatrix4Invert(faceNodeTransform)
Below is my Obj-C++ OpenCV Wrapper which takes in a subset of Vision Landmarks and the true pixel size of the image being looked at:
/ https://answers.opencv.org/question/23089/opencv-opengl-proper-camera-pose-using-solvepnp/
- (SCNMatrix4) transformFor:(VNFaceLandmarks2D*)landmarks imageSize:(CGSize)imageSize
{
// 1 convert landmarks to image points in image space (pixels) to vector of cv::Point2f's :
// Note that this translates the point coordinate system to be top left oriented for OpenCV's image coordinates:
std::vector<cv::Point2f > imagePoints = [self imagePointsForLandmarks:landmarks imageSize:imageSize];
// 2 Load Model Points
std::vector<cv::Point3f > modelPoints = [self modelPoints];
// 3 create our camera extrinsic matrix
// TODO - see if this is sane?
double max_d = fmax(imageSize.width, imageSize.height);
cv::Mat cameraMatrix = (cv::Mat_<double>(3,3) << max_d, 0, imageSize.width/2.0,
0, max_d, imageSize.height/2.0,
0, 0, 1.0);
// 4 Run solvePnP
double distanceCoef[] = {0,0,0,0};
cv::Mat distanceCoefMat = cv::Mat(1 ,4 ,CV_64FC1,distanceCoef);
// Output Matrixes
std::vector<double> rv(3);
cv::Mat rotationOut = cv::Mat(rv);
std::vector<double> tv(3);
cv::Mat translationOut = cv::Mat(tv);
cv::solvePnP(modelPoints, imagePoints, cameraMatrix, distanceCoefMat, rotationOut, translationOut, false, cv::SOLVEPNP_EPNP);
// 5 Convert rotation matrix (actually a vector)
// To a real 4x4 rotation matrix:
cv::Mat viewMatrix = cv::Mat::zeros(4, 4, CV_64FC1);
cv::Mat rotation;
cv::Rodrigues(rotationOut, rotation);
// Append our transforms to our matrix and set final to identity:
for(unsigned int row=0; row<3; ++row)
{
for(unsigned int col=0; col<3; ++col)
{
viewMatrix.at<double>(row, col) = rotation.at<double>(row, col);
}
viewMatrix.at<double>(row, 3) = translationOut.at<double>(row, 0);
}
viewMatrix.at<double>(3, 3) = 1.0f;
// Transpose OpenCV to OpenGL coords
cv::Mat cvToGl = cv::Mat::zeros(4, 4, CV_64FC1);
cvToGl.at<double>(0, 0) = 1.0f;
cvToGl.at<double>(1, 1) = -1.0f; // Invert the y axis
cvToGl.at<double>(2, 2) = -1.0f; // invert the z axis
cvToGl.at<double>(3, 3) = 1.0f;
viewMatrix = cvToGl * viewMatrix;
// Finally transpose to get correct SCN / OpenGL Matrix :
cv::Mat glViewMatrix = cv::Mat::zeros(4, 4, CV_64FC1);
cv::transpose(viewMatrix , glViewMatrix);
return [self convertCVMatToMatrix4:glViewMatrix];
}
- (SCNMatrix4) convertCVMatToMatrix4:(cv::Mat)matrix
{
SCNMatrix4 scnMatrix = SCNMatrix4Identity;
scnMatrix.m11 = matrix.at<double>(0, 0);
scnMatrix.m12 = matrix.at<double>(0, 1);
scnMatrix.m13 = matrix.at<double>(0, 2);
scnMatrix.m14 = matrix.at<double>(0, 3);
scnMatrix.m21 = matrix.at<double>(1, 0);
scnMatrix.m22 = matrix.at<double>(1, 1);
scnMatrix.m23 = matrix.at<double>(1, 2);
scnMatrix.m24 = matrix.at<double>(1, 3);
scnMatrix.m31 = matrix.at<double>(2, 0);
scnMatrix.m32 = matrix.at<double>(2, 1);
scnMatrix.m33 = matrix.at<double>(2, 2);
scnMatrix.m34 = matrix.at<double>(2, 3);
scnMatrix.m41 = matrix.at<double>(3, 0);
scnMatrix.m42 = matrix.at<double>(3, 1);
scnMatrix.m43 = matrix.at<double>(3, 2);
scnMatrix.m44 = matrix.at<double>(3, 3);
return (scnMatrix);
}
Some questions:
An SCNNode has no modelViewMatrix (just as I understand it, a transform, which is the modelMatrix) to just throw a matrix at - so I've read the inverse of the transform from SolvePNP process can be used to pose the camera instead, which appears to get me the closes result. I want to ensure this approach is correct.
If I have the modelViewMatrix, and the projectionMatrix, I should be able to calculate the appropriate modelMatrix? Is this the approach I should be taking?
Its unclear to me what projectionMatrix I should be using for my SceneKit Scene and If that has any bearing on my results. Do I need a pixel for pixel exact match of my viewport to the image size, and how do I properly configure my SCNCamera to ensure coordinate system agreeance for SolvePnP?
Thank you very much!

opencv Vec3d to Eigen::Quaternion, euler flipping on results

I am using opencv::solvePnP to return a camera pose. I run PnP, and it returns the rvec and tvec values.(rotation vector and position).
I then run this function to convert the values to the camera pose:
void GetCameraPoseEigen(cv::Vec3d tvecV, cv::Vec3d rvecV, Eigen::Vector3d &Translate, Eigen::Quaterniond &quats)
{
Mat R;
Mat tvec, rvec;
tvec = DoubleMatFromVec3b(tvecV);
rvec = DoubleMatFromVec3b(rvecV);
cv::Rodrigues(rvec, R); // R is 3x3
R = R.t(); // rotation of inverse
tvec = -R*tvec; // translation of inverse
Eigen::Matrix3d mat;
cv2eigen(R, mat);
Eigen::Quaterniond EigenQuat(mat);
quats = EigenQuat;
double x_t = tvec.at<double>(0, 0);
double y_t = tvec.at<double>(1, 0);
double z_t = tvec.at<double>(2, 0);
Translate.x() = x_t * 10;
Translate.y() = y_t * 10;
Translate.z() = z_t * 10;
}
This works, yet at some rotation angles, the converted rotation values flip randomly between positive and negative values. Yet, the source rvecV value does not. I assume this means I am going wrong with my conversion. How can i get a stable Quaternion from the PnP returned cv::Vec3d?
EDIT: This seems to be Quaternion flipping, as mentioned here:
Quaternion is flipping sign for very similar rotations?
Based on that, i have tried adding:
if(quat.w() < 0)
{
quat = quat.Inverse();
}
But I see the same flipping.
Both quat and -quat represent the same rotation. You can check that by taking a unit quaternion, converting it to a rotation matrix, then doing
quat.coeffs() = -quat.coeffs();
and converting that to a rotation matrix as well.
If for some reason you always want a positive w value, negate all coefficients if w is negative.
The sign should not matter...
... rotation-wise, as long as all four fields of the 4D quaternion are getting flipped. There's more to it explained here:
Quaternion to EulerXYZ, how to differentiate the negative and positive quaternion
Think of it this way:
Angle/axis both flipped mean the same thing
and mind the clockwise to counterclockwise transition much like in a mirror image.
There may be convention to keep the quat.w() or quat[0] component positive and change other components to opposite accordingly. Assume w = cos(angle/2) then setting w > 0 just means: I want angle to be within the (-pi, pi) range. So that the -270 degrees rotation becomes +90 degrees rotation.
Doing the quat.Inverse() is probably not what you want, because this creates a rotation in the opposite direction. That is -quat != quat.Inverse().
Also: check that both systems have the same handedness (chirality)! Test if your rotation matrix determinant is +1 or -1.
(sry for the image link, I don't have enough reputation to embed them).

Rotation matrix to euler angles with opencv

I am working on a project wich involves Aruco markers and opencv.
I am quite far in the project progress. I can read the rotation vectors and convert them to a rodrigues matrix using rodrigues() from opencv.
This is a example of a rodrigues matrix I get:
[0,1,0;
1,0,0;
0,0,-1]
I use the following code.
Mat m33(3, 3, CV_64F);
Mat measured_eulers(3, 1, CV_64F);
Rodrigues(rotationVectors, m33);
measured_eulers = rot2euler(m33);
Degree_euler = measured_eulers * 180 / CV_PI;
I use the predefined rot2euler to convert from rodrigues matrix to euler angles.
And I convert the received radians to degrees.
rot2euler looks like the following.
Mat rot2euler(const Mat & rotationMatrix)
{
Mat euler(3, 1, CV_64F);
double m00 = rotationMatrix.at<double>(0, 0);
double m02 = rotationMatrix.at<double>(0, 2);
double m10 = rotationMatrix.at<double>(1, 0);
double m11 = rotationMatrix.at<double>(1, 1);
double m12 = rotationMatrix.at<double>(1, 2);
double m20 = rotationMatrix.at<double>(2, 0);
double m22 = rotationMatrix.at<double>(2, 2);
double x, y, z;
// Assuming the angles are in radians.
if (m10 > 0.998) { // singularity at north pole
x = 0;
y = CV_PI / 2;
z = atan2(m02, m22);
}
else if (m10 < -0.998) { // singularity at south pole
x = 0;
y = -CV_PI / 2;
z = atan2(m02, m22);
}
else
{
x = atan2(-m12, m11);
y = asin(m10);
z = atan2(-m20, m00);
}
euler.at<double>(0) = x;
euler.at<double>(1) = y;
euler.at<double>(2) = z;
return euler;
}
If I use the rodrigues matrix I give as an example I get the following euler angles.
[0; 90; -180]
But I am suppose to get the following.
[-180; 0; 90]
When is use this tool http://danceswithcode.net/engineeringnotes/rotations_in_3d/demo3D/rotations_in_3d_tool.html
You can see that [0; 90; -180] doesn't match the rodrigues matrix but [-180; 0; 90] does. (I am aware of the fact that the tool works with ZYX coordinates)
So the problem is I get the correct values but in a wrong order.
Another problem is that this isn't always the case.
For example rodrigues matrix:
[1,0,0;
0,-1,0;
0,0,-1]
Provides me the correct euler angles.
If someone knows a solution to the problem or can provide me with a explanation how the rot2euler function works exactly. It will be higly appreciated.
Kind Regards
Brent Convens
I guess I am quite late but I'll answer it nonetheless.
Dont quote me on this, ie I'm not 100 % certain but this is one
of the files ( {OPENCV_INSTALLATION_DIR}/apps/interactive-calibration/rotationConverters.cpp ) from the source code of openCV 3.3
It seems to me that openCV is giving you Y-Z-X ( similar to what is being shown in the code above )
Why I said I wasn't sure because I just looked at the source code of cv::Rodrigues and it doesnt seem to call this piece of code that I have shown above. The Rodrigues function has the math harcoded into it ( and I think it can be checked by Taking the 2 rotation matrices and multiplying them as - R = Ry * Rz * Rx and then looking at the place in the code where there is a acos(R(2,0)) or asin(R(0,2) or something similar,since one of the elements of "R" will usually be a cos() or sine which will give you a solution as to which angle is being found.
Not specific to OpenCV, but you could write something like this:
cosine_for_pitch = math.sqrt(pose_mat[0][0] ** 2 + pose_mat[1][0] ** 2)
is_singular = cosine_for_pitch < 10**-6
if not is_singular:
yaw = math.atan2(pose_mat[1][0], pose_mat[0][0])
pitch = math.atan2(-pose_mat[2][0], cosine_for_pitch)
roll = math.atan2(pose_mat[2][1], pose_mat[2][2])
else:
yaw = math.atan2(-pose_mat[1][2], pose_mat[1][1])
pitch = math.atan2(-pose_mat[2][0], cosine_for_pitch)
roll = 0
Here, you could explore more:
https://www.learnopencv.com/rotation-matrix-to-euler-angles/
http://www.staff.city.ac.uk/~sbbh653/publications/euler.pdf
I propose to use the PCL library to do that with this formulation
pcl::getEulerAngles(transformatoin,roll,pitch,yaw);
you need just to initialize the roll, pitch, yaw and a pre-calculated transformation matrix you can do it

OpenCV + OpenGL: proper camera pose using solvePnP

I've got problem with obtaining proper camera pose from iPad camera using OpenCV.
I'm using custom made 2D marker (based on AruCo library ) - I want to render 3D cube over that marker using OpenGL.
In order to recieve camera pose I'm using solvePnP function from OpenCV.
According to THIS LINK I'm doing it like this:
cv::solvePnP(markerObjectPoints, imagePoints, [self currentCameraMatrix], _userDefaultsManager.distCoeffs, rvec, tvec);
tvec.at<double>(0, 0) *= -1; // I don't know why I have to do it, but translation in X axis is inverted
cv::Mat R;
cv::Rodrigues(rvec, R); // R is 3x3
R = R.t(); // rotation of inverse
tvec = -R * tvec; // translation of inverse
cv::Mat T(4, 4, R.type()); // T is 4x4
T(cv::Range(0, 3), cv::Range(0, 3)) = R * 1; // copies R into T
T(cv::Range(0, 3), cv::Range(3, 4)) = tvec * 1; // copies tvec into T
double *p = T.ptr<double>(3);
p[0] = p[1] = p[2] = 0;
p[3] = 1;
camera matrix & dist coefficients are coming from findChessboardCorners function, imagePoints are manually detected corners of marker (you can see them as green square in the video posted below), and markerObjectPoints are manually hardcoded points that represents marker corners:
markerObjectPoints.push_back(cv::Point3d(-6, -6, 0));
markerObjectPoints.push_back(cv::Point3d(6, -6, 0));
markerObjectPoints.push_back(cv::Point3d(6, 6, 0));
markerObjectPoints.push_back(cv::Point3d(-6, 6, 0));
Because marker is 12 cm long in real world, I've chosed the same size in the for easier debugging.
As a result I'm recieving 4x4 matrix T, that I'll use as ModelView matrix in OpenCV.
Using GLKit drawing function looks more or less like this:
- (void)glkView:(GLKView *)view drawInRect:(CGRect)rect {
// preparations
glClearColor(0.0, 0.0, 0.0, 0.0);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
float aspect = fabsf(self.bounds.size.width / self.bounds.size.height);
effect.transform.projectionMatrix = GLKMatrix4MakePerspective(GLKMathDegreesToRadians(39), aspect, 0.1f, 1000.0f);
// set modelViewMatrix
float mat[16] = generateOpenGLMatFromFromOpenCVMat(T);
currentModelMatrix = GLKMatrix4MakeWithArrayAndTranspose(mat);
effect.transform.modelviewMatrix = currentModelMatrix;
[effect prepareToDraw];
glDrawArrays(GL_TRIANGLES, 0, 36); // draw previously prepared cube
}
I'm not rotating everything for 180 degrees around X axis (as it was mentioned in previously linked article), because I doesn't look as necessary.
The problem is that it doesn't work! Translation vector looks OK, but X and Y rotations are messed up :(
I've recorded a video presenting that issue:
http://www.youtube.com/watch?v=EMNBT5H7-os
I've tried almost everything (including inverting all axises one by one), but nothing actually works.
What should I do? How should I properly display that 3D cube? Translation / rotation vectors that come from solvePnP are looking reasonable, so I guess that I can't correctly map these vectors to OpenGL matrices.
Thanks to Djo1509 from http://answers.opencv.org/ I've found out that the problem was unnecessary transposed rotation matrix R matrix used as part of matrix T, and unnecessary tvec = -R * tvec operation.
For more info look there

Testing a fundamental matrix

My questions are:
How do I figure out if my fundamental matrix is correct?
Is the code I posted below a good effort toward that?
My end goal is to do some sort of 3D reconstruction. Right now I'm trying to calculate the fundamental matrix so that I can estimate the difference between the two cameras. I'm doing this within openFrameworks, using the ofxCv addon, but for the most part it's just pure OpenCV. It's difficult to post code which isolates the problem since ofxCv is also in development.
My code basically reads in two 640x480 frames taken by my webcam from slightly different positions (basically just sliding the laptop a little bit horizontally). I already have a calibration matrix for it, obtained from ofxCv's calibration code, which uses findChessboardCorners. The undistortion example code seems to indicate that the calibration matrix is accurate. It calculates the optical flow between the pictures (either calcOpticalFlowPyrLK or calcOpticalFlowFarneback), and feeds those point pairs to findFundamentalMatrix.
To test if the fundamental matrix is valid, I decomposed it to a rotation and translation matrix. I then multiplied the rotation matrix by the points of the second image, to see what the rotation difference between the cameras was. I figured that any difference should be small, but I'm getting big differences.
Here's the fundamental and rotation matrix of my last code, if it helps:
fund: [-8.413948689969405e-07, -0.0001918870646474247, 0.06783422344973795;
0.0001877654679452431, 8.522397812179886e-06, 0.311671691674232;
-0.06780237856576941, -0.3177275967586101, 1]
R: [0.8081771697692786, -0.1096128431920695, -0.5786490187247098;
-0.1062963539438068, -0.9935398408215166, 0.03974506055610323;
-0.5792674230456705, 0.02938723035105822, -0.8146076621848839]
t: [0, 0.3019063882496216, -0.05799044915951077;
-0.3019063882496216, 0, -0.9515721940769112;
0.05799044915951077, 0.9515721940769112, 0]
Here's my portion of the code, which occurs after the second picture is taken:
const ofImage& image1 = images[images.size() - 2];
const ofImage& image2 = images[images.size() - 1];
std::vector<cv::Point2f> points1 = flow->getPointsPrev();
std::vector<cv::Point2f> points2 = flow->getPointsNext();
std::vector<cv::KeyPoint> keyPoints1 = convertFrom(points1);
std::vector<cv::KeyPoint> keyPoints2 = convertFrom(points2);
std::cout << "points1: " << points1.size() << std::endl;
std::cout << "points2: " << points2.size() << std::endl;
fundamentalMatrix = (cv::Mat)cv::findFundamentalMat(points1, points2);
cv::Mat cameraMatrix = (cv::Mat)calibration.getDistortedIntrinsics().getCameraMatrix();
cv::Mat cameraMatrixInv = cameraMatrix.inv();
std::cout << "fund: " << fundamentalMatrix << std::endl;
essentialMatrix = cameraMatrix.t() * fundamentalMatrix * cameraMatrix;
cv::SVD svd(essentialMatrix);
Matx33d W(0,-1,0, //HZ 9.13
1,0,0,
0,0,1);
cv::Mat_<double> R = svd.u * Mat(W).inv() * svd.vt; //HZ 9.19
std::cout << "R: " << (cv::Mat)R << std::endl;
Matx33d Z(0, -1, 0,
1, 0, 0,
0, 0, 0);
cv::Mat_<double> t = svd.vt.t() * Mat(Z) * svd.vt;
std::cout << "t: " << (cv::Mat)t << std::endl;
Vec3d tVec = Vec3d(t(1,2), t(2,0), t(0,1));
Matx34d P1 = Matx34d(R(0,0), R(0,1), R(0,2), tVec(0),
R(1,0), R(1,1), R(1,2), tVec(1),
R(2,0), R(2,1), R(2,2), tVec(2));
ofMatrix4x4 ofR(R(0,0), R(0,1), R(0,2), 0,
R(1,0), R(1,1), R(1,2), 0,
R(2,0), R(2,1), R(2,2), 0,
0, 0, 0, 1);
ofRs.push_back(ofR);
cv::Matx34d P(1,0,0,0,
0,1,0,0,
0,0,1,0);
for (int y = 0; y < image1.height; y += 10) {
for (int x = 0; x < image1.width; x += 10) {
Vec3d vec(x, y, 0);
Point3d point1(vec.val[0], vec.val[1], vec.val[2]);
Vec3d result = (cv::Mat)((cv::Mat)R * (cv::Mat)vec);
Point3d point2 = result;
mesh.addColor(image1.getColor(x, y));
mesh.addVertex(ofVec3f(point1.x, point1.y, point1.z));
mesh.addColor(image2.getColor(x, y));
mesh.addVertex(ofVec3f(point2.x, point2.y, point2.z));
}
}
Any ideas? Does my fundamental matrix look correct, or do I have the wrong idea in testing it?
If you want to find out if your Fundamental Matrix is correct, you should compute error.
Using the epipolar constraint equation, you can check how close the detected features in one image lie on the epipolar lines of the other image. Ideally, these dot products should sum to 0, and thus, the calibration error is computed as the sum of absolute distances (SAD). The mean of the SAD is reported as stereo calibration error. Basically, you are computing SAD of the computed features in image_left (could be chessboard corners) from the corresponding epipolar lines. This error is measured in pixel^2, anything below 1 is acceptable.
OpenCV has code examples, look at the Stereo Calibrate cpp file, it shows you how to compute this error.
https://code.ros.org/trac/opencv/browser/trunk/opencv/samples/c/stereo_calib.cpp?rev=2614
Look at "avgErr" Lines 260-269
Ankur
i think that you did not remove matches which are incorrect before you use then to calculate F.
Also i have an idea on how to validate F ,from x'Fx=0,you can replace several x' and x in the formula.
KyleFan
I wrote a python function to do this:
def Ferror(F,pts1,pts2): # pts are Nx3 array of homogenous coordinates.
# how well F satisfies the equation pt1 * F * pt2 == 0
vals = pts1.dot(F).dot(pts2.T)
err = np.abs(vals)
print("avg Ferror:",np.mean(err))
return np.mean(err)

Resources