I've two images captured from two cameras of same make placed some distance apart, capturing the same scene. I want to calculate the real world rotation and translation among the two cameras. In order to achieve this, I first extracted the SIFT features of both of the images and matched them.
I now have fundamental matrix as well as homography matrix. However unable to proceed further, lots of confusion. Can anybody help me to estimate the rotation and translation in between two cameras?
I'm using OpenCV for feature extraction and matching, homography calculations.
If you have the Homography then you also have the rotation. Once you have homography it is easy to get rotation and translation matrix.
For example, if you are using OpenCV c++:
param[in] H
param[out] pose
void cameraPoseFromHomography(const Mat& H, Mat& pose)
{
pose = Mat::eye(3, 4, CV_32FC1); // 3x4 matrix, the camera pose
float norm1 = (float)norm(H.col(0));
float norm2 = (float)norm(H.col(1));
float tnorm = (norm1 + norm2) / 2.0f; // Normalization value
Mat p1 = H.col(0); // Pointer to first column of H
Mat p2 = pose.col(0); // Pointer to first column of pose (empty)
cv::normalize(p1, p2); // Normalize the rotation, and copies the column to pose
p1 = H.col(1); // Pointer to second column of H
p2 = pose.col(1); // Pointer to second column of pose (empty)
cv::normalize(p1, p2); // Normalize the rotation and copies the column to pose
p1 = pose.col(0);
p2 = pose.col(1);
Mat p3 = p1.cross(p2); // Computes the cross-product of p1 and p2
Mat c2 = pose.col(2); // Pointer to third column of pose
p3.copyTo(c2); // Third column is the crossproduct of columns one and two
pose.col(3) = H.col(2) / tnorm; //vector t [R|t] is the last column of pose
}
This function calculates de camera pose from homography, in which rotation is contained. For further theoretical info follow this thread.
Related
Given a set of 3D points in camera's perspective corresponding to a planar surface (ground), is there any fast efficient method to find the orientation of the plane regarding the camera's plane? Or is it only possible by running heavier "surface matching" algorithms on the point cloud?
I've tried to use estimateAffine3D and findHomography, but my main limitation is that I don't have the point coordinates on the surface plane - I can only select a set of points from the depth images and thus must work from a set of 3D points in the camera frame.
I've written a simple geometric approach that takes a couple of points and computes vertical and horizontal angles based on depth measurement, but I fear this is both not very robust nor very precise.
EDIT: Following the suggestion by #Micka, I've attempted to fit the points to a 2D plane on the camera's frame, with the following function:
#include <opencv2/opencv.hpp>
//------------------------------------------------------------------------------
/// #brief Fits a set of 3D points to a 2D plane, by solving a system of linear equations of type aX + bY + cZ + d = 0
///
/// #param[in] points The points
///
/// #return 4x1 Mat with plane equations coefficients [a, b, c, d]
///
cv::Mat fitPlane(const std::vector< cv::Point3d >& points) {
// plane equation: aX + bY + cZ + d = 0
// assuming c=-1 -> aX + bY + d = z
cv::Mat xys = cv::Mat::ones(points.size(), 3, CV_64FC1);
cv::Mat zs = cv::Mat::ones(points.size(), 1, CV_64FC1);
// populate left and right hand matrices
for (int idx = 0; idx < points.size(); idx++) {
xys.at< double >(idx, 0) = points[idx].x;
xys.at< double >(idx, 1) = points[idx].y;
zs.at< double >(idx, 0) = points[idx].z;
}
// coeff mat
cv::Mat coeff(3, 1, CV_64FC1);
// problem is now xys * coeff = zs
// solving using SVD should output coeff
cv::SVD svd(xys);
svd.backSubst(zs, coeff);
// alternative approach -> requires mat with 3D coordinates & additional col
// solves xyzs * coeff = 0
// cv::SVD::solveZ(xyzs, coeff); // #note: data type must be double (CV_64FC1)
// check result w/ input coordinates (plane equation should output null or very small values)
double a = coeff.at< double >(0);
double b = coeff.at< double >(1);
double d = coeff.at< double >(2);
for (auto& point : points) {
std::cout << a * point.x + b * point.y + d - point.z << std::endl;
}
return coeff;
}
For simplicity purposes, it is assumed that the camera is properly calibrated and that 3D reconstruction is correct - something I already validated previously and therefore out of the scope of this issue. I use the mouse to select points on a depth/color frame pair, reconstruct the 3D coordinates and pass them into the function above.
I've also tried other approaches beyond cv::SVD::solveZ(), such as inverting xyz with cv::invert(), and with cv::solve(), but it always ended in either ridiculously small values or runtime errors regarding matrix size and/or type.
In OpenCV how do you calculate the average gradient strength in a Mat and the average gradient direction?
I have sourced the below methods by googling but I want to confirm I am actually doing this correctly before moving onto the next step.
Is this correct?
Mat img = imread('foo.png', CV_8UC); // read image as grayscale single channel
// Calculate the mean intensity and the std deviation
// Any errors here or am I doing this correctly?
Scalar sMean, sStdDev;
meanStdDev(src, sMean, sStdDev);
double mean = sMean[0];
double stddev = sStdDev[0];
// Calculate the average gradient magnitude/strength across the image
// Any errors here or am I doing this correctly?
Mat dX, dY, magnitude;
Sobel(src, dX, CV_32F, 1, 0, 1);
Sobel(src, dY, CV_32F, 0, 1, 1);
magnitude(dX, dY, magnitude);
Scalar sMMean, sMStdDev;
meanStdDev(magnitude, sMMean, sMStdDev);
double magnitudeMean = sMMean[0];
double magnitudeStdDev = sMStdDev[0];
// Calculate the average gradient direction across the image
// Any errors here or am I doing this correctly?
Scalar avgHorizDir = mean(dX);
Scalar avgVertDir = mean(dY);
double avgDir = atan2(-avgVertDir[0], avgHorizDir[0]);
float blurriness = cv::videostab::calcBlurriness(src); // low values = sharper. High values = blurry
Technically those are the correct ways of obtaining the two averages.
The way you compute mean direction uses weighted directional statistics, meaning that pixels without a strong gradient have less influence on the average.
However, for most images this average direction is not very meaningful, as there exist edges in all directions and cancel out.
If your image is of a single edge, then this will work great.
If your image has lines in it, containing edges in opposite directions, this will not work. In this case, you want to average the double angle (average orientations). The obvious way of doing this is to compute the direction per pixel as an angle, double them, then use directional statistics to average (ie convert back to vectors and average those). Doubling the angle causes opposite directions to be mapped to the same value, thus averaging doesn’t cancel these out.
Another simple way to average orientations is to take the average of the tensor field obtained by the outer product of the gradient field with itself, and determine the direction of the eigenvector corresponding to the largest eigenvalue. The tensor field is obtained as follows:
Mat Sxx = dX * dX;
Mat Syy = dY * dY;
Mat Sxy = dX * dY;
This should then be averaged:
Scalar mSxx = mean(sXX);
Scalar mSyy = mean(sYY);
Scalar mSxy = mean(sXY);
These values form a 2x2 real-valued symmetric matrix:
| mSxx mSxy |
| mSxy mSyy |
It is relatively straight-forward to determine its eigendecomposition, and can be done analytically. I don’t have the equations on hand right now, so I’ll leave it as an exercise to the reader. :)
I am using opencv::solvePnP to return a camera pose. I run PnP, and it returns the rvec and tvec values.(rotation vector and position).
I then run this function to convert the values to the camera pose:
void GetCameraPoseEigen(cv::Vec3d tvecV, cv::Vec3d rvecV, Eigen::Vector3d &Translate, Eigen::Quaterniond &quats)
{
Mat R;
Mat tvec, rvec;
tvec = DoubleMatFromVec3b(tvecV);
rvec = DoubleMatFromVec3b(rvecV);
cv::Rodrigues(rvec, R); // R is 3x3
R = R.t(); // rotation of inverse
tvec = -R*tvec; // translation of inverse
Eigen::Matrix3d mat;
cv2eigen(R, mat);
Eigen::Quaterniond EigenQuat(mat);
quats = EigenQuat;
double x_t = tvec.at<double>(0, 0);
double y_t = tvec.at<double>(1, 0);
double z_t = tvec.at<double>(2, 0);
Translate.x() = x_t * 10;
Translate.y() = y_t * 10;
Translate.z() = z_t * 10;
}
This works, yet at some rotation angles, the converted rotation values flip randomly between positive and negative values. Yet, the source rvecV value does not. I assume this means I am going wrong with my conversion. How can i get a stable Quaternion from the PnP returned cv::Vec3d?
EDIT: This seems to be Quaternion flipping, as mentioned here:
Quaternion is flipping sign for very similar rotations?
Based on that, i have tried adding:
if(quat.w() < 0)
{
quat = quat.Inverse();
}
But I see the same flipping.
Both quat and -quat represent the same rotation. You can check that by taking a unit quaternion, converting it to a rotation matrix, then doing
quat.coeffs() = -quat.coeffs();
and converting that to a rotation matrix as well.
If for some reason you always want a positive w value, negate all coefficients if w is negative.
The sign should not matter...
... rotation-wise, as long as all four fields of the 4D quaternion are getting flipped. There's more to it explained here:
Quaternion to EulerXYZ, how to differentiate the negative and positive quaternion
Think of it this way:
Angle/axis both flipped mean the same thing
and mind the clockwise to counterclockwise transition much like in a mirror image.
There may be convention to keep the quat.w() or quat[0] component positive and change other components to opposite accordingly. Assume w = cos(angle/2) then setting w > 0 just means: I want angle to be within the (-pi, pi) range. So that the -270 degrees rotation becomes +90 degrees rotation.
Doing the quat.Inverse() is probably not what you want, because this creates a rotation in the opposite direction. That is -quat != quat.Inverse().
Also: check that both systems have the same handedness (chirality)! Test if your rotation matrix determinant is +1 or -1.
(sry for the image link, I don't have enough reputation to embed them).
I was able to get a Homography for a WebcamTexture on a texture and a target object.
Now, I want to change the transform of Unity's camera based on the Homography. I've found a way to get the Camera Position from Homography like this:
void cameraPoseFromHomography(const Mat& H, Mat& pose)
{
pose = Mat::eye(3, 4, CV_32FC1); // 3x4 matrix, the camera pose
float norm1 = (float)norm(H.col(0));
float norm2 = (float)norm(H.col(1));
float tnorm = (norm1 + norm2) / 2.0f; // Normalization value
Mat p1 = H.col(0); // Pointer to first column of H
Mat p2 = pose.col(0); // Pointer to first column of pose (empty)
cv::normalize(p1, p2); // Normalize the rotation, and copies the column to pose
p1 = H.col(1); // Pointer to second column of H
p2 = pose.col(1); // Pointer to second column of pose (empty)
cv::normalize(p1, p2); // Normalize the rotation and copies the column to pose
p1 = pose.col(0);
p2 = pose.col(1);
Mat p3 = p1.cross(p2); // Computes the cross-product of p1 and p2
Mat c2 = pose.col(2); // Pointer to third column of pose
p3.copyTo(c2); // Third column is the crossproduct of columns one and two
pose.col(3) = H.col(2) / tnorm; //vector t [R|t] is the last column of pose
}
Source: https://stackoverflow.com/a/10781165/4382683
Now, this is relative to the OpenCV image which has dimensions in order of hundreds. But in Unity, the Quad has LocalScale (1,1,1) - I don't know how to translate the pose with this.
Also, how does a Mat of 3x4 be used a position in Unity - where position is just a Vector3 like (1,5,4) and rotation is also a Vector3 like (90, 180, 0).
Any hints or a direction to follow are good enough. Thanks.
How can I calculate distance from camera to a point on a ground plane from an image?
I have the intrinsic parameters of the camera and the position (height, pitch).
Is there any OpenCV function that can estimate that distance?
You can use undistortPoints to compute the rays backprojecting the pixels, but that API is rather hard to use for your purpose. It may be easier to do the calculation "by hand" in your code. Doing it at least once will also help you understand what exactly that API is doing.
Express your "position (height, pitch)" of the camera as a rotation matrix R and a translation vector t, representing the coordinate transform from the origin of the ground plane to the camera. That is, given a point in ground plane coordinates Pg = [Xg, Yg, Zg], its coordinates in camera frame are given by
Pc = R * Pg + t
The camera center is Cc = [0, 0, 0] in camera coordinates. In ground coordinates it is then:
Cg = inv(R) * (-t) = -R' * t
where inv(R) is the inverse of R, R' is its transpose, and the last equality is due to R being an orthogonal matrix.
Let's assume, for simplicity, that the the ground plane is Zg = 0.
Let K be the matrix of intrinsic parameters. Given a pixel q = [u, v], write it in homogeneous image coordinates Q = [u, v, 1]. Its location in camera coordinates is
Qc = Ki * Q
where Ki = inv(K) is the inverse of the intrinsic parameters matrix. The same point in world coordinates is then
Qg = R' * Qc + Cg
All the points Pg = [Xg, Yg, Zg] that belong to the ray from the camera center through that pixel, expressed in ground coordinates, are then on the line
Pg = Cg + lambda * (Qg - Cg)
for lambda going from 0 to positive infinity. This last formula represents three equations in ground XYZ coordinates, and you want to find the values of X, Y, Z and lambda where the ray intersects the ground plane. But that means Zg=0, so you have only 3 unknowns. Solve them (you recover lambda from the 3rd equation, then substitute in the first two), and you get Xg and Yg of the solution to your problem.