Questions about the parameters T and t in the stereoCalibrate function and recoverPose function in Opnecv

Questions about the parameters T and t in the stereoCalibrate function and recoverPose function in Opnecv - opencv

I want to verify the rotation matrix and translation vector of the intrinsic matrix factorization calculated by SFM with the rotation matrix and translation vector calculated in the binocular calibration with two cameras.
`
calRealPoint(objRealPoint, boardWidth, boardHeight, frameNumber, squareSize);
cout << "cal real successful" << endl;
double rms = stereoCalibrate(objRealPoint, imagePointL, imagePointR,
cameraMatrixL, distCoeffL,
cameraMatrixR, distCoeffR,
Size(imageWidth, imageHeight), R, T, E, F/*, rvecs, rvecs*/, CALIB_USE_INTRINSIC_GUESS,
TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, 100, 1e-5));
cout << "Stereo Calibration done with RMS error = " << rms << endl;
`
`
// corrected images
int pass_count = recoverPose(p1, p2, K_LEFT, noArray(), K_RIGHT, noArray(), E_, R_, t);
`
The result is that the rotation matrix is somewhat different, and the translation vector is much different.
enter image description here
enter image description here
Then I checked the API documentation, what I want to ask is what is the difference between T and t in the documentation?
Finally I used the decomposeEssentialMat function to calculate t.
`
Mat t_, R1, R2;
decomposeEssentialMat(E, R1, R2, t_);
t : [-0.9992756029282842;
0.004148058533358282;
0.03782939336967541]
T=[-30.43588350113243;
0.126341347579394;
1.152205663926915]
`
So what exactly is the difference between T and t? thank you

Check the decomposeEssentialMat specification in the OpenCV reference manual.
You'll find that it says
..., you can only get the direction of the translation, so the function returns unit t.

Related

Image perspective correction and measurement with a homography

I want to calibrate a camera in order to make real world measurements of distance between features in an image. I'm using the OpenCV demo images for this example. I'd like to know if what I'm doing is valid and/or if there is another/better way to go about it. I could use a checkerboard of known pitch to calibrate my camera. But in this example the pose of the camera is as shown in the image below. So I can't just calculate px/mm to make my real world distance measurements, I need to correct for the pose first.
I find the chessboard corners and calibrate the camera with a call to OpenCV's calibrateCamera which gives the intrinsics, extrinsics and distortion parameters.
Now, I want to be able to make measurements with the camera in this same pose. As an example I'll try and measure between two corners on the image above. I want to do a perspective correction on the image so I can effectively get a birds eye view. I do this using the method described here. The idea is that the rotation for camera pose that gets a birds eye view of this image is a rotation on the z-axis. Below is the rotation matrix.
Then following this OpenCV homography example I calculate the homography between the original view and my desired bird's eye view. If I do this on the image above I get an image that looks good, see below. Now that I have my perspective rectified image I find the corners again with findChessboardCorners and I calculate an average pixel distance between corners of ~36 pixels. If the distance between corners in rw units is 25mm I can say my px/mm scaling is 36/25 = 1.44 px/mm. Now for any image taken with the camera in this pose I can rectify the image and use this pixel scaling to measure distance between objects in the image. Is there a better way to allow for real world distance measurements here? Is it possible to do this perspective correction for pixels only? For example if I find the pixel locations of two corners in the original image can I apply the image rectification on the pixel coordinates only? Rather than on the entire image which can be computationally expensive? Just trying to deepen my understanding. Thanks
Some of my code
void MyCalibrateCamera(float squareSize, const std::string& imgPath, const Size& patternSize)
{
Mat img = imread(samples::findFile(imgPath));
Mat img_corners = img.clone(), img_pose = img.clone();
//! [find-chessboard-corners]
vector<Point2f> corners;
bool found = findChessboardCorners(img, patternSize, corners);
//! [find-chessboard-corners]
if (!found)
{
cout << "Cannot find chessboard corners." << endl;
return;
}
drawChessboardCorners(img_corners, patternSize, corners, found);
imshow("Chessboard corners detection", img_corners);
waitKey();
//! [compute-object-points]
vector<Point3f> objectPoints;
calcChessboardCorners(patternSize, squareSize, objectPoints);
vector<Point2f> objectPointsPlanar;
vector <vector <Point3f>> objectPointsArray;
vector <vector <Point2f>> imagePointsArray;
imagePointsArray.push_back(corners);
objectPointsArray.push_back(objectPoints);
Mat intrinsics;
Mat distortion;
vector <Mat> rotation;
vector <Mat> translation;
double RMSError = calibrateCamera(
objectPointsArray,
imagePointsArray,
img.size(),
intrinsics,
distortion,
rotation,
translation,
CALIB_ZERO_TANGENT_DIST |
CALIB_FIX_K3 | CALIB_FIX_K4 | CALIB_FIX_K5 |
CALIB_FIX_ASPECT_RATIO);
cout << "intrinsics: " << intrinsics << endl;
cout << "rotation: " << rotation.at(0) << endl;
cout << "translation: " << translation.at(0) << endl;
cout << "distortion: " << distortion << endl;
drawFrameAxes(img_pose, intrinsics, distortion, rotation.at(0), translation.at(0), 2 * squareSize);
imshow("FrameAxes", img_pose);
waitKey();
//todo: is this a valid px/mm measure?
float px_to_mm = intrinsics.at<double>(0, 0) / (translation.at(0).at<double>(2,0) * 1000);
//undistort the image
Mat imgUndistorted, map1, map2;
initUndistortRectifyMap(
intrinsics,
distortion,
Mat(),
getOptimalNewCameraMatrix(
intrinsics,
distortion,
img.size(),
1,
img.size(),
0),
img.size(),
CV_16SC2,
map1,
map2);
remap(
img,
imgUndistorted,
map1,
map2,
INTER_LINEAR);
imshow("OrgImg", img);
waitKey();
imshow("UndistortedImg", imgUndistorted);
waitKey();
Mat img_bird_eye_view = img.clone();
//rectify
// https://docs.opencv.org/3.4.0/d9/dab/tutorial_homography.html#tutorial_homography_Demo3
// https://stackoverflow.com/questions/48576087/birds-eye-view-perspective-transformation-from-camera-calibration-opencv-python
//Get to a birds eye view, or -90 degrees z rotation
Mat rvec = rotation.at(0);
Mat tvec = translation.at(0);
//-90 degrees z. Required depends on the frame axes.
Mat R_desired = (Mat_<double>(3, 3) <<
0, 1, 0,
-1, 0, 0,
0, 0, 1);
//Get 3x3 rotation matrix from rotation vector
Mat R;
Rodrigues(rvec, R);
//compute the normal to the camera frame
Mat normal = (Mat_<double>(3, 1) << 0, 0, 1);
Mat normal1 = R * normal;
//compute d, distance . dot product between the normal and a point on the plane.
Mat origin(3, 1, CV_64F, Scalar(0));
Mat origin1 = R * origin + tvec;
double d_inv1 = 1.0 / normal1.dot(origin1);
//compute the homography to go from the camera view to the desired view
Mat R_1to2, tvec_1to2;
Mat tvec_desired = tvec.clone();
//get the displacement between our views
computeC2MC1(R, tvec, R_desired, tvec_desired, R_1to2, tvec_1to2);
//now calculate the euclidean homography
Mat H = R_1to2 + d_inv1 * tvec_1to2 * normal1.t();
//now the projective homography
H = intrinsics * H * intrinsics.inv();
H = H / H.at<double>(2, 2);
std::cout << "H:\n" << H << std::endl;
Mat imgToWarp = imgUndistorted.clone();
warpPerspective(imgToWarp, img_bird_eye_view, H, img.size());
Mat compare;
hconcat(imgToWarp, img_bird_eye_view, compare);
imshow("Bird eye view", compare);
waitKey();
...

opencv calibrateCamera function yielding bad results

I'm trying to get opencv camera calibration working but having trouble getting it to output valid data. I have an uncalibrated camera that I would like to calibrate, but to test my code I am using an Azure Kinect camera (the color camera), since the SDK supplies the correct intrinsics for it and I can verify them. I've collected 30 images of a chessboard from slightly different angles, which I understand should be sufficient, and run the calibration function, but no matter what flags I pass in I get values for fx and fy that are pretty different from the correct fx and fy, and distortion coefficients that are WILDLY different. Am I doing something wrong? Do I need more or better data?
A sample of the images I'm using can be found here: https://www.dropbox.com/sh/9pa94uedoe5mlxz/AABisSvgWwBT-bY65lfzp2N3a?dl=0
Save them in c:\calibration_test to run the code below.
#include <filesystem>
#include <iostream>
#include <opencv2/core.hpp>
#include <opencv2/calib3d/calib3d.hpp>
#include <opencv2/features2d/features2d.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/imgcodecs.hpp>
using namespace std;
namespace fs = experimental::filesystem;
static bool extractCorners(cv::Mat colorImage, vector<cv::Point3f>& corners3d, vector<cv::Point2f>& corners)
{
// Each square is 20x20mm
const float kSquareSize = 0.020f;
const cv::Size boardSize(7, 9);
const cv::Point3f kCenterOffset((float)(boardSize.width - 1) * kSquareSize, (float)(boardSize.height - 1) * kSquareSize, 0.f);
cv::Mat image;
cv::cvtColor(colorImage, image, cv::COLOR_BGRA2GRAY);
int chessBoardFlags = cv::CALIB_CB_ADAPTIVE_THRESH | cv::CALIB_CB_NORMALIZE_IMAGE;
if (!cv::findChessboardCorners(image, boardSize, corners, chessBoardFlags))
{
return false;
}
cv::cornerSubPix(image, corners, cv::Size(11, 11), cv::Size(-1, -1),
cv::TermCriteria(cv::TermCriteria::EPS + cv::TermCriteria::COUNT, 30, 0.1));
// Construct the corners
for (int i = 0; i < boardSize.height; ++i)
for (int j = 0; j < boardSize.width; ++j)
corners3d.push_back(cv::Point3f(j * kSquareSize, i * kSquareSize, 0) - kCenterOffset);
return true;
}
int main()
{
vector<cv::Mat> frames;
for (const auto& p : fs::directory_iterator("c:\\calibration_test\\"))
{
frames.push_back(cv::imread(p.path().string()));
}
int numFrames = (int)frames.size();
vector<vector<cv::Point2f>> corners(numFrames);
vector<vector<cv::Point3f>> corners3d(numFrames);
int framesWithCorners = 0;
for (int i = 0; i < numFrames; ++i)
{
if (extractCorners(frames[i], corners3d[framesWithCorners], corners[framesWithCorners]))
{
++framesWithCorners;
}
}
numFrames = framesWithCorners;
corners.resize(numFrames);
corners3d.resize(numFrames);
// Camera intrinsics come from the Azure Kinect API
cv::Matx33d cameraMatrix(
914.111755f, 0.f, 960.887390f,
0.f, 913.880615f, 551.566528f,
0.f, 0.f, 1.f);
vector<float> distCoeffs = { 0.576340079f, -2.71203661f, 0.000563957903f, -0.000239689150f, 1.54344523f, 0.454746544f, -2.53860712f, 1.47272563f };
cv::Size imageSize = frames[0].size();
vector<cv::Point3d> rotations;
vector<cv::Point3d> translations;
int flags = cv::CALIB_USE_INTRINSIC_GUESS | cv::CALIB_FIX_PRINCIPAL_POINT | cv::CALIB_RATIONAL_MODEL;
double result = cv::calibrateCamera(corners3d, corners, imageSize, cameraMatrix, distCoeffs, rotations, translations,
flags);
// After this call, cameraMatrix has different values for fx and fy, and WILDLY different distortion coefficients.
cout << "fx: " << cameraMatrix(0, 0) << endl;
cout << "fy: " << cameraMatrix(1, 1) << endl;
cout << "cx: " << cameraMatrix(0, 2) << endl;
cout << "cy: " << cameraMatrix(1, 2) << endl;
for (size_t i = 0; i < distCoeffs.size(); ++i)
{
cout << "d" << i << ": " << distCoeffs[i] << endl;
}
return 0;
}
Some sample output is:
fx: 913.143
fy: 917.965
cx: 960.887
cy: 551.567
d0: 0.327596
d1: -73.1837
d2: -0.00125972
d3: 0.002805
d4: -7.93086
d5: 0.295437
d6: -73.481
d7: -3.25043
d8: 0
d9: 0
d10: 0
d11: 0
d12: 0
d13: 0
Any idea what I'm doing wrong?
Bonus question: Why do I get 14 distortion coefficients back instead of 8? If I leave off CALIB_RATIONAL_MODEL then I only get 5 (three radial and two tangential).

You need to take images from the whole field of view of the camera to correctly capture the lens distortion characteristics. The images you provide only show the chessboad in one position, slightly angled.
Ideally you should have images of the chessboard evenly distributed over the x and y axis of the image plane, right up to the edges of the image. Make sure sufficient white boarder around the board is always visible though for detection robustness.
You should also try to capture images where the chessboard is nearer to the camera and farther away, not just a uniform distance. The different angles you provide look good on the other hand.
You can find an extensive guide how to ensure good calibration results in this answer: How to verify the correctness of calibration of a webcam?
Comparing your camera matrix to the one coming from Azure Kinect API it doesn't look so bad. The principle point is pretty spot on and the focal length is in a reasonable range. If you improve the quality of the input with my tips and the SO answer I have provided the results should be even closer. Comparing sets of distortion coefficients by their distance doesn't really work that well, the error function is not convex so you can have lots of local minima that produce relatively good results but they are far from the global minimum that would yield the best results. If that explanation makes sense to you.
Regarding your bonus question: I only see 8 values filled in in the output you return, the rest is 0 so doesn't have any influence. I'm not sure if the output is expected to be different from that function.

Multiple-View Geometry

I've two images captured from two cameras of same make placed some distance apart, capturing the same scene. I want to calculate the real world rotation and translation among the two cameras. In order to achieve this, I first extracted the SIFT features of both of the images and matched them.
I now have fundamental matrix as well as homography matrix. However unable to proceed further, lots of confusion. Can anybody help me to estimate the rotation and translation in between two cameras?
I'm using OpenCV for feature extraction and matching, homography calculations.

If you have the Homography then you also have the rotation. Once you have homography it is easy to get rotation and translation matrix.
For example, if you are using OpenCV c++:
param[in] H
param[out] pose
void cameraPoseFromHomography(const Mat& H, Mat& pose)
{
pose = Mat::eye(3, 4, CV_32FC1); // 3x4 matrix, the camera pose
float norm1 = (float)norm(H.col(0));
float norm2 = (float)norm(H.col(1));
float tnorm = (norm1 + norm2) / 2.0f; // Normalization value
Mat p1 = H.col(0); // Pointer to first column of H
Mat p2 = pose.col(0); // Pointer to first column of pose (empty)
cv::normalize(p1, p2); // Normalize the rotation, and copies the column to pose
p1 = H.col(1); // Pointer to second column of H
p2 = pose.col(1); // Pointer to second column of pose (empty)
cv::normalize(p1, p2); // Normalize the rotation and copies the column to pose
p1 = pose.col(0);
p2 = pose.col(1);
Mat p3 = p1.cross(p2); // Computes the cross-product of p1 and p2
Mat c2 = pose.col(2); // Pointer to third column of pose
p3.copyTo(c2); // Third column is the crossproduct of columns one and two
pose.col(3) = H.col(2) / tnorm; //vector t [R|t] is the last column of pose
}
This function calculates de camera pose from homography, in which rotation is contained. For further theoretical info follow this thread.

Testing a fundamental matrix

My questions are:
How do I figure out if my fundamental matrix is correct?
Is the code I posted below a good effort toward that?
My end goal is to do some sort of 3D reconstruction. Right now I'm trying to calculate the fundamental matrix so that I can estimate the difference between the two cameras. I'm doing this within openFrameworks, using the ofxCv addon, but for the most part it's just pure OpenCV. It's difficult to post code which isolates the problem since ofxCv is also in development.
My code basically reads in two 640x480 frames taken by my webcam from slightly different positions (basically just sliding the laptop a little bit horizontally). I already have a calibration matrix for it, obtained from ofxCv's calibration code, which uses findChessboardCorners. The undistortion example code seems to indicate that the calibration matrix is accurate. It calculates the optical flow between the pictures (either calcOpticalFlowPyrLK or calcOpticalFlowFarneback), and feeds those point pairs to findFundamentalMatrix.
To test if the fundamental matrix is valid, I decomposed it to a rotation and translation matrix. I then multiplied the rotation matrix by the points of the second image, to see what the rotation difference between the cameras was. I figured that any difference should be small, but I'm getting big differences.
Here's the fundamental and rotation matrix of my last code, if it helps:
fund: [-8.413948689969405e-07, -0.0001918870646474247, 0.06783422344973795;
0.0001877654679452431, 8.522397812179886e-06, 0.311671691674232;
-0.06780237856576941, -0.3177275967586101, 1]
R: [0.8081771697692786, -0.1096128431920695, -0.5786490187247098;
-0.1062963539438068, -0.9935398408215166, 0.03974506055610323;
-0.5792674230456705, 0.02938723035105822, -0.8146076621848839]
t: [0, 0.3019063882496216, -0.05799044915951077;
-0.3019063882496216, 0, -0.9515721940769112;
0.05799044915951077, 0.9515721940769112, 0]
Here's my portion of the code, which occurs after the second picture is taken:
const ofImage& image1 = images[images.size() - 2];
const ofImage& image2 = images[images.size() - 1];
std::vector<cv::Point2f> points1 = flow->getPointsPrev();
std::vector<cv::Point2f> points2 = flow->getPointsNext();
std::vector<cv::KeyPoint> keyPoints1 = convertFrom(points1);
std::vector<cv::KeyPoint> keyPoints2 = convertFrom(points2);
std::cout << "points1: " << points1.size() << std::endl;
std::cout << "points2: " << points2.size() << std::endl;
fundamentalMatrix = (cv::Mat)cv::findFundamentalMat(points1, points2);
cv::Mat cameraMatrix = (cv::Mat)calibration.getDistortedIntrinsics().getCameraMatrix();
cv::Mat cameraMatrixInv = cameraMatrix.inv();
std::cout << "fund: " << fundamentalMatrix << std::endl;
essentialMatrix = cameraMatrix.t() * fundamentalMatrix * cameraMatrix;
cv::SVD svd(essentialMatrix);
Matx33d W(0,-1,0, //HZ 9.13
1,0,0,
0,0,1);
cv::Mat_<double> R = svd.u * Mat(W).inv() * svd.vt; //HZ 9.19
std::cout << "R: " << (cv::Mat)R << std::endl;
Matx33d Z(0, -1, 0,
1, 0, 0,
0, 0, 0);
cv::Mat_<double> t = svd.vt.t() * Mat(Z) * svd.vt;
std::cout << "t: " << (cv::Mat)t << std::endl;
Vec3d tVec = Vec3d(t(1,2), t(2,0), t(0,1));
Matx34d P1 = Matx34d(R(0,0), R(0,1), R(0,2), tVec(0),
R(1,0), R(1,1), R(1,2), tVec(1),
R(2,0), R(2,1), R(2,2), tVec(2));
ofMatrix4x4 ofR(R(0,0), R(0,1), R(0,2), 0,
R(1,0), R(1,1), R(1,2), 0,
R(2,0), R(2,1), R(2,2), 0,
0, 0, 0, 1);
ofRs.push_back(ofR);
cv::Matx34d P(1,0,0,0,
0,1,0,0,
0,0,1,0);
for (int y = 0; y < image1.height; y += 10) {
for (int x = 0; x < image1.width; x += 10) {
Vec3d vec(x, y, 0);
Point3d point1(vec.val[0], vec.val[1], vec.val[2]);
Vec3d result = (cv::Mat)((cv::Mat)R * (cv::Mat)vec);
Point3d point2 = result;
mesh.addColor(image1.getColor(x, y));
mesh.addVertex(ofVec3f(point1.x, point1.y, point1.z));
mesh.addColor(image2.getColor(x, y));
mesh.addVertex(ofVec3f(point2.x, point2.y, point2.z));
}
}
Any ideas? Does my fundamental matrix look correct, or do I have the wrong idea in testing it?

If you want to find out if your Fundamental Matrix is correct, you should compute error.
Using the epipolar constraint equation, you can check how close the detected features in one image lie on the epipolar lines of the other image. Ideally, these dot products should sum to 0, and thus, the calibration error is computed as the sum of absolute distances (SAD). The mean of the SAD is reported as stereo calibration error. Basically, you are computing SAD of the computed features in image_left (could be chessboard corners) from the corresponding epipolar lines. This error is measured in pixel^2, anything below 1 is acceptable.
OpenCV has code examples, look at the Stereo Calibrate cpp file, it shows you how to compute this error.
https://code.ros.org/trac/opencv/browser/trunk/opencv/samples/c/stereo_calib.cpp?rev=2614
Look at "avgErr" Lines 260-269
Ankur

i think that you did not remove matches which are incorrect before you use then to calculate F.
Also i have an idea on how to validate F ,from x'Fx=0,you can replace several x' and x in the formula.
KyleFan

I wrote a python function to do this:
def Ferror(F,pts1,pts2): # pts are Nx3 array of homogenous coordinates.
# how well F satisfies the equation pt1 * F * pt2 == 0
vals = pts1.dot(F).dot(pts2.T)
err = np.abs(vals)
print("avg Ferror:",np.mean(err))
return np.mean(err)

OpenCv pointPolygonTest. How does one supply the contours as input?

I've been trying to use the OpenCV function:
double pointPolygonTest(InputArray contour, Point2f pt, bool measureDist)
I have a contour specified by 4 Points in 2D (x1,y1), ..., (x4,y4). I want to test if a Point (x,y) is inside or outside of the contour. But I can't seem to find any reference how to specify the contours as input for the function correctly.
I've tried the following implementation without getting a correct result:
vector< Point2f > contour;
contour.push_back(Point2f(x1, y1));
contour.push_back(Point2f(x2, y2));
contour.push_back(Point2f(x3, y3));
contour.push_back(Point2f(x4, y4));
int inCont;
inCont = pointPolygonTest(contour, Point2f(x, y), false);
Am I missing something?

Function works for me without any problem (OpenCV 2.3.1):
vector<Point2f> points;
points.push_back(Point2f(0,0));
points.push_back(Point2f(0,4));
points.push_back(Point2f(4,4));
points.push_back(Point2f(4,0));
cout << pointPolygonTest(points, Point2f(5,1), false) << endl;
cout << pointPolygonTest(points, Point2f(1,1), false) << endl;
cout << pointPolygonTest(points, Point2f(0,0), false) << endl;
Output:
-1
1
0

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart