I want to calibrate a camera in order to make real world measurements of distance between features in an image. I'm using the OpenCV demo images for this example. I'd like to know if what I'm doing is valid and/or if there is another/better way to go about it. I could use a checkerboard of known pitch to calibrate my camera. But in this example the pose of the camera is as shown in the image below. So I can't just calculate px/mm to make my real world distance measurements, I need to correct for the pose first.
I find the chessboard corners and calibrate the camera with a call to OpenCV's calibrateCamera which gives the intrinsics, extrinsics and distortion parameters.
Now, I want to be able to make measurements with the camera in this same pose. As an example I'll try and measure between two corners on the image above. I want to do a perspective correction on the image so I can effectively get a birds eye view. I do this using the method described here. The idea is that the rotation for camera pose that gets a birds eye view of this image is a rotation on the z-axis. Below is the rotation matrix.
Then following this OpenCV homography example I calculate the homography between the original view and my desired bird's eye view. If I do this on the image above I get an image that looks good, see below. Now that I have my perspective rectified image I find the corners again with findChessboardCorners and I calculate an average pixel distance between corners of ~36 pixels. If the distance between corners in rw units is 25mm I can say my px/mm scaling is 36/25 = 1.44 px/mm. Now for any image taken with the camera in this pose I can rectify the image and use this pixel scaling to measure distance between objects in the image. Is there a better way to allow for real world distance measurements here? Is it possible to do this perspective correction for pixels only? For example if I find the pixel locations of two corners in the original image can I apply the image rectification on the pixel coordinates only? Rather than on the entire image which can be computationally expensive? Just trying to deepen my understanding. Thanks
Some of my code
void MyCalibrateCamera(float squareSize, const std::string& imgPath, const Size& patternSize)
{
Mat img = imread(samples::findFile(imgPath));
Mat img_corners = img.clone(), img_pose = img.clone();
//! [find-chessboard-corners]
vector<Point2f> corners;
bool found = findChessboardCorners(img, patternSize, corners);
//! [find-chessboard-corners]
if (!found)
{
cout << "Cannot find chessboard corners." << endl;
return;
}
drawChessboardCorners(img_corners, patternSize, corners, found);
imshow("Chessboard corners detection", img_corners);
waitKey();
//! [compute-object-points]
vector<Point3f> objectPoints;
calcChessboardCorners(patternSize, squareSize, objectPoints);
vector<Point2f> objectPointsPlanar;
vector <vector <Point3f>> objectPointsArray;
vector <vector <Point2f>> imagePointsArray;
imagePointsArray.push_back(corners);
objectPointsArray.push_back(objectPoints);
Mat intrinsics;
Mat distortion;
vector <Mat> rotation;
vector <Mat> translation;
double RMSError = calibrateCamera(
objectPointsArray,
imagePointsArray,
img.size(),
intrinsics,
distortion,
rotation,
translation,
CALIB_ZERO_TANGENT_DIST |
CALIB_FIX_K3 | CALIB_FIX_K4 | CALIB_FIX_K5 |
CALIB_FIX_ASPECT_RATIO);
cout << "intrinsics: " << intrinsics << endl;
cout << "rotation: " << rotation.at(0) << endl;
cout << "translation: " << translation.at(0) << endl;
cout << "distortion: " << distortion << endl;
drawFrameAxes(img_pose, intrinsics, distortion, rotation.at(0), translation.at(0), 2 * squareSize);
imshow("FrameAxes", img_pose);
waitKey();
//todo: is this a valid px/mm measure?
float px_to_mm = intrinsics.at<double>(0, 0) / (translation.at(0).at<double>(2,0) * 1000);
//undistort the image
Mat imgUndistorted, map1, map2;
initUndistortRectifyMap(
intrinsics,
distortion,
Mat(),
getOptimalNewCameraMatrix(
intrinsics,
distortion,
img.size(),
1,
img.size(),
0),
img.size(),
CV_16SC2,
map1,
map2);
remap(
img,
imgUndistorted,
map1,
map2,
INTER_LINEAR);
imshow("OrgImg", img);
waitKey();
imshow("UndistortedImg", imgUndistorted);
waitKey();
Mat img_bird_eye_view = img.clone();
//rectify
// https://docs.opencv.org/3.4.0/d9/dab/tutorial_homography.html#tutorial_homography_Demo3
// https://stackoverflow.com/questions/48576087/birds-eye-view-perspective-transformation-from-camera-calibration-opencv-python
//Get to a birds eye view, or -90 degrees z rotation
Mat rvec = rotation.at(0);
Mat tvec = translation.at(0);
//-90 degrees z. Required depends on the frame axes.
Mat R_desired = (Mat_<double>(3, 3) <<
0, 1, 0,
-1, 0, 0,
0, 0, 1);
//Get 3x3 rotation matrix from rotation vector
Mat R;
Rodrigues(rvec, R);
//compute the normal to the camera frame
Mat normal = (Mat_<double>(3, 1) << 0, 0, 1);
Mat normal1 = R * normal;
//compute d, distance . dot product between the normal and a point on the plane.
Mat origin(3, 1, CV_64F, Scalar(0));
Mat origin1 = R * origin + tvec;
double d_inv1 = 1.0 / normal1.dot(origin1);
//compute the homography to go from the camera view to the desired view
Mat R_1to2, tvec_1to2;
Mat tvec_desired = tvec.clone();
//get the displacement between our views
computeC2MC1(R, tvec, R_desired, tvec_desired, R_1to2, tvec_1to2);
//now calculate the euclidean homography
Mat H = R_1to2 + d_inv1 * tvec_1to2 * normal1.t();
//now the projective homography
H = intrinsics * H * intrinsics.inv();
H = H / H.at<double>(2, 2);
std::cout << "H:\n" << H << std::endl;
Mat imgToWarp = imgUndistorted.clone();
warpPerspective(imgToWarp, img_bird_eye_view, H, img.size());
Mat compare;
hconcat(imgToWarp, img_bird_eye_view, compare);
imshow("Bird eye view", compare);
waitKey();
...
Related
I'm calibrating a camera using a grid of circles. The camera is in a fixed location above a table so I'm using a single image for calibration. (All the objects I’ll be working with will be flat and on the same table as my calibration image.) I'm putting the real-world locations of the circle centers into objectPoints and passing that to calibrateCamera.
Here is my calibration code (basically distilled down from the OpenCV calibration.cpp sample program to work for a single image):
int circlesPerRow = 56;
int circlesPerColumn = 32;
// The distance between circle centers is 4 cm
double centerToCenterDistance = 0.04;
Mat calibrationImage = imread(calibrationImageFileName, IMREAD_GRAYSCALE);
vector<Point2f> detectedCenters;
Size boardSize(circlesPerRow, circlesPerColumn);
bool found = findCirclesGrid(calibrationImage, boardSize, detectedCenters);
if (!found)
{
return ERR_INVALID_BOARD;
}
// Put the detected centers in the imagePoints vector
vector<vector<Point2f> > imagePoints;
imagePoints.push_back(detectedCenters);
// Set the aspect ratio to 1
Mat cameraMatrix = Mat::eye(3, 3, CV_64F);
double aspectRatio = 1.0;
cameraMatrix.at<double>(0, 0) = 1.0;
Size imageSize(calibrationImage.size());
vector<Mat> rvecs, tvecs;
Mat distCoeffs = Mat::zeros(8, 1, CV_64F);
// Create a vector of the centers in user units
vector<vector<Point3f> > objectPoints(1);
for (int i = 0; i < circlesPerColumn; i++)
for (int j = 0; j < circlesPerRow; j++)
objectPoints[0].push_back(Point3f(float(j*centerToCenterDistance), float(i*centerToCenterDistance), 0));
int flags = CALIB_FIX_ASPECT_RATIO | CALIB_FIX_K4 | CALIB_FIX_K5;
calibrateCamera(objectPoints, imagePoints, imageSize, cameraMatrix, distCoeffs, rvecs, tvecs, flags);
After calling calibrateCamera how do I calculate the number of pixels per meter on the same plane as the calibration circles in an undistorted image?
First things first, you are doing a calibration with only 1 image... it is recommended to use several images in different positions to get more accurate results, because you are calculating the intrisic parameters, if it was only the camera pose, PnP would be enough.
calibrateCamera will give you the intrinsics (camera matrix) parameters needed to project 3D points to the image plane of the camera. It will also give the Extrinsic parameters needed the origin to the camera origin (one per image given).
Once you do this calibration you can create a set of points, for example:
cv::Vec3f a(0., 0., 0.), b(1., 0., 0.);
Assuming that you are using meters in your world coordinate units, if not multiply accordingly :)
Now you have 2 options, the manual way which is apply the pin hole camera model formula to this two points, using as extrinsics the ones generated from your image that has the desired camera pose (in your case you only have one). Or you can use project points like:
// your last line
cv::calibrateCamera(objectPoints, imagePoints, imageSize, cameraMatrix, distCoeffs, rvecs, tvecs, flags);
// prepare the points
std::vector<cv::Point3f> pointsToProject{cv::Vec3f{0., 0., 0.},cv::Vec3f{0., 1., 0.}};
std::vector<cv::Point2f> projectedPoints;
// invert the extrinsic matrix
cv::Mat rotMat;
cv::rodrigues(rvecs[0], rotMat);
cv::Mat transformation = cv::Mat::eye(4,4,CV_32F);
rotMat.setTo(transformation(cv::Rect(0,0,3,3)));
transformation.at<float>(0,3) = tvecs[0][0];
transformation.at<float>(1,3) = tvecs[0][1];
transformation.at<float>(2,3) = tvecs[0][2];
transformation = transformation.inv();
// back rot and translation vectors
cv::Mat rvec, tvec(3,1,CV_32F);
cv::rodrigues(transformation(cv::Rect(0,0,3,3)), rvec);
tvec.at<float>(0) = transformation.at<float>(0,3);
tvec.at<float>(1) =transformation.at<float>(1,3);
tvec.at<float>(2) =transformation.at<float>(2,3);
cv::projectPoints(pointsToProject, rvec, tvec, cameraMatrix, distCoeffs, projectedPoints );
double amountOfPixelsPerMeter = cv::norm(projectedPoints[0]-projectedPoints[1]);
However this will give a meter distance before the extrinsics is applied, so even if it is in x axis, it may be different depending on the rotations.
I hope this helps, if not leave a comment. I wrote most of it out of my head, so it may have a typo or something.
I'm trying to build static augmented reality scene over a photo with 4 defined correspondences between coplanar points on a plane and image.
Here is a step by step flow:
User adds an image using device's camera. Let's assume it contains a rectangle captured with some perspective.
User defines physical size of the rectangle, which lies in horizontal plane (YOZ in terms of SceneKit). Let's assume it's center is world's origin (0, 0, 0), so we can easily find (x,y,z) for each corner.
User defines uv coordinates in image coordinate system for each corner of the rectangle.
SceneKit scene is created with a rectangle of the same size, and visible at the same perspective.
Other nodes can be added and moved in the scene.
I've also measured position of the iphone camera relatively to the center of the A4 paper. So for this shot the position was (0, 14, 42.5) measured in cm. Also my iPhone was slightly pitched to the table (5-10 degrees)
Using this data I've set up SCNCamera to get the desired perspective of the blue plane on the third image:
let camera = SCNCamera()
camera.xFov = 66
camera.zFar = 1000
camera.zNear = 0.01
cameraNode.camera = camera
cameraAngle = -7 * CGFloat.pi / 180
cameraNode.rotation = SCNVector4(x: 1, y: 0, z: 0, w: Float(cameraAngle))
cameraNode.position = SCNVector3(x: 0, y: 14, z: 42.5)
This will give me a reference to compare my result with.
In order to build AR with SceneKit I need to:
Adjust SCNCamera's fov, so that it matches real camera's fov.
Calculate position and rotation for camera node using 4 correnspondensies between world points (x,0,z) and image points (u, v)
H - homography; K - Intrinsic matrix; [R | t] - Extrinsic matrix
I tried two approaches in order to find transform matrix for camera: using solvePnP from OpenCV and manual calculation from homography based on 4 coplanar points.
Manual approach:
1. Find out homography
This step is done successfully, since UV coordinates of world's origin seems to be correct.
2. Intrinsic matrix
In order to get intrinsic matrix of iPhone 6, I have used this app, which gave me the following result out of 100 images of 640*480 resolution:
Assuming that input image has 4:3 aspect ratio, I can scale the above matrix depending on resolution
I am not sure but it feels like a potential problem here. I've used cv::calibrationMatrixValues to check fovx for the calculated intrinsic matrix and the result was ~50°, while it should be close to 60°.
3. Camera pose matrix
func findCameraPose(homography h: matrix_float3x3, size: CGSize) -> matrix_float4x3? {
guard let intrinsic = intrinsicMatrix(imageSize: size),
let intrinsicInverse = intrinsic.inverse else { return nil }
let l1 = 1.0 / (intrinsicInverse * h.columns.0).norm
let l2 = 1.0 / (intrinsicInverse * h.columns.1).norm
let l3 = (l1+l2)/2
let r1 = l1 * (intrinsicInverse * h.columns.0)
let r2 = l2 * (intrinsicInverse * h.columns.1)
let r3 = cross(r1, r2)
let t = l3 * (intrinsicInverse * h.columns.2)
return matrix_float4x3(columns: (r1, r2, r3, t))
}
Result:
Since I measured the approximate position and orientation for this particular image, I know the transform matrix, which would give the expected result and it is quite different:
I am also a bit conserned about 2-3 element of reference rotation matrix, which is -9.1, while it should be close to zero instead, since there is very slight rotation.
OpenCV approach:
There is a solvePnP function in OpenCV for this kind of problems, so I tried to use it instead of reinventing the wheel.
OpenCV in Objective-C++:
typedef struct CameraPose {
SCNVector4 rotationVector;
SCNVector3 translationVector;
} CameraPose;
+ (CameraPose)findCameraPose: (NSArray<NSValue *> *) objectPoints imagePoints: (NSArray<NSValue *> *) imagePoints size: (CGSize) size {
vector<Point3f> cvObjectPoints = [self convertObjectPoints:objectPoints];
vector<Point2f> cvImagePoints = [self convertImagePoints:imagePoints withSize: size];
cv::Mat distCoeffs(4,1,cv::DataType<double>::type, 0.0);
cv::Mat rvec(3,1,cv::DataType<double>::type);
cv::Mat tvec(3,1,cv::DataType<double>::type);
cv::Mat cameraMatrix = [self intrinsicMatrixWithImageSize: size];
cv::solvePnP(cvObjectPoints, cvImagePoints, cameraMatrix, distCoeffs, rvec, tvec);
SCNVector4 rotationVector = SCNVector4Make(rvec.at<double>(0), rvec.at<double>(1), rvec.at<double>(2), norm(rvec));
SCNVector3 translationVector = SCNVector3Make(tvec.at<double>(0), tvec.at<double>(1), tvec.at<double>(2));
CameraPose result = CameraPose{rotationVector, translationVector};
return result;
}
+ (vector<Point2f>) convertImagePoints: (NSArray<NSValue *> *) array withSize: (CGSize) size {
vector<Point2f> points;
for (NSValue * value in array) {
CGPoint point = [value CGPointValue];
points.push_back(Point2f(point.x - size.width/2, point.y - size.height/2));
}
return points;
}
+ (vector<Point3f>) convertObjectPoints: (NSArray<NSValue *> *) array {
vector<Point3f> points;
for (NSValue * value in array) {
CGPoint point = [value CGPointValue];
points.push_back(Point3f(point.x, 0.0, -point.y));
}
return points;
}
+ (cv::Mat) intrinsicMatrixWithImageSize: (CGSize) imageSize {
double f = 0.84 * max(imageSize.width, imageSize.height);
Mat result(3,3,cv::DataType<double>::type);
cv::setIdentity(result);
result.at<double>(0) = f;
result.at<double>(4) = f;
return result;
}
Usage in Swift:
func testSolvePnP() {
let source = modelPoints().map { NSValue(cgPoint: $0) }
let destination = perspectivePicker.currentPerspective.map { NSValue(cgPoint: $0)}
let cameraPose = CameraPoseDetector.findCameraPose(source, imagePoints: destination, size: backgroundImageView.size);
cameraNode.rotation = cameraPose.rotationVector
cameraNode.position = cameraPose.translationVector
}
Output:
The result is better but far from my expectations.
Some other things I've also tried:
This question is very similar, though I don't understand how the accepted answer is working without intrinsics.
decomposeHomographyMat also didn't give me the result I expected
I am really stuck with this issue so any help would be much appreciated.
Actually I was one step away from the working solution with OpenCV.
My problem with second approach was that I forgot to convert the output from solvePnP back to SpriteKit's coordinate system.
Note that the input (image and world points) was actually converted correctly to OpenCV coordinate system (convertObjectPoints: and convertImagePoints:withSize: methods)
So here is a fixed findCameraPose method with some comments and intermediate results printed:
+ (CameraPose)findCameraPose: (NSArray<NSValue *> *) objectPoints imagePoints: (NSArray<NSValue *> *) imagePoints size: (CGSize) size {
vector<Point3f> cvObjectPoints = [self convertObjectPoints:objectPoints];
vector<Point2f> cvImagePoints = [self convertImagePoints:imagePoints withSize: size];
std::cout << "object points: " << cvObjectPoints << std::endl;
std::cout << "image points: " << cvImagePoints << std::endl;
cv::Mat distCoeffs(4,1,cv::DataType<double>::type, 0.0);
cv::Mat rvec(3,1,cv::DataType<double>::type);
cv::Mat tvec(3,1,cv::DataType<double>::type);
cv::Mat cameraMatrix = [self intrinsicMatrixWithImageSize: size];
cv::solvePnP(cvObjectPoints, cvImagePoints, cameraMatrix, distCoeffs, rvec, tvec);
std::cout << "rvec: " << rvec << std::endl;
std::cout << "tvec: " << tvec << std::endl;
std::vector<cv::Point2f> projectedPoints;
cvObjectPoints.push_back(Point3f(0.0, 0.0, 0.0));
cv::projectPoints(cvObjectPoints, rvec, tvec, cameraMatrix, distCoeffs, projectedPoints);
for(unsigned int i = 0; i < projectedPoints.size(); ++i) {
std::cout << "Image point: " << cvImagePoints[i] << " Projected to " << projectedPoints[i] << std::endl;
}
cv::Mat RotX(3, 3, cv::DataType<double>::type);
cv::setIdentity(RotX);
RotX.at<double>(4) = -1; //cos(180) = -1
RotX.at<double>(8) = -1;
cv::Mat R;
cv::Rodrigues(rvec, R);
R = R.t(); // rotation of inverse
Mat rvecConverted;
Rodrigues(R, rvecConverted); //
std::cout << "rvec in world coords:\n" << rvecConverted << std::endl;
rvecConverted = RotX * rvecConverted;
std::cout << "rvec scenekit :\n" << rvecConverted << std::endl;
Mat tvecConverted = -R * tvec;
std::cout << "tvec in world coords:\n" << tvecConverted << std::endl;
tvecConverted = RotX * tvecConverted;
std::cout << "tvec scenekit :\n" << tvecConverted << std::endl;
SCNVector4 rotationVector = SCNVector4Make(rvecConverted.at<double>(0), rvecConverted.at<double>(1), rvecConverted.at<double>(2), norm(rvecConverted));
SCNVector3 translationVector = SCNVector3Make(tvecConverted.at<double>(0), tvecConverted.at<double>(1), tvecConverted.at<double>(2));
return CameraPose{rotationVector, translationVector};
}
Notes:
RotX matrix means rotation by 180 degrees around x axis, which will transform any vector from OpenCV coordinate system to SpriteKit's
Rodrigues method transforms rotation vector to rotation matrix (3x3) and vice versa
(In iOS) I need to determine the distance to a fiducial marker which may be rotated.
I'm using the Aruco library and I have so far implemented marker detection and pose estimation in iOS, but what I really need is the distance to that marker from the camera.
I've seen some examples where you use the real size of the marker, focal length of the camera and the size on screen of the marker to compute distance, but this doesn't take into account any rotation applied to the marker.
As I have the pose estimation working I'm "guessing" that it should be possible to un-rotate the corner points of the marker then use the bounding box of those points, along with the real size and camera focal length. Though I'm not entirely sure that is correct, or how to implement it.
This is my first time using OpenCV so I'm really just guessing at the moment.
Any and all help is very much appreciated.
Many thanks
I use the code below in my project to detect Z distance between camera and marker. I'm not sure whether it's completely correct or not, but it gives acceptable results.
Ptr<aruco::DetectorParameters> detectorParams = aruco::DetectorParameters::create();
detectorParams->doCornerRefinement = true;
Ptr<aruco::Dictionary> dictionary = aruco::getPredefinedDictionary(aruco::PREDEFINED_DICTIONARY_NAME(kMarkerDictionaryID));
Mat camMatrix, distCoeffs;
if (!readCameraParameters(kCameraParametersFile, camMatrix, distCoeffs)) {
// ...
return;
}
VideoCapture inputVideo;
inputVideo.open(kCameraID);
while (inputVideo.grab()) {
Mat image;
inputVideo.retrieve(image);
vector< int > ids;
vector< vector< Point2f > > corners;
vector< Vec3d > rvecs, tvecs;
aruco::detectMarkers(image, dictionary, corners, ids, detectorParams);
if (ids.size() > 0) {
aruco::estimatePoseSingleMarkers(corners, kMarkerLengthInMeters, camMatrix, distCoeffs, rvecs, tvecs);
for (unsigned int i = 0; i < ids.size(); i++) {
// if (ids[i] != kMarkerID)
// continue;
// Calc camera pose
Mat R;
Rodrigues(rvecs[i], R);
Mat cameraPose = -R.t() * (Mat)tvecs[i];
double x = cameraPose.at<double>(0,0);
double y = cameraPose.at<double>(0,1);
double z = cameraPose.at<double>(0,2);
// So z is the distance between camera and marker
// Or if you need rotation invariant offsets
// x = tvecs[i][0];
// y = tvecs[i][0];
// z = tvecs[i][0];
cout << "X: " << x << " Y: " << y << " Z: " << z << endl;
aruco::drawAxis(image, camMatrix, distCoeffs, rvecs[i], tvecs[i], kMarkerLengthInMeters * 0.5f);
}
aruco::drawDetectedMarkers(image, corners, resultIds);
}
resize(image, image, Size(image.cols / 2, image.rows / 2));
imshow("out", image);
char key = (char)waitKey(kWaitTimeInmS);
if (key == 27) break;
}
In part of my project I need to compute orientation of a patch in an affine transformed image. Now my problem is that I don't know how can I find this computed orientation with respect to original un-warped image.
For example a point in warped image is found(100,200). I can extract orientation of this point using 8x8 neighboring pixels. suppose it is 30 degree. the warped image in which the point has found is result of applying transformation on original image with 60 degree in each axis.(pitch,yaw and roll).(This orientation extraction is usually known as descriptor extraction in computer vision)
Now the transformation matrix is known. orientation of the point in the transformed image is known as well. The position of the point wrt reference is known(using inverse transformation). Now I want to know what is the new orientation if this point(100,200) goes to reference frame(e.g. 150,250). in another word, what is the new orientation with respect to reference image.
I know this is straight forward to solve if we just have roll angle rotation(60 degree). in this case the orientation wrt reference frame would be 30+60 = 90 .
I tried implement this using OpenCV:
cv::Mat rvec1(3,1,CV_32F); // rot vector related to B
rvec1.at<float>(0,0)=0;
rvec1.at<float>(1,0)=30*to_RAD;
rvec1.at<float>(2,0)=0;
cv::Mat rvec2(3,1,CV_32F); // rot vector related to A
rvec2.at<float>(0,0)=0;
rvec2.at<float>(1,0)=60*to_RAD;
rvec2.at<float>(2,0)=0;
cv::Mat R_A;
cv::Mat R_B;
cv::Rodrigues(rvec1, R_B);
cv::Rodrigues(rvec2, R_A);
cv::Mat R_combined= R_B*R_A;
cv::Mat rvec_result;
cv::Rodrigues(R_combined,rvec_result);
I want to create rotation Mat A and B using 2 rotation vector. and after multiplying these two I want to convert it into rotation vector.But only thing I get is a runtime error on the last line (cv::Rodrigues(R_combined,rvec_result);)
Thank you for your help in advance.
Ok it sounds like you have two rotation matrices (you can use rodrigues to get them) call them A and B. You want to know how to combine them. Lets say that A represents the orientation of your arrow wrt to the patch and B is the patch wrt to the origin. Lets start with our origin at the center of the patch. A describes the orientation of the arrow. Now we want to rotate the origin by B so that our original reference axis are now at B wrt to the new origin. To do that just do
Combined = [B]*[A];
Update
I don't know why your code would give you an error, it works perfectly for me. Here is the code I ran.
cv::Mat rvec1(3,1,CV_32F); // rot vector related to B
rvec1.at<float>(0,0)=0;
rvec1.at<float>(1,0)=30*M_PI/180;;
rvec1.at<float>(2,0)=0;
cv::Mat rvec2(3,1,CV_32F); // rot vector related to A
rvec2.at<float>(0,0)=0;
rvec2.at<float>(1,0)=60*M_PI/180;;
rvec2.at<float>(2,0)=0;
cv::Mat R_A;
cv::Mat R_B;
cv::Rodrigues(rvec1, R_B);
cv::Rodrigues(rvec2, R_A);
cv::Mat R_combined= R_B*R_A;
cv::Mat rvec_result;
cv::Rodrigues(R_combined,rvec_result);
std::cout << rvec1 << std::endl<<std::endl;
std::cout << rvec2 << std::endl<<std::endl;
std::cout << R_A << std::endl<<std::endl;
std::cout << R_B << std::endl<<std::endl;
std::cout << R_combined << std::endl<<std::endl;
std::cout << rvec_result << std::endl<<std::endl;
And here is my output
[0; 0.52359879; 0]
[0; 1.0471976; 0]
[0.49999997, 0, 0.86602545;
0, 1, 0;
-0.86602545, 0, 0.49999997]
[0.86602539, 0, 0.5;
0, 1, 0;
-0.5, 0, 0.86602539]
[-5.9604645e-08, 0, 1;
0, 1, 0;
-1, 0, -5.9604645e-08]
[0; 1.5707964; 0]
My questions are:
How do I figure out if my fundamental matrix is correct?
Is the code I posted below a good effort toward that?
My end goal is to do some sort of 3D reconstruction. Right now I'm trying to calculate the fundamental matrix so that I can estimate the difference between the two cameras. I'm doing this within openFrameworks, using the ofxCv addon, but for the most part it's just pure OpenCV. It's difficult to post code which isolates the problem since ofxCv is also in development.
My code basically reads in two 640x480 frames taken by my webcam from slightly different positions (basically just sliding the laptop a little bit horizontally). I already have a calibration matrix for it, obtained from ofxCv's calibration code, which uses findChessboardCorners. The undistortion example code seems to indicate that the calibration matrix is accurate. It calculates the optical flow between the pictures (either calcOpticalFlowPyrLK or calcOpticalFlowFarneback), and feeds those point pairs to findFundamentalMatrix.
To test if the fundamental matrix is valid, I decomposed it to a rotation and translation matrix. I then multiplied the rotation matrix by the points of the second image, to see what the rotation difference between the cameras was. I figured that any difference should be small, but I'm getting big differences.
Here's the fundamental and rotation matrix of my last code, if it helps:
fund: [-8.413948689969405e-07, -0.0001918870646474247, 0.06783422344973795;
0.0001877654679452431, 8.522397812179886e-06, 0.311671691674232;
-0.06780237856576941, -0.3177275967586101, 1]
R: [0.8081771697692786, -0.1096128431920695, -0.5786490187247098;
-0.1062963539438068, -0.9935398408215166, 0.03974506055610323;
-0.5792674230456705, 0.02938723035105822, -0.8146076621848839]
t: [0, 0.3019063882496216, -0.05799044915951077;
-0.3019063882496216, 0, -0.9515721940769112;
0.05799044915951077, 0.9515721940769112, 0]
Here's my portion of the code, which occurs after the second picture is taken:
const ofImage& image1 = images[images.size() - 2];
const ofImage& image2 = images[images.size() - 1];
std::vector<cv::Point2f> points1 = flow->getPointsPrev();
std::vector<cv::Point2f> points2 = flow->getPointsNext();
std::vector<cv::KeyPoint> keyPoints1 = convertFrom(points1);
std::vector<cv::KeyPoint> keyPoints2 = convertFrom(points2);
std::cout << "points1: " << points1.size() << std::endl;
std::cout << "points2: " << points2.size() << std::endl;
fundamentalMatrix = (cv::Mat)cv::findFundamentalMat(points1, points2);
cv::Mat cameraMatrix = (cv::Mat)calibration.getDistortedIntrinsics().getCameraMatrix();
cv::Mat cameraMatrixInv = cameraMatrix.inv();
std::cout << "fund: " << fundamentalMatrix << std::endl;
essentialMatrix = cameraMatrix.t() * fundamentalMatrix * cameraMatrix;
cv::SVD svd(essentialMatrix);
Matx33d W(0,-1,0, //HZ 9.13
1,0,0,
0,0,1);
cv::Mat_<double> R = svd.u * Mat(W).inv() * svd.vt; //HZ 9.19
std::cout << "R: " << (cv::Mat)R << std::endl;
Matx33d Z(0, -1, 0,
1, 0, 0,
0, 0, 0);
cv::Mat_<double> t = svd.vt.t() * Mat(Z) * svd.vt;
std::cout << "t: " << (cv::Mat)t << std::endl;
Vec3d tVec = Vec3d(t(1,2), t(2,0), t(0,1));
Matx34d P1 = Matx34d(R(0,0), R(0,1), R(0,2), tVec(0),
R(1,0), R(1,1), R(1,2), tVec(1),
R(2,0), R(2,1), R(2,2), tVec(2));
ofMatrix4x4 ofR(R(0,0), R(0,1), R(0,2), 0,
R(1,0), R(1,1), R(1,2), 0,
R(2,0), R(2,1), R(2,2), 0,
0, 0, 0, 1);
ofRs.push_back(ofR);
cv::Matx34d P(1,0,0,0,
0,1,0,0,
0,0,1,0);
for (int y = 0; y < image1.height; y += 10) {
for (int x = 0; x < image1.width; x += 10) {
Vec3d vec(x, y, 0);
Point3d point1(vec.val[0], vec.val[1], vec.val[2]);
Vec3d result = (cv::Mat)((cv::Mat)R * (cv::Mat)vec);
Point3d point2 = result;
mesh.addColor(image1.getColor(x, y));
mesh.addVertex(ofVec3f(point1.x, point1.y, point1.z));
mesh.addColor(image2.getColor(x, y));
mesh.addVertex(ofVec3f(point2.x, point2.y, point2.z));
}
}
Any ideas? Does my fundamental matrix look correct, or do I have the wrong idea in testing it?
If you want to find out if your Fundamental Matrix is correct, you should compute error.
Using the epipolar constraint equation, you can check how close the detected features in one image lie on the epipolar lines of the other image. Ideally, these dot products should sum to 0, and thus, the calibration error is computed as the sum of absolute distances (SAD). The mean of the SAD is reported as stereo calibration error. Basically, you are computing SAD of the computed features in image_left (could be chessboard corners) from the corresponding epipolar lines. This error is measured in pixel^2, anything below 1 is acceptable.
OpenCV has code examples, look at the Stereo Calibrate cpp file, it shows you how to compute this error.
https://code.ros.org/trac/opencv/browser/trunk/opencv/samples/c/stereo_calib.cpp?rev=2614
Look at "avgErr" Lines 260-269
Ankur
i think that you did not remove matches which are incorrect before you use then to calculate F.
Also i have an idea on how to validate F ,from x'Fx=0,you can replace several x' and x in the formula.
KyleFan
I wrote a python function to do this:
def Ferror(F,pts1,pts2): # pts are Nx3 array of homogenous coordinates.
# how well F satisfies the equation pt1 * F * pt2 == 0
vals = pts1.dot(F).dot(pts2.T)
err = np.abs(vals)
print("avg Ferror:",np.mean(err))
return np.mean(err)