How do I get pixels per unit length after calling calibrateCamera? - opencv

I'm calibrating a camera using a grid of circles. The camera is in a fixed location above a table so I'm using a single image for calibration. (All the objects I’ll be working with will be flat and on the same table as my calibration image.) I'm putting the real-world locations of the circle centers into objectPoints and passing that to calibrateCamera.
Here is my calibration code (basically distilled down from the OpenCV calibration.cpp sample program to work for a single image):
int circlesPerRow = 56;
int circlesPerColumn = 32;
// The distance between circle centers is 4 cm
double centerToCenterDistance = 0.04;
Mat calibrationImage = imread(calibrationImageFileName, IMREAD_GRAYSCALE);
vector<Point2f> detectedCenters;
Size boardSize(circlesPerRow, circlesPerColumn);
bool found = findCirclesGrid(calibrationImage, boardSize, detectedCenters);
if (!found)
{
return ERR_INVALID_BOARD;
}
// Put the detected centers in the imagePoints vector
vector<vector<Point2f> > imagePoints;
imagePoints.push_back(detectedCenters);
// Set the aspect ratio to 1
Mat cameraMatrix = Mat::eye(3, 3, CV_64F);
double aspectRatio = 1.0;
cameraMatrix.at<double>(0, 0) = 1.0;
Size imageSize(calibrationImage.size());
vector<Mat> rvecs, tvecs;
Mat distCoeffs = Mat::zeros(8, 1, CV_64F);
// Create a vector of the centers in user units
vector<vector<Point3f> > objectPoints(1);
for (int i = 0; i < circlesPerColumn; i++)
for (int j = 0; j < circlesPerRow; j++)
objectPoints[0].push_back(Point3f(float(j*centerToCenterDistance), float(i*centerToCenterDistance), 0));
int flags = CALIB_FIX_ASPECT_RATIO | CALIB_FIX_K4 | CALIB_FIX_K5;
calibrateCamera(objectPoints, imagePoints, imageSize, cameraMatrix, distCoeffs, rvecs, tvecs, flags);
After calling calibrateCamera how do I calculate the number of pixels per meter on the same plane as the calibration circles in an undistorted image?

First things first, you are doing a calibration with only 1 image... it is recommended to use several images in different positions to get more accurate results, because you are calculating the intrisic parameters, if it was only the camera pose, PnP would be enough.
calibrateCamera will give you the intrinsics (camera matrix) parameters needed to project 3D points to the image plane of the camera. It will also give the Extrinsic parameters needed the origin to the camera origin (one per image given).
Once you do this calibration you can create a set of points, for example:
cv::Vec3f a(0., 0., 0.), b(1., 0., 0.);
Assuming that you are using meters in your world coordinate units, if not multiply accordingly :)
Now you have 2 options, the manual way which is apply the pin hole camera model formula to this two points, using as extrinsics the ones generated from your image that has the desired camera pose (in your case you only have one). Or you can use project points like:
// your last line
cv::calibrateCamera(objectPoints, imagePoints, imageSize, cameraMatrix, distCoeffs, rvecs, tvecs, flags);
// prepare the points
std::vector<cv::Point3f> pointsToProject{cv::Vec3f{0., 0., 0.},cv::Vec3f{0., 1., 0.}};
std::vector<cv::Point2f> projectedPoints;
// invert the extrinsic matrix
cv::Mat rotMat;
cv::rodrigues(rvecs[0], rotMat);
cv::Mat transformation = cv::Mat::eye(4,4,CV_32F);
rotMat.setTo(transformation(cv::Rect(0,0,3,3)));
transformation.at<float>(0,3) = tvecs[0][0];
transformation.at<float>(1,3) = tvecs[0][1];
transformation.at<float>(2,3) = tvecs[0][2];
transformation = transformation.inv();
// back rot and translation vectors
cv::Mat rvec, tvec(3,1,CV_32F);
cv::rodrigues(transformation(cv::Rect(0,0,3,3)), rvec);
tvec.at<float>(0) = transformation.at<float>(0,3);
tvec.at<float>(1) =transformation.at<float>(1,3);
tvec.at<float>(2) =transformation.at<float>(2,3);
cv::projectPoints(pointsToProject, rvec, tvec, cameraMatrix, distCoeffs, projectedPoints );
double amountOfPixelsPerMeter = cv::norm(projectedPoints[0]-projectedPoints[1]);
However this will give a meter distance before the extrinsics is applied, so even if it is in x axis, it may be different depending on the rotations.
I hope this helps, if not leave a comment. I wrote most of it out of my head, so it may have a typo or something.

Related

Image perspective correction and measurement with a homography

I want to calibrate a camera in order to make real world measurements of distance between features in an image. I'm using the OpenCV demo images for this example. I'd like to know if what I'm doing is valid and/or if there is another/better way to go about it. I could use a checkerboard of known pitch to calibrate my camera. But in this example the pose of the camera is as shown in the image below. So I can't just calculate px/mm to make my real world distance measurements, I need to correct for the pose first.
I find the chessboard corners and calibrate the camera with a call to OpenCV's calibrateCamera which gives the intrinsics, extrinsics and distortion parameters.
Now, I want to be able to make measurements with the camera in this same pose. As an example I'll try and measure between two corners on the image above. I want to do a perspective correction on the image so I can effectively get a birds eye view. I do this using the method described here. The idea is that the rotation for camera pose that gets a birds eye view of this image is a rotation on the z-axis. Below is the rotation matrix.
Then following this OpenCV homography example I calculate the homography between the original view and my desired bird's eye view. If I do this on the image above I get an image that looks good, see below. Now that I have my perspective rectified image I find the corners again with findChessboardCorners and I calculate an average pixel distance between corners of ~36 pixels. If the distance between corners in rw units is 25mm I can say my px/mm scaling is 36/25 = 1.44 px/mm. Now for any image taken with the camera in this pose I can rectify the image and use this pixel scaling to measure distance between objects in the image. Is there a better way to allow for real world distance measurements here? Is it possible to do this perspective correction for pixels only? For example if I find the pixel locations of two corners in the original image can I apply the image rectification on the pixel coordinates only? Rather than on the entire image which can be computationally expensive? Just trying to deepen my understanding. Thanks
Some of my code
void MyCalibrateCamera(float squareSize, const std::string& imgPath, const Size& patternSize)
{
Mat img = imread(samples::findFile(imgPath));
Mat img_corners = img.clone(), img_pose = img.clone();
//! [find-chessboard-corners]
vector<Point2f> corners;
bool found = findChessboardCorners(img, patternSize, corners);
//! [find-chessboard-corners]
if (!found)
{
cout << "Cannot find chessboard corners." << endl;
return;
}
drawChessboardCorners(img_corners, patternSize, corners, found);
imshow("Chessboard corners detection", img_corners);
waitKey();
//! [compute-object-points]
vector<Point3f> objectPoints;
calcChessboardCorners(patternSize, squareSize, objectPoints);
vector<Point2f> objectPointsPlanar;
vector <vector <Point3f>> objectPointsArray;
vector <vector <Point2f>> imagePointsArray;
imagePointsArray.push_back(corners);
objectPointsArray.push_back(objectPoints);
Mat intrinsics;
Mat distortion;
vector <Mat> rotation;
vector <Mat> translation;
double RMSError = calibrateCamera(
objectPointsArray,
imagePointsArray,
img.size(),
intrinsics,
distortion,
rotation,
translation,
CALIB_ZERO_TANGENT_DIST |
CALIB_FIX_K3 | CALIB_FIX_K4 | CALIB_FIX_K5 |
CALIB_FIX_ASPECT_RATIO);
cout << "intrinsics: " << intrinsics << endl;
cout << "rotation: " << rotation.at(0) << endl;
cout << "translation: " << translation.at(0) << endl;
cout << "distortion: " << distortion << endl;
drawFrameAxes(img_pose, intrinsics, distortion, rotation.at(0), translation.at(0), 2 * squareSize);
imshow("FrameAxes", img_pose);
waitKey();
//todo: is this a valid px/mm measure?
float px_to_mm = intrinsics.at<double>(0, 0) / (translation.at(0).at<double>(2,0) * 1000);
//undistort the image
Mat imgUndistorted, map1, map2;
initUndistortRectifyMap(
intrinsics,
distortion,
Mat(),
getOptimalNewCameraMatrix(
intrinsics,
distortion,
img.size(),
1,
img.size(),
0),
img.size(),
CV_16SC2,
map1,
map2);
remap(
img,
imgUndistorted,
map1,
map2,
INTER_LINEAR);
imshow("OrgImg", img);
waitKey();
imshow("UndistortedImg", imgUndistorted);
waitKey();
Mat img_bird_eye_view = img.clone();
//rectify
// https://docs.opencv.org/3.4.0/d9/dab/tutorial_homography.html#tutorial_homography_Demo3
// https://stackoverflow.com/questions/48576087/birds-eye-view-perspective-transformation-from-camera-calibration-opencv-python
//Get to a birds eye view, or -90 degrees z rotation
Mat rvec = rotation.at(0);
Mat tvec = translation.at(0);
//-90 degrees z. Required depends on the frame axes.
Mat R_desired = (Mat_<double>(3, 3) <<
0, 1, 0,
-1, 0, 0,
0, 0, 1);
//Get 3x3 rotation matrix from rotation vector
Mat R;
Rodrigues(rvec, R);
//compute the normal to the camera frame
Mat normal = (Mat_<double>(3, 1) << 0, 0, 1);
Mat normal1 = R * normal;
//compute d, distance . dot product between the normal and a point on the plane.
Mat origin(3, 1, CV_64F, Scalar(0));
Mat origin1 = R * origin + tvec;
double d_inv1 = 1.0 / normal1.dot(origin1);
//compute the homography to go from the camera view to the desired view
Mat R_1to2, tvec_1to2;
Mat tvec_desired = tvec.clone();
//get the displacement between our views
computeC2MC1(R, tvec, R_desired, tvec_desired, R_1to2, tvec_1to2);
//now calculate the euclidean homography
Mat H = R_1to2 + d_inv1 * tvec_1to2 * normal1.t();
//now the projective homography
H = intrinsics * H * intrinsics.inv();
H = H / H.at<double>(2, 2);
std::cout << "H:\n" << H << std::endl;
Mat imgToWarp = imgUndistorted.clone();
warpPerspective(imgToWarp, img_bird_eye_view, H, img.size());
Mat compare;
hconcat(imgToWarp, img_bird_eye_view, compare);
imshow("Bird eye view", compare);
waitKey();
...

Pose Estimation - cv::SolvePnP with Scenekit - Coordinate System Question

I have been working on Pose Estimation (rectifying key points on a 3D model with 2D points on an image to match pose) via OpenCV's cv::solvePNP, using features / key points from Apples Vision framework.
TL-DR:
My scene kit model is being translated and the units look correct when introspecting the translation and rotation vectors from solvePnP (ie, they are the right order of magnitude), but the coordinate system of the translation appears off:
I am trying to understand the coordinate system requirements with solvePnP wrt to Metal / OpenGL coordinate system and my camera projection matrix.
What 'projectionMatrix' does my SCNCamera require to match image based coordinate system passed into solvePnP?
Some things ive read / believe I am taking into account.
OpenCV vs OpenGL (thus Metal) have row major vs column major differences.
OpenCV's coordinate system for 3D is different than OpenGL (thus Metal).
Longer with code:
My workflow is as such:
Step 1 - use a 3D model tool to introspect points on my 3D model and get the objects vertex positions for the major key points in the 2D detected features. I am using left pupil, right pupil, tip of nose, tip of chin, left outer lip corner, right outer lip corner.
Step 2 - Run a vision request and extract a list of points in image space (converting for OpenCV's top left coordinate system) and extract the same ordered list of 2D points.
Step 3 - Construct a camera matrix by using the size of the input image.
Step 4 - run cv::solvePnP, and then use cv::Rodrigues to convert the rotation vector to a matrix
Step 5 - Convert the coordinate system of the resulting transforms into something appropriate for the GPU - invert the y and z axis and combine the translation and rotation to a single 4x4 Matrix, and then transpose it for the appropriate major ness of OpenGL / Metal
Step 6 - apply the resulting transform to Scenekit via:
let faceNodeTransform = openCVWrapper.transform(for: landmarks, imageSize: size)
self.destinationView.pointOfView?.transform = SCNMatrix4Invert(faceNodeTransform)
Below is my Obj-C++ OpenCV Wrapper which takes in a subset of Vision Landmarks and the true pixel size of the image being looked at:
/ https://answers.opencv.org/question/23089/opencv-opengl-proper-camera-pose-using-solvepnp/
- (SCNMatrix4) transformFor:(VNFaceLandmarks2D*)landmarks imageSize:(CGSize)imageSize
{
// 1 convert landmarks to image points in image space (pixels) to vector of cv::Point2f's :
// Note that this translates the point coordinate system to be top left oriented for OpenCV's image coordinates:
std::vector<cv::Point2f > imagePoints = [self imagePointsForLandmarks:landmarks imageSize:imageSize];
// 2 Load Model Points
std::vector<cv::Point3f > modelPoints = [self modelPoints];
// 3 create our camera extrinsic matrix
// TODO - see if this is sane?
double max_d = fmax(imageSize.width, imageSize.height);
cv::Mat cameraMatrix = (cv::Mat_<double>(3,3) << max_d, 0, imageSize.width/2.0,
0, max_d, imageSize.height/2.0,
0, 0, 1.0);
// 4 Run solvePnP
double distanceCoef[] = {0,0,0,0};
cv::Mat distanceCoefMat = cv::Mat(1 ,4 ,CV_64FC1,distanceCoef);
// Output Matrixes
std::vector<double> rv(3);
cv::Mat rotationOut = cv::Mat(rv);
std::vector<double> tv(3);
cv::Mat translationOut = cv::Mat(tv);
cv::solvePnP(modelPoints, imagePoints, cameraMatrix, distanceCoefMat, rotationOut, translationOut, false, cv::SOLVEPNP_EPNP);
// 5 Convert rotation matrix (actually a vector)
// To a real 4x4 rotation matrix:
cv::Mat viewMatrix = cv::Mat::zeros(4, 4, CV_64FC1);
cv::Mat rotation;
cv::Rodrigues(rotationOut, rotation);
// Append our transforms to our matrix and set final to identity:
for(unsigned int row=0; row<3; ++row)
{
for(unsigned int col=0; col<3; ++col)
{
viewMatrix.at<double>(row, col) = rotation.at<double>(row, col);
}
viewMatrix.at<double>(row, 3) = translationOut.at<double>(row, 0);
}
viewMatrix.at<double>(3, 3) = 1.0f;
// Transpose OpenCV to OpenGL coords
cv::Mat cvToGl = cv::Mat::zeros(4, 4, CV_64FC1);
cvToGl.at<double>(0, 0) = 1.0f;
cvToGl.at<double>(1, 1) = -1.0f; // Invert the y axis
cvToGl.at<double>(2, 2) = -1.0f; // invert the z axis
cvToGl.at<double>(3, 3) = 1.0f;
viewMatrix = cvToGl * viewMatrix;
// Finally transpose to get correct SCN / OpenGL Matrix :
cv::Mat glViewMatrix = cv::Mat::zeros(4, 4, CV_64FC1);
cv::transpose(viewMatrix , glViewMatrix);
return [self convertCVMatToMatrix4:glViewMatrix];
}
- (SCNMatrix4) convertCVMatToMatrix4:(cv::Mat)matrix
{
SCNMatrix4 scnMatrix = SCNMatrix4Identity;
scnMatrix.m11 = matrix.at<double>(0, 0);
scnMatrix.m12 = matrix.at<double>(0, 1);
scnMatrix.m13 = matrix.at<double>(0, 2);
scnMatrix.m14 = matrix.at<double>(0, 3);
scnMatrix.m21 = matrix.at<double>(1, 0);
scnMatrix.m22 = matrix.at<double>(1, 1);
scnMatrix.m23 = matrix.at<double>(1, 2);
scnMatrix.m24 = matrix.at<double>(1, 3);
scnMatrix.m31 = matrix.at<double>(2, 0);
scnMatrix.m32 = matrix.at<double>(2, 1);
scnMatrix.m33 = matrix.at<double>(2, 2);
scnMatrix.m34 = matrix.at<double>(2, 3);
scnMatrix.m41 = matrix.at<double>(3, 0);
scnMatrix.m42 = matrix.at<double>(3, 1);
scnMatrix.m43 = matrix.at<double>(3, 2);
scnMatrix.m44 = matrix.at<double>(3, 3);
return (scnMatrix);
}
Some questions:
An SCNNode has no modelViewMatrix (just as I understand it, a transform, which is the modelMatrix) to just throw a matrix at - so I've read the inverse of the transform from SolvePNP process can be used to pose the camera instead, which appears to get me the closes result. I want to ensure this approach is correct.
If I have the modelViewMatrix, and the projectionMatrix, I should be able to calculate the appropriate modelMatrix? Is this the approach I should be taking?
Its unclear to me what projectionMatrix I should be using for my SceneKit Scene and If that has any bearing on my results. Do I need a pixel for pixel exact match of my viewport to the image size, and how do I properly configure my SCNCamera to ensure coordinate system agreeance for SolvePnP?
Thank you very much!

Is it possible to recognize so minimal changes between noisy images in OpenCV?

I want to detect the very minimal movement of a conveyor belt using image evaluation (Resolution: 31x512, image rate: 1000 per second.). The moment of belt-start is important for me.
If I do cv::absdiff between two subsequent images, I obtain very noisy result:
According to the mechanical rotation sensor of the motor, the movement starts here:
I tried to threshold the abs-diff image with a cascade of erosion and dilation, but I could detect the earliest change more than second too late in this image:
Is it possible to find the change earlier?
Here is the sequence of the Images without changes (according to motor sensor):
In this sequence the movement begins in the middle image:
Looks like I've found a solution which works in MY case.
Instead of comparing the image changes in space-domain, the cross-correlation should be applied:
I convert both images to DFT, multiply DFT-Mats and convert back. The max pixel value is the center of the correlation. As long as the images are same, the max-pix remains in the same position and moves otherwise.
The actual working code uses 3 images, 2 DFT multiplication result between images 1,2 and 2,3:
Mat img1_( 512, 32, CV_16UC1 );
Mat img2_( 512, 32, CV_16UC1 );
Mat img3_( 512, 32, CV_16UC1 );
//read the data in the images wohever you want. I read from MHD-file
//Set ROI (if required)
Mat img1 = img1_(cv::Rect(0,200,32,100));
Mat img2 = img2_(cv::Rect(0,200,32,100));
Mat img3 = img3_(cv::Rect(0,200,32,100));
//Float mats for DFT
Mat img1f;
Mat img2f;
Mat img3f;
//DFT and produtcts mats
Mat dft1,dft2,dft3,dftproduct,dftproduct2;
//Calculate DFT of both images
img1.convertTo(img1f, CV_32FC1);
cv::dft(img1f, dft1);
img2.convertTo(img3f, CV_32FC1);
cv::dft(img3f, dft3);
img3.convertTo(img2f, CV_32FC1);
cv::dft(img2f, dft2);
//Multiply DFT Mats
cv::mulSpectrums(dft1,dft2,dftproduct,true);
cv::mulSpectrums(dft2,dft3,dftproduct2,true);
//Convert back to space domain
cv::Mat result,result2;
cv::idft(dftproduct,result);
cv::idft(dftproduct2,result2);
//Not sure if required, I needed it for visualizing
cv::normalize( result, result, 0, 255, NORM_MINMAX, CV_8UC1);
cv::normalize( result2, result2, 0, 255, NORM_MINMAX, CV_8UC1);
//Find maxima positions
double dummy;
Point locdummy; Point maxLoc1; Point maxLoc2;
cv::minMaxLoc(result, &dummy, &dummy, &locdummy, &maxLoc1);
cv::minMaxLoc(result2, &dummy, &dummy, &locdummy, &maxLoc2);
//Calculate products simply fot having one value to compare
int maxlocProd1 = maxLoc1.x*maxLoc1.y;
int maxlocProd2 = maxLoc2.x*maxLoc2.y;
//Calculate absolute difference of the products. Not 0 means movement
int absPosDiff = std::abs(maxlocProd2-maxlocProd1);
if ( absPosDiff>0 )
{
std::cout << id<< std::endl;
break;
}

OpenCV + OpenGL: proper camera pose using solvePnP

I've got problem with obtaining proper camera pose from iPad camera using OpenCV.
I'm using custom made 2D marker (based on AruCo library ) - I want to render 3D cube over that marker using OpenGL.
In order to recieve camera pose I'm using solvePnP function from OpenCV.
According to THIS LINK I'm doing it like this:
cv::solvePnP(markerObjectPoints, imagePoints, [self currentCameraMatrix], _userDefaultsManager.distCoeffs, rvec, tvec);
tvec.at<double>(0, 0) *= -1; // I don't know why I have to do it, but translation in X axis is inverted
cv::Mat R;
cv::Rodrigues(rvec, R); // R is 3x3
R = R.t(); // rotation of inverse
tvec = -R * tvec; // translation of inverse
cv::Mat T(4, 4, R.type()); // T is 4x4
T(cv::Range(0, 3), cv::Range(0, 3)) = R * 1; // copies R into T
T(cv::Range(0, 3), cv::Range(3, 4)) = tvec * 1; // copies tvec into T
double *p = T.ptr<double>(3);
p[0] = p[1] = p[2] = 0;
p[3] = 1;
camera matrix & dist coefficients are coming from findChessboardCorners function, imagePoints are manually detected corners of marker (you can see them as green square in the video posted below), and markerObjectPoints are manually hardcoded points that represents marker corners:
markerObjectPoints.push_back(cv::Point3d(-6, -6, 0));
markerObjectPoints.push_back(cv::Point3d(6, -6, 0));
markerObjectPoints.push_back(cv::Point3d(6, 6, 0));
markerObjectPoints.push_back(cv::Point3d(-6, 6, 0));
Because marker is 12 cm long in real world, I've chosed the same size in the for easier debugging.
As a result I'm recieving 4x4 matrix T, that I'll use as ModelView matrix in OpenCV.
Using GLKit drawing function looks more or less like this:
- (void)glkView:(GLKView *)view drawInRect:(CGRect)rect {
// preparations
glClearColor(0.0, 0.0, 0.0, 0.0);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
float aspect = fabsf(self.bounds.size.width / self.bounds.size.height);
effect.transform.projectionMatrix = GLKMatrix4MakePerspective(GLKMathDegreesToRadians(39), aspect, 0.1f, 1000.0f);
// set modelViewMatrix
float mat[16] = generateOpenGLMatFromFromOpenCVMat(T);
currentModelMatrix = GLKMatrix4MakeWithArrayAndTranspose(mat);
effect.transform.modelviewMatrix = currentModelMatrix;
[effect prepareToDraw];
glDrawArrays(GL_TRIANGLES, 0, 36); // draw previously prepared cube
}
I'm not rotating everything for 180 degrees around X axis (as it was mentioned in previously linked article), because I doesn't look as necessary.
The problem is that it doesn't work! Translation vector looks OK, but X and Y rotations are messed up :(
I've recorded a video presenting that issue:
http://www.youtube.com/watch?v=EMNBT5H7-os
I've tried almost everything (including inverting all axises one by one), but nothing actually works.
What should I do? How should I properly display that 3D cube? Translation / rotation vectors that come from solvePnP are looking reasonable, so I guess that I can't correctly map these vectors to OpenGL matrices.
Thanks to Djo1509 from http://answers.opencv.org/ I've found out that the problem was unnecessary transposed rotation matrix R matrix used as part of matrix T, and unnecessary tvec = -R * tvec operation.
For more info look there

Having some difficulty in image stitching using OpenCV

I'm currently working on Image stitching using OpenCV 2.3.1 on Visual Studio 2010, but I'm having some trouble.
Problem Description
I'm trying to write a code for stitching multiple images derived from a few cameras(about 3~4), i,e, the code should keep executing image stitching until I ask it to stop.
The following is what I've done so far:
(For simplification, I'll replace some part of the code with just a few words)
1.Reading frames(images) from 2 cameras (Currently I'm just working on 2 cameras.)
2.Feature detection, descriptor calculation (SURF)
3.Feature matching using FlannBasedMatcher
4.Removing outliers and calculate the Homography with inliers using RANSAC.
5.Warp one of both images.
For step 5., I followed the answer in the following thread and just changed some parameters:
Stitching 2 images in opencv
However, the result is terrible though.
I just uploaded the result onto youtube and of course only those who have the link will be able to see it.
http://youtu.be/Oy5z_7LeaMk
My code is shown below:
(Only crucial parts are shown)
VideoCapture cam1, cam2;
cam1.open(0);
cam2.open(1);
while(1)
{
Mat frm1, frm2;
cam1 >> frm1;
cam2 >> frm2;
//(SURF detection, descriptor calculation
//and matching using FlannBasedMatcher)
double max_dist = 0; double min_dist = 100;
//-- Quick calculation of max and min distances between keypoints
for( int i = 0; i < descriptors_1.rows; i++ )
{
double dist = matches[i].distance;
if( dist < min_dist ) min_dist = dist;
if( dist > max_dist ) max_dist = dist;
}
(Draw only "good" matches
(i.e. whose distance is less than 3*min_dist ))
vector<Point2f> frame1;
vector<Point2f> frame2;
for( int i = 0; i < good_matches.size(); i++ )
{
//-- Get the keypoints from the good matches
frame1.push_back( keypoints_1[ good_matches[i].queryIdx ].pt );
frame2.push_back( keypoints_2[ good_matches[i].trainIdx ].pt );
}
Mat H = findHomography( Mat(frame1), Mat(frame2), CV_RANSAC );
cout << "Homography: " << H << endl;
/* warp the image */
Mat warpImage2;
warpPerspective(frm2, warpImage2,
H, Size(frm2.cols, frm2.rows), INTER_CUBIC);
Mat final(Size(frm2.cols*3 + frm1.cols, frm2.rows),CV_8UC3);
Mat roi1(final, Rect(frm1.cols, 0, frm1.cols, frm1.rows));
Mat roi2(final, Rect(2*frm1.cols, 0, frm2.cols, frm2.rows));
warpImage2.copyTo(roi2);
frm1.copyTo(roi1);
imshow("final", final);
What else should I do to make the stitching better?
Besides, is it reasonable to make the Homography matrix fixed instead of keeping computing it ?
What I mean is to specify the angle and the displacement between the 2 cameras by myself so as to derive a Homography matrix that satisfies what I want.
Thanks. :)
It sounds like you are going about this sensibly, but if you have access to both of the cameras, and they will remain stationary with respect to each other, then calibrating offline, and simply applying the transformation online will make your application more efficient.
One point to note is, you say you are using the findHomography function from OpenCV. From the documentation, this function:
Finds a perspective transformation between two planes.
However, your points are not restricted to a specific plane as they are imaging a 3D scene. If you wanted to calibrate offline, you could image a chessboard with both cameras, and the detected corners could be used in this function.
Alternatively, you may like to investigate the Fundamental matrix, which can be calculated with a similar function. This matrix describes the relative position of the cameras, but some work (and a good textbook) will be required to extract them.
If you can find it, I would strongly recommend having a look at Part II: "Two-View Geometry" in the book "Multiple View Geometry in computer vision", by Richard Hartley and Andrew Zisserman, which goes through the process in detail.
I have been working lately on image registration. My algorithm takes two images, calculates the SURF features, find correspondences, find homography matrix and then stitch both images together, I did it with the next code:
void stich(Mat base, Mat target,Mat homography, Mat& panorama){
Mat corners1(1, 4,CV_32F);
Mat corners2(1,4,CV_32F);
Mat corners(1,4,CV_32F);
vector<Mat> planes;
/* compute corners
of warped image
*/
corners1.at<float>(0,0)=0;
corners2.at<float>(0,0)=0;
corners1.at<float>(0,1)=0;
corners2.at<float>(0,1)=target.rows;
corners1.at<float>(0,2)=target.cols;
corners2.at<float>(0,2)=0;
corners1.at<float>(0,3)=target.cols;
corners2.at<float>(0,3)=target.rows;
planes.push_back(corners1);
planes.push_back(corners2);
merge(planes,corners);
perspectiveTransform(corners, corners, homography);
/* compute size of resulting
image and allocate memory
*/
double x_start = min( min( (double)corners.at<Vec2f>(0,0)[0], (double)corners.at<Vec2f> (0,1)[0]),0.0);
double x_end = max( max( (double)corners.at<Vec2f>(0,2)[0], (double)corners.at<Vec2f>(0,3)[0]), (double)base.cols);
double y_start = min( min( (double)corners.at<Vec2f>(0,0)[1], (double)corners.at<Vec2f>(0,2)[1]), 0.0);
double y_end = max( max( (double)corners.at<Vec2f>(0,1)[1], (double)corners.at<Vec2f>(0,3)[1]), (double)base.rows);
/*Creating image
with same channels, depth
as target
and proper size
*/
panorama.create(Size(x_end - x_start + 1, y_end - y_start + 1), target.depth());
planes.clear();
/*Planes should
have same n.channels
as target
*/
for (int i=0;i<target.channels();i++){
planes.push_back(panorama);
}
merge(planes,panorama);
// create translation matrix in order to copy both images to correct places
Mat T;
T=Mat::zeros(3,3,CV_64F);
T.at<double>(0,0)=1;
T.at<double>(1,1)=1;
T.at<double>(2,2)=1;
T.at<double>(0,2)=-x_start;
T.at<double>(1,2)=-y_start;
// copy base image to correct position within output image
warpPerspective(base, panorama, T,panorama.size(),INTER_LINEAR| CV_WARP_FILL_OUTLIERS);
// change homography to take necessary translation into account
gemm(T, homography,1,T,0,T);
// warp second image and copy it to output image
warpPerspective(target,panorama, T, panorama.size(),INTER_LINEAR);
//tidy
corners.release();
T.release();
}
Any question I will try

Resources