Related
I am trying to project a giving 3D point to image plane, I have posted many question regarding this and many people help me, also I read many related links but still the projection doesn't work for me correctly.
I have a 3d point (-455,-150,0) where x is the depth axis and z is the upwards axis and y is the horizontal one I have roll: Rotation around the front-to-back axis (x) , pitch: Rotation around the side-to-side axis (y) and yaw:Rotation around the vertical axis (z) also I have the position on the camera (x,y,z)=(-50,0,100) so I am doing the following
first I am doing from world coordinates to camera coordinates using the extrinsic parameters:
double pi = 3.14159265358979323846;
double yp = 0.033716827630996704* pi / 180; //roll
double thet = 67.362312316894531* pi / 180; //pitch
double k = 89.7135009765625* pi / 180; //yaw
double rotxm[9] = { 1,0,0,0,cos(yp),-sin(yp),0,sin(yp),cos(yp) };
double rotym[9] = { cos(thet),0,sin(thet),0,1,0,-sin(thet),0,cos(thet) };
double rotzm[9] = { cos(k),-sin(k),0,sin(k),cos(k),0,0,0,1};
cv::Mat rotx = Mat{ 3,3,CV_64F,rotxm };
cv::Mat roty = Mat{ 3,3,CV_64F,rotym };
cv::Mat rotz = Mat{ 3,3,CV_64F,rotzm };
cv::Mat rotationm = rotz * roty * rotx; //rotation matrix
cv::Mat mpoint3(1, 3, CV_64F, { -455,-150,0 }); //the 3D point location
mpoint3 = mpoint3 * rotationm; //rotation
cv::Mat position(1, 3, CV_64F, {-50,0,100}); //the camera position
mpoint3=mpoint3 - position; //translation
and now I want to move from camera coordinates to image coordinates
the first solution was: as I read from some sources
Mat myimagepoint3 = mpoint3 * mycameraMatrix;
this didn't work
the second solution was:
double fx = cameraMatrix.at<double>(0, 0);
double fy = cameraMatrix.at<double>(1, 1);
double cx1 = cameraMatrix.at<double>(0, 2);
double cy1= cameraMatrix.at<double>(1, 2);
xt = mpoint3 .at<double>(0) / mpoint3.at<double>(2);
yt = mpoint3 .at<double>(1) / mpoint3.at<double>(2);
double u = xt * fx + cx1;
double v = yt * fy + cy1;
but also didn't work
I also tried to use opencv method fisheye::projectpoints(from world to image coordinates)
Mat recv2;
cv::Rodrigues(rotationm, recv2);
//inputpoints a vector contains one point which is the 3d world coordinate of the point
//outputpoints a vector to store the output point
cv::fisheye::projectPoints(inputpoints,outputpoints,recv2,position,mycameraMatrix,mydiscoff );
but this also didn't work
by didn't work I mean: I know (in the image) where should the point appear but when I draw it, it is always in another place (not even close) sometimes I even got a negative values
note: there is no syntax errors or exceptions but may I made typos while I am writing code here
so can any one suggest if I am doing something wrong?
Last couple of weeks I've been working on developing a simple proof-of-concept application in which a 3D model is projected over a specific Augmented Reality marker (in my case I am using Aruco markers) in IOS (with Swift and Objective-C)
I calibrated an Ipad Camera with a specific fixed lens position and used that to estimate the pose of the AR marker (which from my debug analysis seem pretty accurate). The problem seems (surprise, surprise) when I try to use SceneKit scene to project a model over the marker.
I am aware that the axis in opencv and SceneKit are different (Y and Z) and already done this correction as well as the row order/column order difference between the two libraries.
After constructing the projection matrix, I apply that same transform to the 3D model and from my debug analysis the object seems to be translated to the desired position and with the desired rotation. The problem is that it never does overlap the specific image pixel position of the marker. I am using a AVCapturePreviewVideoLayer as to put the video in background that has the same bounds as my SceneKit View.
Has anyone has a clue why this happens? I tried to play with cameras FOV's but with no real impact in the results.
Thank you all for your time.
EDIT1: I Will post some of the code here to reveal what I am currently doing.
I have two subviews inside the main view, one which is a background AVCaptureVideoPreviewLayer and another which is a SceneKitView. Both have the same bounds as the main view.
At each frame I use an opencv wrapper which outputs the pose of each marker:
std::vector<int> ids;
std::vector<std::vector<cv::Point2f>> corners, rejected;
cv::aruco::detectMarkers(frame, _dictionary, corners, ids, _detectorParams, rejected);
if (ids.size() > 0 ){
cv::aruco::drawDetectedMarkers(frame, corners, ids);
cv::Mat rvecs, tvecs;
cv::aruco::estimatePoseSingleMarkers(corners, 2.6, _intrinsicMatrix, _distCoeffs, rvecs, tvecs);
// Let's protect ourselves agains multiple markers
if (rvecs.total() > 1)
return;
_markerFound = true;
cv::Rodrigues(rvecs, _currentR);
_currentT = tvecs;
for (int row = 0; row < _currentR.rows; row++){
for (int col = 0; col < _currentR.cols; col++){
_currentExtrinsics.at<double>(row, col) = _currentR.at<double>(row, col);
}
_currentExtrinsics.at<double>(row, 3) = _currentT.at<double>(row);
}
_currentExtrinsics.at<double>(3,3) = 1;
std::cout << tvecs << std::endl;
// Convert coordinate systems of opencv to openGL (SceneKit)
// Note that in openCV z goes away the camera (in openGL goes into the camera)
// and y points down and on openGL point up
// Another note: openCV has a column order matrix representation, while SceneKit
// has a row order matrix, but we'll take care of it later.
cv::Mat cvToGl = cv::Mat::zeros(4, 4, CV_64F);
cvToGl.at<double>(0,0) = 1.0f;
cvToGl.at<double>(1,1) = -1.0f; // invert the y axis
cvToGl.at<double>(2,2) = -1.0f; // invert the z axis
cvToGl.at<double>(3,3) = 1.0f;
_currentExtrinsics = cvToGl * _currentExtrinsics;
cv::aruco::drawAxis(frame, _intrinsicMatrix, _distCoeffs, rvecs, tvecs, 5);
Then in each frame I convert the opencv matrix for a SCN4Matrix:
- (SCNMatrix4) transformToSceneKit:(cv::Mat&) openCVTransformation{
SCNMatrix4 mat = SCNMatrix4Identity;
// Transpose
openCVTransformation = openCVTransformation.t();
// copy the rotationRows
mat.m11 = (float) openCVTransformation.at<double>(0, 0);
mat.m12 = (float) openCVTransformation.at<double>(0, 1);
mat.m13 = (float) openCVTransformation.at<double>(0, 2);
mat.m14 = (float) openCVTransformation.at<double>(0, 3);
mat.m21 = (float)openCVTransformation.at<double>(1, 0);
mat.m22 = (float)openCVTransformation.at<double>(1, 1);
mat.m23 = (float)openCVTransformation.at<double>(1, 2);
mat.m24 = (float)openCVTransformation.at<double>(1, 3);
mat.m31 = (float)openCVTransformation.at<double>(2, 0);
mat.m32 = (float)openCVTransformation.at<double>(2, 1);
mat.m33 = (float)openCVTransformation.at<double>(2, 2);
mat.m34 = (float)openCVTransformation.at<double>(2, 3);
//copy the translation row
mat.m41 = (float)openCVTransformation.at<double>(3, 0);
mat.m42 = (float)openCVTransformation.at<double>(3, 1)+2.5;
mat.m43 = (float)openCVTransformation.at<double>(3, 2);
mat.m44 = (float)openCVTransformation.at<double>(3, 3);
return mat;
}
At each frame in which the AR marker is found I add a box to the scene and apply the transformation to the object node:
SCNBox *box = [SCNBox boxWithWidth:5.0 height:5.0 length:5.0 chamferRadius:0.0];
_boxNode = [SCNNode nodeWithGeometry:box];
if (found){
[self.delegate returnExtrinsicsMat:extrinsicMatrixOfTheMarker];
Mat R, T;
[self.delegate returnRotationMat:R];
[self.delegate returnTranslationMat:T];
SCNMatrix4 Transformation;
Transformation = [self transformToSceneKit:extrinsicMatrixOfTheMarker];
//_cameraNode.transform = SCNMatrix4Invert(Transformation);
[_sceneKitScene.rootNode addChildNode:_cameraNode];
//_cameraNode.camera.projectionTransform = SCNMatrix4Identity;
//_cameraNode.camera.zNear = 0.0;
_sceneKitView.pointOfView = _cameraNode;
_boxNode.transform = Transformation;
[_sceneKitScene.rootNode addChildNode:_boxNode];
//_boxNode.position = SCNVector3Make(Transformation.m41, Transformation.m42, Transformation.m43);
std::cout << (_boxNode.position.x) << " " << (_boxNode.position.y) << " " << (_boxNode.position.z) << std::endl << std::endl;
}
For example if the translation vector is (-1, 5, 20) the object appears in the scene in position (-1, -5, -20) in the scene, and the rotation is correct also. The problem is that it never appears in the correct position in the background image. I will add some images to show the result.
Does anyone know why this is happening?
Found out the solution. Instead of applying the transform to the node of the object I applied the inverted transformation matrix to the camera node. Then for the camera perspective transform matrix I applied the following matrix:
projection = SCNMatrix4Identity
projection.m11 = (2 * (float)(cameraMatrix[0])) / -(ImageWidth*0.5)
projection.m12 = (-2 * (float)(cameraMatrix[1])) / (ImageWidth*0.5)
projection.m13 = (width - (2 * Float(cameraMatrix[2]))) / (ImageWidth*0.5)
projection.m22 = (2 * (float)(cameraMatrix[4])) / (ImageHeight*0.5)
projection.m23 = (-height + (2 * (float)(cameraMatrix[5]))) / (ImageHeight*0.5)
projection.m33 = (-far - near) / (far - near)
projection.m34 = (-2 * far * near) / (far - near)
projection.m43 = -1
projection.m44 = 0
being far and near the z clipping planes.
I also had to correct the box initial position to center it on the marker.
I am able to detect markers, identify markers and initialise OpenGL objects on screen. The issue I'm having is overlaying them on top of the markers position in the camera world. My camera is calibrated best I can using this method Iphone 6 camera calibration for OpenCV. I feel there is an issue with my cameras projection matrix, I create it as follows:
-(void)buildProjectionMatrix:
(Matrix33)cameraMatrix:
(int)screen_width:
(int)screen_height:
(Matrix44&) projectionMatrix
{
float near = 0.01; // Near clipping distance
float far = 100; // Far clipping distance
// Camera parameters
float f_x = cameraMatrix.data[0]; // Focal length in x axis
float f_y = cameraMatrix.data[4]; // Focal length in y axis
float c_x = cameraMatrix.data[2]; // Camera primary point x
float c_y = cameraMatrix.data[5]; // Camera primary point y
std::cout<<"fx "<<f_x<<" fy "<<f_y<<" cx "<<c_x<<" cy "<<c_y<<std::endl;
std::cout<<"width "<<screen_width<<" height "<<screen_height<<std::endl;
projectionMatrix.data[0] = - 2.0 * f_x / screen_width;
projectionMatrix.data[1] = 0.0;
projectionMatrix.data[2] = 0.0;
projectionMatrix.data[3] = 0.0;
projectionMatrix.data[4] = 0.0;
projectionMatrix.data[5] = 2.0 * f_y / screen_height;
projectionMatrix.data[6] = 0.0;
projectionMatrix.data[7] = 0.0;
projectionMatrix.data[8] = 2.0 * c_x / screen_width - 1.0;
projectionMatrix.data[9] = 2.0 * c_y / screen_height - 1.0;
projectionMatrix.data[10] = -( far+near ) / ( far - near );
projectionMatrix.data[11] = -1.0;
projectionMatrix.data[12] = 0.0;
projectionMatrix.data[13] = 0.0;
projectionMatrix.data[14] = -2.0 * far * near / ( far - near );
projectionMatrix.data[15] = 0.0;
}
This is the method to estimate the position of the marker:
void MarkerDetector::estimatePosition(std::vector<Marker>& detectedMarkers)
{
for (size_t i=0; i<detectedMarkers.size(); i++)
{
Marker& m = detectedMarkers[i];
cv::Mat Rvec;
cv::Mat_<float> Tvec;
cv::Mat raux,taux;
cv::solvePnP(m_markerCorners3d, m.points, camMatrix, distCoeff,raux,taux);
raux.convertTo(Rvec,CV_32F);
taux.convertTo(Tvec ,CV_32F);
cv::Mat_<float> rotMat(3,3);
cv::Rodrigues(Rvec, rotMat);
// Copy to transformation matrix
for (int col=0; col<3; col++)
{
for (int row=0; row<3; row++)
{
m.transformation.r().mat[row][col] = rotMat(row,col); // Copy rotation component
}
m.transformation.t().data[col] = Tvec(col); // Copy translation component
}
// Since solvePnP finds camera location, w.r.t to marker pose, to get marker pose w.r.t to the camera we invert it.
m.transformation = m.transformation.getInverted();
}
}
The OpenGL shape is able to track and account for size and roation, but something is going wrong with the translation. If the camera is turned 90 degrees, the opengl shape swings around 90 degrees about the centre of the marker. Its almost as if I am translating before rotating, but I am not.
See video for issue:
https://vid.me/fLvX
I guess you can have some problem with projecting the 3-D modelpoints. Essentially, solvePnP gives a transformation that brings points from the model coordinate system to the camera coordinate system and this is composed of a rotation and translation vector (output of solvePnP):
cv::Mat rvec, tvec;
cv::solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs, rvec, tvec)
At this point you are able to project model points onto the image plane
std::vector<cv::Vec2d> imagePointsRP; // Reprojected image points
cv::projectPoints(objectPoints, rvec, tvec, cameraMatrix, distCoeffs, imagePointsRP);
Now, you should only draw the points of imagePointsRP over the incoming image and if the pose estimation was correct then you'll see the reprojected corners over the corners of the marker
Anyway, the matrices of model TO camera and camera TO model direction can be composed as below:
cv::Mat rmat
cv::Rodrigues(rvec, rmat); // mRmat is 3x3
cv::Mat modelToCam = cv::Mat::eye(4, 4, CV_64FC1);
modelToCam(cv::Range(0, 3), cv::Range(0, 3)) = rmat * 1.0;
modelToCam(cv::Range(0, 3), cv::Range(3, 4)) = tvec * 1.0;
cv::Mat camToModel = cv::Mat::eye(4, 4, CV_64FC1);
cv::Mat rmatinv = rmat.t(); // rotation of inverse
cv::Mat tvecinv = -rmatinv * tvec; // translation of inverse
camToModel(cv::Range(0, 3), cv::Range(0, 3)) = rmatinv * 1.0;
camToModel(cv::Range(0, 3), cv::Range(3, 4)) = tvecinv * 1.0;
In any case, it's also useful to estimate reprojection error and discard the poses with high error (remember, the PnP problem has only unique solution if n=4 and these points are coplanar):
double totalErr = 0.0;
for (size_t i = 0; i < imagePoints.size(); i++)
{
double err = cv::norm(cv::Mat(imagePoints[i]), cv::Mat(imagePointsRP[i]), cv::NORM_L2);
totalErr += err*err;
}
totalErr = std::sqrt(totalErr / imagePoints.size());
I'm using Emgu.CV to perform some basic image manipulation and composition. My images are loaded as Image<Bgra,Byte>.
Question #1: When I use the Image<,>.Add() method, the images are always blended together, regardless of the alpha value. Instead I'd like them to be composited one atop the other, and use the included alpha channel to determine how the images should be blended. So if I call image1.Add(image2) any fully opaque pixels in image2 would completely cover the pixels from image1, while semi-transparent pixels would be blended based on the alpha value.
Here's what I'm trying to do in visual form. There's a city image with some "transparent holes" cut out, and a frog behind. This is what it should look like:
And this is what openCV produces.
How can I get this effect with OpenCV? And will it be as fast as calling Add()?
Question #2: is there a way to perform this composition in-place instead of creating a new image with each call to Add()? (e.g. image1.AddImageInPlace(image2) modifies the bytes of image1?)
NOTE: Looking for answers within Emgu.CV, which I'm using because of how well it handles perspective warping.
Before OpenCV 2.4 there was no support of PNGs with alpha channel.
To verify if your current version supports it, print the number of channels after loading an image that you are certain to be RGBA. If it supports, the application will output the number 4, else it will output number 3 (RGB). Using the C API you would do:
IplImage* t_img = cvLoadImage(argv[1], CV_LOAD_IMAGE_UNCHANGED);
if (!t_img)
{
printf("!!! Unable to load transparent image.\n");
return -1;
}
printf("Channels: %d\n", t_img->nChannels);
If you can't update OpenCV:
There are some posts around that try to bypass this limitation but I haven't tested them myself;
The easiest solution would be to use another API to load the image and blend it, check blImageBlending;
Another alternative, not as lightweight, is to use Qt.
If your version already supports PNGs with RGBA:
Take a look at Emulating photoshop’s blending modes in OpenCV. It implements several Photoshop blending modes and I imagine you are capable of converting that code to .Net.
EDIT:
I had to deal with this problem recently and I've demonstrated how to deal with it on this answer.
You'll have to iterate through each pixel. I'm assuming image 1 is the frog image, and image 2 is the city image, with image1 always being bigger than image2.
//to simulate image1.AddInPlace(image2)
int image2w = image2.Width;
int image2h = image2.Height;
int i,j;
var alpha;
for (i = 0; i < w; i++)
{
for (j = 0; j < h; j++)
{
//alpha=255 is opaque > image2 should be used
alpha = image2[3][j,i].Intensity;
image1[j, i]
= new Bgra(
image2[j, i].Blue * alpha + (image1[j, i].Blue * (255-alpha)),
image2[j, i].Green * alpha + (image1[j, i].Green * (255-alpha)),
image2[j, i].Red * alpha + (image1[j, i].Red * (255-alpha)));
}
}
Using Osiris's suggestion as a starting point, and having checked out alpha compositing on Wikipedia, i ended up with the following which worked really nicely for my purposes.
This was used this with Emgucv. I was hoping that the opencv gpu::AlphaComposite methods were available in Emgucv which I believe would have done the following for me, but alas the version I am using didn't appear to have them implemented.
static public Image<Bgra, Byte> Overlay( Image<Bgra, Byte> image1, Image<Bgra, Byte> image2 )
{
Image<Bgra, Byte> result = image1.Copy();
Image<Bgra, Byte> src = image2;
Image<Bgra, Byte> dst = image1;
int rows = result.Rows;
int cols = result.Cols;
for (int y = 0; y < rows; ++y)
{
for (int x = 0; x < cols; ++x)
{
// http://en.wikipedia.org/wiki/Alpha_compositing
double srcA = 1.0/255 * src.Data[y, x, 3];
double dstA = 1.0/255 * dst.Data[y, x, 3];
double outA = (srcA + (dstA - dstA * srcA));
result.Data[y, x, 0] = (Byte)(((src.Data[y, x, 0] * srcA) + (dst.Data[y, x, 0] * (1 - srcA))) / outA); // Blue
result.Data[y, x, 1] = (Byte)(((src.Data[y, x, 1] * srcA) + (dst.Data[y, x, 1] * (1 - srcA))) / outA); // Green
result.Data[y, x, 2] = (Byte)(((src.Data[y, x, 2] * srcA) + (dst.Data[y, x, 2] * (1 - srcA))) / outA); // Red
result.Data[y, x, 3] = (Byte)(outA*255);
}
}
return result;
}
A newer version, using emgucv methods. rather than a loop. Not sure it improves on performance.
double unit = 1.0 / 255.0;
Image[] dstS = dst.Split();
Image[] srcS = src.Split();
Image[] rs = result.Split();
Image<Gray, double> srcA = srcS[3] * unit;
Image<Gray, double> dstA = dstS[3] * unit;
Image<Gray, double> outA = srcA.Add(dstA.Sub(dstA.Mul(srcA)));// (srcA + (dstA - dstA * srcA));
// Red.
rs[0] = srcS[0].Mul(srcA).Add(dstS[0].Mul(1 - srcA)).Mul(outA.Pow(-1.0)); // Mul.Pow is divide.
rs[1] = srcS[1].Mul(srcA).Add(dstS[1].Mul(1 - srcA)).Mul(outA.Pow(-1.0));
rs[2] = srcS[2].Mul(srcA).Add(dstS[2].Mul(1 - srcA)).Mul(outA.Pow(-1.0));
rs[3] = outA.Mul(255);
// Merge image back together.
CvInvoke.cvMerge(rs[0], rs[1], rs[2], rs[3], result);
return result.Convert<Bgra, Byte>();
I found an interesting blog post on internet, which I think is related to what you are trying to do.
Please have a look at the Creating Overlays Method (archive.org link). You can use this idea to implement your own function to add two images in the way you mentioned above, making some particular areas in the image transparent while leaving the rest as it is.
I use OpenCV to undestort set of points after camera calibration.
The code follows.
const int npoints = 2; // number of point specified
// Points initialization.
// Only 2 ponts in this example, in real code they are read from file.
float input_points[npoints][2] = {{0,0}, {2560, 1920}};
CvMat * src = cvCreateMat(1, npoints, CV_32FC2);
CvMat * dst = cvCreateMat(1, npoints, CV_32FC2);
// fill src matrix
float * src_ptr = (float*)src->data.ptr;
for (int pi = 0; pi < npoints; ++pi) {
for (int ci = 0; ci < 2; ++ci) {
*(src_ptr + pi * 2 + ci) = input_points[pi][ci];
}
}
cvUndistortPoints(src, dst, &camera1, &distCoeffs1);
After the code above dst contains following numbers:
-8.82689655e-001 -7.05507338e-001 4.16228324e-001 3.04863811e-001
which are too small in comparison with numbers in src.
At the same time if I undistort image via the call:
cvUndistort2( srcImage, dstImage, &camera1, &dist_coeffs1 );
I receive good undistorted image which means that pixel coordinates are not modified so drastically in comparison with separate points.
How to obtain the same undistortion for specific points as for images?
Thanks.
The points should be "unnormalized" using camera matrix.
More specifically, after call of cvUndistortPoints following transformation should be also added:
double fx = CV_MAT_ELEM(camera1, double, 0, 0);
double fy = CV_MAT_ELEM(camera1, double, 1, 1);
double cx = CV_MAT_ELEM(camera1, double, 0, 2);
double cy = CV_MAT_ELEM(camera1, double, 1, 2);
float * dst_ptr = (float*)dst->data.ptr;
for (int pi = 0; pi < npoints; ++pi) {
float& px = *(dst_ptr + pi * 2);
float& py = *(dst_ptr + pi * 2 + 1);
// perform transformation.
// In fact this is equivalent to multiplication to camera matrix
px = px * fx + cx;
py = py * fy + cy;
}
More info on camera matrix at OpenCV 'Camera Calibration and 3D Reconstruction'
UPDATE:
Following C++ function call should work as well:
std::vector<cv::Point2f> inputDistortedPoints = ...
std::vector<cv::Point2f> outputUndistortedPoints;
cv::Mat cameraMatrix = ...
cv::Mat distCoeffs = ...
cv::undistortPoints(inputDistortedPoints, outputUndistortedPoints, cameraMatrix, distCoeffs, cv::noArray(), cameraMatrix);
It may be your matrix size :)
OpenCV expects a vector of points - a column or a row matrix with two channels. But because your input matrix is only 2 pts, and the number of channels is also 1, it cannot figure out what's the input, row or colum.
So, fill a longer input mat with bogus values, and keep only the first:
const int npoints = 4; // number of point specified
// Points initialization.
// Only 2 ponts in this example, in real code they are read from file.
float input_points[npoints][4] = {{0,0}, {2560, 1920}}; // the rest will be set to 0
CvMat * src = cvCreateMat(1, npoints, CV_32FC2);
CvMat * dst = cvCreateMat(1, npoints, CV_32FC2);
// fill src matrix
float * src_ptr = (float*)src->data.ptr;
for (int pi = 0; pi < npoints; ++pi) {
for (int ci = 0; ci < 2; ++ci) {
*(src_ptr + pi * 2 + ci) = input_points[pi][ci];
}
}
cvUndistortPoints(src, dst, &camera1, &distCoeffs1);
EDIT
While OpenCV specifies undistortPoints accept only 2-channel input, actually, it accepts
1-column, 2-channel, multi-row mat or (and this case is not documented)
2 column, multi-row, 1-channel mat or
multi-column, 1 row, 2-channel mat
(as seen in undistort.cpp, line 390)
But a bug inside (or lack of available info), makes it wrongly mix the second one with the third one, when the number of columns is 2. So, your data is considered a 2-column, 2-row, 1-channel.
I also reach this problems, and I take some time to research an finally understand.
Formula
You see the formula above, in the open system, distort operation is before camera matrix, so the process order is:
image_distorted ->camera_matrix -> un-distort function->camera_matrix->back to image_undistorted.
So you need a small fix to and camera1 again.
Mat eye3 = Mat::eye(3, 3, CV_64F);
cvUndistortPoints(src, dst, &camera1, &distCoeffs1, &eye3,&camera1);
Otherwise, if the last two parameters is empty, It would be project to a Normalized image coordinate.
See codes: opencv-3.4.0-src\modules\imgproc\src\undistort.cpp :297
cvUndistortPointsInternal()