Iphone 6 camera calibration for OpenCV - ios

Im developing an iOS Augmented Reality application using OpenCV. I'm having issues creating the camera projection matrix to allow the OpenGL overlay to map directly on top of the marker. I feel this is due to my iPhone 6 camera not being correctly calibrated to the application. I know there is OpenCV code to calibrate webcams etc using the chess board, but I can't find a way to calibrate my embedded iPhone camera.
Is there a way? Or are there known estimate values for iPhone 6? Which include: focal length in x and y, primary point in x and y, along with the distortion coefficient matrix.
Any help will be appreciated.
EDIT:
Deduced values are as follows (using iPhone 6, camera feed resolution 1280x720):
fx=1229
cx=360
fy=1153
cy=640
This code provides an accurate estimate for the focal length and primary points for devices currently running iOS 9.1.
AVCaptureDeviceFormat *format = deviceInput.device.activeFormat;
CMFormatDescriptionRef fDesc = format.formatDescription;
CGSize dim = CMVideoFormatDescriptionGetPresentationDimensions(fDesc, true, true);
float cx = float(dim.width) / 2.0;
float cy = float(dim.height) / 2.0;
float HFOV = format.videoFieldOfView;
float VFOV = ((HFOV)/cx)*cy;
float fx = abs(float(dim.width) / (2 * tan(HFOV / 180 * float(M_PI) / 2)));
float fy = abs(float(dim.height) / (2 * tan(VFOV / 180 * float(M_PI) / 2)));
NOTE:
I had an initialization issue with this code. I recommend once the values are initialised and correctly set, to save them to a data file and read this file in for the values.

In my non-OpenCV AR application I am using field of view (FOV) of the iPhone's camera to construct the camera projection matrix. It works alright for displaying the Sun path overlaid on top of the camera view.
I don't know how much accuracy you need. It could be that knowing only FOV would not be enough you.
iOS API provides a way to get field of view of the camera. I get it as so:
AVCaptureDevice * camera = ...
AVCaptureDeviceFormat * format = camera.activeFormat;
float fieldOfView = format.videoFieldOfView;
After getting the FOV I compute the projection matrix:
typedef double mat4f_t[16]; // 4x4 matrix in column major order
mat4f_t projection;
createProjectionMatrix(projection,
GRAD_TO_RAD(fieldOfView),
viewSize.width/viewSize.height,
5.0f,
1000.0f);
where
void createProjectionMatrix(
mat4f_t mout,
float fovy,
float aspect,
float zNear,
float zFar)
{
float f = 1.0f / tanf(fovy/2.0f);
mout[0] = f / aspect;
mout[1] = 0.0f;
mout[2] = 0.0f;
mout[3] = 0.0f;
mout[4] = 0.0f;
mout[5] = f;
mout[6] = 0.0f;
mout[7] = 0.0f;
mout[8] = 0.0f;
mout[9] = 0.0f;
mout[10] = (zFar+zNear) / (zNear-zFar);
mout[11] = -1.0f;
mout[12] = 0.0f;
mout[13] = 0.0f;
mout[14] = 2 * zFar * zNear / (zNear-zFar);
mout[15] = 0.0f;
}

Related

world coordinates to camera coordinates to pixel coordinates

I am trying to project a giving 3D point to image plane, I have posted many question regarding this and many people help me, also I read many related links but still the projection doesn't work for me correctly.
I have a 3d point (-455,-150,0) where x is the depth axis and z is the upwards axis and y is the horizontal one I have roll: Rotation around the front-to-back axis (x) , pitch: Rotation around the side-to-side axis (y) and yaw:Rotation around the vertical axis (z) also I have the position on the camera (x,y,z)=(-50,0,100) so I am doing the following
first I am doing from world coordinates to camera coordinates using the extrinsic parameters:
double pi = 3.14159265358979323846;
double yp = 0.033716827630996704* pi / 180; //roll
double thet = 67.362312316894531* pi / 180; //pitch
double k = 89.7135009765625* pi / 180; //yaw
double rotxm[9] = { 1,0,0,0,cos(yp),-sin(yp),0,sin(yp),cos(yp) };
double rotym[9] = { cos(thet),0,sin(thet),0,1,0,-sin(thet),0,cos(thet) };
double rotzm[9] = { cos(k),-sin(k),0,sin(k),cos(k),0,0,0,1};
cv::Mat rotx = Mat{ 3,3,CV_64F,rotxm };
cv::Mat roty = Mat{ 3,3,CV_64F,rotym };
cv::Mat rotz = Mat{ 3,3,CV_64F,rotzm };
cv::Mat rotationm = rotz * roty * rotx; //rotation matrix
cv::Mat mpoint3(1, 3, CV_64F, { -455,-150,0 }); //the 3D point location
mpoint3 = mpoint3 * rotationm; //rotation
cv::Mat position(1, 3, CV_64F, {-50,0,100}); //the camera position
mpoint3=mpoint3 - position; //translation
and now I want to move from camera coordinates to image coordinates
the first solution was: as I read from some sources
Mat myimagepoint3 = mpoint3 * mycameraMatrix;
this didn't work
the second solution was:
double fx = cameraMatrix.at<double>(0, 0);
double fy = cameraMatrix.at<double>(1, 1);
double cx1 = cameraMatrix.at<double>(0, 2);
double cy1= cameraMatrix.at<double>(1, 2);
xt = mpoint3 .at<double>(0) / mpoint3.at<double>(2);
yt = mpoint3 .at<double>(1) / mpoint3.at<double>(2);
double u = xt * fx + cx1;
double v = yt * fy + cy1;
but also didn't work
I also tried to use opencv method fisheye::projectpoints(from world to image coordinates)
Mat recv2;
cv::Rodrigues(rotationm, recv2);
//inputpoints a vector contains one point which is the 3d world coordinate of the point
//outputpoints a vector to store the output point
cv::fisheye::projectPoints(inputpoints,outputpoints,recv2,position,mycameraMatrix,mydiscoff );
but this also didn't work
by didn't work I mean: I know (in the image) where should the point appear but when I draw it, it is always in another place (not even close) sometimes I even got a negative values
note: there is no syntax errors or exceptions but may I made typos while I am writing code here
so can any one suggest if I am doing something wrong?

Using android data to stitch multiple images

I am using gyroscope data from android phone to stitch two images. The image are placed as if they are flipped.
rotation matrix of camera from gyroscope-
double thetaOverTwo = gyroscopeRotationVelocity * dT / 2.0f;
double sinThetaOverTwo = Math.sin(thetaOverTwo);
double cosThetaOverTwo = Math.cos(thetaOverTwo);
deltaQuaternion.setY((float) (sinThetaOverTwo * axisX));
deltaQuaternion.setX(-(float) (sinThetaOverTwo * axisY));
deltaQuaternion.setZ((float) (sinThetaOverTwo * axisZ));
deltaQuaternion.setW(-(float) cosThetaOverTwo);
Matrix.setRotateM(mRotationMatrix, 0, (-(float) (2.0f * Math.acos(q.getW()) * 180.0f / Math.PI)), q.getX(), q.getY(), q.getZ());
Note: Above set of equation is used because android phone is always used in landscape mode.
Each time I click an image, rotation matrix calculated from above code is pushed to rot_mat vector;
Now I am using the same matrix as camera rotation matrix, but the output is as if images are flipped.
Mat ref_matg = rot_mat[mid]; // mid is index of reference image
for (size_t i = 0; i < num_images; ++i){
Mat R = rot_mat[i]*ref_matg.inv();
R.convertTo(R, CV_32F);
cameras[i].R = R;
cameras[i].focal = focal;
cameras[i].ppx = ppx;
cameras[i].ppy = ppy;
cameras[i].aspect = aspect;
}
However if I use the same code, by initially flipping all the images, the output is correct.
Can anyone give me reason behind this.

Augmented Reality iOS application tracking issue

I am able to detect markers, identify markers and initialise OpenGL objects on screen. The issue I'm having is overlaying them on top of the markers position in the camera world. My camera is calibrated best I can using this method Iphone 6 camera calibration for OpenCV. I feel there is an issue with my cameras projection matrix, I create it as follows:
-(void)buildProjectionMatrix:
(Matrix33)cameraMatrix:
(int)screen_width:
(int)screen_height:
(Matrix44&) projectionMatrix
{
float near = 0.01; // Near clipping distance
float far = 100; // Far clipping distance
// Camera parameters
float f_x = cameraMatrix.data[0]; // Focal length in x axis
float f_y = cameraMatrix.data[4]; // Focal length in y axis
float c_x = cameraMatrix.data[2]; // Camera primary point x
float c_y = cameraMatrix.data[5]; // Camera primary point y
std::cout<<"fx "<<f_x<<" fy "<<f_y<<" cx "<<c_x<<" cy "<<c_y<<std::endl;
std::cout<<"width "<<screen_width<<" height "<<screen_height<<std::endl;
projectionMatrix.data[0] = - 2.0 * f_x / screen_width;
projectionMatrix.data[1] = 0.0;
projectionMatrix.data[2] = 0.0;
projectionMatrix.data[3] = 0.0;
projectionMatrix.data[4] = 0.0;
projectionMatrix.data[5] = 2.0 * f_y / screen_height;
projectionMatrix.data[6] = 0.0;
projectionMatrix.data[7] = 0.0;
projectionMatrix.data[8] = 2.0 * c_x / screen_width - 1.0;
projectionMatrix.data[9] = 2.0 * c_y / screen_height - 1.0;
projectionMatrix.data[10] = -( far+near ) / ( far - near );
projectionMatrix.data[11] = -1.0;
projectionMatrix.data[12] = 0.0;
projectionMatrix.data[13] = 0.0;
projectionMatrix.data[14] = -2.0 * far * near / ( far - near );
projectionMatrix.data[15] = 0.0;
}
This is the method to estimate the position of the marker:
void MarkerDetector::estimatePosition(std::vector<Marker>& detectedMarkers)
{
for (size_t i=0; i<detectedMarkers.size(); i++)
{
Marker& m = detectedMarkers[i];
cv::Mat Rvec;
cv::Mat_<float> Tvec;
cv::Mat raux,taux;
cv::solvePnP(m_markerCorners3d, m.points, camMatrix, distCoeff,raux,taux);
raux.convertTo(Rvec,CV_32F);
taux.convertTo(Tvec ,CV_32F);
cv::Mat_<float> rotMat(3,3);
cv::Rodrigues(Rvec, rotMat);
// Copy to transformation matrix
for (int col=0; col<3; col++)
{
for (int row=0; row<3; row++)
{
m.transformation.r().mat[row][col] = rotMat(row,col); // Copy rotation component
}
m.transformation.t().data[col] = Tvec(col); // Copy translation component
}
// Since solvePnP finds camera location, w.r.t to marker pose, to get marker pose w.r.t to the camera we invert it.
m.transformation = m.transformation.getInverted();
}
}
The OpenGL shape is able to track and account for size and roation, but something is going wrong with the translation. If the camera is turned 90 degrees, the opengl shape swings around 90 degrees about the centre of the marker. Its almost as if I am translating before rotating, but I am not.
See video for issue:
https://vid.me/fLvX
I guess you can have some problem with projecting the 3-D modelpoints. Essentially, solvePnP gives a transformation that brings points from the model coordinate system to the camera coordinate system and this is composed of a rotation and translation vector (output of solvePnP):
cv::Mat rvec, tvec;
cv::solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs, rvec, tvec)
At this point you are able to project model points onto the image plane
std::vector<cv::Vec2d> imagePointsRP; // Reprojected image points
cv::projectPoints(objectPoints, rvec, tvec, cameraMatrix, distCoeffs, imagePointsRP);
Now, you should only draw the points of imagePointsRP over the incoming image and if the pose estimation was correct then you'll see the reprojected corners over the corners of the marker
Anyway, the matrices of model TO camera and camera TO model direction can be composed as below:
cv::Mat rmat
cv::Rodrigues(rvec, rmat); // mRmat is 3x3
cv::Mat modelToCam = cv::Mat::eye(4, 4, CV_64FC1);
modelToCam(cv::Range(0, 3), cv::Range(0, 3)) = rmat * 1.0;
modelToCam(cv::Range(0, 3), cv::Range(3, 4)) = tvec * 1.0;
cv::Mat camToModel = cv::Mat::eye(4, 4, CV_64FC1);
cv::Mat rmatinv = rmat.t(); // rotation of inverse
cv::Mat tvecinv = -rmatinv * tvec; // translation of inverse
camToModel(cv::Range(0, 3), cv::Range(0, 3)) = rmatinv * 1.0;
camToModel(cv::Range(0, 3), cv::Range(3, 4)) = tvecinv * 1.0;
In any case, it's also useful to estimate reprojection error and discard the poses with high error (remember, the PnP problem has only unique solution if n=4 and these points are coplanar):
double totalErr = 0.0;
for (size_t i = 0; i < imagePoints.size(); i++)
{
double err = cv::norm(cv::Mat(imagePoints[i]), cv::Mat(imagePointsRP[i]), cv::NORM_L2);
totalErr += err*err;
}
totalErr = std::sqrt(totalErr / imagePoints.size());

Map KinectV2 depth to rgb DSLR

I am trying to map the depth from the Kinectv2 to RGB space from a DSLR camera and I am stuck with weird pixel mapping.
I am working on Processing, using OpenCV and Nicolas Burrus' method where :
P3D.x = (x_d - cx_d) * depth(x_d,y_d) / fx_d
P3D.y = (y_d - cy_d) * depth(x_d,y_d) / fy_d
P3D.z = depth(x_d,y_d)
P3D' = R.P3D + T
P2D_rgb.x = (P3D'.x * fx_rgb / P3D'.z) + cx_rgb
P2D_rgb.y = (P3D'.y * fy_rgb / P3D'.z) + cy_rgb
Unfortunatly i have a problem when I reproject 3D point to RGB World Space. In order to check if the problem came from my OpenCV calibration I used MRPT Kinect & Setero Calibration in order to get the intrinsics and distorsion coefficients of the cameras and the rototranslation relative transformation between the two cameras.
eDatas from stereo calibration MRPT
Here my datas :
depth c_x = 262.573912;
depth c_y = 216.804166;
depth f_y = 462.676558;
depth f_x = 384.377033;
depthDistCoeff = {
1.975280e-001, -6.939150e-002, 0.000000e+000, -5.830770e-002, 0.000000e+000
};
DSLR c_x_R = 538.134412;
DSLR c_y_R = 359.760525;
DSLR f_y_R = 968.431461;
DSLR f_x_R = 648.480385;
rgbDistCoeff = {
2.785566e-001, -1.540991e+000, 0.000000e+000, -9.482198e-002, 0.000000e+000
};
R = {
8.4263457190597e-001, -8.9789363922252e-002, 5.3094712387890e-001,
4.4166517232817e-002, 9.9420220953803e-001, 9.8037162878270e-002,
-5.3667149820385e-001, -5.9159417476295e-002, 8.4171483671105e-001
};
T = {-4.740111e-001, 3.618596e-002, -4.443195e-002};
Then I use the data in processing in order to compute the mapping using :
PVector pixelDepthCoord = new PVector(i * offset_, j * offset_);
int index = (int) pixelDepthCoord .x + (int) pixelDepthCoord .y * depthWidth;
int depth = 0;
if (rawData[index] != 255)
{
//2D Depth Coord
depth = rawDataDepth[index];
} else
{
}
//3D Depth Coord - Back projecting pixel depth coord to 3D depth coord
float bppx = (pixelDepthCoord.x - c_x) * depth / f_x;
float bppy = (pixelDepthCoord.y - c_y) * depth / f_y;
float bppz = -depth;
//transpose 3D depth coord to 3D color coord
float x_ =(bppx * R[0] + bppy * R[1] + bppz * R[2]) + T[0];
float y_ = (bppx * R[3] + bppy * R[4] + bppz * R[5]) + T[1];
float z_ = (bppx * R[6] + bppy * R[7] + bppz * R[8]) + T[2];
//Project 3D color coord to 2D color Cood
float pcx = (x_ * f_x_R / z_) + c_x_R;
float pcy = (y_ * f_y_R / z_) + c_y_R;
Then i get the following transformations :
Weird mapping behavior
I think i have a probleme in my method but i don't get it. Does anyone has any ideas or a clues. I am racking my brain since many days on this problem ;)
Thanks

What is the use of Projection matrix?

I've been trying to analyse Apple's pARk(Augmented reality sample application) where I came across the below function,
Method call with parameters below:
createProjectionMatrix(projectionTransform, 60.0f*DEGREES_TO_RADIANS, self.bounds.size.width*1.0f / self.bounds.size.height, 0.25f, 1000.0f);
void createProjectionMatrix(mat4f_t mout, float fovy, float aspect, float zNear, float zFar)
{
float f = 1.0f / tanf(fovy/2.0f);
mout[0] = f / aspect;
mout[1] = 0.0f;
mout[2] = 0.0f;
mout[3] = 0.0f;
mout[4] = 0.0f;
mout[5] = f;
mout[6] = 0.0f;
mout[7] = 0.0f;
mout[8] = 0.0f;
mout[9] = 0.0f;
mout[10] = (zFar+zNear) / (zNear-zFar);
mout[11] = -1.0f;
mout[12] = 0.0f;
mout[13] = 0.0f;
mout[14] = 2 * zFar * zNear / (zNear-zFar);
mout[15] = 0.0f;
}
I see this projection matrix is multiplied with rotation matrix(obtained by motionManager.deviceMotion API).
What is the use of projection matrix?Why should it be multiplied with rotation matrix?
multiplyMatrixAndMatrix(projectionCameraTransform, projectionTransform, cameraTransform);
Why the resultant matrix has to be multiplied with a PointOfInterest vector coordinates again?
multiplyMatrixAndVector(v, projectionCameraTransform, placesOfInterestCoordinates[i]);
Appreciate any help here.
Sample code link here
In computer vision and in robotics, a typical task is to identify specific objects in an image and to determine each object's POSITION and ORIENTATION (or Translation and Rotation) relative to some coordinate system.
In Augmented Reality we normally calculate the pose of the detected object and then augment a virtual model on top of it. We can project the virtual model more REALISTically if we know the pose of the detected object.
The joint rotation-translation matrix [R|t] is called a matrix of extrinsic parameters. It is used to describe the camera motion around a static scene, or vice versa, rigid motion of an object in front of a still camera. That is, [R|t] translates coordinates of a point (X, Y, Z) to a coordinate system, fixed with respect to the camera. This offers you a 6DOF pose(3 rotation & 3 translation) required for Mobile AR.
A good read if you want to read more http://games.ianterrell.com/learn-the-basics-of-opengl-with-glkit-in-ios-5/
Sorry I am only working with Android AR. Hope this helps :)

Resources