Homography matrix decomposition into rotation matrix and translation vector - opencv

I'm working on an augmented reality application for android using opencv 2.4.4 and have some problem with homography decomposition.
As we know, homography matrix is define as H=A.[R t] , where A is the intrinsic camera matrix, R is rotation matrix and t is translation vector.
I want to estimate the view side of camera using pictures, also the orientation of camera in 3d room.
Homography matrix can I estimate with opencv function: findHomography, and I think it works!!!
Here how I do it:
static Mat mFindHomography(MatOfKeyPoint keypoints1, MatOfKeyPoint keypoints2, MatOfDMatch matches){
List<Point> lp1 = new ArrayList<Point>(500);
List<Point> lp2 = new ArrayList<Point>(500);
KeyPoint[] k1 = keypoints1.toArray();
KeyPoint[] k2 = keypoints2.toArray();
List<DMatch> matchesList = matches.toList();
if (matchesList.size() < 4){
MatOfDMatch mat = new MatOfDMatch();
return mat;
// Add matches keypoints to new list to apply homography
for(DMatch match : matchesList){
Point kk1 = k1[match.queryIdx].pt;
Point kk2 = k2[match.trainIdx].pt;
MatOfPoint2f srcPoints = new MatOfPoint2f(lp1.toArray(new Point[0]));
MatOfPoint2f dstPoints = new MatOfPoint2f(lp2.toArray(new Point[0]));
Mat mask = new Mat();
Mat homography = Calib3d.findHomography(srcPoints, dstPoints, Calib3d.RANSAC, 10, mask); // Finds a perspective transformation between two planes. ---Calib3d.LMEDS Least-Median robust method
List<DMatch> matches_homo = new ArrayList<DMatch>();
int size = (int) mask.size().height;
for(int i = 0; i < size; i++){
if ( mask.get(i, 0)[0] == 1){
DMatch d = matchesList.get(i);
MatOfDMatch mat = new MatOfDMatch();
matchesFilterdByRansac = (int) mat.size().height;
return homography;
After that, I want to decompose this homography matrix and compute euler angles. As we know H=A.[R t], I multiply homography matrix with inverse of camera intrinsic matrix: H.A^{-1} = [R t]. So, I want to decompose [R t] in rotation and translation and compute euler angles from rotation matrix. But it didn't work. What is wrong there?!!
if(!homography.empty()){ // esstimate pose frome homography
Mat intrinsics = Mat.zeros(3, 3, CvType.CV_32FC1); // camera intrinsic matrix
intrinsics.put(0, 0, 890);
intrinsics.put(0, 2, 400);
intrinsics.put(1, 1, 890);
intrinsics.put(1, 2, 240);
intrinsics.put(2, 2, 1);
// Inverse Matrix from Wolframalpha
double[] inverseIntrinsics = { 0.001020408, 0 , -0.408163265,
0, 0.0011235955, -0.26966292,
0, 0 , 1 };
// cross multiplication
double[] rotationTranslation = matrixMultiply3X3(homography, inverseIntrinsics);
Mat pose = Mat.eye(3, 4, CvType.CV_32FC1); // 3x4 matrix, the camera pose
float norm1 = (float) Core.norm(rotationTranslation.col(0));
float norm2 = (float) Core.norm(rotationTranslation.col(1));
float tnorm = (norm1 + norm2) / 2.0f; // Normalization value ---test: float tnorm = (float) h.get(2, 2)[0];// not worked
Mat normalizedTemp = new Mat();
Core.normalize(rotationTranslation.col(0), normalizedTemp);
normalizedTemp.convertTo(normalizedTemp, CvType.CV_32FC1);
normalizedTemp.copyTo(pose.col(0)); // Normalize the rotation, and copies the column to pose
Core.normalize(rotationTranslation.col(1), normalizedTemp);
normalizedTemp.convertTo(normalizedTemp, CvType.CV_32FC1);
normalizedTemp.copyTo(pose.col(1));// Normalize the rotation and copies the column to pose
Mat p3 = pose.col(0).cross(pose.col(1)); // Computes the cross-product of p1 and p2
p3.copyTo(pose.col(2));// Third column is the crossproduct of columns one and two
double[] buffer = new double[3];
rotationTranslation.col(2).get(0, 0, buffer);
pose.put(0, 3, buffer[0] / tnorm); //vector t [R|t] is the last column of pose
pose.put(1, 3, buffer[1] / tnorm);
pose.put(2, 3, buffer[2] / tnorm);
float[] rotationMatrix = new float[9];
rotationMatrix = getArrayFromMat(pose);
float[] eulerOrientation = new float[3];
SensorManager.getOrientation(rotationMatrix, eulerOrientation);
// Convert radian to degree
double yaw = (double) (eulerOrientation[0]) * (180 / Math.PI));// * -57;
double pitch = (double) (eulerOrientation[1]) * (180 / Math.PI));
double roll = (double) (eulerOrientation[2]) * (180 / Math.PI));}
Note that opencv 3.0 has a homogeraphy decomposition function (here), but I'm using opencv 2.4.4 for android!!! Is there a wrapper for it in java?
Second problem is with decomposing of rotation matrix in euler angels. Is there any problem with:
float[] eulerOrientation = new float[3];
SensorManager.getOrientation(rotationMatrix, eulerOrientation);
I used this link too, but not better result!
double pitch = Math.atan2(pose.get(2, 1)[0], pose.get(2, 2)[0]);
double roll = Math.atan2(-1*pose.get(2, 0)[0], Math.sqrt( Math.pow(pose.get(2, 1)[0], 2) + Math.pow(pose.get(2, 2)[0], 2)) );
double yaw = Math.atan2(pose.get(1, 0)[0], pose.get(0, 0)[0]);
Thanks a lot for any response

I hope this answer will help those looking for a solution today.
My answer uses c++ and opencv 2.4.9. I copied the decomposehomographymat function from opencv 3.0. After computing homography I use the copied function to decompose homography. To filter homography matrix and select the correct answer from the 4 decompositions, check my answer here.
To obtain euler angles from the rotation matrix, you can refer to this. I am able to get good results with this method.


Pose Estimation - cv::SolvePnP with Scenekit - Coordinate System Question

I have been working on Pose Estimation (rectifying key points on a 3D model with 2D points on an image to match pose) via OpenCV's cv::solvePNP, using features / key points from Apples Vision framework.
My scene kit model is being translated and the units look correct when introspecting the translation and rotation vectors from solvePnP (ie, they are the right order of magnitude), but the coordinate system of the translation appears off:
I am trying to understand the coordinate system requirements with solvePnP wrt to Metal / OpenGL coordinate system and my camera projection matrix.
What 'projectionMatrix' does my SCNCamera require to match image based coordinate system passed into solvePnP?
Some things ive read / believe I am taking into account.
OpenCV vs OpenGL (thus Metal) have row major vs column major differences.
OpenCV's coordinate system for 3D is different than OpenGL (thus Metal).
Longer with code:
My workflow is as such:
Step 1 - use a 3D model tool to introspect points on my 3D model and get the objects vertex positions for the major key points in the 2D detected features. I am using left pupil, right pupil, tip of nose, tip of chin, left outer lip corner, right outer lip corner.
Step 2 - Run a vision request and extract a list of points in image space (converting for OpenCV's top left coordinate system) and extract the same ordered list of 2D points.
Step 3 - Construct a camera matrix by using the size of the input image.
Step 4 - run cv::solvePnP, and then use cv::Rodrigues to convert the rotation vector to a matrix
Step 5 - Convert the coordinate system of the resulting transforms into something appropriate for the GPU - invert the y and z axis and combine the translation and rotation to a single 4x4 Matrix, and then transpose it for the appropriate major ness of OpenGL / Metal
Step 6 - apply the resulting transform to Scenekit via:
let faceNodeTransform = openCVWrapper.transform(for: landmarks, imageSize: size)
self.destinationView.pointOfView?.transform = SCNMatrix4Invert(faceNodeTransform)
Below is my Obj-C++ OpenCV Wrapper which takes in a subset of Vision Landmarks and the true pixel size of the image being looked at:
/ https://answers.opencv.org/question/23089/opencv-opengl-proper-camera-pose-using-solvepnp/
- (SCNMatrix4) transformFor:(VNFaceLandmarks2D*)landmarks imageSize:(CGSize)imageSize
// 1 convert landmarks to image points in image space (pixels) to vector of cv::Point2f's :
// Note that this translates the point coordinate system to be top left oriented for OpenCV's image coordinates:
std::vector<cv::Point2f > imagePoints = [self imagePointsForLandmarks:landmarks imageSize:imageSize];
// 2 Load Model Points
std::vector<cv::Point3f > modelPoints = [self modelPoints];
// 3 create our camera extrinsic matrix
// TODO - see if this is sane?
double max_d = fmax(imageSize.width, imageSize.height);
cv::Mat cameraMatrix = (cv::Mat_<double>(3,3) << max_d, 0, imageSize.width/2.0,
0, max_d, imageSize.height/2.0,
0, 0, 1.0);
// 4 Run solvePnP
double distanceCoef[] = {0,0,0,0};
cv::Mat distanceCoefMat = cv::Mat(1 ,4 ,CV_64FC1,distanceCoef);
// Output Matrixes
std::vector<double> rv(3);
cv::Mat rotationOut = cv::Mat(rv);
std::vector<double> tv(3);
cv::Mat translationOut = cv::Mat(tv);
cv::solvePnP(modelPoints, imagePoints, cameraMatrix, distanceCoefMat, rotationOut, translationOut, false, cv::SOLVEPNP_EPNP);
// 5 Convert rotation matrix (actually a vector)
// To a real 4x4 rotation matrix:
cv::Mat viewMatrix = cv::Mat::zeros(4, 4, CV_64FC1);
cv::Mat rotation;
cv::Rodrigues(rotationOut, rotation);
// Append our transforms to our matrix and set final to identity:
for(unsigned int row=0; row<3; ++row)
for(unsigned int col=0; col<3; ++col)
viewMatrix.at<double>(row, col) = rotation.at<double>(row, col);
viewMatrix.at<double>(row, 3) = translationOut.at<double>(row, 0);
viewMatrix.at<double>(3, 3) = 1.0f;
// Transpose OpenCV to OpenGL coords
cv::Mat cvToGl = cv::Mat::zeros(4, 4, CV_64FC1);
cvToGl.at<double>(0, 0) = 1.0f;
cvToGl.at<double>(1, 1) = -1.0f; // Invert the y axis
cvToGl.at<double>(2, 2) = -1.0f; // invert the z axis
cvToGl.at<double>(3, 3) = 1.0f;
viewMatrix = cvToGl * viewMatrix;
// Finally transpose to get correct SCN / OpenGL Matrix :
cv::Mat glViewMatrix = cv::Mat::zeros(4, 4, CV_64FC1);
cv::transpose(viewMatrix , glViewMatrix);
return [self convertCVMatToMatrix4:glViewMatrix];
- (SCNMatrix4) convertCVMatToMatrix4:(cv::Mat)matrix
SCNMatrix4 scnMatrix = SCNMatrix4Identity;
scnMatrix.m11 = matrix.at<double>(0, 0);
scnMatrix.m12 = matrix.at<double>(0, 1);
scnMatrix.m13 = matrix.at<double>(0, 2);
scnMatrix.m14 = matrix.at<double>(0, 3);
scnMatrix.m21 = matrix.at<double>(1, 0);
scnMatrix.m22 = matrix.at<double>(1, 1);
scnMatrix.m23 = matrix.at<double>(1, 2);
scnMatrix.m24 = matrix.at<double>(1, 3);
scnMatrix.m31 = matrix.at<double>(2, 0);
scnMatrix.m32 = matrix.at<double>(2, 1);
scnMatrix.m33 = matrix.at<double>(2, 2);
scnMatrix.m34 = matrix.at<double>(2, 3);
scnMatrix.m41 = matrix.at<double>(3, 0);
scnMatrix.m42 = matrix.at<double>(3, 1);
scnMatrix.m43 = matrix.at<double>(3, 2);
scnMatrix.m44 = matrix.at<double>(3, 3);
return (scnMatrix);
Some questions:
An SCNNode has no modelViewMatrix (just as I understand it, a transform, which is the modelMatrix) to just throw a matrix at - so I've read the inverse of the transform from SolvePNP process can be used to pose the camera instead, which appears to get me the closes result. I want to ensure this approach is correct.
If I have the modelViewMatrix, and the projectionMatrix, I should be able to calculate the appropriate modelMatrix? Is this the approach I should be taking?
Its unclear to me what projectionMatrix I should be using for my SceneKit Scene and If that has any bearing on my results. Do I need a pixel for pixel exact match of my viewport to the image size, and how do I properly configure my SCNCamera to ensure coordinate system agreeance for SolvePnP?
Thank you very much!

How to use OpenCV's Homography for a target and position camera in Unity?

I was able to get a Homography for a WebcamTexture on a texture and a target object.
Now, I want to change the transform of Unity's camera based on the Homography. I've found a way to get the Camera Position from Homography like this:
void cameraPoseFromHomography(const Mat& H, Mat& pose)
pose = Mat::eye(3, 4, CV_32FC1); // 3x4 matrix, the camera pose
float norm1 = (float)norm(H.col(0));
float norm2 = (float)norm(H.col(1));
float tnorm = (norm1 + norm2) / 2.0f; // Normalization value
Mat p1 = H.col(0); // Pointer to first column of H
Mat p2 = pose.col(0); // Pointer to first column of pose (empty)
cv::normalize(p1, p2); // Normalize the rotation, and copies the column to pose
p1 = H.col(1); // Pointer to second column of H
p2 = pose.col(1); // Pointer to second column of pose (empty)
cv::normalize(p1, p2); // Normalize the rotation and copies the column to pose
p1 = pose.col(0);
p2 = pose.col(1);
Mat p3 = p1.cross(p2); // Computes the cross-product of p1 and p2
Mat c2 = pose.col(2); // Pointer to third column of pose
p3.copyTo(c2); // Third column is the crossproduct of columns one and two
pose.col(3) = H.col(2) / tnorm; //vector t [R|t] is the last column of pose
Source: https://stackoverflow.com/a/10781165/4382683
Now, this is relative to the OpenCV image which has dimensions in order of hundreds. But in Unity, the Quad has LocalScale (1,1,1) - I don't know how to translate the pose with this.
Also, how does a Mat of 3x4 be used a position in Unity - where position is just a Vector3 like (1,5,4) and rotation is also a Vector3 like (90, 180, 0).
Any hints or a direction to follow are good enough. Thanks.

OpenCV for Unity : 4-point calibration/reprojection

It is my first post on Stack so I'm sorry in advance for my clumsiness. Please let me know if I can improve my question anyway.
► What I want to achieve (in a long term):
I try to manipulate my Unity3d presentation with a laser pointer using OpenCV fo Unity.
I believe one picture is worth more than a thousand words, so this should tell the most:
► What is the problem:
I try to make a simple 4-point calibration (projection) from camera view (some kind of trapezium) into plane space.
I thought it will be something very basic and easy, but I have no experience with OpenCV and I can't make it work.
► Sample:
I made a much less complicated example, without any laser detection and all other stuff. Only 4-points trapezium that I try to reproject into the plane space.
Link to the whole sample project: https://1drv.ms/u/s!AiDsGecSyzmuujXGQUapcYrIvP7b
The core script from my example:
using OpenCVForUnity;
using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using UnityEngine.UI;
using System;
public class TestCalib : MonoBehaviour
public RawImage displayDummy;
public RectTransform[] handlers;
public RectTransform dummyCross;
public RectTransform dummyResult;
public Vector2 webcamSize = new Vector2(640, 480);
public Vector2 objectSize = new Vector2(1024, 768);
private Texture2D texture;
Mat cameraMatrix;
MatOfDouble distCoeffs;
MatOfPoint3f objectPoints;
MatOfPoint2f imagePoints;
Mat rvec;
Mat tvec;
Mat rotationMatrix;
Mat imgMat;
void Start()
texture = new Texture2D((int)webcamSize.x, (int)webcamSize.y, TextureFormat.RGB24, false);
if (displayDummy) displayDummy.texture = texture;
imgMat = new Mat(texture.height, texture.width, CvType.CV_8UC3);
void Update()
imgMat = new Mat(texture.height, texture.width, CvType.CV_8UC3);
Utils.matToTexture2D(imgMat, texture);
void DrawImagePoints()
Point[] pointsArray = imagePoints.toArray();
for (int i = 0; i < pointsArray.Length; i++)
Point p0 = pointsArray[i];
int j = (i < pointsArray.Length - 1) ? i + 1 : 0;
Point p1 = pointsArray[j];
Imgproc.circle(imgMat, p0, 5, new Scalar(0, 255, 0, 150), 1);
Imgproc.line(imgMat, p0, p1, new Scalar(255, 255, 0, 150), 1);
private void DrawResults(MatOfPoint2f resultPoints)
Point[] pointsArray = resultPoints.toArray();
for (int i = 0; i < pointsArray.Length; i++)
Point p = pointsArray[i];
Imgproc.circle(imgMat, p, 5, new Scalar(255, 155, 0, 150), 1);
public void Test()
float w2 = objectSize.x / 2F;
float h2 = objectSize.y / 2F;
objectPoints = new MatOfPoint3f(
new Point3(-w2, -h2, 0),
new Point3(w2, -h2, 0),
new Point3(-w2, h2, 0),
new Point3(w2, h2, 0)
objectPoints = new MatOfPoint3f(
new Point3(0, 0, 0),
new Point3(objectSize.x, 0, 0),
new Point3(objectSize.x, objectSize.y, 0),
new Point3(0, objectSize.y, 0)
imagePoints = GetImagePointsFromHandlers();
rvec = new Mat(1, 3, CvType.CV_64FC1);
tvec = new Mat(1, 3, CvType.CV_64FC1);
rotationMatrix = new Mat(3, 3, CvType.CV_64FC1);
double fx = webcamSize.x / objectSize.x;
double fy = webcamSize.y / objectSize.y;
double cx = 0; // webcamSize.x / 2.0f;
double cy = 0; // webcamSize.y / 2.0f;
cameraMatrix = new Mat(3, 3, CvType.CV_64FC1);
cameraMatrix.put(0, 0, fx);
cameraMatrix.put(0, 1, 0);
cameraMatrix.put(0, 2, cx);
cameraMatrix.put(1, 0, 0);
cameraMatrix.put(1, 1, fy);
cameraMatrix.put(1, 2, cy);
cameraMatrix.put(2, 0, 0);
cameraMatrix.put(2, 1, 0);
cameraMatrix.put(2, 2, 1.0f);
distCoeffs = new MatOfDouble(0, 0, 0, 0);
Calib3d.solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs, rvec, tvec);
Mat uv = new Mat(3, 1, CvType.CV_64FC1);
uv.put(0, 0, dummyCross.anchoredPosition.x);
uv.put(1, 0, dummyCross.anchoredPosition.y);
uv.put(2, 0, 0);
Calib3d.Rodrigues(rvec, rotationMatrix);
Mat P = rotationMatrix.inv() * (cameraMatrix.inv() * uv - tvec);
Vector2 v = new Vector2((float)P.get(0, 0)[0], (float)P.get(1, 0)[0]);
dummyResult.anchoredPosition = v;
private MatOfPoint2f GetImagePointsFromHandlers()
MatOfPoint2f m = new MatOfPoint2f();
List<Point> points = new List<Point>();
foreach (RectTransform handler in handlers)
Point p = new Point(handler.anchoredPosition.x, handler.anchoredPosition.y);
return m;
Thanks in advance for any help.
This question is not opencv specific but heavily math-based and more often seen in the realm of computer graphics. What you are looking for is called a Projective Transformation.
A Projective Transformation takes a set of coordinates and projects them onto something. In your case you want to project a 2D point in the camera view to a 2D point on a flat plane.
So we want a projection transform for 2D-Space. To perform a projection transform we need to find the projection matrix for the transformation we want to apply. In this case we need a matrix that expresses the projective deformation of the camera in relation to a flat plane.
To work with projections we first need to convert our points into homogeneous coordinates. To do so we simply add a new component to our vectors with value 1. So (x,y) becomes (x,y,1). And we will do that with all our five available points.
Now we start with the actual math. First some definitions: The camera's point of view and respective coordinates shall be the camera space, coordinates in relation to a flat plane are in flat space. Let c₁ to c₄ be the corner points of the plane in relation to camera space as homogeneous vectors. Let p be the point that we have found in camera space and p' the point we want to find in flat space, both as homogeneous vectors again.
Mathematically speaking, we are looking for a Matrix C that will allow us to calculate p' by giving it p.
p' = C * p
Now we obviously need to find C. To find a projection matrix for two dimensional space, we need four points (how convenient..) I will assume that c₁ will go to (0,0), c₂ will go to (0,1), c₃ to (1,0) and c₄ to (1,1). You need to solve two matrix equations using e.g. the gaussian row elimination or an LR Decomposition algorithm. OpenCV should contain functions to do those tasks for you, but be aware of matrix conditioning and their impact on a usable solution.
Now back to the matrices. You need to calculate two basis change matrices as they are called. They are used to change the frame of reference of your coordinates (exactly what we want to do). The first matrix will transform our coordinates to three dimensional basis vectors and the second one will transform our 2D plane into three dimensional basis vectors.
For the coordinate one you'll need to calculate λ, μ and r in the following equation:
⌈ c₁.x c₂.x c₃.x ⌉ ⌈ λ ⌉ ⌈ c₄.x ⌉
c₁.y c₂.y c₃.y * μ = c₄.y
⌊ 1 1 1 ⌋ ⌊ r ⌋ ⌊ 1 ⌋
this will lead you to your first Matrix, A
⌈ λ*c₁.x μ*c₂.x r*c₃.x ⌉
A = λ*c₁.y μ*c₂.y r*c₃.y
⌊ λ μ r ⌋
A will now map the points c₁ to c₄ to the basis coordinates (1,0,0), (0,1,0), (0,0,1) and (1,1,1). We do the same thing for our plane now. First solve
⌈ 0 0 1 ⌉ ⌈ λ ⌉ ⌈ 1 ⌉
0 1 0 * μ = 1
⌊ 1 1 1 ⌋ ⌊ r ⌋ ⌊ 1 ⌋
and get B
⌈ 0 0 r ⌉
B = 0 μ 0
⌊ λ μ r ⌋
A and B will now map from those three dimensional basis vectors into your respective spaces. But that is not quite what we want. We want camera space -> basis -> flat space, so only matrix B manipulates in the right direction. But that is easily fixable by inverting A. That will give us matrix C = B * A⁻¹ (watch the order of B and A⁻¹ it is not interchangeable). This leaves us with a formula to calculate p' out of p.
p' = C * p
p' = B * A⁻¹ * p
Read it from left to right like: take p, transform p from camera space into basis vectors and transform those into flat space.
If you remember correctly, p' still has three components, so we need to dehomogenize p' first before we can use it. This will yield
x' = p'.x / p'.z
y' = p'.y / p'.z
and viola we have successfully transformed a laser point from a camera view onto a flat piece of paper. Totally not overly complicated or so...
I Develop Code. MouseUp Call this Function. And Resolution Edit;
void Cal()
// Webcam Resolution 1280*720
MatOfPoint2f pts_src = new MatOfPoint2f(
new Point(Double.Parse(imagePoints.get(0,0).GetValue(0).ToString()), Double.Parse(imagePoints.get(0, 0).GetValue(1).ToString())),
new Point(Double.Parse(imagePoints.get(1,0).GetValue(0).ToString()), Double.Parse(imagePoints.get(1, 0).GetValue(1).ToString())),
new Point(Double.Parse(imagePoints.get(2,0).GetValue(0).ToString()), Double.Parse(imagePoints.get(2, 0).GetValue(1).ToString())),
new Point(Double.Parse(imagePoints.get(3,0).GetValue(0).ToString()), Double.Parse(imagePoints.get(3, 0).GetValue(1).ToString()))
//Resolution 1920*1080
MatOfPoint2f pts_dst = new MatOfPoint2f(
new Point(0, 0),
new Point(1920, 0),
new Point(1920, 1080),
new Point(0, 1080)
// 1. Calculate Homography
Mat h = Calib3d.findHomography((pts_src), (pts_dst));
// Pick Point (WebcamDummy Cavas : 1280*0.5f / 720*0.5f)
MatOfPoint2f srcPointMat = new MatOfPoint2f(
new Point(dummyCross.anchoredPosition.x*2.0f, dummyCross.anchoredPosition.y*2.0f)
MatOfPoint2f dstPointMat = new MatOfPoint2f();
//2. h Mat Mul srcPoint to dstPoint
Core.perspectiveTransform(srcPointMat, dstPointMat, h);
Vector2 v = new Vector2((float)dstPointMat.get(0, 0)[0], (float)dstPointMat.get(0, 0)[1]);
//(ResultDummy Cavas: 1920 * 0.5f / 1080 * 0.5f)
dummyResult.anchoredPosition = v*0.5f;
Debug.Log(dummyCross.anchoredPosition.ToString() + "\n" + dummyResult.anchoredPosition.ToString());

Augmented Reality iOS application tracking issue

I am able to detect markers, identify markers and initialise OpenGL objects on screen. The issue I'm having is overlaying them on top of the markers position in the camera world. My camera is calibrated best I can using this method Iphone 6 camera calibration for OpenCV. I feel there is an issue with my cameras projection matrix, I create it as follows:
(Matrix44&) projectionMatrix
float near = 0.01; // Near clipping distance
float far = 100; // Far clipping distance
// Camera parameters
float f_x = cameraMatrix.data[0]; // Focal length in x axis
float f_y = cameraMatrix.data[4]; // Focal length in y axis
float c_x = cameraMatrix.data[2]; // Camera primary point x
float c_y = cameraMatrix.data[5]; // Camera primary point y
std::cout<<"fx "<<f_x<<" fy "<<f_y<<" cx "<<c_x<<" cy "<<c_y<<std::endl;
std::cout<<"width "<<screen_width<<" height "<<screen_height<<std::endl;
projectionMatrix.data[0] = - 2.0 * f_x / screen_width;
projectionMatrix.data[1] = 0.0;
projectionMatrix.data[2] = 0.0;
projectionMatrix.data[3] = 0.0;
projectionMatrix.data[4] = 0.0;
projectionMatrix.data[5] = 2.0 * f_y / screen_height;
projectionMatrix.data[6] = 0.0;
projectionMatrix.data[7] = 0.0;
projectionMatrix.data[8] = 2.0 * c_x / screen_width - 1.0;
projectionMatrix.data[9] = 2.0 * c_y / screen_height - 1.0;
projectionMatrix.data[10] = -( far+near ) / ( far - near );
projectionMatrix.data[11] = -1.0;
projectionMatrix.data[12] = 0.0;
projectionMatrix.data[13] = 0.0;
projectionMatrix.data[14] = -2.0 * far * near / ( far - near );
projectionMatrix.data[15] = 0.0;
This is the method to estimate the position of the marker:
void MarkerDetector::estimatePosition(std::vector<Marker>& detectedMarkers)
for (size_t i=0; i<detectedMarkers.size(); i++)
Marker& m = detectedMarkers[i];
cv::Mat Rvec;
cv::Mat_<float> Tvec;
cv::Mat raux,taux;
cv::solvePnP(m_markerCorners3d, m.points, camMatrix, distCoeff,raux,taux);
taux.convertTo(Tvec ,CV_32F);
cv::Mat_<float> rotMat(3,3);
cv::Rodrigues(Rvec, rotMat);
// Copy to transformation matrix
for (int col=0; col<3; col++)
for (int row=0; row<3; row++)
m.transformation.r().mat[row][col] = rotMat(row,col); // Copy rotation component
m.transformation.t().data[col] = Tvec(col); // Copy translation component
// Since solvePnP finds camera location, w.r.t to marker pose, to get marker pose w.r.t to the camera we invert it.
m.transformation = m.transformation.getInverted();
The OpenGL shape is able to track and account for size and roation, but something is going wrong with the translation. If the camera is turned 90 degrees, the opengl shape swings around 90 degrees about the centre of the marker. Its almost as if I am translating before rotating, but I am not.
See video for issue:
I guess you can have some problem with projecting the 3-D modelpoints. Essentially, solvePnP gives a transformation that brings points from the model coordinate system to the camera coordinate system and this is composed of a rotation and translation vector (output of solvePnP):
cv::Mat rvec, tvec;
cv::solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs, rvec, tvec)
At this point you are able to project model points onto the image plane
std::vector<cv::Vec2d> imagePointsRP; // Reprojected image points
cv::projectPoints(objectPoints, rvec, tvec, cameraMatrix, distCoeffs, imagePointsRP);
Now, you should only draw the points of imagePointsRP over the incoming image and if the pose estimation was correct then you'll see the reprojected corners over the corners of the marker
Anyway, the matrices of model TO camera and camera TO model direction can be composed as below:
cv::Mat rmat
cv::Rodrigues(rvec, rmat); // mRmat is 3x3
cv::Mat modelToCam = cv::Mat::eye(4, 4, CV_64FC1);
modelToCam(cv::Range(0, 3), cv::Range(0, 3)) = rmat * 1.0;
modelToCam(cv::Range(0, 3), cv::Range(3, 4)) = tvec * 1.0;
cv::Mat camToModel = cv::Mat::eye(4, 4, CV_64FC1);
cv::Mat rmatinv = rmat.t(); // rotation of inverse
cv::Mat tvecinv = -rmatinv * tvec; // translation of inverse
camToModel(cv::Range(0, 3), cv::Range(0, 3)) = rmatinv * 1.0;
camToModel(cv::Range(0, 3), cv::Range(3, 4)) = tvecinv * 1.0;
In any case, it's also useful to estimate reprojection error and discard the poses with high error (remember, the PnP problem has only unique solution if n=4 and these points are coplanar):
double totalErr = 0.0;
for (size_t i = 0; i < imagePoints.size(); i++)
double err = cv::norm(cv::Mat(imagePoints[i]), cv::Mat(imagePointsRP[i]), cv::NORM_L2);
totalErr += err*err;
totalErr = std::sqrt(totalErr / imagePoints.size());

How to undistort points in camera shot coordinates and obtain corresponding undistorted image coordinates?

I use OpenCV to undestort set of points after camera calibration.
The code follows.
const int npoints = 2; // number of point specified
// Points initialization.
// Only 2 ponts in this example, in real code they are read from file.
float input_points[npoints][2] = {{0,0}, {2560, 1920}};
CvMat * src = cvCreateMat(1, npoints, CV_32FC2);
CvMat * dst = cvCreateMat(1, npoints, CV_32FC2);
// fill src matrix
float * src_ptr = (float*)src->data.ptr;
for (int pi = 0; pi < npoints; ++pi) {
for (int ci = 0; ci < 2; ++ci) {
*(src_ptr + pi * 2 + ci) = input_points[pi][ci];
cvUndistortPoints(src, dst, &camera1, &distCoeffs1);
After the code above dst contains following numbers:
-8.82689655e-001 -7.05507338e-001 4.16228324e-001 3.04863811e-001
which are too small in comparison with numbers in src.
At the same time if I undistort image via the call:
cvUndistort2( srcImage, dstImage, &camera1, &dist_coeffs1 );
I receive good undistorted image which means that pixel coordinates are not modified so drastically in comparison with separate points.
How to obtain the same undistortion for specific points as for images?
The points should be "unnormalized" using camera matrix.
More specifically, after call of cvUndistortPoints following transformation should be also added:
double fx = CV_MAT_ELEM(camera1, double, 0, 0);
double fy = CV_MAT_ELEM(camera1, double, 1, 1);
double cx = CV_MAT_ELEM(camera1, double, 0, 2);
double cy = CV_MAT_ELEM(camera1, double, 1, 2);
float * dst_ptr = (float*)dst->data.ptr;
for (int pi = 0; pi < npoints; ++pi) {
float& px = *(dst_ptr + pi * 2);
float& py = *(dst_ptr + pi * 2 + 1);
// perform transformation.
// In fact this is equivalent to multiplication to camera matrix
px = px * fx + cx;
py = py * fy + cy;
More info on camera matrix at OpenCV 'Camera Calibration and 3D Reconstruction'
Following C++ function call should work as well:
std::vector<cv::Point2f> inputDistortedPoints = ...
std::vector<cv::Point2f> outputUndistortedPoints;
cv::Mat cameraMatrix = ...
cv::Mat distCoeffs = ...
cv::undistortPoints(inputDistortedPoints, outputUndistortedPoints, cameraMatrix, distCoeffs, cv::noArray(), cameraMatrix);
It may be your matrix size :)
OpenCV expects a vector of points - a column or a row matrix with two channels. But because your input matrix is only 2 pts, and the number of channels is also 1, it cannot figure out what's the input, row or colum.
So, fill a longer input mat with bogus values, and keep only the first:
const int npoints = 4; // number of point specified
// Points initialization.
// Only 2 ponts in this example, in real code they are read from file.
float input_points[npoints][4] = {{0,0}, {2560, 1920}}; // the rest will be set to 0
CvMat * src = cvCreateMat(1, npoints, CV_32FC2);
CvMat * dst = cvCreateMat(1, npoints, CV_32FC2);
// fill src matrix
float * src_ptr = (float*)src->data.ptr;
for (int pi = 0; pi < npoints; ++pi) {
for (int ci = 0; ci < 2; ++ci) {
*(src_ptr + pi * 2 + ci) = input_points[pi][ci];
cvUndistortPoints(src, dst, &camera1, &distCoeffs1);
While OpenCV specifies undistortPoints accept only 2-channel input, actually, it accepts
1-column, 2-channel, multi-row mat or (and this case is not documented)
2 column, multi-row, 1-channel mat or
multi-column, 1 row, 2-channel mat
(as seen in undistort.cpp, line 390)
But a bug inside (or lack of available info), makes it wrongly mix the second one with the third one, when the number of columns is 2. So, your data is considered a 2-column, 2-row, 1-channel.
I also reach this problems, and I take some time to research an finally understand.
You see the formula above, in the open system, distort operation is before camera matrix, so the process order is:
image_distorted ->camera_matrix -> un-distort function->camera_matrix->back to image_undistorted.
So you need a small fix to and camera1 again.
Mat eye3 = Mat::eye(3, 3, CV_64F);
cvUndistortPoints(src, dst, &camera1, &distCoeffs1, &eye3,&camera1);
Otherwise, if the last two parameters is empty, It would be project to a Normalized image coordinate.
See codes: opencv-3.4.0-src\modules\imgproc\src\undistort.cpp :297
