I'd like to take the perspective transform matrix returned from OpenCV's findHomography function and convert it (either in C++ or Objective-C) to iOS' CATransform3D. I'd like them to be as close as possible in terms of accurately reproducing the "warp" effect on the Core Graphics side. Example code would really be appreciated!
From iOS' CATransform3D.h:
/* Homogeneous three-dimensional transforms. */
struct CATransform3D
CGFloat m11, m12, m13, m14;
CGFloat m21, m22, m23, m24;
CGFloat m31, m32, m33, m34;
CGFloat m41, m42, m43, m44;
Similar questions:
Apply homography matrix using Core Graphics
convert an opencv affine matrix to CGAffineTransform
I have never tried this so take it with a grain of salt.
CATRansform3D is a 4x4 matrix which operates on a 3 dimensional homogeneous vector (4x1) to produce another vector of the same type. I am assuming that when rendered, objects described by a 4x1 vector have each element divided by the 4th element and the 3rd element is used only to determine which objects appear on top of which. Assuming this is correct...
The 3x3 matrix returned by findHomography operates on a 2 dimensional homogeneous vector. That process can be thought of in 4 steps
The first column of the homography is multiplied by x
The second column of the homography is multiplied by y
The third column of the homography is multiplied by 1
the resulting 1st and 2nd vector elements are divided by the 3rd
You need this process to be replicated in a 4x4 vector in which I am assuming the 3rd element in the resulting vector is meaningless for your purposes.
Construct your matrix like this (H is your homography matrix)
[H(0,0), H(0,1), 0, H(0,2),
H(1,0), H(1,1), 0, H(1,2),
0, 0, 1, 0
H(2,0), H(2,1), 0, H(2,2)]
This clearly satisfies 1,2 and 3. 4 is satisfied because the homogeneous element is always the last one. That is why the "homogeneous row" if you will had to get bumped down one line. The 1 on the 3rd row is to let the z component of the vector pass through unmolested.
All of the above is done in row major notation (like openCV) to try to keep things from being confusing. You can look at Tommy's answer to see how the conversion to column major looks (you basically just transpose it). Note however that at the moment Tommy and I disagree about how to construct the matrix.
From my reading of the documentation, m11 in CATransform3D is equivalent to a in CGAffineTransform, m12 is equivalent to b and so on.
As per your comment below, I understand the matrix OpenCV returns to be 3x3 (which, in retrospect, is the size you'd expect). So you'd fill in the other elements with those equivalent to the identity matrix. As per Hammer's answer, you want to preserve the portion that deals with the (usually implicit) homogenous coordinate in its place while padding everything else with the identity.
[aside: my original answer was wrong. I've edited it to be correct since I've posted code and Hammer hasn't. This post is marked as community wiki to reflect that it's in no sense solely my answer]
So I think you'd want:
CATransform3D MatToTransform(Mat cvTransform)
CATransform3D transform;
transform.m11 = cvTransform.at<float>(0, 0);
transform.m12 = cvTransform.at<float>(1, 0);
transform.m13 = 0.0f;
transform.m14 = cvTransform.at<float>(2, 0);
transform.m21 = cvTransform.at<float>(0, 1);
transform.m22 = cvTransform.at<float>(1, 1);
transform.m23 = 0.0f;
transform.m24 = cvTransform.at<float>(2, 1);
transform.m31 = 0.0f;
transform.m32 = 0.0f;
transform.m33 = 1.0f;
transform.m34 = 0.0f;
transform.m41 = cvTransform.at<float>(0, 2);
transform.m42 = cvTransform.at<float>(1, 2);
transform.m43 = 0.0f;
transform.m44 = cvTransform.at<float>(2, 2);
return transform;
Or use cvGetReal1D if you're keeping C++ out of it.
Tommy answer worked for me, but I needed to use double, instead of float. This is also shortened version of the code:
CATransform3D MatToCATransform3D(cv::Mat H) {
return {
H.at<double>(0, 0), H.at<double>(1, 0), 0.0, H.at<double>(2, 0),
H.at<double>(0, 1), H.at<double>(1, 1), 0.0, H.at<double>(2, 1),
0.0, 0.0, 1.0, 0.0,
H.at<double>(0, 2), H.at<double>(1, 2), 0.0f, H.at<double>(2, 2)
I'm working on porting from OpenGL (OGL) to MetalKit (MTK) on iOS. I'm failing to get identical display in the MetalKit version of the app. I modified the projection matrix to account for differences in Normalized Device Coordinates between the two frameworks, but don't know what else to change to get identical display. Any ideas what else needs to be changed to port from OpenGL to MetalKit?
Projection Matrix Changes so far...
I understand that the Normalized Device Coordinates (NDC) are different in OGL vs MTK:
OGL NDC: -1 < z < 1
MTK NDC: 0 < z < 1
I modified the projection matrix to address the NDC difference, as indicated here. Unfortunately, this modification to the projection matrix doesn't result in identical display to the old OGL code.
I'm struggling to even know what else to try.
For reference, here's some misc background information:
The view matrix is very simple (identity matrix); i.e. camera is at (0, 0, 0) and looking toward (0, 0, -1)
In the legacy OpenGL code, I used GLKMatrix4MakeFrustum to produce the projection matrix, using the screen bounds for left, right, top, bottom, and near=1, far=1000
I stripped the scene down to bare bones while debugging and below are 2 images, the first from legacy OGL code and the second from MTK, both just showing the "ground" plane with a debug texture and a black background.
Any ideas about what else might need to change to get to identical display in MetalKit would be greatly appreciated.
OpenGL (legacy)
Edit 1
I tried to extract code relevant to calculation and use of the projection matrix:
float aspectRatio = 1.777; // iPhone 8 device
float top = 1;
float bottom = -1;
float left = -aspectRatio;
float right = aspectRatio;
float RmL = right - left;
float TmB = top - bottom;
float nearZ = 1;
float farZ = 1000;
GLKMatrix4 projMatrix = { 2 * nearZ / RmL, 0, 0, 0,
0, 2 * nearZ / TmB, 0, 0,
0, 0, -farZ / (farZ - nearZ), -1,
0, 0, -farZ * nearZ / (farZ - nearZ), 0 };
GLKMatrix4 viewMatrix = ...; // Identity matrix: camera at origin, looking at (0, 0, -1), yUp=(0, 1, 0);
GLKMatrix4 modelMatrix = ...; // Different for various models, but even when this is the identity matrix in old/new code the visual output is different
GLKMatrix4 mvpMatrix = GLKMatrix4Multiply(projMatrix, GLKMatrix4Multiply(viewMatrix, modelMatrix));
GLKMatrix4 x = mvpMatrix; // rename for brevity below
float mvpMatrixArray[16] = {x.m00, x.m01, x.m02, x.m03, x.m10, x.m11, x.m12, x.m13, x.m20, x.m21, x.m22, x.m23, x.m30, x.m31, x.m32, x.m33};
// making MVP matrix available to vertex shader
[renderCommandEncoder setVertexBytes:&mvpMatrixArray
length:16 * sizeof(float)
atIndex:1]; // vertex data is at "0"
[renderCommandEncoder setVertexBuffer:vertexBuffer
[renderCommandEncoder drawPrimitives:MTLPrimitiveTypeTriangleStrip
Sadly this issue ended up being due to a bug in the vertex shader that was pushing all geometry +1 on the Z axis, leading to the visual differences.
For any future OpenGL-to-Metal porters: the projection matrix changes above, accounting for the differences in normalized device coordinates, are enough.
Without seeing the code it's hard to say what the problem is. One of the most common issues could be a wrongly configured viewport:
// Set the region of the drawable to draw into.
[renderEncoder setViewport:(MTLViewport){0.0, 0.0, _viewportSize.x, _viewportSize.y, 0.0, 1.0 }];
The default values for the viewport are:
originX = 0.0
originY = 0.0
width = w
height = h
znear = 0.0
zfar = 1.0
*Metal: znear = minZ, zfar = maxZ.
MinZ and MaxZ indicate the depth-ranges into which the scene will be
rendered and are not used for clipping. Most applications will set
these members to 0.0 and 1.0 to enable the system to render to the
entire range of depth values in the depth buffer. In some cases, you
can achieve special effects by using other depth ranges. For instance,
to render a heads-up display in a game, you can set both values to 0.0
to force the system to render objects in a scene in the foreground, or
you might set them both to 1.0 to render an object that should always
be in the background.
Applications typically set MinZ and MaxZ to 0.0 and 1.0 respectively
to cause the system to render to the entire depth range. However, you
can use other values to achieve certain affects. For example, you
might set both values to 0.0 to force all objects into the foreground,
or set both to 1.0 to render all objects into the background.
I have been working on Pose Estimation (rectifying key points on a 3D model with 2D points on an image to match pose) via OpenCV's cv::solvePNP, using features / key points from Apples Vision framework.
My scene kit model is being translated and the units look correct when introspecting the translation and rotation vectors from solvePnP (ie, they are the right order of magnitude), but the coordinate system of the translation appears off:
I am trying to understand the coordinate system requirements with solvePnP wrt to Metal / OpenGL coordinate system and my camera projection matrix.
What 'projectionMatrix' does my SCNCamera require to match image based coordinate system passed into solvePnP?
Some things ive read / believe I am taking into account.
OpenCV vs OpenGL (thus Metal) have row major vs column major differences.
OpenCV's coordinate system for 3D is different than OpenGL (thus Metal).
Longer with code:
My workflow is as such:
Step 1 - use a 3D model tool to introspect points on my 3D model and get the objects vertex positions for the major key points in the 2D detected features. I am using left pupil, right pupil, tip of nose, tip of chin, left outer lip corner, right outer lip corner.
Step 2 - Run a vision request and extract a list of points in image space (converting for OpenCV's top left coordinate system) and extract the same ordered list of 2D points.
Step 3 - Construct a camera matrix by using the size of the input image.
Step 4 - run cv::solvePnP, and then use cv::Rodrigues to convert the rotation vector to a matrix
Step 5 - Convert the coordinate system of the resulting transforms into something appropriate for the GPU - invert the y and z axis and combine the translation and rotation to a single 4x4 Matrix, and then transpose it for the appropriate major ness of OpenGL / Metal
Step 6 - apply the resulting transform to Scenekit via:
let faceNodeTransform = openCVWrapper.transform(for: landmarks, imageSize: size)
self.destinationView.pointOfView?.transform = SCNMatrix4Invert(faceNodeTransform)
Below is my Obj-C++ OpenCV Wrapper which takes in a subset of Vision Landmarks and the true pixel size of the image being looked at:
/ https://answers.opencv.org/question/23089/opencv-opengl-proper-camera-pose-using-solvepnp/
- (SCNMatrix4) transformFor:(VNFaceLandmarks2D*)landmarks imageSize:(CGSize)imageSize
// 1 convert landmarks to image points in image space (pixels) to vector of cv::Point2f's :
// Note that this translates the point coordinate system to be top left oriented for OpenCV's image coordinates:
std::vector<cv::Point2f > imagePoints = [self imagePointsForLandmarks:landmarks imageSize:imageSize];
// 2 Load Model Points
std::vector<cv::Point3f > modelPoints = [self modelPoints];
// 3 create our camera extrinsic matrix
// TODO - see if this is sane?
double max_d = fmax(imageSize.width, imageSize.height);
cv::Mat cameraMatrix = (cv::Mat_<double>(3,3) << max_d, 0, imageSize.width/2.0,
0, max_d, imageSize.height/2.0,
0, 0, 1.0);
// 4 Run solvePnP
double distanceCoef[] = {0,0,0,0};
cv::Mat distanceCoefMat = cv::Mat(1 ,4 ,CV_64FC1,distanceCoef);
// Output Matrixes
std::vector<double> rv(3);
cv::Mat rotationOut = cv::Mat(rv);
std::vector<double> tv(3);
cv::Mat translationOut = cv::Mat(tv);
cv::solvePnP(modelPoints, imagePoints, cameraMatrix, distanceCoefMat, rotationOut, translationOut, false, cv::SOLVEPNP_EPNP);
// 5 Convert rotation matrix (actually a vector)
// To a real 4x4 rotation matrix:
cv::Mat viewMatrix = cv::Mat::zeros(4, 4, CV_64FC1);
cv::Mat rotation;
cv::Rodrigues(rotationOut, rotation);
// Append our transforms to our matrix and set final to identity:
for(unsigned int row=0; row<3; ++row)
for(unsigned int col=0; col<3; ++col)
viewMatrix.at<double>(row, col) = rotation.at<double>(row, col);
viewMatrix.at<double>(row, 3) = translationOut.at<double>(row, 0);
viewMatrix.at<double>(3, 3) = 1.0f;
// Transpose OpenCV to OpenGL coords
cv::Mat cvToGl = cv::Mat::zeros(4, 4, CV_64FC1);
cvToGl.at<double>(0, 0) = 1.0f;
cvToGl.at<double>(1, 1) = -1.0f; // Invert the y axis
cvToGl.at<double>(2, 2) = -1.0f; // invert the z axis
cvToGl.at<double>(3, 3) = 1.0f;
viewMatrix = cvToGl * viewMatrix;
// Finally transpose to get correct SCN / OpenGL Matrix :
cv::Mat glViewMatrix = cv::Mat::zeros(4, 4, CV_64FC1);
cv::transpose(viewMatrix , glViewMatrix);
return [self convertCVMatToMatrix4:glViewMatrix];
- (SCNMatrix4) convertCVMatToMatrix4:(cv::Mat)matrix
SCNMatrix4 scnMatrix = SCNMatrix4Identity;
scnMatrix.m11 = matrix.at<double>(0, 0);
scnMatrix.m12 = matrix.at<double>(0, 1);
scnMatrix.m13 = matrix.at<double>(0, 2);
scnMatrix.m14 = matrix.at<double>(0, 3);
scnMatrix.m21 = matrix.at<double>(1, 0);
scnMatrix.m22 = matrix.at<double>(1, 1);
scnMatrix.m23 = matrix.at<double>(1, 2);
scnMatrix.m24 = matrix.at<double>(1, 3);
scnMatrix.m31 = matrix.at<double>(2, 0);
scnMatrix.m32 = matrix.at<double>(2, 1);
scnMatrix.m33 = matrix.at<double>(2, 2);
scnMatrix.m34 = matrix.at<double>(2, 3);
scnMatrix.m41 = matrix.at<double>(3, 0);
scnMatrix.m42 = matrix.at<double>(3, 1);
scnMatrix.m43 = matrix.at<double>(3, 2);
scnMatrix.m44 = matrix.at<double>(3, 3);
return (scnMatrix);
Some questions:
An SCNNode has no modelViewMatrix (just as I understand it, a transform, which is the modelMatrix) to just throw a matrix at - so I've read the inverse of the transform from SolvePNP process can be used to pose the camera instead, which appears to get me the closes result. I want to ensure this approach is correct.
If I have the modelViewMatrix, and the projectionMatrix, I should be able to calculate the appropriate modelMatrix? Is this the approach I should be taking?
Its unclear to me what projectionMatrix I should be using for my SceneKit Scene and If that has any bearing on my results. Do I need a pixel for pixel exact match of my viewport to the image size, and how do I properly configure my SCNCamera to ensure coordinate system agreeance for SolvePnP?
Thank you very much!
I'm trying to use OpenCV to do some basic augmented reality. The way I'm going about it is using findChessboardCorners to get a set of points from a camera image. Then, I create a 3D quad along the z = 0 plane and use solvePnP to get a homography between the imaged points and the planar points. From that, I figure I should be able to set up a modelview matrix which will allow me to render a cube with the right pose on top of the image.
The documentation for solvePnP says that it outputs a rotation vector "that (together with [the translation vector] ) brings points from the model coordinate system to the camera coordinate system." I think that's the opposite of what I want; since my quad is on the plane z = 0, I want a a modelview matrix which will transform that quad to the appropriate 3D plane.
I thought that by performing the opposite rotations and translations in the opposite order I could calculate the correct modelview matrix, but that seems not to work. While the rendered object (a cube) does move with the camera image and seems to be roughly correct translationally, the rotation just doesn't work at all; it on multiple axes when it should only be rotating on one, and sometimes in the wrong direction. Here's what I'm doing so far:
std::vector<Point2f> corners;
bool found = findChessboardCorners(*_imageBuffer, cv::Size(5,4), corners,
drawChessboardCorners(*_imageBuffer, cv::Size(6, 5), corners, found);
std::vector<double> distortionCoefficients(5); // camera distortion
distortionCoefficients[0] = 0.070969;
distortionCoefficients[1] = 0.777647;
distortionCoefficients[2] = -0.009131;
distortionCoefficients[3] = -0.013867;
distortionCoefficients[4] = -5.141519;
// Since the image was resized, we need to scale the found corner points
float sw = _width / SMALL_WIDTH;
float sh = _height / SMALL_HEIGHT;
std::vector<Point2f> board_verts;
board_verts.push_back(Point2f(corners[0].x * sw, corners[0].y * sh));
board_verts.push_back(Point2f(corners[15].x * sw, corners[15].y * sh));
board_verts.push_back(Point2f(corners[19].x * sw, corners[19].y * sh));
board_verts.push_back(Point2f(corners[4].x * sw, corners[4].y * sh));
Mat boardMat(board_verts);
std::vector<Point3f> square_verts;
square_verts.push_back(Point3f(-1, 1, 0));
square_verts.push_back(Point3f(-1, -1, 0));
square_verts.push_back(Point3f(1, -1, 0));
square_verts.push_back(Point3f(1, 1, 0));
Mat squareMat(square_verts);
// Transform the camera's intrinsic parameters into an OpenGL camera matrix
// Camera parameters
double f_x = 786.42938232; // Focal length in x axis
double f_y = 786.42938232; // Focal length in y axis (usually the same?)
double c_x = 217.01358032; // Camera primary point x
double c_y = 311.25384521; // Camera primary point y
cv::Mat cameraMatrix(3,3,CV_32FC1);
cameraMatrix.at<float>(0,0) = f_x;
cameraMatrix.at<float>(0,1) = 0.0;
cameraMatrix.at<float>(0,2) = c_x;
cameraMatrix.at<float>(1,0) = 0.0;
cameraMatrix.at<float>(1,1) = f_y;
cameraMatrix.at<float>(1,2) = c_y;
cameraMatrix.at<float>(2,0) = 0.0;
cameraMatrix.at<float>(2,1) = 0.0;
cameraMatrix.at<float>(2,2) = 1.0;
Mat rvec(3, 1, CV_32F), tvec(3, 1, CV_32F);
solvePnP(squareMat, boardMat, cameraMatrix, distortionCoefficients,
rvec, tvec);
_rv[0] = rvec.at<double>(0, 0);
_rv[1] = rvec.at<double>(1, 0);
_rv[2] = rvec.at<double>(2, 0);
_tv[0] = tvec.at<double>(0, 0);
_tv[1] = tvec.at<double>(1, 0);
_tv[2] = tvec.at<double>(2, 0);
Then in the drawing code...
GLKMatrix4 modelViewMatrix = GLKMatrix4MakeTranslation(0.0f, 0.0f, 0.0f);
modelViewMatrix = GLKMatrix4Translate(modelViewMatrix, -tv[1], -tv[0], -tv[2]);
modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, -rv[0], 1.0f, 0.0f, 0.0f);
modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, -rv[1], 0.0f, 1.0f, 0.0f);
modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, -rv[2], 0.0f, 0.0f, 1.0f);
The vertices I'm rendering create a cube of unit length around the origin (i.e. from -0.5 to 0.5 along each edge.) I know with OpenGL translation functions performed transformations in "reverse order," so the above should rotate the cube along the z, y, and then x axes, and then translate it. However, it seems like it's being translated first and then rotated, so perhaps Apple's GLKMatrix4 works differently?
This question seems very similar to mine, and in particular coder9's answer seems like it might be more or less what I'm looking for. However, I tried it and compared the results to my method, and the matrices I arrived at in both cases were the same. I feel like that answer is right, but that I'm missing some crucial detail.
You have to make sure the axis are facing the correct direction. Especially, the y and z axis are facing different directions in OpenGL and OpenCV to ensure the x-y-z basis is direct. You can find some information and code (with an iPad camera) in this blog post.
-- Edit --
Ah ok. Unfortunately, I used these resources to do it the other way round (opengl ---> opencv) to test some algorithms. My main issue was that the row order of the images was inverted between OpenGL and OpenCV (maybe this helps).
When simulating cameras, I came across the same projection matrices that can be found here and in the generalized projection matrix paper. This paper quoted in the comments of the blog post also shows some link between computer vision and OpenGL projections.
I'm not an IOS programmer, so this answer might be misleading!
If the problem is not in the order of applying the rotations and the translation, then suggest using a simpler and more commonly used coordinate system.
The points in the corners vector have the origin (0,0) at the top left corner of the image and the y axis is towards the bottom of the image. Often from math we are used to think of the coordinate system with the origin at the center and y axis towards the top of the image. From the coordinates you're pushing into board_verts I'm guessing you're making the same mistake. If that's the case, it's easy to transform the positions of the corners by something like this:
for (i=0;i<corners.size();i++) {
corners[i].x -= width/2;
corners[i].y = -corners[i].y + height/2;
then you call solvePnP(). Debugging this is not that difficult, just print the positions of the four corners and the estimated R and T, and see if they make sense. Then you can proceed to the OpenGL step. Please let me know how it goes.
How can I make a Core Graphics affine transform for rotation around a point x,y of angle a, using only a single call to CGAffineTransformMake() plus math.h trig functions such as sin(), cos(), etc., and no other CG calls.
Other answers here seem to be about using multiple stacked transforms or multi-step transforms to move, rotate and move, using multiple Core Graphics calls. Those answers do not meet my specific requirements.
A rotation of angle a around the point (x,y) corresponds to the affine transformation:
CGAffineTransform transform = CGAffineTransformMake(cos(a),sin(a),-sin(a),cos(a),x-x*cos(a)+y*sin(a),y-x*sin(a)-y*cos(a));
You may need to plug in -a instead of a depending on whether you want the rotation to be clockwise or counterclockwise. Also, you may need to plug in -y instead of y depending on whether or not your coordinate system is upside down.
Also, you can accomplish precisely the same thing in three lines of code using:
CGAffineTransform transform = CGAffineTransformMakeTranslation(x, y);
transform = CGAffineTransformRotate(transform, a);
transform = CGAffineTransformTranslate(transform,-x,-y);
If you were applying this to a view, you could also simply use a rotation transform via CGAffineTransformMakeRotation(a), provided you set the view's layer's anchorPoint property to reflect the point you want to rotate around. However, is sounds like you aren't interested in applying this to a view.
Finally, if you are applying this to a non-Euclidean 2D space, you may not want an affine transformation at all. Affine transformations are isometries of Euclidean space, meaning that they preserve the standard Euclidean distance, as well as angles. If your space is not Euclidean, then the transformation you want may not actually be affine, or if it is affine, the matrix for the rotation might not be as simple as what I wrote above with sin and cos. For instance, if you were in a hyperbolic space, you might need to use the hyperbolic trig functions sinh and cosh, along with different + and - signs in the formula.
P.S. I also wanted to remind anyone reading this far that "affine" is pronounced with a short "a" as in "ask", not a long "a" as in "able". I have even heard Apple employees mispronouncing it in their WWDC talks.
for Swift 4
print(x, y) // where x,y is the point to rotate around
let degrees = 45.0
let transform = CGAffineTransform(translationX: x, y: y)
.rotated(by: degrees * .pi / 180)
.translatedBy(x: -x, y: -y)
For those like me, that are struggling in search of a complete solution to rotate an image and scale it properly, in order to fill the containing frame, after a couple of hours this is the most complete and flawless solution that I have obtained.
The trick here is to translate the reference point, before any trasformation involved (both scale and rotation). After that, you have to concatenate the two transform in order to obtain a complete affine transform.
I have packed the whole solution in a CIFilter subclass that you can gist here.
Following the relevant part of code:
CGFloat a = _inputDegree.floatValue;
CGFloat x = _inputImage.extent.size.width/2.0;
CGFloat y = _inputImage.extent.size.height/2.0;
CGFloat scale = [self calculateScaleForAngle:GLKMathRadiansToDegrees(a)];
CGAffineTransform transform = CGAffineTransformMakeTranslation(x, y);
transform = CGAffineTransformRotate(transform, a);
transform = CGAffineTransformTranslate(transform,-x,-y);
CGAffineTransform transform2 = CGAffineTransformMakeTranslation(x, y);
transform2 = CGAffineTransformScale(transform2, scale, scale);
transform2 = CGAffineTransformTranslate(transform2,-x,-y);
CGAffineTransform concate = CGAffineTransformConcat(transform2, transform);
Here's some convenience methods for rotating about an anchor point:
extension CGAffineTransform {
init(rotationAngle: CGFloat, anchor: CGPoint) {
a: cos(rotationAngle),
b: sin(rotationAngle),
c: -sin(rotationAngle),
d: cos(rotationAngle),
tx: anchor.x - anchor.x * cos(rotationAngle) + anchor.y * sin(rotationAngle),
ty: anchor.y - anchor.x * sin(rotationAngle) - anchor.y * cos(rotationAngle)
func rotated(by angle: CGFloat, anchor: CGPoint) -> Self {
let transform = Self(rotationAngle: angle, anchor: anchor)
return self.concatenating(transform)
Use the view's layer and anchor point. e.g.
view.layer.anchorPoint = CGPoint(x:0,y:1.0)
I'm currently working on this [opencv sample]
The interesting part is at line 89 warpPerspectiveRand method. I want to set the rotation angle, translation, scaling and other transformation values manually instead of using random generated values. But I don't know how to calculate the matrix elements.
A simple calculation example would be helpful.
double ang = 0.1;
double xscale = 1.2;
double yscale = 1.5;
double xTranslation = 100;
double yTranslation = 200;
cv::Mat t(3,3,CV_64F);
t.at<double>(0,0) = xscale*cos(ang);
t.at<double>(1,1) = yscale*cos(ang);
t.at<double>(0,1) = -sin(ang);
t.at<double>(1,0) = sin(ang);
t.at<double>(0,2) = xTranslation ;
t.at<double>(1,2) = yTranslation;
t.at<double>(2,2) = 1;
Rotation is always around (0,0). If you would like to rotated around a different point, you need to translate(move), rotate, and move back. It can be done by creating two matrices, one for rotation (A) and one for translation(T), and building a new Matrix M as:
M = inv(T) * A * T
What you're looking for is a projection matrix
There are different matrix styles, some of them are 4x4 (the complete theoretical projection matrix), some are 3x3 (as in OpenCV), because they consider the projection as a transform from a planar surface to another planar surface, and this constraint allows one to express the trasform by a 3x3 matrix.