Arcball Rotation with Quaternions (using iOS GLKit) - ios

I'm looking for a simple implementation for arcball rotation on 3D models with quaternions, specifically using GLKit on iOS. So far, I have examined the following sources:
Arcball rotation with GLKit
How to rotate a 3D object with touches using OpenGL
I've also been trying to understand source code and maths from here and here. I can rotate my object but it keeps jumping around at certain angles, so I fear gimbal lock is at play. I'm using gesture recognizers to control the rotations (pan gestures affect roll and yaw, rotate gestures affect pitch). I'm attaching my code for the quaternion handling as well as the modelview matrix transformation.
Variables:
GLKQuaternion rotationE;
Quaternion Handling:
- (void)rotateWithXY:(float)x and:(float)y
{
const float rate = M_PI/360.0f;
GLKVector3 up = GLKVector3Make(0.0f, 1.0f, 0.0f);
GLKVector3 right = GLKVector3Make(1.0f, 0.0f, 0.0f);
up = GLKQuaternionRotateVector3(GLKQuaternionInvert(self.rotationE), up);
self.rotationE = GLKQuaternionMultiply(self.rotationE, GLKQuaternionMakeWithAngleAndVector3Axis(x*rate, up));
right = GLKQuaternionRotateVector3(GLKQuaternionInvert(self.rotationE), right);
self.rotationE = GLKQuaternionMultiply(self.rotationE, GLKQuaternionMakeWithAngleAndVector3Axis(y*rate, right));
}
- (void)rotateWithZ:(float)z
{
GLKVector3 front = GLKVector3Make(0.0f, 0.0f, -1.0f);
front = GLKQuaternionRotateVector3(GLKQuaternionInvert(self.rotationE), front);
self.rotationE = GLKQuaternionMultiply(self.rotationE, GLKQuaternionMakeWithAngleAndVector3Axis(z, front));
}
Modelview Matrix Transformation (Inside Draw Loop):
// Get Quaternion Rotation
GLKVector3 rAxis = GLKQuaternionAxis(self.transformations.rotationE);
float rAngle = GLKQuaternionAngle(self.transformations.rotationE);
// Set Modelview Matrix
GLKMatrix4 modelviewMatrix = GLKMatrix4Identity;
modelviewMatrix = GLKMatrix4MakeTranslation(0.0f, 0.0f, -0.55f);
modelviewMatrix = GLKMatrix4Rotate(modelviewMatrix, rAngle, rAxis.x, rAxis.y, rAxis.z);
modelviewMatrix = GLKMatrix4Scale(modelviewMatrix, 0.5f, 0.5f, 0.5f);
glUniformMatrix4fv(self.sunShader.uModelviewMatrix, 1, 0, modelviewMatrix.m);
Any help is greatly appreciated, but I do want to keep it as simple as possible and stick to GLKit.

There seem to be a few issues going on here.
You say that you're using [x,y] to pan, but it looks more like you're using them to pitch and yaw. To me, at least, panning is translation, not rotation.
Unless I'm missing something, it also looks like your replacing the entire rotation everytime you try to update it. You rotate a vector by the inverse of the current rotation and then create a quaternion from that vector and some angle. I believe that this is equivalent to creating the quaternion from the original vector and then rotating it by the current rotation inverse. So you have q_e'*q_up. Then you multiply that with the current rotation, which gives q_e*q_e'*q_up = q_up. The current rotation is canceled out. This doesn't seem like it's what you want.
All you really need to do is create a new quaternion from axis-and-angle and then multiply it with the current quaternion. If the new quaternion is on the left, the orientation change will use the eye-local frame. If the new quaternion is on the right, the orientation change will be in the global frame. I think you want:
self.rotationE =
GLKQuaternionMultiply(
GLKQuaternionMakeWithAngleAndVector3Axis(x*rate, up),self.rotationE);
Do this, without the pre-rotation by inverse for all three cases.
I've never used the GLKit, but it's uncommon to extract axis-angle when converting from Quaternion to Matrix. If the angle is zero, the axis is undefined. When it's near zero, you'll have numeric instability. It looks like you should be using GLKMatrix4MakeWithQuaternion and then multiplying the resulting matrix with your translation matrix and scale matrix:
GLKMatrix4 modelviewMatrix =
GLKMatrix4Multiply( GLKMatrix4MakeTranslation(0.0f, 0.0f, -0.55f),
GLKMatrix4MakeWithQuaternion( self.rotationE ) );
modelviewMatrix = GLKMatrix4Scale( modelviewMatrix, 0.5f, 0.5f, 0.5f );

I was recently asked a bit more about my resulting implementation of this problem, so here it is!
- (void)rotate:(GLKVector3)r
{
// Convert degrees to radians for maths calculations
r.x = GLKMathDegreesToRadians(r.x);
r.y = GLKMathDegreesToRadians(r.y);
r.z = GLKMathDegreesToRadians(r.z);
// Axis Vectors w/ Direction (x=right, y=up, z=front)
// In OpenGL, negative z values go "into" the screen. In GLKit, positive z values go "into" the screen.
GLKVector3 right = GLKVector3Make(1.0f, 0.0f, 0.0f);
GLKVector3 up = GLKVector3Make(0.0f, 1.0f, 0.0f);
GLKVector3 front = GLKVector3Make(0.0f, 0.0f, 1.0f);
// Quaternion w/ Angle and Vector
// Positive angles are counter-clockwise, so convert to negative for a clockwise rotation
GLKQuaternion q = GLKQuaternionIdentity;
q = GLKQuaternionMultiply(GLKQuaternionMakeWithAngleAndVector3Axis(-r.x, right), q);
q = GLKQuaternionMultiply(GLKQuaternionMakeWithAngleAndVector3Axis(-r.y, up), q);
q = GLKQuaternionMultiply(GLKQuaternionMakeWithAngleAndVector3Axis(-r.z, front), q);
// ModelView Matrix
GLKMatrix4 modelViewMatrix = GLKMatrix4Identity;
modelViewMatrix = GLKMatrix4Multiply(modelViewMatrix, GLKMatrix4MakeWithQuaternion(q));
}
Hope you put it to good use :)

Related

Object projection in openGL ES 2

I want to draw object within ar but got unexpected result - gl mashine think that i see object from another side (or from inside).
Here image what i want to draw (taken from separate project)
And here - what i got when try to draw this object in my ar (inside of the sphere)
So I guess that problem is that because I put object inside sphere and adjust position of obj using base mat from sphere obj.
Camera positioned in the center of the sphere - so for this obj I use same mat - just scale/rotate/translate it.
This is how I calculate projection mat
CGRect viewFrame = self.frame;
if (!CGSizeEqualToSize(self.newSize, CGSizeZero){
size = self.newSize;
}
CGFloat aspect = viewFrame.size.width / viewFrame.size.height;
CGFloat scale = self.interractor.scale;
CGFloat FOVY = DEGREES_TO_RADIANS(self.viewScale) / scale;
CGFloat cameraDistanse = -(1.0 / [Utilities FarZ]);
GLKMatrix4 cameraTranslation = GLKMatrix4MakeTranslation(0, 0, cameraDistanse);
GLKMatrix4 projectionMatrix = GLKMatrix4MakePerspective(FOVY, aspect, NearZ, [Utilities FarZ]);
projectionMatrix = GLKMatrix4Multiply(projectionMatrix, cameraTranslation);
//and also here added some code for modifying, but I skip it here
For this obj I just calculate new scale and position of obj - looks like it's correct because I able to see obj and change his position etc, so skip this part.
In the second project where I got correct result of displaying obj I calculate projection mat in similar way, but with a little bit less calculation:
float aspect = self.glView.frame.size.width / self.glView.frame.size.height;
GLKMatrix4 projectionMatrix = GLKMatrix4MakePerspective(GLKMathDegreesToRadians(65.0f), aspect, 0.01f, 100);
//scale
//rotate
//translate
GLKMatrix4 modelViewMatrix = GLKMatrix4MakeTranslation(0.0f, 0.0f, -1.5f);
modelViewMatrix = GLKMatrix4Multiply(modelViewMatrix, projectionMatrix);
GLfloat scale = 0.5 *_scale;
GLKMatrix4 scaleMatrix = GLKMatrix4MakeScale(scale, scale, scale);
modelViewMatrix = GLKMatrix4Translate(modelViewMatrix, _positionX, _positionY, -5);
modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, _rotationX, 0.0f, 1.0f, 0.0f);
modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, _rotationY, 1.0f, 0.0f, 0.0f);
modelViewMatrix = GLKMatrix4Multiply(scaleMatrix, modelViewMatrix);
In first project (correct one) I also use
glEnable(GL_DEPTH_TEST);
glDepthMask(GL_TRUE);
glDisable(GL_CULL_FACE);
In second with few obj - depend from obj that I want to draw:
glClearColor(0.0f, 1.0f, 0.0f, 1.0f);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
glDisable(GL_SCISSOR_TEST);
glDisable(GL_DEPTH_TEST);
glDisable(GL_CULL_FACE);
//call sphere draw
glEnable(GL_DEPTH_TEST);
glEnable(GL_CULL_FACE);
//call obj draw
glEnable(GL_SCISSOR_TEST);
So as I sad before, I guess that the problem is that openGL "think" that we are looking to obj from another side, but I'm not sure. And if i'm right how can i fix this? Or whats done incorrect?
Update
#codetiger I check ur suggestions:
1) Wrong face winding order - recheck it again and try to inverse order, also try to build same model in another project (all works perfect) - result i guess that order is ok;
2) Wrong Culling - check all combinations of
glDisable / glEnable with argument GL_CULL_FACE
glCullFace with argument GL_FRONT, GL_BACK or GL_FRONT_AND_BACK
glFrontFace with argument GL_CW or GL_CCW
What i see - a little bit change but i see still incorrect obj (or wrong side or partial obj etc)
3) vertices are flipped - try to flip them, as result - even worse than before
4) try to combine this 3 suggestion one with another - result not acceptable

OpenGL draw multiple isometric cubes

I'm trying to draw multiple cubes at an isometric camera angle. Here's the code that draws one. (OpenGL ES 2.0 with GLKit on iOS).
float startZ = -4.0f;
// position
GLKMatrix4 modelViewMatrix = GLKMatrix4Identity;
modelViewMatrix = GLKMatrix4Translate(modelViewMatrix, location.x, location.y, location.z + startZ);
// isometric camera angle
modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, GLKMathDegreesToRadians(45), 1.0, 0, 0);
modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, GLKMathDegreesToRadians(45), 0.0, 1.0, 0);
self.effect.transform.modelviewMatrix = modelViewMatrix;
[self.effect prepareToDraw];
glDrawArrays(GL_TRIANGLES, 0, 36);
The problem is that it is translating first, then rotating, which means with more than one box, they do not line up (they look like a chain of diamonds. Each one is in position and rotated so the corners overlap).
I've tried switching the order so the rotation is before the translation, but they don't show up at all. My vertex array is bound to a unit cube centered around the origin.
I really don't understand how to control the camera separate from the object. I screwed around with the projection matrix for a while without getting it. As far as I understand, the camera is supposed to be controlled with the modelViewMatrix, right? (The "View" part).
Your 'camera' transform (modelview) seems correct, however it looks like you're using a perspective projection - if you want isometric you will need to change your projection matrix.
It looks like you are applying the camera rotation to each object as you draw it. Instead, simulate a 2-deep matrix stack if your use-case is this simple, so you just have your camera matrix and each cube's matrix.
Set your projection and camera matrices - keep a reference to your camera matrix.
For each cube generate the individual cube transformation matrix (which should probably consist of translation only - no rotation so the cubes remain axis aligned - I think this is what you're going for).
Backwards-multiply your cube matrix by the camera matrix and use that as the modelview matrix.
Note that the camera matrix will remain unchanged for each cube you render, but the modelview matrix will incorporate both the cube's individual transformation matrix and the camera matrix into a single modelview matrix. This is equivalent to the old matrix stack methods glPushMatrix and glPopMatrix (not available in GLES 2.0). If you need more complex object hierarchies (where the cubes have child-objects in their 'local' coordinate space) then you should probably implement your own full matrix stack, instead of the 2-deep equivalent discussed above.
For reference, here's an article that helped me understand it. It's a little mathy, but does a good job explaining things intuitively.
http://db-in.com/blog/2011/04/cameras-on-opengl-es-2-x/
I ended up keeping the perspective projection, because I don't want true isometric. The key was to do them in the right order, because moving the camera is the inverse of moving the object. See the comments, and the article. Working code:
// the one you want to happen first is multiplied LAST
// camRotate * camScale * camTranslate * objTranslate * objScale * objRotate;
// TODO cache the camera matrix
// the camera angle remains the same for all objects
GLKMatrix4 camRotate = GLKMatrix4MakeRotation(GLKMathDegreesToRadians(45), 1, 0, 0);
camRotate = GLKMatrix4Rotate(camRotate, GLKMathDegreesToRadians(45), 0, 1, 0);
GLKMatrix4 camTranslate = GLKMatrix4MakeTranslation(4, -5, -4.0);
GLKMatrix4 objTranslate = GLKMatrix4MakeTranslation(location.x, location.y, location.z);
GLKMatrix4 modelViewMatrix = GLKMatrix4Multiply(camRotate, camTranslate);
modelViewMatrix = GLKMatrix4Multiply(modelViewMatrix, objTranslate);
self.effect.transform.modelviewMatrix = modelViewMatrix;
[self.effect prepareToDraw];
glDrawArrays(GL_TRIANGLES, 0, 36);

find the point in space ahead of current view in opengl on iphone

I've got some code working to create a 3d view in opengl and then use the device motion to look around within it. i know this is working because i can place 3d cubes in space around me and see that they are in the right places. (i'm just creating them with x/y/z co-ordinates).
The code uses the rotation matrix of the device and then applies it to the various blocks.
CMRotationMatrix r = dm.attitude.rotationMatrix;
GLKMatrix4 baseModelViewMatrix = GLKMatrix4Make(r.m11, r.m21, r.m31, 0.0f,
r.m12, r.m22, r.m32, 0.0f,
r.m13, r.m23, r.m33, 0.0f,
0.0f, 0.0f, 0.0f, 1.0f);
float aspect = fabsf(self.view.bounds.size.width / self.view.bounds.size.height);
GLKMatrix4 projectionMatrix = GLKMatrix4MakePerspective(GLKMathDegreesToRadians(65.0f), aspect, kNearZ, kFarZ);
block.effect.transform.projectionMatrix = projectionMatrix;
what i want to be able to do is figure out where i'm looking and then create a block out in front of me.
i've had some limited success by creating a vector in one direction, applying the rotation matrix to it and then reading off it's new values. but it only works on some of the directions- when i rotate too far it messes up.
GLKVector4 vect = GLKVector4Make(0.0f,0.0f,10.0f,1.0f);
GLKVector4 newVec = GLKMatrix4MultiplyVector4(baseModelViewMatrix,vect);
I then read off newVec.x,newVec.y,newVec.z and use them to place the cube.
can someone tell me if i'm on the right track here? is there an easier way to achieve this?
the maths of it all is quite daunting.
UPDATE:
I've had some partial success using
GLKVector3 newVec1 = GLKVector3Make(-r.m22, -r.m33, r.m21);
This only works in one lanscape orientation, and also only works in a cylinder around my current location. the up/down axis isn't quite right.
are these parts of the rotation matrix sufficient to get a point anywhere around me?
UPDATE 2:
Thought it might help to post some more code to make it really clear.
This is how everything is getting displayed.
//1- get device position
CMDeviceMotion *dm = motionManager.deviceMotion;
CMRotationMatrix r = dm.attitude.rotationMatrix;
GLKMatrix4 baseModelViewMatrix = GLKMatrix4Make(r.m11, r.m21, r.m31, 0.0f,
r.m12, r.m22, r.m32, 0.0f,
r.m13, r.m23, r.m33, 0.0f,
0.0f, 0.0f, 0.0f, 1.0f);
GLKMatrix4 projectionMatrix = GLKMatrix4MakePerspective(GLKMathDegreesToRadians(45.0f), aspect, kNearZ, kFarZ);
//2- work out position ahead of current view
GLKVector3 newVector = GLKVector3Make(-r.m22, -r.m33, r.m21);
//place cube in 3d space- this works fine when just positioning with x,y,z co-ordinates
cube.effect.transform.projectionMatrix = projectionMatrix;
GLKMatrix4 modelViewMatrix = GLKMatrix4Identity;
modelViewMatrix = GLKMatrix4Translate(modelViewMatrix, newVector.x*100, newVector.y*100, newVector.z*100);
modelViewMatrix = GLKMatrix4Multiply(baseModelViewMatrix, modelViewMatrix);
cube.effect.transform.modelviewMatrix = modelViewMatrix;
The vector where you look at should probably be multiplied by inverted matrix. You didn't write what results are you getting but here are a few possible solutions:
newVect1 = (r.m13, r.m23, r.m33)
newVect2 = (-r.m13, -r.m23, r.m33)
multiply (0,0,1) with inverted matrix
EDIT: (To add some explanations)
If your baseModelViewMatrix is responsible for rotating all the objects around you then the vector that faces forward (display wise) should be rotated by inverted matrix of baseModelViewMatrix. If it is true that in case that baseModelViewMatrix is identity the correct vector for facing forward is (0,0,1), then just multiply (0,0,1) with inverted matrix (you can simply try that and you even should, just set it to identity and place the object at (0,0,100) or whatever). If it is not, you should find the one that is facing forward and do the same with it.
Concerning the 2nd method I posted newVect2 = (-r.m13, -r.m23, r.m33) it is specifically meant in case where (0,0,1) is the forward vector. The idea is that rotation matrix consists of 3 base vectors x,y,z where each row represents one of them respectively (it always works for creating rotation matrix but might not reversibly). If the 3rd row does represent the z base vector then vector (0,0,1) will be transformed to (r.m13, r.m23, r.m33) and the one you seem to be facing is the vector that is mirrored through its normal (0,0,1). The operation for that would be R' = normal*(2*dot(normal, R)) - R and for the case where normal is (0,0,1) and R is (r.m13, r.m23, r.m33), the result R' is (-r.m13, -r.m23, r.m33). Just as a note here it might be turned around and the z base vector is (r.m31, r.m32, r.m33).
In any case, if 3rd option does not work neither will the 2nd and the problem for that is probably that (0,0,1) is not the forward vector for identity.

OpenCV: rotation/translation vector to OpenGL modelview matrix

I'm trying to use OpenCV to do some basic augmented reality. The way I'm going about it is using findChessboardCorners to get a set of points from a camera image. Then, I create a 3D quad along the z = 0 plane and use solvePnP to get a homography between the imaged points and the planar points. From that, I figure I should be able to set up a modelview matrix which will allow me to render a cube with the right pose on top of the image.
The documentation for solvePnP says that it outputs a rotation vector "that (together with [the translation vector] ) brings points from the model coordinate system to the camera coordinate system." I think that's the opposite of what I want; since my quad is on the plane z = 0, I want a a modelview matrix which will transform that quad to the appropriate 3D plane.
I thought that by performing the opposite rotations and translations in the opposite order I could calculate the correct modelview matrix, but that seems not to work. While the rendered object (a cube) does move with the camera image and seems to be roughly correct translationally, the rotation just doesn't work at all; it on multiple axes when it should only be rotating on one, and sometimes in the wrong direction. Here's what I'm doing so far:
std::vector<Point2f> corners;
bool found = findChessboardCorners(*_imageBuffer, cv::Size(5,4), corners,
CV_CALIB_CB_FILTER_QUADS |
CV_CALIB_CB_FAST_CHECK);
if(found)
{
drawChessboardCorners(*_imageBuffer, cv::Size(6, 5), corners, found);
std::vector<double> distortionCoefficients(5); // camera distortion
distortionCoefficients[0] = 0.070969;
distortionCoefficients[1] = 0.777647;
distortionCoefficients[2] = -0.009131;
distortionCoefficients[3] = -0.013867;
distortionCoefficients[4] = -5.141519;
// Since the image was resized, we need to scale the found corner points
float sw = _width / SMALL_WIDTH;
float sh = _height / SMALL_HEIGHT;
std::vector<Point2f> board_verts;
board_verts.push_back(Point2f(corners[0].x * sw, corners[0].y * sh));
board_verts.push_back(Point2f(corners[15].x * sw, corners[15].y * sh));
board_verts.push_back(Point2f(corners[19].x * sw, corners[19].y * sh));
board_verts.push_back(Point2f(corners[4].x * sw, corners[4].y * sh));
Mat boardMat(board_verts);
std::vector<Point3f> square_verts;
square_verts.push_back(Point3f(-1, 1, 0));
square_verts.push_back(Point3f(-1, -1, 0));
square_verts.push_back(Point3f(1, -1, 0));
square_verts.push_back(Point3f(1, 1, 0));
Mat squareMat(square_verts);
// Transform the camera's intrinsic parameters into an OpenGL camera matrix
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
// Camera parameters
double f_x = 786.42938232; // Focal length in x axis
double f_y = 786.42938232; // Focal length in y axis (usually the same?)
double c_x = 217.01358032; // Camera primary point x
double c_y = 311.25384521; // Camera primary point y
cv::Mat cameraMatrix(3,3,CV_32FC1);
cameraMatrix.at<float>(0,0) = f_x;
cameraMatrix.at<float>(0,1) = 0.0;
cameraMatrix.at<float>(0,2) = c_x;
cameraMatrix.at<float>(1,0) = 0.0;
cameraMatrix.at<float>(1,1) = f_y;
cameraMatrix.at<float>(1,2) = c_y;
cameraMatrix.at<float>(2,0) = 0.0;
cameraMatrix.at<float>(2,1) = 0.0;
cameraMatrix.at<float>(2,2) = 1.0;
Mat rvec(3, 1, CV_32F), tvec(3, 1, CV_32F);
solvePnP(squareMat, boardMat, cameraMatrix, distortionCoefficients,
rvec, tvec);
_rv[0] = rvec.at<double>(0, 0);
_rv[1] = rvec.at<double>(1, 0);
_rv[2] = rvec.at<double>(2, 0);
_tv[0] = tvec.at<double>(0, 0);
_tv[1] = tvec.at<double>(1, 0);
_tv[2] = tvec.at<double>(2, 0);
}
Then in the drawing code...
GLKMatrix4 modelViewMatrix = GLKMatrix4MakeTranslation(0.0f, 0.0f, 0.0f);
modelViewMatrix = GLKMatrix4Translate(modelViewMatrix, -tv[1], -tv[0], -tv[2]);
modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, -rv[0], 1.0f, 0.0f, 0.0f);
modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, -rv[1], 0.0f, 1.0f, 0.0f);
modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, -rv[2], 0.0f, 0.0f, 1.0f);
The vertices I'm rendering create a cube of unit length around the origin (i.e. from -0.5 to 0.5 along each edge.) I know with OpenGL translation functions performed transformations in "reverse order," so the above should rotate the cube along the z, y, and then x axes, and then translate it. However, it seems like it's being translated first and then rotated, so perhaps Apple's GLKMatrix4 works differently?
This question seems very similar to mine, and in particular coder9's answer seems like it might be more or less what I'm looking for. However, I tried it and compared the results to my method, and the matrices I arrived at in both cases were the same. I feel like that answer is right, but that I'm missing some crucial detail.
You have to make sure the axis are facing the correct direction. Especially, the y and z axis are facing different directions in OpenGL and OpenCV to ensure the x-y-z basis is direct. You can find some information and code (with an iPad camera) in this blog post.
-- Edit --
Ah ok. Unfortunately, I used these resources to do it the other way round (opengl ---> opencv) to test some algorithms. My main issue was that the row order of the images was inverted between OpenGL and OpenCV (maybe this helps).
When simulating cameras, I came across the same projection matrices that can be found here and in the generalized projection matrix paper. This paper quoted in the comments of the blog post also shows some link between computer vision and OpenGL projections.
I'm not an IOS programmer, so this answer might be misleading!
If the problem is not in the order of applying the rotations and the translation, then suggest using a simpler and more commonly used coordinate system.
The points in the corners vector have the origin (0,0) at the top left corner of the image and the y axis is towards the bottom of the image. Often from math we are used to think of the coordinate system with the origin at the center and y axis towards the top of the image. From the coordinates you're pushing into board_verts I'm guessing you're making the same mistake. If that's the case, it's easy to transform the positions of the corners by something like this:
for (i=0;i<corners.size();i++) {
corners[i].x -= width/2;
corners[i].y = -corners[i].y + height/2;
}
then you call solvePnP(). Debugging this is not that difficult, just print the positions of the four corners and the estimated R and T, and see if they make sense. Then you can proceed to the OpenGL step. Please let me know how it goes.

OpenGL ES 2.0: Why does this perspective projection matrix not give the right result?

About 2 days ago I decided to write code to explicitly calculate the Model-View-Projection ("MVP") matrix to understand how it worked. Since then I've had nothing but trouble, seemingly because of the projection matrix I'm using.
Working with an iPhone display, I create a screen centered square described by these 4 corner vertices:
const CGFloat cy = screenHeight/2.0f;
const CGFloat z = -1.0f;
const CGFloat dim = 50.0f;
vxData[0] = cx-dim;
vxData[1] = cy-dim;
vxData[2] = z;
vxData[3] = cx-dim;
vxData[4] = cy+dim;
vxData[5] = z;
vxData[6] = cx+dim;
vxData[7] = cy+dim;
vxData[8] = z;
vxData[9] = cx+dim;
vxData[10] = cy-dim;
vxData[11] = z;
Since I am using OGLES 2.0 I pass the MVP as a uniform to my vertex shader, then simply apply the transformation to the current vertex position:
uniform mat4 mvp;
attribute vec3 vpos;
void main()
{
gl_Position = mvp * vec4(vpos, 1.0);
}
For now I have simplified my MVP to just be the P matrix. There are two projection matrices listed in the code shown below. The first is the standard perspective projection matrix, and the second is an explicit-value projection matrix I found online.
CGRect screenBounds = [[UIScreen mainScreen] bounds];
const CGFloat screenWidth = screenBounds.size.width;
const CGFloat screenHeight = screenBounds.size.height;
const GLfloat n = 0.01f;
const GLfloat f = 100.0f;
const GLfloat fov = 60.0f * 2.0f * M_PI / 360.0f;
const GLfloat a = screenWidth/screenHeight;
const GLfloat d = 1.0f / tanf(fov/2.0f);
// Standard perspective projection.
GLKMatrix4 projectionMx = GLKMatrix4Make(d/a, 0.0f, 0.0f, 0.0f,
0.0f, d, 0.0f, 0.0f,
0.0f, 0.0f, (n+f)/(n-f), -1.0f,
0.0f, 0.0f, (2*n*f)/(n-f), 0.0f);
// The one I found online.
GLKMatrix4 projectionMx = GLKMatrix4Make(2.0f/screenWidth,0.0f,0.0f,0.0f,
0.0f,2.0f/-screenHeight,0.0f,0.0f,
0.0f,0.0f,1.0f,0.0f,
-1.0f,1.0f,0.0f,1.0f);
When using the explicit value matrix, the square renders exactly as desired in the centre of the screen with correct dimension. When using the perspective projection matrix, nothing is displayed on-screen. I've done printouts of the position values generated for screen centre (screenWidth/2, screenHeight/2, 0) by the perspective projection matrix and they're enormous. The explicit value matrix correctly produces zero.
I think the explicit value matrix is an orthographic projection matrix - is that right? My frustration is that I can't work out why my perspective projection matrix fails to work.
I'd be tremendously grateful if someone could help me with this problem. Many thanks.
UPDATE For Christian Rau:
#define Zn 0.0f
#define Zf 100.0f
#define PRIMITIVE_Z 1.0f
//...
CGRect screenBounds = [[UIScreen mainScreen] bounds];
const CGFloat screenWidth = screenBounds.size.width;
const CGFloat screenHeight = screenBounds.size.height;
//...
glUseProgram(program);
//...
glViewport(0.0f, 0.0f, screenBounds.size.width, screenBounds.size.height);
//...
const CGFloat cx = screenWidth/2.0f;
const CGFloat cy = screenHeight/2.0f;
const CGFloat z = PRIMITIVE_Z;
const CGFloat dim = 50.0f;
vxData[0] = cx-dim;
vxData[1] = cy-dim;
vxData[2] = z;
vxData[3] = cx-dim;
vxData[4] = cy+dim;
vxData[5] = z;
vxData[6] = cx+dim;
vxData[7] = cy+dim;
vxData[8] = z;
vxData[9] = cx+dim;
vxData[10] = cy-dim;
vxData[11] = z;
//...
const GLfloat n = Zn;
const GLfloat f = Zf;
const GLfloat fov = 60.0f * 2.0f * M_PI / 360.0f;
const GLfloat a = screenWidth/screenHeight;
const GLfloat d = 1.0f / tanf(fov/2.0f);
GLKMatrix4 projectionMx = GLKMatrix4Make(d/a, 0.0f, 0.0f, 0.0f,
0.0f, d, 0.0f, 0.0f,
0.0f, 0.0f, (n+f)/(n-f), -1.0f,
0.0f, 0.0f, (2*n*f)/(n-f), 0.0f);
//...
// ** Here is the matrix you recommended, Christian:
GLKMatrix4 ts = GLKMatrix4Make(2.0f/screenWidth, 0.0f, 0.0f, -1.0f,
0.0f, 2.0f/screenHeight, 0.0f, -1.0f,
0.0f, 0.0f, 1.0f, 0.0f,
0.0f, 0.0f, 0.0f, 1.0f);
GLKMatrix4 mvp = GLKMatrix4Multiply(projectionMx, ts);
UPDATE 2
The new MVP code:
GLKMatrix4 ts = GLKMatrix4Make(2.0f/screenWidth, 0.0f, 0.0f, -1.0f,
0.0f, 2.0f/-screenHeight, 0.0f, 1.0f,
0.0f, 0.0f, 1.0f, 0.0f,
0.0f, 0.0f, 0.0f, 1.0f);
// Using Apple perspective, view matrix generators
// (I can solve bugs in my own implementation later..!)
GLKMatrix4 _p = GLKMatrix4MakePerspective(60.0f * 2.0f * M_PI / 360.0f,
screenWidth / screenHeight,
Zn, Zf);
GLKMatrix4 _mv = GLKMatrix4MakeLookAt(0.0f, 0.0f, 1.0f,
0.0f, 0.0f, -1.0f,
0.0f, 1.0f, 0.0f);
GLKMatrix4 _mvp = GLKMatrix4Multiply(_p, _mv);
GLKMatrix4 mvp = GLKMatrix4Multiply(_mvp, ts);
Still nothing visible at the screen centre, and the transformed x,y coordinates of the screen centre are not zero.
UPDATE 3
Using the transpose of ts instead in the above code works! But the square no longer appears square; it appears to now have aspect ratio screenHeight/screenWidth i.e. it has a longer dimension parallel to the (short) screen width, and a shorter dimension parallel to the (long) screen height.
I'd very much like to know (a) why the transpose is required and whether it is a valid fix, (b) how to correctly rectify the non-square dimension, and (c) how this additional matrix transpose(ts) that we use fits into the transformation chain of Viewport * Projection * View * Model * Point .
For (c): I understand what the matrix does, i.e. the explanation by Christian Rau as to how we transform to range [-1, 1]. But is it correct to include this additional work as a separate transformation matrix, or should some part of our MVP chain be doing this work instead?
Sincere thanks go to Christian Rau for his valuable contribution thus far.
UPDATE 4
My question about "how ts fits in" is silly isn't it - the whole point is the matrix is only needed because I'm choosing to use screen coordinates for my vertices; if I were to use coordinates in world space from the start then this work wouldn't be needed!
Thanks Christian for all your help, it's been invaluable :) Problem solved.
The reason for this is, that your first projection matrix doesn't account for the scaling and translation part of the transformation, whereas the second matrix does it.
So, since your modelview matrix is identity, the first projection matrix assumes the models' coordinates to ly somewhere in [-1,1], whereas the second matrix already contains the scaling and translation part (look at the screenWidth/Height values in there) and therefore assumes the coordinates to ly in [0,screenWidth] x [0,screenHeight].
So you have to right-multiply your projection matrix by a matrix that first scales [0,screenWidth] down to [0,2] and [0,screenHeight] down to [0,2] and then translates [0,2] into [-1,1] (using w for screenWidth and h for screenHeight):
[ 2/w 0 0 -1 ]
[ 0 2/h 0 -1 ]
[ 0 0 1 0 ]
[ 0 0 0 1 ]
which will result in the matrix
[ 2*d/h 0 0 -d/a ]
[ 0 2*d/h 0 -d ]
[ 0 0 (n+f)/(n-f) 2*n*f/(n-f) ]
[ 0 0 -1 0 ]
So you see that your second matrix corresponds to a fov of 90 degrees, an aspect ratio of 1:1 and a near-far range of [-1,1]. Additionally it also inverts the y-axis, so that the origin is in the upper-left, which results in the second row being negated:
[ 0 -2*d/h 0 d ]
But as an end comment, I suggest you to not configure the projection matrix to account for all this. Instead your projection matrix should look like the first one and you should let the modelview matrix manage any translation or scaling of your world. It is not by accident, that the transformation pipeline was seperated into modelview and projection matrix and you should keep this separation also when using shaders. You can of course still multiply both matrices together on the CPU and upload a single MVP matrix to the shader.
And in general you don't really use a screen-based coordinate system when working with a 3-dimensional world. You would only want to do this if you are drawing 2d graphics (like GUI elements or HUDs) and in this case you would use a more simple orthographic projection matrix, anyway, that is nothing more than the above mentioned scale-translate matrix without all the perspective complexity.
EDIT: To your 3rd update:
(a) The transpose is required because I guess your GLKMatrix4Make function accepts its parameters in column-major format and you put the matrix in row-wise.
(b) I made a little mistake. You should change the screenWidth in the ts matrix into screenHeight (or maybe the other way around, not sure). We actually need a uniform scale, because the aspect ratio is already taken care of by the projection matrix.
(c) It is not easy to classify this matrix into the usual MVP pipeline. This is because it is not really common. Let's look at the two common cases of rendering:
3D: When you have a 3-dimensional world it is not really common to define it's coordinates in screen-based units, because there is not et a mapping from 3d-scene to 2d-screen and using a coordinate system where units equal pixels just doesn't make sense. In this setup you most likely would classify it as part of the modelview matrix for transforming the world into another unit system. But in this case you would need real 3d transformations and not just such a half-baked 2d solution.
2D: When rendering a 2d-scene (like a GUI or a HUD or just some text), you sometimes really want a screen-based coordinate system. But in this case you most likely would use an orthographic projection (without any perspective). Such an orthographic matrix is actually nothing more than this ts matrix (with some additional scale-translate for z, based on the near-far range). So in this case the matrix belongs to, or actually is, the projection matrix. Just look at how the good old glOrtho function constructs its matrix and you'll see its nothing more than ts.

Resources