I'm trying to use OpenCV to do some basic augmented reality. The way I'm going about it is using findChessboardCorners to get a set of points from a camera image. Then, I create a 3D quad along the z = 0 plane and use solvePnP to get a homography between the imaged points and the planar points. From that, I figure I should be able to set up a modelview matrix which will allow me to render a cube with the right pose on top of the image.
The documentation for solvePnP says that it outputs a rotation vector "that (together with [the translation vector] ) brings points from the model coordinate system to the camera coordinate system." I think that's the opposite of what I want; since my quad is on the plane z = 0, I want a a modelview matrix which will transform that quad to the appropriate 3D plane.
I thought that by performing the opposite rotations and translations in the opposite order I could calculate the correct modelview matrix, but that seems not to work. While the rendered object (a cube) does move with the camera image and seems to be roughly correct translationally, the rotation just doesn't work at all; it on multiple axes when it should only be rotating on one, and sometimes in the wrong direction. Here's what I'm doing so far:
std::vector<Point2f> corners;
bool found = findChessboardCorners(*_imageBuffer, cv::Size(5,4), corners,
CV_CALIB_CB_FILTER_QUADS |
CV_CALIB_CB_FAST_CHECK);
if(found)
{
drawChessboardCorners(*_imageBuffer, cv::Size(6, 5), corners, found);
std::vector<double> distortionCoefficients(5); // camera distortion
distortionCoefficients[0] = 0.070969;
distortionCoefficients[1] = 0.777647;
distortionCoefficients[2] = -0.009131;
distortionCoefficients[3] = -0.013867;
distortionCoefficients[4] = -5.141519;
// Since the image was resized, we need to scale the found corner points
float sw = _width / SMALL_WIDTH;
float sh = _height / SMALL_HEIGHT;
std::vector<Point2f> board_verts;
board_verts.push_back(Point2f(corners[0].x * sw, corners[0].y * sh));
board_verts.push_back(Point2f(corners[15].x * sw, corners[15].y * sh));
board_verts.push_back(Point2f(corners[19].x * sw, corners[19].y * sh));
board_verts.push_back(Point2f(corners[4].x * sw, corners[4].y * sh));
Mat boardMat(board_verts);
std::vector<Point3f> square_verts;
square_verts.push_back(Point3f(-1, 1, 0));
square_verts.push_back(Point3f(-1, -1, 0));
square_verts.push_back(Point3f(1, -1, 0));
square_verts.push_back(Point3f(1, 1, 0));
Mat squareMat(square_verts);
// Transform the camera's intrinsic parameters into an OpenGL camera matrix
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
// Camera parameters
double f_x = 786.42938232; // Focal length in x axis
double f_y = 786.42938232; // Focal length in y axis (usually the same?)
double c_x = 217.01358032; // Camera primary point x
double c_y = 311.25384521; // Camera primary point y
cv::Mat cameraMatrix(3,3,CV_32FC1);
cameraMatrix.at<float>(0,0) = f_x;
cameraMatrix.at<float>(0,1) = 0.0;
cameraMatrix.at<float>(0,2) = c_x;
cameraMatrix.at<float>(1,0) = 0.0;
cameraMatrix.at<float>(1,1) = f_y;
cameraMatrix.at<float>(1,2) = c_y;
cameraMatrix.at<float>(2,0) = 0.0;
cameraMatrix.at<float>(2,1) = 0.0;
cameraMatrix.at<float>(2,2) = 1.0;
Mat rvec(3, 1, CV_32F), tvec(3, 1, CV_32F);
solvePnP(squareMat, boardMat, cameraMatrix, distortionCoefficients,
rvec, tvec);
_rv[0] = rvec.at<double>(0, 0);
_rv[1] = rvec.at<double>(1, 0);
_rv[2] = rvec.at<double>(2, 0);
_tv[0] = tvec.at<double>(0, 0);
_tv[1] = tvec.at<double>(1, 0);
_tv[2] = tvec.at<double>(2, 0);
}
Then in the drawing code...
GLKMatrix4 modelViewMatrix = GLKMatrix4MakeTranslation(0.0f, 0.0f, 0.0f);
modelViewMatrix = GLKMatrix4Translate(modelViewMatrix, -tv[1], -tv[0], -tv[2]);
modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, -rv[0], 1.0f, 0.0f, 0.0f);
modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, -rv[1], 0.0f, 1.0f, 0.0f);
modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, -rv[2], 0.0f, 0.0f, 1.0f);
The vertices I'm rendering create a cube of unit length around the origin (i.e. from -0.5 to 0.5 along each edge.) I know with OpenGL translation functions performed transformations in "reverse order," so the above should rotate the cube along the z, y, and then x axes, and then translate it. However, it seems like it's being translated first and then rotated, so perhaps Apple's GLKMatrix4 works differently?
This question seems very similar to mine, and in particular coder9's answer seems like it might be more or less what I'm looking for. However, I tried it and compared the results to my method, and the matrices I arrived at in both cases were the same. I feel like that answer is right, but that I'm missing some crucial detail.
You have to make sure the axis are facing the correct direction. Especially, the y and z axis are facing different directions in OpenGL and OpenCV to ensure the x-y-z basis is direct. You can find some information and code (with an iPad camera) in this blog post.
-- Edit --
Ah ok. Unfortunately, I used these resources to do it the other way round (opengl ---> opencv) to test some algorithms. My main issue was that the row order of the images was inverted between OpenGL and OpenCV (maybe this helps).
When simulating cameras, I came across the same projection matrices that can be found here and in the generalized projection matrix paper. This paper quoted in the comments of the blog post also shows some link between computer vision and OpenGL projections.
I'm not an IOS programmer, so this answer might be misleading!
If the problem is not in the order of applying the rotations and the translation, then suggest using a simpler and more commonly used coordinate system.
The points in the corners vector have the origin (0,0) at the top left corner of the image and the y axis is towards the bottom of the image. Often from math we are used to think of the coordinate system with the origin at the center and y axis towards the top of the image. From the coordinates you're pushing into board_verts I'm guessing you're making the same mistake. If that's the case, it's easy to transform the positions of the corners by something like this:
for (i=0;i<corners.size();i++) {
corners[i].x -= width/2;
corners[i].y = -corners[i].y + height/2;
}
then you call solvePnP(). Debugging this is not that difficult, just print the positions of the four corners and the estimated R and T, and see if they make sense. Then you can proceed to the OpenGL step. Please let me know how it goes.
Related
I have been working on Pose Estimation (rectifying key points on a 3D model with 2D points on an image to match pose) via OpenCV's cv::solvePNP, using features / key points from Apples Vision framework.
TL-DR:
My scene kit model is being translated and the units look correct when introspecting the translation and rotation vectors from solvePnP (ie, they are the right order of magnitude), but the coordinate system of the translation appears off:
I am trying to understand the coordinate system requirements with solvePnP wrt to Metal / OpenGL coordinate system and my camera projection matrix.
What 'projectionMatrix' does my SCNCamera require to match image based coordinate system passed into solvePnP?
Some things ive read / believe I am taking into account.
OpenCV vs OpenGL (thus Metal) have row major vs column major differences.
OpenCV's coordinate system for 3D is different than OpenGL (thus Metal).
Longer with code:
My workflow is as such:
Step 1 - use a 3D model tool to introspect points on my 3D model and get the objects vertex positions for the major key points in the 2D detected features. I am using left pupil, right pupil, tip of nose, tip of chin, left outer lip corner, right outer lip corner.
Step 2 - Run a vision request and extract a list of points in image space (converting for OpenCV's top left coordinate system) and extract the same ordered list of 2D points.
Step 3 - Construct a camera matrix by using the size of the input image.
Step 4 - run cv::solvePnP, and then use cv::Rodrigues to convert the rotation vector to a matrix
Step 5 - Convert the coordinate system of the resulting transforms into something appropriate for the GPU - invert the y and z axis and combine the translation and rotation to a single 4x4 Matrix, and then transpose it for the appropriate major ness of OpenGL / Metal
Step 6 - apply the resulting transform to Scenekit via:
let faceNodeTransform = openCVWrapper.transform(for: landmarks, imageSize: size)
self.destinationView.pointOfView?.transform = SCNMatrix4Invert(faceNodeTransform)
Below is my Obj-C++ OpenCV Wrapper which takes in a subset of Vision Landmarks and the true pixel size of the image being looked at:
/ https://answers.opencv.org/question/23089/opencv-opengl-proper-camera-pose-using-solvepnp/
- (SCNMatrix4) transformFor:(VNFaceLandmarks2D*)landmarks imageSize:(CGSize)imageSize
{
// 1 convert landmarks to image points in image space (pixels) to vector of cv::Point2f's :
// Note that this translates the point coordinate system to be top left oriented for OpenCV's image coordinates:
std::vector<cv::Point2f > imagePoints = [self imagePointsForLandmarks:landmarks imageSize:imageSize];
// 2 Load Model Points
std::vector<cv::Point3f > modelPoints = [self modelPoints];
// 3 create our camera extrinsic matrix
// TODO - see if this is sane?
double max_d = fmax(imageSize.width, imageSize.height);
cv::Mat cameraMatrix = (cv::Mat_<double>(3,3) << max_d, 0, imageSize.width/2.0,
0, max_d, imageSize.height/2.0,
0, 0, 1.0);
// 4 Run solvePnP
double distanceCoef[] = {0,0,0,0};
cv::Mat distanceCoefMat = cv::Mat(1 ,4 ,CV_64FC1,distanceCoef);
// Output Matrixes
std::vector<double> rv(3);
cv::Mat rotationOut = cv::Mat(rv);
std::vector<double> tv(3);
cv::Mat translationOut = cv::Mat(tv);
cv::solvePnP(modelPoints, imagePoints, cameraMatrix, distanceCoefMat, rotationOut, translationOut, false, cv::SOLVEPNP_EPNP);
// 5 Convert rotation matrix (actually a vector)
// To a real 4x4 rotation matrix:
cv::Mat viewMatrix = cv::Mat::zeros(4, 4, CV_64FC1);
cv::Mat rotation;
cv::Rodrigues(rotationOut, rotation);
// Append our transforms to our matrix and set final to identity:
for(unsigned int row=0; row<3; ++row)
{
for(unsigned int col=0; col<3; ++col)
{
viewMatrix.at<double>(row, col) = rotation.at<double>(row, col);
}
viewMatrix.at<double>(row, 3) = translationOut.at<double>(row, 0);
}
viewMatrix.at<double>(3, 3) = 1.0f;
// Transpose OpenCV to OpenGL coords
cv::Mat cvToGl = cv::Mat::zeros(4, 4, CV_64FC1);
cvToGl.at<double>(0, 0) = 1.0f;
cvToGl.at<double>(1, 1) = -1.0f; // Invert the y axis
cvToGl.at<double>(2, 2) = -1.0f; // invert the z axis
cvToGl.at<double>(3, 3) = 1.0f;
viewMatrix = cvToGl * viewMatrix;
// Finally transpose to get correct SCN / OpenGL Matrix :
cv::Mat glViewMatrix = cv::Mat::zeros(4, 4, CV_64FC1);
cv::transpose(viewMatrix , glViewMatrix);
return [self convertCVMatToMatrix4:glViewMatrix];
}
- (SCNMatrix4) convertCVMatToMatrix4:(cv::Mat)matrix
{
SCNMatrix4 scnMatrix = SCNMatrix4Identity;
scnMatrix.m11 = matrix.at<double>(0, 0);
scnMatrix.m12 = matrix.at<double>(0, 1);
scnMatrix.m13 = matrix.at<double>(0, 2);
scnMatrix.m14 = matrix.at<double>(0, 3);
scnMatrix.m21 = matrix.at<double>(1, 0);
scnMatrix.m22 = matrix.at<double>(1, 1);
scnMatrix.m23 = matrix.at<double>(1, 2);
scnMatrix.m24 = matrix.at<double>(1, 3);
scnMatrix.m31 = matrix.at<double>(2, 0);
scnMatrix.m32 = matrix.at<double>(2, 1);
scnMatrix.m33 = matrix.at<double>(2, 2);
scnMatrix.m34 = matrix.at<double>(2, 3);
scnMatrix.m41 = matrix.at<double>(3, 0);
scnMatrix.m42 = matrix.at<double>(3, 1);
scnMatrix.m43 = matrix.at<double>(3, 2);
scnMatrix.m44 = matrix.at<double>(3, 3);
return (scnMatrix);
}
Some questions:
An SCNNode has no modelViewMatrix (just as I understand it, a transform, which is the modelMatrix) to just throw a matrix at - so I've read the inverse of the transform from SolvePNP process can be used to pose the camera instead, which appears to get me the closes result. I want to ensure this approach is correct.
If I have the modelViewMatrix, and the projectionMatrix, I should be able to calculate the appropriate modelMatrix? Is this the approach I should be taking?
Its unclear to me what projectionMatrix I should be using for my SceneKit Scene and If that has any bearing on my results. Do I need a pixel for pixel exact match of my viewport to the image size, and how do I properly configure my SCNCamera to ensure coordinate system agreeance for SolvePnP?
Thank you very much!
I've got problem with obtaining proper camera pose from iPad camera using OpenCV.
I'm using custom made 2D marker (based on AruCo library ) - I want to render 3D cube over that marker using OpenGL.
In order to recieve camera pose I'm using solvePnP function from OpenCV.
According to THIS LINK I'm doing it like this:
cv::solvePnP(markerObjectPoints, imagePoints, [self currentCameraMatrix], _userDefaultsManager.distCoeffs, rvec, tvec);
tvec.at<double>(0, 0) *= -1; // I don't know why I have to do it, but translation in X axis is inverted
cv::Mat R;
cv::Rodrigues(rvec, R); // R is 3x3
R = R.t(); // rotation of inverse
tvec = -R * tvec; // translation of inverse
cv::Mat T(4, 4, R.type()); // T is 4x4
T(cv::Range(0, 3), cv::Range(0, 3)) = R * 1; // copies R into T
T(cv::Range(0, 3), cv::Range(3, 4)) = tvec * 1; // copies tvec into T
double *p = T.ptr<double>(3);
p[0] = p[1] = p[2] = 0;
p[3] = 1;
camera matrix & dist coefficients are coming from findChessboardCorners function, imagePoints are manually detected corners of marker (you can see them as green square in the video posted below), and markerObjectPoints are manually hardcoded points that represents marker corners:
markerObjectPoints.push_back(cv::Point3d(-6, -6, 0));
markerObjectPoints.push_back(cv::Point3d(6, -6, 0));
markerObjectPoints.push_back(cv::Point3d(6, 6, 0));
markerObjectPoints.push_back(cv::Point3d(-6, 6, 0));
Because marker is 12 cm long in real world, I've chosed the same size in the for easier debugging.
As a result I'm recieving 4x4 matrix T, that I'll use as ModelView matrix in OpenCV.
Using GLKit drawing function looks more or less like this:
- (void)glkView:(GLKView *)view drawInRect:(CGRect)rect {
// preparations
glClearColor(0.0, 0.0, 0.0, 0.0);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
float aspect = fabsf(self.bounds.size.width / self.bounds.size.height);
effect.transform.projectionMatrix = GLKMatrix4MakePerspective(GLKMathDegreesToRadians(39), aspect, 0.1f, 1000.0f);
// set modelViewMatrix
float mat[16] = generateOpenGLMatFromFromOpenCVMat(T);
currentModelMatrix = GLKMatrix4MakeWithArrayAndTranspose(mat);
effect.transform.modelviewMatrix = currentModelMatrix;
[effect prepareToDraw];
glDrawArrays(GL_TRIANGLES, 0, 36); // draw previously prepared cube
}
I'm not rotating everything for 180 degrees around X axis (as it was mentioned in previously linked article), because I doesn't look as necessary.
The problem is that it doesn't work! Translation vector looks OK, but X and Y rotations are messed up :(
I've recorded a video presenting that issue:
http://www.youtube.com/watch?v=EMNBT5H7-os
I've tried almost everything (including inverting all axises one by one), but nothing actually works.
What should I do? How should I properly display that 3D cube? Translation / rotation vectors that come from solvePnP are looking reasonable, so I guess that I can't correctly map these vectors to OpenGL matrices.
Thanks to Djo1509 from http://answers.opencv.org/ I've found out that the problem was unnecessary transposed rotation matrix R matrix used as part of matrix T, and unnecessary tvec = -R * tvec operation.
For more info look there
I'm trying to draw multiple cubes at an isometric camera angle. Here's the code that draws one. (OpenGL ES 2.0 with GLKit on iOS).
float startZ = -4.0f;
// position
GLKMatrix4 modelViewMatrix = GLKMatrix4Identity;
modelViewMatrix = GLKMatrix4Translate(modelViewMatrix, location.x, location.y, location.z + startZ);
// isometric camera angle
modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, GLKMathDegreesToRadians(45), 1.0, 0, 0);
modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, GLKMathDegreesToRadians(45), 0.0, 1.0, 0);
self.effect.transform.modelviewMatrix = modelViewMatrix;
[self.effect prepareToDraw];
glDrawArrays(GL_TRIANGLES, 0, 36);
The problem is that it is translating first, then rotating, which means with more than one box, they do not line up (they look like a chain of diamonds. Each one is in position and rotated so the corners overlap).
I've tried switching the order so the rotation is before the translation, but they don't show up at all. My vertex array is bound to a unit cube centered around the origin.
I really don't understand how to control the camera separate from the object. I screwed around with the projection matrix for a while without getting it. As far as I understand, the camera is supposed to be controlled with the modelViewMatrix, right? (The "View" part).
Your 'camera' transform (modelview) seems correct, however it looks like you're using a perspective projection - if you want isometric you will need to change your projection matrix.
It looks like you are applying the camera rotation to each object as you draw it. Instead, simulate a 2-deep matrix stack if your use-case is this simple, so you just have your camera matrix and each cube's matrix.
Set your projection and camera matrices - keep a reference to your camera matrix.
For each cube generate the individual cube transformation matrix (which should probably consist of translation only - no rotation so the cubes remain axis aligned - I think this is what you're going for).
Backwards-multiply your cube matrix by the camera matrix and use that as the modelview matrix.
Note that the camera matrix will remain unchanged for each cube you render, but the modelview matrix will incorporate both the cube's individual transformation matrix and the camera matrix into a single modelview matrix. This is equivalent to the old matrix stack methods glPushMatrix and glPopMatrix (not available in GLES 2.0). If you need more complex object hierarchies (where the cubes have child-objects in their 'local' coordinate space) then you should probably implement your own full matrix stack, instead of the 2-deep equivalent discussed above.
For reference, here's an article that helped me understand it. It's a little mathy, but does a good job explaining things intuitively.
http://db-in.com/blog/2011/04/cameras-on-opengl-es-2-x/
I ended up keeping the perspective projection, because I don't want true isometric. The key was to do them in the right order, because moving the camera is the inverse of moving the object. See the comments, and the article. Working code:
// the one you want to happen first is multiplied LAST
// camRotate * camScale * camTranslate * objTranslate * objScale * objRotate;
// TODO cache the camera matrix
// the camera angle remains the same for all objects
GLKMatrix4 camRotate = GLKMatrix4MakeRotation(GLKMathDegreesToRadians(45), 1, 0, 0);
camRotate = GLKMatrix4Rotate(camRotate, GLKMathDegreesToRadians(45), 0, 1, 0);
GLKMatrix4 camTranslate = GLKMatrix4MakeTranslation(4, -5, -4.0);
GLKMatrix4 objTranslate = GLKMatrix4MakeTranslation(location.x, location.y, location.z);
GLKMatrix4 modelViewMatrix = GLKMatrix4Multiply(camRotate, camTranslate);
modelViewMatrix = GLKMatrix4Multiply(modelViewMatrix, objTranslate);
self.effect.transform.modelviewMatrix = modelViewMatrix;
[self.effect prepareToDraw];
glDrawArrays(GL_TRIANGLES, 0, 36);
I've got some code working to create a 3d view in opengl and then use the device motion to look around within it. i know this is working because i can place 3d cubes in space around me and see that they are in the right places. (i'm just creating them with x/y/z co-ordinates).
The code uses the rotation matrix of the device and then applies it to the various blocks.
CMRotationMatrix r = dm.attitude.rotationMatrix;
GLKMatrix4 baseModelViewMatrix = GLKMatrix4Make(r.m11, r.m21, r.m31, 0.0f,
r.m12, r.m22, r.m32, 0.0f,
r.m13, r.m23, r.m33, 0.0f,
0.0f, 0.0f, 0.0f, 1.0f);
float aspect = fabsf(self.view.bounds.size.width / self.view.bounds.size.height);
GLKMatrix4 projectionMatrix = GLKMatrix4MakePerspective(GLKMathDegreesToRadians(65.0f), aspect, kNearZ, kFarZ);
block.effect.transform.projectionMatrix = projectionMatrix;
what i want to be able to do is figure out where i'm looking and then create a block out in front of me.
i've had some limited success by creating a vector in one direction, applying the rotation matrix to it and then reading off it's new values. but it only works on some of the directions- when i rotate too far it messes up.
GLKVector4 vect = GLKVector4Make(0.0f,0.0f,10.0f,1.0f);
GLKVector4 newVec = GLKMatrix4MultiplyVector4(baseModelViewMatrix,vect);
I then read off newVec.x,newVec.y,newVec.z and use them to place the cube.
can someone tell me if i'm on the right track here? is there an easier way to achieve this?
the maths of it all is quite daunting.
UPDATE:
I've had some partial success using
GLKVector3 newVec1 = GLKVector3Make(-r.m22, -r.m33, r.m21);
This only works in one lanscape orientation, and also only works in a cylinder around my current location. the up/down axis isn't quite right.
are these parts of the rotation matrix sufficient to get a point anywhere around me?
UPDATE 2:
Thought it might help to post some more code to make it really clear.
This is how everything is getting displayed.
//1- get device position
CMDeviceMotion *dm = motionManager.deviceMotion;
CMRotationMatrix r = dm.attitude.rotationMatrix;
GLKMatrix4 baseModelViewMatrix = GLKMatrix4Make(r.m11, r.m21, r.m31, 0.0f,
r.m12, r.m22, r.m32, 0.0f,
r.m13, r.m23, r.m33, 0.0f,
0.0f, 0.0f, 0.0f, 1.0f);
GLKMatrix4 projectionMatrix = GLKMatrix4MakePerspective(GLKMathDegreesToRadians(45.0f), aspect, kNearZ, kFarZ);
//2- work out position ahead of current view
GLKVector3 newVector = GLKVector3Make(-r.m22, -r.m33, r.m21);
//place cube in 3d space- this works fine when just positioning with x,y,z co-ordinates
cube.effect.transform.projectionMatrix = projectionMatrix;
GLKMatrix4 modelViewMatrix = GLKMatrix4Identity;
modelViewMatrix = GLKMatrix4Translate(modelViewMatrix, newVector.x*100, newVector.y*100, newVector.z*100);
modelViewMatrix = GLKMatrix4Multiply(baseModelViewMatrix, modelViewMatrix);
cube.effect.transform.modelviewMatrix = modelViewMatrix;
The vector where you look at should probably be multiplied by inverted matrix. You didn't write what results are you getting but here are a few possible solutions:
newVect1 = (r.m13, r.m23, r.m33)
newVect2 = (-r.m13, -r.m23, r.m33)
multiply (0,0,1) with inverted matrix
EDIT: (To add some explanations)
If your baseModelViewMatrix is responsible for rotating all the objects around you then the vector that faces forward (display wise) should be rotated by inverted matrix of baseModelViewMatrix. If it is true that in case that baseModelViewMatrix is identity the correct vector for facing forward is (0,0,1), then just multiply (0,0,1) with inverted matrix (you can simply try that and you even should, just set it to identity and place the object at (0,0,100) or whatever). If it is not, you should find the one that is facing forward and do the same with it.
Concerning the 2nd method I posted newVect2 = (-r.m13, -r.m23, r.m33) it is specifically meant in case where (0,0,1) is the forward vector. The idea is that rotation matrix consists of 3 base vectors x,y,z where each row represents one of them respectively (it always works for creating rotation matrix but might not reversibly). If the 3rd row does represent the z base vector then vector (0,0,1) will be transformed to (r.m13, r.m23, r.m33) and the one you seem to be facing is the vector that is mirrored through its normal (0,0,1). The operation for that would be R' = normal*(2*dot(normal, R)) - R and for the case where normal is (0,0,1) and R is (r.m13, r.m23, r.m33), the result R' is (-r.m13, -r.m23, r.m33). Just as a note here it might be turned around and the z base vector is (r.m31, r.m32, r.m33).
In any case, if 3rd option does not work neither will the 2nd and the problem for that is probably that (0,0,1) is not the forward vector for identity.
I'm looking for a simple implementation for arcball rotation on 3D models with quaternions, specifically using GLKit on iOS. So far, I have examined the following sources:
Arcball rotation with GLKit
How to rotate a 3D object with touches using OpenGL
I've also been trying to understand source code and maths from here and here. I can rotate my object but it keeps jumping around at certain angles, so I fear gimbal lock is at play. I'm using gesture recognizers to control the rotations (pan gestures affect roll and yaw, rotate gestures affect pitch). I'm attaching my code for the quaternion handling as well as the modelview matrix transformation.
Variables:
GLKQuaternion rotationE;
Quaternion Handling:
- (void)rotateWithXY:(float)x and:(float)y
{
const float rate = M_PI/360.0f;
GLKVector3 up = GLKVector3Make(0.0f, 1.0f, 0.0f);
GLKVector3 right = GLKVector3Make(1.0f, 0.0f, 0.0f);
up = GLKQuaternionRotateVector3(GLKQuaternionInvert(self.rotationE), up);
self.rotationE = GLKQuaternionMultiply(self.rotationE, GLKQuaternionMakeWithAngleAndVector3Axis(x*rate, up));
right = GLKQuaternionRotateVector3(GLKQuaternionInvert(self.rotationE), right);
self.rotationE = GLKQuaternionMultiply(self.rotationE, GLKQuaternionMakeWithAngleAndVector3Axis(y*rate, right));
}
- (void)rotateWithZ:(float)z
{
GLKVector3 front = GLKVector3Make(0.0f, 0.0f, -1.0f);
front = GLKQuaternionRotateVector3(GLKQuaternionInvert(self.rotationE), front);
self.rotationE = GLKQuaternionMultiply(self.rotationE, GLKQuaternionMakeWithAngleAndVector3Axis(z, front));
}
Modelview Matrix Transformation (Inside Draw Loop):
// Get Quaternion Rotation
GLKVector3 rAxis = GLKQuaternionAxis(self.transformations.rotationE);
float rAngle = GLKQuaternionAngle(self.transformations.rotationE);
// Set Modelview Matrix
GLKMatrix4 modelviewMatrix = GLKMatrix4Identity;
modelviewMatrix = GLKMatrix4MakeTranslation(0.0f, 0.0f, -0.55f);
modelviewMatrix = GLKMatrix4Rotate(modelviewMatrix, rAngle, rAxis.x, rAxis.y, rAxis.z);
modelviewMatrix = GLKMatrix4Scale(modelviewMatrix, 0.5f, 0.5f, 0.5f);
glUniformMatrix4fv(self.sunShader.uModelviewMatrix, 1, 0, modelviewMatrix.m);
Any help is greatly appreciated, but I do want to keep it as simple as possible and stick to GLKit.
There seem to be a few issues going on here.
You say that you're using [x,y] to pan, but it looks more like you're using them to pitch and yaw. To me, at least, panning is translation, not rotation.
Unless I'm missing something, it also looks like your replacing the entire rotation everytime you try to update it. You rotate a vector by the inverse of the current rotation and then create a quaternion from that vector and some angle. I believe that this is equivalent to creating the quaternion from the original vector and then rotating it by the current rotation inverse. So you have q_e'*q_up. Then you multiply that with the current rotation, which gives q_e*q_e'*q_up = q_up. The current rotation is canceled out. This doesn't seem like it's what you want.
All you really need to do is create a new quaternion from axis-and-angle and then multiply it with the current quaternion. If the new quaternion is on the left, the orientation change will use the eye-local frame. If the new quaternion is on the right, the orientation change will be in the global frame. I think you want:
self.rotationE =
GLKQuaternionMultiply(
GLKQuaternionMakeWithAngleAndVector3Axis(x*rate, up),self.rotationE);
Do this, without the pre-rotation by inverse for all three cases.
I've never used the GLKit, but it's uncommon to extract axis-angle when converting from Quaternion to Matrix. If the angle is zero, the axis is undefined. When it's near zero, you'll have numeric instability. It looks like you should be using GLKMatrix4MakeWithQuaternion and then multiplying the resulting matrix with your translation matrix and scale matrix:
GLKMatrix4 modelviewMatrix =
GLKMatrix4Multiply( GLKMatrix4MakeTranslation(0.0f, 0.0f, -0.55f),
GLKMatrix4MakeWithQuaternion( self.rotationE ) );
modelviewMatrix = GLKMatrix4Scale( modelviewMatrix, 0.5f, 0.5f, 0.5f );
I was recently asked a bit more about my resulting implementation of this problem, so here it is!
- (void)rotate:(GLKVector3)r
{
// Convert degrees to radians for maths calculations
r.x = GLKMathDegreesToRadians(r.x);
r.y = GLKMathDegreesToRadians(r.y);
r.z = GLKMathDegreesToRadians(r.z);
// Axis Vectors w/ Direction (x=right, y=up, z=front)
// In OpenGL, negative z values go "into" the screen. In GLKit, positive z values go "into" the screen.
GLKVector3 right = GLKVector3Make(1.0f, 0.0f, 0.0f);
GLKVector3 up = GLKVector3Make(0.0f, 1.0f, 0.0f);
GLKVector3 front = GLKVector3Make(0.0f, 0.0f, 1.0f);
// Quaternion w/ Angle and Vector
// Positive angles are counter-clockwise, so convert to negative for a clockwise rotation
GLKQuaternion q = GLKQuaternionIdentity;
q = GLKQuaternionMultiply(GLKQuaternionMakeWithAngleAndVector3Axis(-r.x, right), q);
q = GLKQuaternionMultiply(GLKQuaternionMakeWithAngleAndVector3Axis(-r.y, up), q);
q = GLKQuaternionMultiply(GLKQuaternionMakeWithAngleAndVector3Axis(-r.z, front), q);
// ModelView Matrix
GLKMatrix4 modelViewMatrix = GLKMatrix4Identity;
modelViewMatrix = GLKMatrix4Multiply(modelViewMatrix, GLKMatrix4MakeWithQuaternion(q));
}
Hope you put it to good use :)