GLM Getting Forward from view matrix returns a weird non working output - glm

I've been trying to figure this out for at least 2 months by now, I need to implement a way on how to get the Forward Vector, I've tried multiple approaches but every single time it works for certain cases. I store the rotation as euler angles, I tried converting them to Quaternions, Tried getting the view matrix from inverting the model matrix and getting the 3 row of that matrix, and every single time I get a weird output. if the rotation is all zero, it works, if it is a multiple of 90 in any axis the output works, but if the rotation is let's say 45 degress it doesn't work, the output is -0.707107, Y: -0, Z: 0.707107 which doesn't make sense. How I get the Forward vector:
glm::mat4 inverted = glm::inverse(CreateMatrix());
glm::vec3 ret = glm::normalize(glm::vec3(inverted[2]));

Related

How to check whether Essential Matrix is correct or not without decomposing it?

On a very high level, my pose estimation pipeline looks somewhat like this:
Find features in image_1 and image_2 (let's say cv::ORB)
Match the features (let's say using the BruteForce-Hamming descriptor matcher)
Calculate Essential Matrix (using cv::findEssentialMat)
Decompose it to get the proper rotation matrix and translation unit vector (using cv::recoverPose)
Repeat
I noticed that at some point, the yaw angle (calculated using the output rotation matrix R of cv::recoverPose) suddenly jumps by more than 150 degrees. For that particular frame, the number of inliers is 0 (the return value of cv::recoverPose). So, to understand what exactly that means and what's going on, I asked this question on SO.
As per the answer to my question:
So, if the number of inliers is 0, then something went very wrong. Either your E is wrong, or the point matches are wrong, or both. In this case you simply cannot estimate the camera motion from those two images.
For that particular image pair, based on the visualization and from my understanding, matches look good:
The next step in the pipeline is finding the Essential Matrix. So, now, how can I check whether the calculated Essential Matrix is correct or not without decomposing it i.e. without calculating Roll Pitch Yaw angles (which can be done by finding the rotation matrix via cv::recoverPose)?
Basically, I want to double-check whether my Essential Matrix is correct or not before I move on to the next component (which is cv::recoverPose) in the pipeline!
The essential matrix maps each point p in image 1 to its epipolar line in image 2. The point q in image 2 that corresponds to p must be very close to the line. You can plot the epipolar lines and the matching points to see if they make sense. Remember that E is defined in normalized image coordinates, not in pixels. To get the epipolar lines in pixel coordinates, you would need to convert E to F (the fundamental matrix).
The epipolar lines must all intersect in one point, called the epipole. The epipole is the projection of the other camera's center in your image. You can find the epipole by computing the nullspace of F.
If you know something about the camera motion, then you can determine where the epipole should be. For example, if the camera is mounted on a vehicle that is moving directly forward, and you know the pitch and roll angles of the camera relative to the ground, then you can calculate exactly where the epipole will be. If the vehicle can turn in-plane, then you can find the horizontal line on which the epipole must lie.
You can get the epipole from F, and see if it is where it is supposed to be.

Understanding the output of solvepnp?

I am have been using solvepnp() for the calculation of the rotation and translation matrix. But the euler angles calculated from the obtained rotation matrix gave very erratic values. Trying to find the problem, I had a set of 2D projection points for my marker and kept the other parameters of solvepnp() constant.
Eg values:
2D points
[219.67473, 242.78395; 363.4151, 238.61298; 503.04855, 234.56117; 501.70917, 628.16742; 500.58069, 959.78564; 383.1756, 972.02679; 262.8746, 984.56982; 243.17044, 646.22925]
The euler angle theta(x) calculated from the output rotation matrix of solvepnp() was -26.4877
Next, I incremented only the x value of the first point(i.e 219.67473) by 0.1 to check the variation of the theta(x) euler angle (keeping the remaining points and the other parameters constant) and ran the solvepnp() again .For that very small change,I had values which were decreasing from -19 degree, -18 degree (for x coord = 223.074) then suddenly jump to 27 degree for a while (for x coord = 223.174 to 226.974) then come down to 1.3 degree (for x coord = 227.074).
I cannot understand this behaviour at all.Could somebody please explain?
My euler angle calculation from the rotation matrix uses this procedure.
Try Rodrigues() for conversion between rotation matrix and rotation vector to make sure everything is clean and right. Non RANSAC version can be very sensitive to outliers that create a huge error in the parameters and thus bias a solution. Using RANSAC version of solvePnP may make it more stable to outliers. For example, adding too much to one of the points coordinates will eventually make it an outlier and it won’t influence a solution after that.
If everything fails, write a series unit tests: create an artificial set of points in 3D (possibly non planar), apply a simple translation first, in second variant apply rotation only, and in a third test apply both. Project using your camera matrix and then plug in your 2D, 3D points and projection matrix into your code to find the pose. If the result deviates from the inverse of the translations and rotations your applied to the points look for the bug in feeding parameters to PnP.
It seems the coordinate systems are different.OpenCV uses right-hand coordinate-system Y-pointing downwards. At nghiaho.com it says the calculations are based on this and if you look at the axis they don't seem to match. I guess you are using Rodrigues for matrix computation? Try comparing rotation vectors as well.

iPhone augmented reality Euler angles rotation – roll issue

I’m working on an iOS augmented reality application.
It is location-based, not marker-based.
I use the GPS, compass and accelerometers to get latitude, longitude, altitude and the 3 euler angles: yaw, pitch and roll. I know using NSLog() that those 6 variables contain valid data.
My application shows some 3d objects over the camera view.
It works fine as long as I use everything but the roll angle.
If I add that third angle, the rotation applied to my opengl world is not good. I do it that way in the main OpenGL draw method
glRotatef(pitch, 1, 0, 0);
glRotatef(yaw, 0, 1, 0);
//glRotatef(roll, 0, 0, 1);
I think there is something wrong with this approach but am certainly not a specialist. Maybe I should create some sort of unique rotation matrix rather than 3 different ones?
Maybe that’s not possible easily? After all most desktop video games, FPS and the like, just let the user change the yaw and the pitch using the mouse, so only 2 angles, not 3. But unlike the mouse, which is a 2d device, a phone used for augmented reality can move in any angles.
But then again, all AR tutorials I have seen online couldn’t handle ‘roll’ properly. ‘Rolling’ your phone would either completely mess AR stuff up or do nothing at all, using some roll-compensation strategies.
So my question is, assuming I have my 3 Euler angles using the phone sensors, how should I apply them to my 3d opengl view?
I think you're likely talking about gimbal lock.
The essence of the problem is that if you rotate with Eulers then there's always a sequence to it. For example, you rotate around x, then around y, then z. But then one axis can always becomes ambiguous because a preceding can move it onto a different axis.
Suppose the rotation were 0 degrees around x, 90 degrees around y, then 20 degrees around z. So you do the x rotation and nothing has changed. You do the y rotation and everything moves 90 degrees. But now you've moved the z axis onto where the x axis was previously. So the z rotation will appear to be around x.
No matter what most people's instincts tell them, there's no way to avoid the problem. The kneejerk reaction is that you'll always rotate around the global axes rather than the local one. That doesn't resolve the problem, it just reverses the order. The z rotation could then the y rotation — which has already occurred — into an x rotation.
You're right that you should aim to create a unique description of rotation separated from measuring angles.
For augmented reality it's actually not all that difficult.
The accelerometer tells you which way down is. The compass tells you which way north is. The two may not be orthogonal though — the compass reading should vary from being exactly at a right angle to the floor on the equator to being exactly parallel to the accelerometer at the poles.
So:
just accept the accelerometer vector as down;
get the cross product of down and the compass vector to get your side vector — it should point along a line of longitude;
then get the cross product of your side vector and your down vector to get a north vector that is suitably perpendicular.
You could equally use the dot product to remove that portion of the compass vector that is in the direction of gravity and cross product from there.
You'll want to normalise everything.
That gives you three basis vectors, so just put them directly into a matrix. No further work required.

Relative Camera Pose Estimation using OpenCV

I'm trying to estimate the relative camera pose using OpenCV. Cameras in my case are calibrated (i know the intrinsic parameters of the camera).
Given the images captured at two positions, i need to find out the relative rotation and translation between two cameras. Typical translation is about 5 to 15 meters and yaw angle rotation between cameras range between 0 - 20 degrees.
For achieving this, following steps are adopted.
a. Finding point corresponding using SIFT/SURF
b. Fundamental Matrix Identification
c. Estimation of Essential Matrix by E = K'FK and modifying E for singularity constraint
d. Decomposition Essential Matrix to get the rotation, R = UWVt or R = UW'Vt (U and Vt are obtained SVD of E)
e. Obtaining the real rotation angles from rotation matrix
Experiment 1: Real Data
For real data experiment, I captured images by mounting a camera on a tripod. Images captured at Position 1, then moved to another aligned Position and changed yaw angles in steps of 5 degrees and captured images for Position 2.
Problems/Issues:
Sign of the estimated yaw angles are not matching with ground truth yaw angles. Sometimes 5 deg is estimated as 5deg, but 10 deg as -10 deg and again 15 deg as 15 deg.
In experiment only yaw angle is changed, however estimated Roll and Pitch angles are having nonzero values close to 180/-180 degrees.
Precision is very poor in some cases the error in estimated and ground truth angles are around 2-5 degrees.
How to find out the scale factor to get the translation in real world measurement units?
The behavior is same on simulated data also.
Have anybody experienced similar problems as me? Have any clue on how to resolve them.
Any help from anybody would be highly appreciated.
(I know there are already so many posts on similar problems, going trough all of them has not saved me. Hence posting one more time.)
In chapter 9.6 of Hartley and Zisserman, they point out that, for a particular essential matrix, if one camera is held in the canonical position/orientation, there are four possible solutions for the second camera matrix: [UWV' | u3], [UWV' | -u3], [UW'V' | u3], and [UW'V' | -u3].
The difference between the first and third (and second and fourth) solutions is that the orientation is rotated by 180 degrees about the line joining the two cameras, called a "twisted pair", which sounds like what you are describing.
The book says that in order to choose the correct combination of translation and orientation from the four options, you need to test a point in the scene and make sure that the point is in front of both cameras.
For problems 1 and 2,
Look for "Euler angles" in wikipedia or any good math site like Wolfram Mathworld. You would find out the different possibilities of Euler angles. I am sure you can figure out why you are getting sign changes in your results based on literature reading.
For problem 3,
It should mostly have to do with the accuracy of our individual camera calibration.
For problem 4,
Not sure. How about, measuring a point from camera using a tape and comparing it with the translation norm to get the scale factor.
Possible reasons for bad accuracy:
1) There is a difference between getting reasonable and precise accuracy in camera calibration. See this thread.
2) The accuracy with which you are moving the tripod. How are you ensuring that there is no rotation of tripod around an axis perpendicular to surface during change in position.
I did not get your simulation concept. But, I would suggest the below test.
Take images without moving the camera or object. Now if you calculate relative camera pose, rotation should be identity matrix and translation should be null vector. Due to numerical inaccuracies and noise, you might see rotation deviation in arc minutes.

XNA 4.0 Camera Question

I'm having trouble understanding how the camera works in my test application. I've been able to piece together a working camera - now I am trying to make sure I understand how it all works. My camera is encapsulated in its own class. Here is the update method that gets called from my Game.Update() method:
public void Update(float dt)
{
Yaw += (200 - Game.MouseState.X) * dt * .12f;
Pitch += (200 - Game.MouseState.Y) * dt * .12f;
Mouse.SetPosition(200, 200);
_worldMatrix = Matrix.CreateFromAxisAngle(Vector3.Right, Pitch) * Matrix.CreateFromAxisAngle(Vector3.Up, Yaw);
float distance = _speed * dt;
if (_game.KeyboardState.IsKeyDown(Keys.E))
MoveForward(distance);
if (_game.KeyboardState.IsKeyDown(Keys.D))
MoveForward(-distance);
if (_game.KeyboardState.IsKeyDown(Keys.S))
MoveRight(-distance);
if (_game.KeyboardState.IsKeyDown(Keys.F))
MoveRight(distance);
if (_game.KeyboardState.IsKeyDown(Keys.A))
MoveUp(distance);
if (_game.KeyboardState.IsKeyDown(Keys.Z))
MoveUp(-distance);
_worldMatrix *= Matrix.CreateTranslation(_position);
_viewMatrix = Matrix.Invert(_worldMatrix); // What's gong on here???
}
First of all, I understand everything in this method other than the very last part where the matrices are being manipulated. I think the terminology is getting in my way as well. For example, my _worldMatrix is really a Rotation Matrix. What really baffles me is the part where the _viewMatrix is calculated by inverting the _worldMatrix. I just don't understand what this is all about.
In prior testing, I always used Matrix.CreateLookAt() to create a view matrix, so I'm a bit confused. I'm hoping someone can explain in simple terms what is going on.
Thanks,
-Scott
One operation the view matrix does for the graphics pipeline is that it converts a 3d point from world space (the x, y, z, we all know & love) into view (or camera) space, a space where the camera is considered to be the center of the world (0,0,0) and all points/objects are relative to it. So while a point may be at 1,1,1 relative to the world, what are it's cordinates relative to the camera location? Well, as it turns out, to find out, you can transform that point by the inverse of a matrix representing the camera's world space position/rotation.
It kinda makes sense if you think about it... let's say the camera position is 2,2,2. An arbitrary point is at 3,3,3. We know that the point is 1,1,1 away from the camera, right? so what transformation would you apply to the point 3,3,3 in order for it to become 1,1,1 (it's location relative to the camera)? you would transform 3,3,3 by -2,-2,-2 to result in 1,1,1. -2,-2,-2 is also the camera's inverted position. That example was for translation because it is relatively easy to groc but basically the same happens for rotation. But don't expect to be able to simply negate all basis vectors to invert a matrix... there is a little more going on with that for rotation.
The Matrix.CreateLookAt() method automatically returns the inverted matrix so you don't really notice it happening unless you reflect its code.
Taking that one step further, the Projection matrix then takes that point in view space and projects it onto a flat surface and that point that started out in 3d space is now in 2d space.

Resources