AVMetadataFaceObject Precision - ios

I'm trying to use the AVMetadataFaceObject to get the yaw and roll of a face in a video. From what I can tell, the precision of the yaw is in increments of 45 degrees and the roll is in increments of 30 degrees.
Is there a way to increase this precision?
(Code as seen in Proper usage of CIDetectorTracking).

You can get the rectangles of the eyes and calculate the angle yourself. You should investigate the changes made here in iOS 7, as there are many improvements in this area.

Related

Calculate angle from the floor

I need to find an specific angle from iPhone CoreMotion data (pitch, yaw, roll, or quaternions). Lets imagine two lines. The first one should go perpendicularly from phone to the floor. Second one should point to the place where camera is looking at (if camera was working, then start of this line would be at the camera, and the end at the place displayed in the center of camera preview). And I need to find an angle between these two lines. I have no idea where to start, can someone help?
I have all the data from CoreMotion, so pitch, yaw, roll / gravity x,y,z, Attitude.quaternion/rotationMatrix.
If you have all parameters then hope in this way you will be able to find the required angle
θ = (1/Sin)(opposite/hypotenuse)
Needed to change way of looking at this. First I realized, that yaw was not affecting this at all. The solution was Pythagoras pattern with pitch and roll as parameters, so:
sqrt(pow(pitch, 2) + pow(roll, 2))
Actually this is still not working best when phone is slightly rotated over yaw axis. Don't know why, I'll try to figure it out. But it's small error, and it's only visible, when pitch is close to 90 degrees.

Transforming Pitch and Roll values in iOS

I have been trying to compare the pitch and roll pattern (2 second recording) readings.
What I have done is record a Pitch and Roll Values for 10 seconds. In those 10 seconds some similar motion is applied to the device.
When I plot these on graph, the values are also same.
This only happens when providing motion to the device from same orientation/position. For example, device is lying down on the table and a motion is provided to the device. All the readings are same.
But if the device is rotate 180 deg. ccw, the readings are inverted.
Is there a way I can achieve same reading for every position? Via applying some transformation formula? I have done it for acceleration values using pitch roll and yaw. But, don't know how to achieve this for pitch and roll itself.
Basically, what I want to achieve is that the Pitch and Roll values should be independent of yaw.
Here is the plot for pitch values.. all the readings are same expect the two starting from 1.5 in the graph. These were the two times when the device was rotated 180 deg. ccw
UPDATE:
I tried to store the CMAttitude in NSUSerDefaults and then applied the multiplyByInverseOfAttitude. But still the graph plot is inverse.
CMAttitude *currentAtt = motion.attitude;
if (firstAttitude)//firstAttitude is the stored CMAttitude
{
[currentAtt multiplyByInverseOfAttitude:firstAttitude];
}
CMQuaternion quat = currentAtt.quaternion;

iPhone augmented reality Euler angles rotation – roll issue

I’m working on an iOS augmented reality application.
It is location-based, not marker-based.
I use the GPS, compass and accelerometers to get latitude, longitude, altitude and the 3 euler angles: yaw, pitch and roll. I know using NSLog() that those 6 variables contain valid data.
My application shows some 3d objects over the camera view.
It works fine as long as I use everything but the roll angle.
If I add that third angle, the rotation applied to my opengl world is not good. I do it that way in the main OpenGL draw method
glRotatef(pitch, 1, 0, 0);
glRotatef(yaw, 0, 1, 0);
//glRotatef(roll, 0, 0, 1);
I think there is something wrong with this approach but am certainly not a specialist. Maybe I should create some sort of unique rotation matrix rather than 3 different ones?
Maybe that’s not possible easily? After all most desktop video games, FPS and the like, just let the user change the yaw and the pitch using the mouse, so only 2 angles, not 3. But unlike the mouse, which is a 2d device, a phone used for augmented reality can move in any angles.
But then again, all AR tutorials I have seen online couldn’t handle ‘roll’ properly. ‘Rolling’ your phone would either completely mess AR stuff up or do nothing at all, using some roll-compensation strategies.
So my question is, assuming I have my 3 Euler angles using the phone sensors, how should I apply them to my 3d opengl view?
I think you're likely talking about gimbal lock.
The essence of the problem is that if you rotate with Eulers then there's always a sequence to it. For example, you rotate around x, then around y, then z. But then one axis can always becomes ambiguous because a preceding can move it onto a different axis.
Suppose the rotation were 0 degrees around x, 90 degrees around y, then 20 degrees around z. So you do the x rotation and nothing has changed. You do the y rotation and everything moves 90 degrees. But now you've moved the z axis onto where the x axis was previously. So the z rotation will appear to be around x.
No matter what most people's instincts tell them, there's no way to avoid the problem. The kneejerk reaction is that you'll always rotate around the global axes rather than the local one. That doesn't resolve the problem, it just reverses the order. The z rotation could then the y rotation — which has already occurred — into an x rotation.
You're right that you should aim to create a unique description of rotation separated from measuring angles.
For augmented reality it's actually not all that difficult.
The accelerometer tells you which way down is. The compass tells you which way north is. The two may not be orthogonal though — the compass reading should vary from being exactly at a right angle to the floor on the equator to being exactly parallel to the accelerometer at the poles.
So:
just accept the accelerometer vector as down;
get the cross product of down and the compass vector to get your side vector — it should point along a line of longitude;
then get the cross product of your side vector and your down vector to get a north vector that is suitably perpendicular.
You could equally use the dot product to remove that portion of the compass vector that is in the direction of gravity and cross product from there.
You'll want to normalise everything.
That gives you three basis vectors, so just put them directly into a matrix. No further work required.

Relative Camera Pose Estimation using OpenCV

I'm trying to estimate the relative camera pose using OpenCV. Cameras in my case are calibrated (i know the intrinsic parameters of the camera).
Given the images captured at two positions, i need to find out the relative rotation and translation between two cameras. Typical translation is about 5 to 15 meters and yaw angle rotation between cameras range between 0 - 20 degrees.
For achieving this, following steps are adopted.
a. Finding point corresponding using SIFT/SURF
b. Fundamental Matrix Identification
c. Estimation of Essential Matrix by E = K'FK and modifying E for singularity constraint
d. Decomposition Essential Matrix to get the rotation, R = UWVt or R = UW'Vt (U and Vt are obtained SVD of E)
e. Obtaining the real rotation angles from rotation matrix
Experiment 1: Real Data
For real data experiment, I captured images by mounting a camera on a tripod. Images captured at Position 1, then moved to another aligned Position and changed yaw angles in steps of 5 degrees and captured images for Position 2.
Problems/Issues:
Sign of the estimated yaw angles are not matching with ground truth yaw angles. Sometimes 5 deg is estimated as 5deg, but 10 deg as -10 deg and again 15 deg as 15 deg.
In experiment only yaw angle is changed, however estimated Roll and Pitch angles are having nonzero values close to 180/-180 degrees.
Precision is very poor in some cases the error in estimated and ground truth angles are around 2-5 degrees.
How to find out the scale factor to get the translation in real world measurement units?
The behavior is same on simulated data also.
Have anybody experienced similar problems as me? Have any clue on how to resolve them.
Any help from anybody would be highly appreciated.
(I know there are already so many posts on similar problems, going trough all of them has not saved me. Hence posting one more time.)
In chapter 9.6 of Hartley and Zisserman, they point out that, for a particular essential matrix, if one camera is held in the canonical position/orientation, there are four possible solutions for the second camera matrix: [UWV' | u3], [UWV' | -u3], [UW'V' | u3], and [UW'V' | -u3].
The difference between the first and third (and second and fourth) solutions is that the orientation is rotated by 180 degrees about the line joining the two cameras, called a "twisted pair", which sounds like what you are describing.
The book says that in order to choose the correct combination of translation and orientation from the four options, you need to test a point in the scene and make sure that the point is in front of both cameras.
For problems 1 and 2,
Look for "Euler angles" in wikipedia or any good math site like Wolfram Mathworld. You would find out the different possibilities of Euler angles. I am sure you can figure out why you are getting sign changes in your results based on literature reading.
For problem 3,
It should mostly have to do with the accuracy of our individual camera calibration.
For problem 4,
Not sure. How about, measuring a point from camera using a tape and comparing it with the translation norm to get the scale factor.
Possible reasons for bad accuracy:
1) There is a difference between getting reasonable and precise accuracy in camera calibration. See this thread.
2) The accuracy with which you are moving the tripod. How are you ensuring that there is no rotation of tripod around an axis perpendicular to surface during change in position.
I did not get your simulation concept. But, I would suggest the below test.
Take images without moving the camera or object. Now if you calculate relative camera pose, rotation should be identity matrix and translation should be null vector. Due to numerical inaccuracies and noise, you might see rotation deviation in arc minutes.

iOS: Can I get the pitch/yaw/roll from accelerometer data?

I want to find out the pitch, yaw, and roll on an iPad 1. Since there is no deviceMotion facility, can I get this data from the accelerometer? I assume that I can use the vector that it returns to compare against a reference vector i.e. gravity.
Does iOS detect when the device is still and then take that as the gravity vector? Or do I have to do that?
Thanks.
It's definitely possible to calculate the Pitch and Roll from accelerometer data, but Yaw requires more information (gyroscope for sure but possibly compass could be made to work).
For an example look at Hungry Shark for iOS . Based on how their tilt calibration ui works I'm pretty sure they're using the accelerometer instead of the gyroscope.
Also, here are some formula's I found on a blog post from Taylor-Robotic a for calculating pitch and roll:
Now that we have 3 outputs expressed in g we should be able to
calculate the pitch and the roll. This requires two further equations.
pitch = atan (x / sqrt(y^2 + z^2))
roll = atan (y / sqrt(x^2 + z^2))
This will produce the pitch and roll in radians, to convert them into
friendly degrees we multiply by 180, then divide by PI.
pitch = (pitch * 180) / PI
roll = (roll * 180) / PI
The thing I'm still looking for is how to calibrate the pitch and roll values based on how the user is holding the device. If I can't figure it out soon, I may open up a separate question. Good Luck!

Resources