I'm trying to get the four vectors that make up the boundaries of the frustum in ARKit, and the solution I came up with is as follows:
Find the field of view angles of the camera
Then find the direction and up vectors of the camera
Using these information, find the four vectors using cross products and rotations
This may be a sloppy way of doing it, however it is the best one I got so far.
I am able to get the FOV angles and the direction vector from the ARCamera.intrinsics and ARCamera.transform properties. However, I don't know how to get the up vector of the camera at this point.
Below is the piece of code I use to find the FOV angles and the direction vector:
func session(_ session: ARSession, didUpdate frame: ARFrame) {
if xFovDegrees == nil || yFovDegrees == nil {
let imageResolution = frame.camera.imageResolution
let intrinsics = frame.camera.intrinsics
xFovDegrees = 2 * atan(Float(imageResolution.width) / (2 * intrinsics[0,0])) * 180 / Float.pi
yFovDegrees = 2 * atan(Float(imageResolution.height) / (2 * intrinsics[1,1])) * 180 / Float.pi
}
let cameraTransform = SCNMatrix4(frame.camera.transform)
let cameraDirection = SCNVector3(-1 * cameraTransform.m31,
-1 * cameraTransform.m32,
-1 * cameraTransform.m33)
}
I am also open to suggestions for ways to find the the four vectors I'm trying to get.
I had not understood how this line worked:
let cameraDirection = SCNVector3(-1 * cameraTransform.m31,
-1 * cameraTransform.m32,
-1 * cameraTransform.m33)
This gives the direction vector of the camera because the 3rd row of the transformation matrix gives where the new z-direction of the transformed camera points at. We multiply it by -1 because the default direction of the camera is the negative z-axis.
Considering this information and the fact that the default up vector for a camera is the positive y-axis, the 2nd row of the transformation matrix gives us the up vector of the camera. The following code gives me what I want:
let cameraUp = SCNVector3(cameraTransform.m21,
cameraTransform.m22,
cameraTransform.m23)
It could be that I'm misunderstanding what you're trying to do, but I'd like to offer an alternative solution (the method and result is different than your answer).
For my purposes, I define the up vector as (0, 1, 0) when the phone is pointing straight up - basically I want the unit vector that is pointing straight out of the top of the phone. ARKit defines the up vector as (0, 1, 0) when the phone is horizontal to the left - so the y-axis is pointing out of the right side of the phone - supposedly because they expect AR apps to prefer horizontal orientation.
camera.transform returns the camera's orientation relative to its initial orientation when the AR session started. It is a 4x4 matrix - the first 3x3 of which is the rotation matrix - so when you write cameraTransform.m21 etc. you are referencing part of the rotation matrix, which is NOT the same as the up vector (however you define it).
So if I define the up vector as the unit y-vector where the y axis is pointing out of the top of the phone, I have to write this as (-1, 0, 0) in ARKit space. Then simply multiplying this vector (slightly modified... see below) by the camera's transform will give me the "up vector" that I'm looking for. Below is an example of using this calculation in a ARSessionDelegate callback.
func session(_ session: ARSession, didUpdate frame: ARFrame) {
// the unit y vector is appended with an extra element
// for multiplying with the 4x4 transform matrix
let unitYVector = float4(-1, 0, 0, 1)
let upVectorH = frame.camera.transform * unitYVector
// drop the 4th element
let upVector = SCNVector3(upVectorH.x, upVectorH.y, upVectorH.z)
}
You can use let unitYVector = float4(0, 1, 0, 1) if you are working with ARKit's horizontal orientation.
You can also do the same sort of calculation to get the "direction vector" (pointing out of the front of the phone) by multiplying unit vector (0, 0, 1, 1) by the camera transform.
Related
I Detected the ArUco marker and estimated the pose. See the image below. However, Xt (X translation) I get is a positive value. According to the drawAxis function, the positive direction is away from the image center. So I thought it was supposed to be a negative value. Why I am getting positive instead.
My camera is about 120 mm away from the imaging surface. But I am getting Zt (Z translation) in the range of 650 mm. Is pose estimation giving the pose of marker with respect to physical camera or image plane center? I didn't get why the Zt is so high.
I kept measuring Pose while changing Z, and obtained roll, pitch, yaw. I noticed roll ( rotation w.r.t. cam X-axis) is changing its sign back and forth magnitude change 166-178, but the sign of Xt did not change with the sign change in roll. Any thoughts on why it behaves like that?
Any suggestion to get more consistent data?
image=cv.imread(fname)
arucoDict = cv.aruco.Dictionary_get(cv.aruco.DICT_4X4_1000)
arucoParams = cv.aruco.DetectorParameters_create()
(corners, ids, rejected) = cv.aruco.detectMarkers(image, arucoDict,
parameters=arucoParams)
print(corners, ids, rejected)
if len(corners) > 0:
# flatten the ArUco IDs list
ids = ids.flatten()
# loop over the detected ArUCo corners
#for (markerCorner, markerID) in zip(corners, ids):
#(markerCorner, markerID)=(corners, ids)
# extract the marker corners (which are always returned in
# top-left, top-right, bottom-right, and bottom-left order)
#corners = corners.reshape((4, 2))
(topLeft, topRight, bottomRight, bottomLeft) = corners[0][0][0],corners[0][0][1],corners[0][0][2],corners[0][0][3]
# convert each of the (x, y)-coordinate pairs to integers
topRight = (int(topRight[0]), int(topRight[1]))
bottomRight = (int(bottomRight[0]), int(bottomRight[1]))
bottomLeft = (int(bottomLeft[0]), int(bottomLeft[1]))
topLeft = (int(topLeft[0]), int(topLeft[1]))
# draw the bounding box of the ArUCo detection
cv.line(image, topLeft, topRight, (0, 255, 0), 2)
cv.line(image, topRight, bottomRight, (0, 255, 0), 2)
cv.line(image, bottomRight, bottomLeft, (0, 255, 0), 2)
cv.line(image, bottomLeft, topLeft, (0, 255, 0), 2)
# compute and draw the center (x, y)-coordinates of the ArUco
# marker
cX = int((topLeft[0] + bottomRight[0]) / 2.0)
cY = int((topLeft[1] + bottomRight[1]) / 2.0)
cv.circle(image, (cX, cY), 4, (0, 0, 255), -1)
if topLeft[1]!=topRight[1] or topLeft[0]!=bottomLeft[0]:
rot1=np.degrees(np.arctan((topLeft[0]-bottomLeft[0])/(bottomLeft[1]-topLeft[1])))
rot2=np.degrees(np.arctan((topRight[1]-topLeft[1])/(topRight[0]-topLeft[0])))
rot=(np.round(rot1,3)+np.round(rot2,3))/2
print(rot1,rot2,rot)
else:
rot=0
# draw the ArUco marker ID on the image
rotS=",rotation:"+str(np.round(rot,3))
cv.putText(image, ("position: "+str(cX) +","+str(cY)),
(100, topLeft[1] - 15), cv.FONT_HERSHEY_SIMPLEX,0.5, (255, 0, 80), 2)
cv.putText(image, rotS,
(400, topLeft[1] -15), cv.FONT_HERSHEY_SIMPLEX,0.5, (255, 0, 80), 2)
print("[INFO] ArUco marker ID: {}".format(ids))
d=np.round((math.dist(topLeft,bottomRight)+math.dist(topRight,bottomLeft))/2,3)
# Get the rotation and translation vectors
rvecs, tvecs, obj_points = cv.aruco.estimatePoseSingleMarkers(corners,aruco_marker_side_length,mtx,dst)
# Print the pose for the ArUco marker
# The pose of the marker is with respect to the camera lens frame.
# Imagine you are looking through the camera viewfinder,
# the camera lens frame's:
# x-axis points to the right
# y-axis points straight down towards your toes
# z-axis points straight ahead away from your eye, out of the camera
#for i, marker_id in enumerate(marker_ids):
# Store the translation (i.e. position) information
transform_translation_x = tvecs[0][0][0]
transform_translation_y = tvecs[0][0][1]
transform_translation_z = tvecs[0][0][2]
# Store the rotation information
rotation_matrix = np.eye(4)
rotation_matrix[0:3, 0:3] = cv.Rodrigues(np.array(rvecs[0]))[0]
r = R.from_matrix(rotation_matrix[0:3, 0:3])
quat = r.as_quat()
# Quaternion format
transform_rotation_x = quat[0]
transform_rotation_y = quat[1]
transform_rotation_z = quat[2]
transform_rotation_w = quat[3]
# Euler angle format in radians
roll_x, pitch_y, yaw_z = euler_from_quaternion(transform_rotation_x,transform_rotation_y,transform_rotation_z,transform_rotation_w)
roll_x = math.degrees(roll_x)
pitch_y = math.degrees(pitch_y)
yaw_z = math.degrees(yaw_z)
Disclaimer: this goes for OpenCV v4.5.5 and corresponding aruco module (contrib repo). They redid a lot of aruco stuff for v4.6.0 and v4.7.0, so best check everything I say here.
Without checking all the code (looks roughly okay), a few basics about OpenCV and aruco:
Both use right-handed coordinate systems. Thumb X, index Y, middle Z.
OpenCV uses X right, Y down, Z far, for screen/camera frames. Origin for screens and pictures is the top left corner. For cameras, the origin is the center of the pinhole model, which would be the center of the aperture. I can't comment on lenses or lens systems. Assume the lens center is the origin. That's probably close enough.
Aruco uses X right, Y far, Z up, if the marker is lying flat on a table. Origin is in the center of the marker. The top left corner of the marker is considered the "first" corner.
The marker can be considered to have its own coordinate system/frame.
The pose given by rvec and tvec is the pose of the marker in the camera frame. That means np.linalg.norm(tvec) gives you the direct distance from the camera to the marker's center. tvec's Z is just the component parallel to optical axis.
If the marker is in the right half of the picture ("half" defined by camera matrix's cx,cy), you'd expect tvec's X to grow. Lower half, Y positive/growing.
Conversely, that transformation transforms marker-local coordinates to camera-local. Try transforming some marker-local points, such as origin or points on the axes. I believe that cv::transform can help with that. Using OpenCV's projectPoints to map 3D space points to 2D image points, you can then draw the marker's axes, or a cube on top of it, or anything you like.
Say the marker sits upright and faces the camera dead-on. When you consider the frame triads of the marker and the camera in space ("world" space), both would be X "right", but one's Y and Z are opposite the other's Y and Z, so you'd expect to see a rotation around the X axis by half a turn (rotating Z and Y).
You could imagine the transformation to happen like this:
initially the camera looks through the marker, from the marker's back out into the world. The camera would be "upside down". The camera sees marker-space.
the pose's rotation component rotates the whole marker-local world around the camera's origin. Seen from the world frame (point of reference), the camera rotates, into an attitude you'd find natural.
the pose's translation moves the marker's world out in front of the camera (Z being positive), or equivalently, the camera backs away from the marker.
If you get implausible values, check aruco_marker_side_length and camera matrix. f would be around 500-3000 for typical resolutions (VGA-4k) and fields of view (60-80 degrees).
I just started learning how to use SceneKit yesterday, so I may get some stuff wrong or incorrect. I am trying to make my cameraNode look at a SCNVector3 point in the scene.
I am trying to make my app available to people below iOS 11.0. However, the look(at:) function is only for iOS 11.0+.
Here is my function where I initialise the camera:
func initCamera() {
cameraNode = SCNNode()
cameraNode.camera = SCNCamera()
cameraNode.position = SCNVector3(5, 12, 10)
if #available(iOS 11.0, *) {
cameraNode.look(at: SCNVector3(0, 5, 0)) // Calculate the look angle
} else {
// How can I calculate the orientation? <-----------
}
print(cameraNode.rotation) // Prints: SCNVector4(x: -0.7600127, y: 0.62465125, z: 0.17941462, w: 0.7226559)
gameScene.rootNode.addChildNode(cameraNode)
}
The orientation of SCNVector4(x: -0.7600127, y: 0.62465125, z: 0.17941462, w: 0.7226559) in degrees is x: -43.5, y: 35.8, z: 10.3, and I don't understand w. (Also, why isn't z = 0? I thought z was the roll...?)
Here is my workings out for recreating what I thought the Y-angle should be:
So I worked it out to be 63.4 degrees, but the returned rotation shows that it should be 35.8 degrees. Is there something wrong with my calculations, do I not fully understand SCNVector4, or is there another method to do this?
I looked at Explaining in Detail the ScnVector4 method for what SCNVector4 is, but I still don't really understand what w is for. It says that w is the 'angle of rotation' which I thought was what I thought X, Y & Z were for.
If you have any questions, please ask!
Although #rickster has given the explanations of the properties of the node, I have figured out a method to rotate the node to look at a point using maths (trigonometry).
Here is my code:
// Extension for Float
extension Float {
/// Convert degrees to radians
func asRadians() -> Float {
return self * Float.pi / 180
}
}
and also:
// Extension for SCNNode
extension SCNNode {
/// Look at a SCNVector3 point
func lookAt(_ point: SCNVector3) {
// Find change in positions
let changeX = self.position.x - point.x // Change in X position
let changeY = self.position.y - point.y // Change in Y position
let changeZ = self.position.z - point.z // Change in Z position
// Calculate the X and Y angles
let angleX = atan2(changeZ, changeY) * (changeZ > 0 ? -1 : 1)
let angleY = atan2(changeZ, changeX)
// Calculate the X and Y rotations
let xRot = Float(-90).asRadians() - angleX // X rotation
let yRot = Float(90).asRadians() - angleY // Y rotation
self.eulerAngles = SCNVector3(CGFloat(xRot), CGFloat(yRot), 0) // Rotate
}
}
And you call the function using:
cameraNode.lookAt(SCNVector3(0, 5, 0))
Hope this helps people in the future!
There are three ways to express a 3D rotation in SceneKit:
What you're doing on paper is calculating separate angles around the x, y, and z axes. These are called Euler angles, or pitch, yaw, and roll. You might get results that more resemble your hand-calculations if you use eulerAngles or simdEulerAngles instead of `rotation. (Or you might not, because one of the difficulties of an Euler-angle system is that you have to apply each of those three rotations in the correct order.)
simdRotation or rotation uses a four-component vector (float4 or SCNVector4) to express an axis-angle representation of the rotation. This relies on a bit of math that isn't obvious for many newcomers to 3D graphics: the result of any sequence of rotations around different axes can be minimally expressed as a single rotation around a new axis.
For example, a rotation of π/2 radians (90°) around the z-axis (0,0,1) followed by a rotation of π/2 around the y-axis (0,1,0) has the same result as a rotation of 2π/3 around the axis (-1/√3, 1/√3, 1/√3).
This is where you're getting confused about the x, y, z, and w components of a SceneKit rotation vector — the first three components are lengths, expressing a 3D vector, and the fourth is a rotation in radians around that vector.
Quaternions are another way to express 3D rotation (and one that's even further off the beaten path for those of us with the formal math education common to undergraduate computer science curricula, but not crazy advanced, either). These have lots of great features for 3D graphics, like being easy to compose and interpolate between. In SceneKit, the simdOrientation or orientation property lets you work with a node's rotation as a quaternion.
Explaining how quaternions work is too much for one SO answer, but the practical upshot is this: if you're working with a good vector math library (like the SIMD library built into iOS 9 and later), you can basically treat them as opaque — just convert from whichever other rotation representation is easiest for you, and reap the benefits.
How can I calculate distance from camera to a point on a ground plane from an image?
I have the intrinsic parameters of the camera and the position (height, pitch).
Is there any OpenCV function that can estimate that distance?
You can use undistortPoints to compute the rays backprojecting the pixels, but that API is rather hard to use for your purpose. It may be easier to do the calculation "by hand" in your code. Doing it at least once will also help you understand what exactly that API is doing.
Express your "position (height, pitch)" of the camera as a rotation matrix R and a translation vector t, representing the coordinate transform from the origin of the ground plane to the camera. That is, given a point in ground plane coordinates Pg = [Xg, Yg, Zg], its coordinates in camera frame are given by
Pc = R * Pg + t
The camera center is Cc = [0, 0, 0] in camera coordinates. In ground coordinates it is then:
Cg = inv(R) * (-t) = -R' * t
where inv(R) is the inverse of R, R' is its transpose, and the last equality is due to R being an orthogonal matrix.
Let's assume, for simplicity, that the the ground plane is Zg = 0.
Let K be the matrix of intrinsic parameters. Given a pixel q = [u, v], write it in homogeneous image coordinates Q = [u, v, 1]. Its location in camera coordinates is
Qc = Ki * Q
where Ki = inv(K) is the inverse of the intrinsic parameters matrix. The same point in world coordinates is then
Qg = R' * Qc + Cg
All the points Pg = [Xg, Yg, Zg] that belong to the ray from the camera center through that pixel, expressed in ground coordinates, are then on the line
Pg = Cg + lambda * (Qg - Cg)
for lambda going from 0 to positive infinity. This last formula represents three equations in ground XYZ coordinates, and you want to find the values of X, Y, Z and lambda where the ray intersects the ground plane. But that means Zg=0, so you have only 3 unknowns. Solve them (you recover lambda from the 3rd equation, then substitute in the first two), and you get Xg and Yg of the solution to your problem.
I'm trying to estimate my device position related to a QR code in space. I'm using ARKit and the Vision framework, both introduced in iOS11, but the answer to this question probably doesn't depend on them.
With the Vision framework, I'm able to get the rectangle that bounds a QR code in the camera frame. I'd like to match this rectangle to the device translation and rotation necessary to transform the QR code from a standard position.
For instance if I observe the frame:
* *
B
C
A
D
* *
while if I was 1m away from the QR code, centered on it, and assuming the QR code has a side of 10cm I'd see:
* *
A0 B0
D0 C0
* *
what has been my device transformation between those two frames? I understand that an exact result might not be possible, because maybe the observed QR code is slightly non planar and we're trying to estimate an affine transform on something that is not one perfectly.
I guess the sceneView.pointOfView?.camera?.projectionTransform is more helpful than the sceneView.pointOfView?.camera?.projectionTransform?.camera.projectionMatrix since the later already takes into account transform inferred from the ARKit that I'm not interested into for this problem.
How would I fill
func get transform(
qrCodeRectangle: VNBarcodeObservation,
cameraTransform: SCNMatrix4) {
// qrCodeRectangle.topLeft etc is the position in [0, 1] * [0, 1] of A0
// expected real world position of the QR code in a referential coordinate system
let a0 = SCNVector3(x: -0.05, y: 0.05, z: 1)
let b0 = SCNVector3(x: 0.05, y: 0.05, z: 1)
let c0 = SCNVector3(x: 0.05, y: -0.05, z: 1)
let d0 = SCNVector3(x: -0.05, y: -0.05, z: 1)
let A0, B0, C0, D0 = ?? // CGPoints representing position in
// camera frame for camera in 0, 0, 0 facing Z+
// then get transform from 0, 0, 0 to current position/rotation that sees
// a0, b0, c0, d0 through the camera as qrCodeRectangle
}
====Edit====
After trying number of things, I ended up going for camera pose estimation using openCV projection and perspective solver, solvePnP This gives me a rotation and translation that should represent the camera pose in the QR code referential. However when using those values and placing objects corresponding to the inverse transformation, where the QR code should be in the camera space, I get inaccurate shifted values, and I'm not able to get the rotation to work:
// some flavor of pseudo code below
func renderer(_ sender: SCNSceneRenderer, updateAtTime time: TimeInterval) {
guard let currentFrame = sceneView.session.currentFrame, let pov = sceneView.pointOfView else { return }
let intrisics = currentFrame.camera.intrinsics
let QRCornerCoordinatesInQRRef = [(-0.05, -0.05, 0), (0.05, -0.05, 0), (-0.05, 0.05, 0), (0.05, 0.05, 0)]
// uses VNDetectBarcodesRequest to find a QR code and returns a bounding rectangle
guard let qr = findQRCode(in: currentFrame) else { return }
let imageSize = CGSize(
width: CVPixelBufferGetWidth(currentFrame.capturedImage),
height: CVPixelBufferGetHeight(currentFrame.capturedImage)
)
let observations = [
qr.bottomLeft,
qr.bottomRight,
qr.topLeft,
qr.topRight,
].map({ (imageSize.height * (1 - $0.y), imageSize.width * $0.x) })
// image and SceneKit coordinated are not the same
// replacing this by:
// (imageSize.height * (1.35 - $0.y), imageSize.width * ($0.x - 0.2))
// weirdly fixes an issue, see below
let rotation, translation = openCV.solvePnP(QRCornerCoordinatesInQRRef, observations, intrisics)
// calls openCV solvePnP and get the results
let positionInCameraRef = -rotation.inverted * translation
let node = SCNNode(geometry: someGeometry)
pov.addChildNode(node)
node.position = translation
node.orientation = rotation.asQuaternion
}
Here is the output:
where A, B, C, D are the QR code corners in the order they are passed to the program.
The predicted origin stays in place when the phone rotates, but it's shifted from where it should be. Surprisingly, if I shift the observations values, I'm able to correct this:
// (imageSize.height * (1 - $0.y), imageSize.width * $0.x)
// replaced by:
(imageSize.height * (1.35 - $0.y), imageSize.width * ($0.x - 0.2))
and now the predicted origin stays robustly in place. However I don't understand where the shift values come from.
Finally, I've tried to get an orientation fixed relatively to the QR code referential:
var n = SCNNode(geometry: redGeometry)
node.addChildNode(n)
n.position = SCNVector3(0.1, 0, 0)
n = SCNNode(geometry: blueGeometry)
node.addChildNode(n)
n.position = SCNVector3(0, 0.1, 0)
n = SCNNode(geometry: greenGeometry)
node.addChildNode(n)
n.position = SCNVector3(0, 0, 0.1)
The orientation is fine when I look at the QR code straight, but then it shifts by something that seems to be related to the phone rotation:
Outstanding questions I have are:
How do I solve the rotation?
where do the position shift values come from?
What simple relationship do rotation, translation, QRCornerCoordinatesInQRRef, observations, intrisics verify? Is it O ~ K^-1 * (R_3x2 | T) Q ? Because if so that's off by a few order of magnitude.
If that's helpful, here are a few numerical values:
Intrisics matrix
Mat 3x3
1090.318, 0.000, 618.661
0.000, 1090.318, 359.616
0.000, 0.000, 1.000
imageSize
1280.0, 720.0
screenSize
414.0, 736.0
==== Edit2 ====
I've noticed that the rotation works fine when the phone stays horizontally parallel to the QR code (ie the rotation matrix is [[a, 0, b], [0, 1, 0], [c, 0, d]]), no matter what the actual QR code orientation is:
Other rotation don't work.
Coordinate systems' correspondence
Take into consideration that Vision/CoreML coordinate system doesn't correspond to ARKit/SceneKit coordinate system. For details look at this post.
Rotation's direction
I suppose the problem is not in matrix. It's in vertices placement. For tracking 2D images you need to place ABCD vertices counter-clockwise (the starting point is A vertex located in imaginary origin x:0, y:0). I think Apple Documentation on VNRectangleObservation class (info about projected rectangular regions detected by an image analysis request) is vague. You placed your vertices in the same order as is in official documentation:
var bottomLeft: CGPoint
var bottomRight: CGPoint
var topLeft: CGPoint
var topRight: CGPoint
But they need to be placed the same way like positive rotation direction (about Z axis) occurs in Cartesian coordinates system:
World Coordinate Space in ARKit (as well as in SceneKit and Vision) always follows a right-handed convention (the positive Y axis points upward, the positive Z axis points toward the viewer and the positive X axis points toward the viewer's right), but is oriented based on your session's configuration. Camera works in Local Coordinate Space.
Rotation direction about any axis is positive (Counter-Clockwise) and negative (Clockwise). For tracking in ARKit and Vision it's critically important.
The order of rotation also makes sense. ARKit, as well as SceneKit, applies rotation relative to the node’s pivot property in the reverse order of the components: first roll (about Z axis), then yaw (about Y axis), then pitch (about X axis). So the rotation order is ZYX.
Math (Trig.):
Notes: the bottom is l (the QR code length), the left angle is k, and the top angle is i (the camera)
I'm studied the pARK example project (http://developer.apple.com/library/IOS/#samplecode/pARk/Introduction/Intro.html#//apple_ref/doc/uid/DTS40011083) so I can apply some of its fundamentals in an app i'm working on. I understand nearly everything, except:
The way it has to calculate if a point of interest must appear or not. It gets the attitude, multiply it with the projection matrix (to get the rotation in GL coords?), then multiply that matrix with the coordinates of the point of interest and, at last, look at the last coordinate of that vector to find out if the point of interest must be shown. Which are the mathematic fundamentals of this?
Thanks a lot!!
I assume you are referring to the following method:
- (void)drawRect:(CGRect)rect
{
if (placesOfInterestCoordinates == nil) {
return;
}
mat4f_t projectionCameraTransform;
multiplyMatrixAndMatrix(projectionCameraTransform, projectionTransform, cameraTransform);
int i = 0;
for (PlaceOfInterest *poi in [placesOfInterest objectEnumerator]) {
vec4f_t v;
multiplyMatrixAndVector(v, projectionCameraTransform, placesOfInterestCoordinates[i]);
float x = (v[0] / v[3] + 1.0f) * 0.5f;
float y = (v[1] / v[3] + 1.0f) * 0.5f;
if (v[2] < 0.0f) {
poi.view.center = CGPointMake(x*self.bounds.size.width, self.bounds.size.height-y*self.bounds.size.height);
poi.view.hidden = NO;
} else {
poi.view.hidden = YES;
}
i++;
}
}
This is performing an OpenGL like vertex transformation on the places of interest to check if they are in a viewable frustum. The frustum is created in the following line:
createProjectionMatrix(projectionTransform, 60.0f*DEGREES_TO_RADIANS, self.bounds.size.width*1.0f / self.bounds.size.height, 0.25f, 1000.0f);
This sets up a frustum with a 60 degree field of view, a near clipping plane of 0.25 and a far clipping plane of 1000. Any point of interest that is further away than 1000 units will then not be visible.
So, to step through the code, first the projection matrix that sets up the frustum, and the camera view matrix, which simply rotates the object so it is the right way up relative to the camera, are multiplied together. Then, for each place of interest, its location is multiplied by the viewProjection matrix. This will project the location of the place of interest into the view frustum, applying rotation and perspective.
The next two lines then convert the transformed location of the place into whats known as normalized device coordinates. The 4 component vector needs to be collapsed to 3 dimensional space, this is achieved by projecting it onto the plane w == 1, by dividing the vector by its w component, v[3]. It is then possible to determine if the point lies within the projection frustum by checking if its coordinates lie in the cube with side length 2 with origin [0, 0, 0]. In this case, the x and y coordinates are being biased from the range [-1 1] to [0 1] to match up with the UIKit coordinate system, by adding 1 and dividing by 2.
Next, the v[2] component, z, is checked to see if it is greater than 0. This is actually incorrect as it has not been biased, it should be checked to see if it is greater than -1. This will detect if the place of interest is in the first half of the projection frustum, if it is then the object is deemed visible and displayed.
If you are unfamiliar with vertex projection and coordinate systems, this is a huge topic with a fairly steep learning curve. There is however a lot of material online covering it, here are a couple of links to get you started:
http://www.falloutsoftware.com/tutorials/gl/gl0.htm
http://www.opengl.org/wiki/Vertex_Transformation
Good luck//