Related
I am trying to estimate the 3D position of a world coordinate from 2 frames. The frames are captured with the same camera from different positions. The problem is, the estimation is wrong.
I have
Camera Intrinsic parameters
K = [4708.29296875, 0, 1218.51806640625;
0, 4708.8935546875, 1050.080322265625;
0, 0, 1]
Translation and Rotation data:
Frame X-Coord Y-Coord Z-Coord(altitude) Pitch Roll Yaw
1 353141.23 482097.85 38.678 0.042652439 1.172694124 16.72142499
2 353141.82 482099.69 38.684 0.097542931 1.143224387 16.79931141
Note: GPS data uses cartesian coordinate system (X,Y,Z Coordinates) is in meter units based on British National Grid GPS system.
To get the rotation matrix I used
https://stackoverflow.com/a/56666686/16432598 which is based on http://www.tobias-weis.de/triangulate-3d-points-from-3d-imagepoints-from-a-moving-camera/.
Using above data I calculate Extrinsic Parameters and the Projection Matrices as follows.
Rt0 = [-0.5284449976982357, 0.308213375891041, -0.7910438668806931, 353141.21875;
-0.8478960766271159, -0.2384055118949635, 0.4735346398506075, 482097.84375;
-0.04263950806535898, 0.9209600028339713, 0.3873171123665929, 38.67800140380859]
Rt1 = [-0.4590975294881605, 0.3270290779984009, -0.8260032933114635, 353141.8125;
-0.8830316937622665, -0.2699087096524321, 0.3839326975722462, 482099.6875;
-0.097388326965866, 0.905649640091175, 0.4126914624432091, 38.68399810791016]
P = K * Rt;
P1 = [-2540.030877954028, 2573.365272473235, -3252.513377560185, 1662739447.059914;
-4037.427278644764, -155.5442017945203, 2636.538291686695, 2270188044.171295;
-0.04263950806535898, 0.9209600028339713, 0.3873171123665929, 38.67800140380859]
P2 = [-2280.235105924588, 2643.299156802081, -3386.193495224041, 1662742249.915956;
-4260.36781710715, -319.9665173096691, 2241.257388910372, 2270196732.490808;
-0.097388326965866, 0.905649640091175, 0.4126914624432091, 38.68399810791016]
triangulatePoints(Points2d, projection_matrices, out);
Now, I pick the same point in both images for triangulation
p2d_1(205,806) and p2d_2(116,813)
For the 3D position of this particular point I expect something like;
[353143.7, 482130.3, 40.80]
whereas I calculate
[549845.5109014747, -417294.6070425579, -201805.410744677]
I know that my Intrinsic parameters and GPS data is very accurate.
Can anybody tell me what is missing or what do I do wrong here?
Thanks
I am running a face tracking configuration in ARKit with SceneKit, in each frame i can access the camera feed via the snapshot property or the capturedImage as a buffer, i have also been able to map each face vertex to the image coordinate space and add some UIView helpers(1 point squares) to display in realtime all the face vertices on the screen, like this:
func renderer(_ renderer: SCNSceneRenderer, didUpdate node: SCNNode, for anchor: ARAnchor) {
guard let faceGeometry = node.geometry as? ARSCNFaceGeometry,
let anchorFace = anchor as? ARFaceAnchor,
anchorFace.isTracked
else { return }
let vertices = anchorFace.geometry.vertices
for (index, vertex) in vertices.enumerated() {
let vertex = sceneView.projectPoint(node.convertPosition(SCNVector3(vertex), to: nil))
let xVertex = CGFloat(vertex.x)
let yVertex = CGFloat(vertex.y)
let newPosition = CGPoint(x: xVertex, y: yVertex)
// Here i update the position of each UIView in the screen with the calculated vertex new position, i have an array of views that matches the vertex count that is consistent across sessions.
}
}
Since the UV coordinates are also constant across sessions, i am trying to draw for each pixel that is over the face mesh its corresponding position in the UV texture so i can get, after some iterations, a persons face texture to a file.
I have come to some theorical solutions, like creating CGPaths for each triangle, and ask for each pixel if it is contained in that triangle and if it is, create a triangular image, cropping a rectangle and then applying a triangle mask obtained from the points projected by the triangle vertices in the image coordinates, so in this fashion i can obtain a triangular image that has to be translated to the underlying triangle transform (like skewing it in place), and then, in a UIView (1024x1024) add each triangle image as UIImageView as a sub view, and finally encode that UIView as PNG, this sounds like a lot of work, specifically the part of matching the cropped triangle with the UV texture corresponding triangle.
In the Apple demo project there is an image that shows how that UV texture looks like, if you edit this image and add some colors it will then show up in the face, but i need the other way around, from what i am seeing in the camera feed, create a texture of your face, in the same demo project there is an example that does exactly what i need but with a shader, and with no clues on how to extract the texture to a file, the shader codes looks like this:
/*
<samplecode>
<abstract>
SceneKit shader (geometry) modifier for texture mapping ARKit camera video onto the face.
</abstract>
</samplecode>
*/
#pragma arguments
float4x4 displayTransform // from ARFrame.displayTransform(for:viewportSize:)
#pragma body
// Transform the vertex to the camera coordinate system.
float4 vertexCamera = scn_node.modelViewTransform * _geometry.position;
// Camera projection and perspective divide to get normalized viewport coordinates (clip space).
float4 vertexClipSpace = scn_frame.projectionTransform * vertexCamera;
vertexClipSpace /= vertexClipSpace.w;
// XY in clip space is [-1,1]x[-1,1], so adjust to UV texture coordinates: [0,1]x[0,1].
// Image coordinates are Y-flipped (upper-left origin).
float4 vertexImageSpace = float4(vertexClipSpace.xy * 0.5 + 0.5, 0.0, 1.0);
vertexImageSpace.y = 1.0 - vertexImageSpace.y;
// Apply ARKit's display transform (device orientation * front-facing camera flip).
float4 transformedVertex = displayTransform * vertexImageSpace;
// Output as texture coordinates for use in later rendering stages.
_geometry.texcoords[0] = transformedVertex.xy;
/**
* MARK: Post-process special effects
*/
Honestly i do not have much experience with shaders, so any help would be appreciated in translating the shader info on how to translate to a more Cocoa Touch Swift code, right now i am not thinking yet in performance, so it if has to be done in the CPU like in a background thread or offline is ok, anyway i will have to choose the right frames to avoid skewed samples, or triangles with very good information and some other with barely a few pixels stretched (like checking if the normal of the triangle is pointing to the camera, sample it), or other UI helpers to make the user turns the face to sample all the face correctly.
I have already checked this post and this post but cannot get it to work.
This app does exactly what i need, but they do not seem like using ARKit.
Thanks.
I’m using ARKit with SceneKit. When user presses a button I create an anchor and to the SCNNode corresponding to it I add a 3D object (loaded from a .scn file in the project).
The 3D object is placed facing the camera, with the same orientation the camera has. I would like to make it look like the object is laying on a plane surface and not inclined if it is that way. So, if I got it right, I’d need to apply a rotation transformation so that it’s rotation around the X and Z axis become 0.
My attempt at this is: take the node’s x and z eulerAngles, invert them, and rotate that amount around each axis
let rotationZ = rotationMatrixAroundZ(radians: -node.eulerAngles.z)
let rotationX = rotationMatrixAroundX(radians: -node.eulerAngles.x)
let rotationTransform = simd_mul(rotationTransformX, rotationTransformZ)
node.transform = SCNMatrix4(simd_mul(simd_float4x4(node.transform), rotationTransform))
This works all right for most cases, but in some the object is rotated in completely strange ways. Should I be setting the
rotation angle to anything else than just the inverse of the current Euler Angle? Setting the angles to 0 directly did not work at all.
I've come across this and figured out I was running into gimbal lock. The solution was to rotate the node around one axis, parent it to another SCNNode(), then rotate the parent around the other axis. Hope that helps.
You don't have to do the node transform on a matrix, you can simply rotate around a specific axis and that might be a bit simpler in terms of the logic of doing the rotation.
You could do something like:
node.runAction(SCNAction.rotateBy(x: x, y: y, z: z, duration: 0.0))
Not sure if this is the kind of thing you're looking for, but it is simpler than doing the rotation with the SCNMatrix4
Well, I managed a workaround, but I'm not truly happy with it, so I'll leave the question unanswered. Basically I define a threshold of 2 degrees and keep applying those rotations until both Euler Angles around X and Z are below the aforementioned threshold.
func layDownNode(_ node: SCNNode) {
let maxErrDegrees: Float = 2.0
let maxErrRadians = GLKMathDegreesToRadians(maxErrDegrees)
while (abs(node.eulerAngles.x) > maxErrRadians || abs(node.eulerAngles.z) > maxErrRadians) {
let rotationZ = -node.eulerAngles.z
let rotationX = -node.eulerAngles.x
let rotationTransformZ = rotationMatrixAroundZ(radians: rotationZ)
let rotationTransformX = rotationMatrixAroundX(radians: rotationX)
let rotationTransform = simd_mul(rotationTransformX, rotationTransformZ)
node.transform = SCNMatrix4(simd_mul(simd_float4x4(node.transform), rotationTransform))
}
}
I'm trying to achieve Augmented Reality with SceneKit.
I got a intrinsic camera matrix and a extrinsic matrix by estimating pose of a marker, using ARuco (OpenCV augmented reality library).
And I set up the SCNCamera's projectionTransform with parameters of the intrinsic matrix (fovy, aspect, zNear, zFar).
Normally in OpenGL, world coordinate relative to camera coordinate is calculated with ModelView but in SceneKit, there is no things such as modelView.
So I calculated inverse matrix of the extrinsic matrix to get the camera coordinate relative to the world coordinate(the marker coordinate).
And I think I've got correct camera's position by the inverse matrix which contains rotation and translate matrix.
However I cannot get camera's rotation from that.
Do you have any ideas?
SceneKit has the same view matrixes that you've come across in OpenGL, they're just a little hidden until you start toying with shaders. A little too hidden IMO.
You seem to have most of this figured out. The projection matrix comes from your camera projectionTransform, and the view matrix comes from the inverse of your camera matrix SCNMatrix4Invert(cameraNode.transform). In my case everything was in world coordinates making my model matrix a simple identity matrix.
The code I ended up using to get the classic model-view-projection matrix was something like...
let projection = camera.projectionTransform()
let view = SCNMatrix4Invert(cameraNode.transform)
let model = SCNMatrix4Identity
let viewProjection = SCNMatrix4Mult(view, projection)
let modelViewProjection = SCNMatrix4Mult(model, viewProjection)
For some reason I found SCNMatrix4Mult(...) took arguments in a different order than I was expecting (eg; opposite to GLKMatrix4Multiply(...)).
I'm still not 100% on this, so would welcome edits/tips. Using this method I was unable to get the SceneKit MVP matrix (as passed to shader) to match up with that calculated by the code above... but it was close enough for what I needed.
#lock's answer looks good with a couple additions:
(1) access SCNNode worldTransform instead of transform in case the cameraNode is animated or parented:
let view = SCNMatrix4Invert(cameraNode.presentationNode.worldTransform)
(2) the code doesn't account for the view's aspect ratio. e.g., assuming a perspective projection, you'll want to do:
perspMatrix.m11 /= viewportAR; //if using Yfov -> adjust Y`
/* or, */
perspMatrix.m22 *= viewportAR; //if using Xfov -> adjust X`
Where, viewportAR = viewport.width / viewport.height
Another way to do it is to have one node with a rendered delegate in the scene, and retrieve SceneKit’s matrices from that delegate (they are passed as options):
FOUNDATION_EXTERN NSString * const SCNModelTransform;
FOUNDATION_EXTERN NSString * const SCNViewTransform;
FOUNDATION_EXTERN NSString * const SCNProjectionTransform;
FOUNDATION_EXTERN NSString * const SCNNormalTransform;
FOUNDATION_EXTERN NSString * const SCNModelViewTransform;
FOUNDATION_EXTERN NSString * const SCNModelViewProjectionTransform;
I am trying to applyTorque to a node in my scene. The documentation states:
Each component of the torque vector relates to rotation about the
corresponding axis in the local coordinate system of the SCNNode
object containing the physics body. For example, applying a torque of
{0.0, 0.0, 1.0} causes a node to spin counterclockwise around its
z-axis.
However in my tests it seems that Physics animations do not affect actual position of the object. Therefore, the axis remain static (even though the actual node obviously moves). This results in the torque always being applied from the same direction (wherever the z axes was when the scene was initiated).
I would like to be able to apply torque so that it is always constant in relation to the object (e.g. to cause node to spin counterclockwise around z-axis of the node's presentationNode not the position node had(has?) when the scene was initiated)
SceneKit uses two versions of each node: the model node defines static behavior and the presentation node is what's actually involved in dynamic behavior and used on screen. This division mirrors that used in Core Animation, and enables features like implicit animation (where you can do things like set node.position and have it animate to the new value, without other parts of your code that query node.position having to working about intermediate values during the animation).
Physics operates on the presentation node, but in some cases--like this one--takes input in scene space.
However, the only difference between the presentation node and the scene is in terms of coordinate spaces, so all you need to do is convert your vector from presentation space to scene space. (The root node of the scene shouldn't be getting transformed by physics, actions, or inflight animations, so there's no practical difference between model-scene space and presentation-scene space.) To do that, use one of the coordinate conversion methods SceneKit provides, such as convertPosition:fromNode:.
Here's a Swift playground that illustrates your dilemma:
import Cocoa
import SceneKit
import XCPlayground
// Set up a scene for our tests
let scene = SCNScene()
let view = SCNView(frame: NSRect(x: 0, y: 0, width: 500, height: 500))
view.autoenablesDefaultLighting = true
view.scene = scene
let cameraNode = SCNNode()
cameraNode.camera = SCNCamera()
cameraNode.position = SCNVector3(x: 0, y: 0, z: 5)
scene.rootNode.addChildNode(cameraNode)
XCPShowView("view", view)
// Make a pyramid to test on
let node = SCNNode(geometry: SCNPyramid(width: 1, height: 1, length: 1))
scene.rootNode.addChildNode(node)
node.physicsBody = SCNPhysicsBody.dynamicBody()
scene.physicsWorld.gravity = SCNVector3Zero // Don't fall off screen
// Rotate around the axis that looks into the screen
node.physicsBody?.applyTorque(SCNVector4(x: 0, y: 0, z: 1, w: 0.1), impulse: true)
// Wait a bit, then try to rotate around the y-axis
node.runAction(SCNAction.waitForDuration(10), completionHandler: {
var axis = SCNVector3(x: 0, y: 1, z: 0)
node.physicsBody?.applyTorque(SCNVector4(x: axis.x, y: axis.y, z: axis.z, w: 1), impulse: true)
})
The second rotation effectively spins the pyramid around the screen's y-axis, not the pyramid's y-axis -- the one that goes through the apex of the pyramid. As you noted, it's spinning around what was the pyramid's y-axis as of before the first rotation; i.e. the y-axis of the scene (which is unaffected by physics), not that of the presentation node (that was rotated through physics).
To fix it, insert the following line (after the one that starts with var axis):
axis = scene.rootNode.convertPosition(axis, fromNode: node.presentationNode())
The call to convertPosition:fromNode: says "give me a vector in scene coordinate space that's equivalent to this one in presentation-node space". When you apply a torque around the converted axis, it effectively converts back to the presentation node's space to simulate physics, so you see it spin around the axis you want.
Update: Had some coordinate spaces wrong, but the end result is pretty much the same.
Unfortunately the solution provided by rickster does not work for me :(
Trying to solve this conundrum I have created (what i believe to be) a very sub-standard solution (more a proof of concept). It involves creating (null) objects on the axis i am trying to find, then I use their position to find the vector aligned to the axes.
As I have a fairly complex scene, I am loading it from a COLLADA file. Within that file i have modelled a simple coordinate tripod: three orthogonal cylinders with cones on top (makes it easer to visualise what is going on).
I then constrain this tripod object to the object I am trying to apply torque to. This way I have objects that allow me to retrieve two points on the axes of the presentationNode of the object I am trying to apply torque to. I can then use these two points to determine the vector to apply the torque from.
// calculate orientation vector in the most unimaginative way possible
// retrieve axis tripod objects. We will be using these as guide objects.
// The tripod is constructed as a cylinder called "Xaxis" with a cone at the top.
// All loaded from an external COLLADA file.
SCNNode *XaxisRoot = [scene.rootNode childNodeWithName:#"XAxis" recursively:YES];
SCNNode *XaxisTip = [XaxisRoot childNodeWithName:#"Cone" recursively:NO];
// To devise the vector we will need two points. One is the root of our tripod,
// the other is at the tip. First, we get their positions. As they are constrained
// to the _rotatingNode, presentationNode.position is always the same .position
// because presentationNode returns position in relation to the parent node.
SCNVector3 XaxisRootPos = XaxisRoot.position;
SCNVector3 XaxisTipPos = XaxisTip.position;
// We then convert these two points into _rotatingNode coordinate space. This is
// the coordinate space applyTorque seems to be using.
XaxisRootPos = [_rotatingNode convertPosition:XaxisRootPos fromNode:_rotatingNode.presentationNode];
XaxisTipPos = [_rotatingNode convertPosition:XaxisTipPos fromNode:_rotatingNode.presentationNode];
// Now, we have two *points* in _rotatingNode coordinate space. One is at the center
// of our _rotatingNode, the other is somewhere along it's Xaxis. Subtracting them
// will give us the *vector* aligned to the x axis of our _rotatingNode
GLKVector3 rawXRotationAxes = GLKVector3Subtract(SCNVector3ToGLKVector3(XaxisRootPos), SCNVector3ToGLKVector3(XaxisTipPos));
// we now normalise this vector
GLKVector3 normalisedXRotationAxes = GLKVector3Normalize(rawXRotationAxes);
//finally we are able to apply toque reliably
[_rotatingNode.physicsBody applyTorque:SCNVector4Make(normalisedXRotationAxis.x,normalisedXRotationAxis.y,normalisedXRotationAxis.z, 500) impulse:YES];
As you can probably see, I am quite inexperienced in SceneKit, but even I can see that much easier/optimised solution does exits, but I am unable to find it :(
I recently had this same problem, of how to convert a torque from the local space of the object to the world space required by the applyTorque method. The problem with using the node's convertPosition:toNode and fromNodes methods, is that they are also applying the node's translation to the torque, so this will only work when the node is at 0,0,0. What these methods do is treat the SCNVector3 as if it's a vec4 with a w component of 1.0. We just want to apply the rotation, in other words, we want the w component of the vec4 to be 0. Unlike SceneKit, GLKit gives us 2 options for how we want our vec3s to be multiplied:
GLKMatrix4MultiplyVector3 where
The input vector is treated as it were a 4-component vector with a w-component of 0.0.
and GLKMatrix4MultiplyVector3WithTranslation where
The input vector is treated as it were a 4-component vector with a w-component of 1.0.
What we want here is the former, just the rotation, not the translation.
So, we could roundtrip to GLKit. To convert for instance the local x axis (1,0,0), eg a pitch rotation, to the global axis needed for apply torque, would look like this:
let local = GLKMatrix4MultiplyVector3(SCNMatrix4ToGLKMatrix4(node.presentationNode.worldTransform), GLKVector3(v: (1,0,0)))
node.physicsBody?.applyTorque(SCNVector4(local.x, local.y, local.z, 10), impulse: false)
However, a more Swiftian approach would be to add a * operator for mat4 * vec3 which treats the vec3 like a vec4 with a 0.0 w component. Like this:
func * (left: SCNMatrix4, right: SCNVector3) -> SCNVector3 { //multiply mat4 by vec3 as if w is 0.0
return SCNVector3(
left.m11 * right.x + left.m21 * right.y + left.m31 * right.z,
left.m12 * right.x + left.m22 * right.y + left.m32 * right.z,
left.m13 * right.x + left.m23 * right.y + left.m33 * right.z
)
}
Although this operator makes an assumption about how we want our vec3s to be multiplied, my reasoning here is that as the convertPosition methods already treat w as 1, it would be redundant to have a * operator that also did this.
You could also add a mat4 * SCNVector4 operator that would let the user explicity choose whether or not they want w to be 0 or 1.
So, instead of having to roundtrip from SceneKit to GLKit, we can just write:
let local = node.presentationNode.worldTransform * SCNVector3(1,0,0)
node.physicsBody?.applyTorque(SCNVector4(local.x, local.y, local.z, 10), impulse: false)
You can use this method to apply rotation on multiple axes with one applyTorque call. So say if you have stick input where you want x on the stick to be yaw (local yUp-axis) and y on the stick to be pitch (local x-axis), but with flight-sim style "down to pull back/ up", then you could set it to SCNVector3(input.y, -input.x, 0)