Understanding 3D object transform property in ARKit

Understanding 3D object transform property in ARKit - ios

Let's say I add a 3D model such as a dog as a child node to my scene's root node in ViewDidLoad. I printed out the dog node's transform and worldTransform properties, both of which are just 4x4 identity matrices.
After rotating, scaling, and positioning, I re-printed the transform and worldTransform properties. I could not understand how to read them. Which column refers to position, size, or orientation?
Under any transform, how do I figure out 1) which direction the front of the dog is facing, assuming that in viewDidLoad the front was facing (0,0,-1) direction, and 2) the height and width of the dog?

A full introduction to transform matrices is a) beyond the scope of a simple SO answer and b) such a basic topic in 3D graphics programming that you can find a zillion or two books, tutorials, and resources on the topic. Here are two decent writeups:
Linear Algebra for Graphics Programming at metalbyexample.com
Transformations at learnopengl.com
Since you're working with ARKit in SceneKit, though, there are a number of convenience utilities for working with transforms, so you often don't need to dig into the math.
which direction the front of the dog is facing?
A node is always "facing" toward (0,0,-1) in its local coordinate space. (Note this is SceneKit's intrinsic notion of "facing", which may or may not map to how any custom assets are designed. If you've imported an OBJ, DAE, or whatever file built in a 3D authoring tool, and in that tool the dog's nose is pointed to (0,0,-1), your dog is facing the "right" way.)
You can get this direction from SCNNode.simdLocalFront — notice it's a static/class property, because in local space the front is always the same direction for all nodes.
What you're probably more interested in is how the node's own idea of its "front" converts to world space — that is, which way is the dog facing relative to the rest of the scene. Here are two ways to get that:
Convert the simdLocalFront to world space, the way you can any other vector: node.convert(node.simdLocalFront, to: nil). (Notice that if you leave the to parameter nil, you convert to world space.)
Use the simdWorldFront property, which does that conversion for you.
the height and width of the dog?
Height and width have nothing to do with transform. A transform tells you (primarily) where something is and which way it's facing.
Assuming you haven't scaled your node, though, its bounding box in local space describes its dimensions in world space:
let (min, max) = node.boundingBox
let height = abs(max.y - min.y)
let width = abs(max.x - min.x)
let depthiness = abs(max.z - min.z)
(Note this is intrinsic height, though: if you have, say, a person model that's standing up, and then you rotate it so they're lying down, this height is still head-to-toe.)

Related

How do I get the picture to stick to the wall？

i'm using arkit and realitykit,first load a room ,then put the picture on the wall ,but how to let it stick on the wall?
how to get the right rotation of the picture?
There are four possibilities,wall's front: z+, z-, x+, x-,

SceneKit
Answering your first question, to make the node stick to a particular place you need to just keep a desired location and add the node, it should not disappear unless you remove it. It may move up and down a little when you change phone orientation and position, but not much
node.position = SCNVector3(x, y, z)
If you are using touches to define the location, I would recommend you considering sceneView.hitTest() and touchesBegan() functions
As for rotation, you simply can use Euler angles on a node of your interest along any axis, and in any needed direction
node.eulerAngles.x = -.pi / 2
Also, I would highly recommend you “App Development with Swift” by Apple Education, 2019. In this book there is a whole chapter on ARKit, besides answering your questions, it has numerous useful techniques and ideas
Here you can find the implementation of the end-of-the-chapter guided project from the App Development with Swift book, but doing it yourself would be much more useful
RealityKit
As given here, you can use one of two options for rotation
let boxAnchor = try! Experience.loadBox()
boxAnchor.steelBox?.orientation = simd_quatf(angle: .pi/4, axis: [0, 0, 1])
For angle property you insert how much you want to rotate the object in radians, and for axis property you select the axis that you want to rotate around.
Another option is to use transform
boxAnchor.steelBox?.transform = Transform(pitch: 0, yaw: 0, roll: .pi/4)
Roll, pitch, and yaw represent rotation along a particular axis, more is written here.
To change the position you can again use transform's translation
steelBox.transform.translation = [0, 0, -0.5]
This will translate your object according to the given parameters. This function heavily relies on affine transforms.
Regarding transformations w.r.t. other objects
Nodes in SceneKit as well as Entities in RealityKit are transformed and rotated with respect to the parent Node or Entity.
So, in your case, you have a big model where you plan to put smaller objects.
You have two options, either use touches detection on the big model (house) and them manually calculate where the object should be placed, which may be pretty cumbersome.
Another option is to add small transparent planes at predefined positions and then adding touches detection on them. This way you can omit calculation of where the smaller object should be placed
Good luck!

finding the depth in arkit with SCNVector3Make

the goal of the project is to create a drawing app. i want it so that when i touch the screen and move my finger it will follow the finger and leave a cyan color paint. i did created it BUT there is one problem. the paint DEPTH is always randomly placed.
here is the code, just need to connect the sceneView with the storyboard.
https://github.com/javaplanet17/test/blob/master/drawingar
my question is how do i make the program so that the depth will always be consistent, by consistent i mean there is always distance between the paint and the camera.
if you run the code above you will see that i have printed out all the SCNMatrix4, but i none of them is the DEPTH.
i have tried to change hitTransform.m43 but it only messes up the x and y.

If you want to get a point some consistent distance in front of the camera, you don’t want a hit test. A hit test finds the real world surface in front of the camera — unless your camera is pointed at a wall that’s perfectly parallel to the device screen, you’re always going to get a range of different distances.
If you want a point some distance in front of the camera, you need to get the camera’s position/orientation and apply a translation (your preferred distance) to that. Then to place SceneKit content there, use the resulting matrix to set the transform of a SceneKit node.
The easiest way to do this is to stick to SIMD vector/matrix types throughout rather than converting between those and SCN types. SceneKit adds a bunch of new accessors in iOS 11 so you can use SIMD types directly.
There’s at least a couple of ways to go about this, depending on what result you want.
Option 1
// set up z translation for 20 cm in front of whatever
// last column of a 4x4 transform matrix is translation vector
var translation = matrix_identity_float4x4
translation.columns.3.z = -0.2
// get camera transform the ARKit way
let cameraTransform = view.session.currentFrame.camera.transform
// if we wanted, we could go the SceneKit way instead; result is the same
// let cameraTransform = view.pointOfView.simdTransform
// set node transform by multiplying matrices
node.simdTransform = cameraTransform * translation
This option, using a whole transform matrix, not only puts the node a consistent distance in front of your camera, it also orients it to point the same direction as your camera.
Option 2
// distance vector for 20 cm in front of whatever
let translation = float3(x: 0, y: 0, z: -0.2)
// treat distance vector as in camera space, convert to world space
let worldTranslation = view.pointOfView.simdConvertPosition(translation, to: nil)
// set node position (not whole transform)
node.simdPosition = worldTranslation
This option sets only the position of the node, leaving its orientation unchanged. For example, if you place a bunch of cubes this way while moving the camera, they’ll all be lined up facing the same direction, whereas with option 1 they’d all be in different directions.
Going beyond
Both of the options above are based only on the 3D transform of the camera — they don’t take the position of a 2D touch on the screen into account.
If you want to do that, too, you’ve got more work cut out for you — essentially what you’re doing is hit testing touches not against the world, but against a virtual plane that’s always parallel to the camera and a certain distance away. That plane is a cross section of the camera projection frustum, so its size depends on what fixed distance from the camera you place it at. A point on the screen projects to a point on that virtual plane, with its position on the plane scaling proportional to the distance from the camera (like in the below sketch):
So, to map touches onto that virtual plane, there are a couple of approaches to consider. (Not giving code for these because it’s not code I can write without testing, and I’m in an Xcode-free environment right now.)
Make an invisible SCNPlane that’s a child of the view’s pointOfView node, parallel to the local xy-plane and some fixed z distance in front. Use SceneKit hitTest (not ARKit hit test!) to map touches to that plane, and use the worldCoordinates of the hit test result to position the SceneKit nodes you drop into your scene.
Use Option 1 or Option 2 above to find a point some fixed distance in front of the camera (or a whole translation matrix oriented to match the camera, translated some distance in front). Use SceneKit’s projectPoint method to find the normalized depth value Z for that point, then call unprojectPoint with your 2D touch location and that same Z value to get the 3D position of the touch location with your camera distance. (For extra code/pointers, see my similar technique in this answer.)

SceneKit unproject Z documentation explanation?

I am going through some SceneKit concepts and one I am trying to solidify in my head is unprojectPoint.
I understand that the function will take a point in 2D and return a point in 3D (so one with the proper Z value).
When I read the documentation I read this:
/*!
#method unprojectPoint
#abstract Unprojects a screenspace 2D point with depth info using the receiver's current point of view and viewport.
#param point The screenspace position to be unprojected.
#discussion A point whose z component is 0 (resp. 1) is unprojected on the near (resp. far) clip plane.
*/
public func unprojectPoint(_ point: SCNVector3) -> SCNVector3
What I am not too clear on is the values 0 and 1 used when it talks about Z....
A point whose z component is 0 (resp. 1) is unprojected on the near (resp. far) clip plane.
As I was reading around online I then found this question:
How to use iOS (Swift) SceneKit SCNSceneRenderer unprojectPoint properly
When I deal with a SceneKit view, is Z = 0 always the near plane, and Z = 1 the far plane? If so, why? And, is Z = 0 and Z = 1 just normalized values?
So, can somebody help me understand why the value 0 and 1 are used for Z in this context? And ultimately help me understand the:
A point whose z component is 0 (resp. 1) is unprojected on the near (resp. far) clip plane.
statement?

Perspective projection is the task of converting a point from the 3D space used for modeling your scene into the 2D pixel space of the view your scene is rendered in. It's something the GPU does thousands of times per frame during rendering.
But it's not entirely a 3D-to-2D conversion. It's important during rendering to sort out which objects are nearer to or farther from the camera (so they obscure each other properly), so perspective projection also outputs a normalized depth component, where lower values indicate a point nearer to the camera (and vice versa). (Values are normalized because at this point all that's needed is relative depth. And/or for reasons of traditional 3D graphics math history and GPU design.) This information gets used during rendering but effectively thrown away afterward — all you see is the 2D view.
"Unprojecting" a point is doing the same thing in reverse: given a point in 2D screen space, you want a point in 3D scene space. But a 2D point in screen space corresponds to a line in 3D space: each pixel in your view is looking along a ray into the 3D scene, and what you see in that pixel comes from the first 3D object that ray intersects.
Thus, to unproject a 2D point into 3D scene space, you need the 2D point itself to define a ray into the 3D scene, then a normalized depth value to decide how far along the ray you want the resulting 3D point to be. (Beware, normalized depth doesn't linearly correspond to distance because of perspective division.)
If you don't know the depth of the point your looking for, there are two things to consider...
Are you actually looking for the scene content (geometry) "behind" a specific pixel? If so, a hit test is more likely what you need.
Is the 3D point you want to get from unprojecting related to another point, enabling you to derive the Z value? There are a few other Q&As around here for that: How to use iOS (Swift) SceneKit SCNSceneRenderer unprojectPoint properly, How to convert 2D point to 3D using SceneKit's unprojectPoint without having a depth value?

Is there a reverse function of lookat for glMatrix?

I am using the glMatrix to code Webgl and want to get the eye position, focal point and up direction from the existing projection and view matrix (kinda like the reverse of lookat function). Is there any way to do this?

I didn't implement one, no. I'm not even sure that you could decompose it into the original vectors, for that matter. The lookAt point could be anywhere along a ray from the origin, and how would you determine what the appropriate up vector was? I'm thinking this is a one-way algorithm (just too lazy to prove it!)
Beyond that, however, I question wether you would want to do this even if there was a method for it. I'll be willing to bet that it's almost always more beneficial to track the values you're using and manipulate them rather than to try and pull them back and forth from matrix to vectors and back.

Yes and No: Yes you can invert the model view transformation and no you will not get exactly all three vectors the same.
The model view transformation of lookAt is very similar to the connectTo operation as used in CSG models. It is mounting your scene in front of your camera. This is done by translation and three axis rotations. The eye point is translated to (0,0,0) and all further rotation is done around it. You can easily derive the eye point by transforming (0,0,0) with the inverse matrix.
But the center point is just used for adjusting the axis of view along the -Z axis. In openGL the eye is facing to -Z. The distance between center and eye is lost. So you can easy get a center point along your axis of view if you define the distance yourself. Let's say we want a distance of d. Then we just need to transform (0,0,-d) with the inverse matrix and we get a valid center point, but not exactly the same. The center point is defining only two rotation angles, the camera pan and tilt.
Even more worse is the reconstruction of the up vector. It is only used for the roll angle of the camera and thus only for one scalar value. Thus for the inverse transformation you can not only choose any positive value along the Y axis, you could choose any point in the YZ plane with a positive Y value. To get a up vector perfectly normal to the viewing axis and of size 1 we just transform (0,1,0) with the inverse matrix. Remember to transform as vector this time (not as point).
Now we have eye, center and up reconstructed in a way to get exactly the same result of lookAt next time. But since this matrix contains only 6 values of information (translation,pan,tilt,roll) we had to choose 3 values that were lost (distance center to eye, size and angle of up vector in YZ plane of camera).
The model view matrix can of course do other transformation (any affine) but the lookAt function is using this matrix only for translation and rotation. It is adjusting the scene in front of the camera without distorting it.

XNA rotation over given vector

I'm newbie in XNA, so sorry about the simple question, but I can't find any solution.
I've got simple model (similar to flat cuboid), which I cannot change (model itself). I would like to create rotate animation. In this particular problem, my model is just a cover of piano. However, the axis over which I'm going to rotate is covered by cover's median. As a result, my model is rotating like a turbine, instead of opening and closing.
I would like to rotate my object over given "line". I found Matrix.CreateLookAt(currentPosition, dstPosition, Vector.Up); method, but still don't know how o combine rotation with such matrix.

Matrix.CreateLookAt is meant for use in a camera, not for manipulating models (although I'm sure some clever individuals who understand what sort of matrix it creates have done so).
What you are wanting to do is rotate your model around an arbitrary axis in space. It's not an animation (those are created in 3D modeling software, not the game), it's a transformation. Transformations are methods by which you can move, rotate and scale a model, and are obviously the crux of 3D game graphics.
For your problem, you want to rotate this flat piece around its edge, yes? To do this, you will combine translation and axis rotation.
First, you want to move the model so the edge you want to rotate around intersects with the origin. So, if the edge was a straight line in the Z direction, it would be perfectly aligned with the Z axis and intersecting 0,0,0. To do this you will need to know the dimensions of your model. Once you have those, create a Matrix:
Matrix originTranslation = Matrix.CreateTranslation(new Vector3(-modelWidth / 2f, 0, 0))
(This assumes a square model. Manipulate the Vector3 until the edge you want is intersecting the origin)
Now, we want to do the rotating. This depends on the angle of your edge. If your model is a square and thus the edge is straight forward in the Z direction, we can just rotate around Vector3.Forward. However, if your edge is angled (as I imagine a piano cover to be), you will have to determine the angle yourself and create a Unit Vector with that same angle. Now you will create another Matrix:
Matrix axisRotation = Matrix.CreateFromAxisAngle(myAxis, rotation)
where myAxis is the unit vector which represents the angle of the edge, and rotation is a float for the number of radians to rotate.
That last bit is the key to your 'animation'. What you are going to want to do is vary that float amount depending on how much time has passed to create an 'animation' of the piano cover opening over time. Of course you will want to clamp it at an upper value, or your cover will just keep rotating.
Now, in order to actually transform your cover model, you must multiply its world matrix by the two above matrices, in order.
pianoCover.World *= originTranslation * axisRotation;
then if you wish you can translate the cover back so that its center is at the origin (by multiplying by a Transform Matrix with the Vector3 values negative of what you first had them), and then subsequently translate your cover to wherever it needs to be in space using another Transform Matrix to that point.
So, note how matrices are used in 3D games. A matrix is created using the appropriate Matrix method in order to create qualities which you desire (translation, rotation around and axis, scale, etc). You make a matrix for each of these properties. Then you multiply them in a specific order (order matters in matrix multiplication) to transform your model as you wish. Often, as seen here, these transformations are intermediate in order to get the desired effect (we could not simply move the cover to where we wanted it then rotate it around its edge; we had to move the edge to the origin, rotate, move it back, etc).
Working with matrices in 3D is pretty tough. In fact, I may not have gotten all that right (I hope by now I know that well enough, but...). The more practice you get, the better you can judge how to perform tasks like this. I would recommend reading tutorials on the subject. Any tutorial that covers 3D in XNA will have this topic.
In closing, though, if you know 3D Modeling software well enough, I would probably suggest you just make an actual animation of a piano and cover opening and closing and use that animated model in your game, instead of using models for both the piano and cover and trying to keep them together.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart