Using Ray/Plane detection for collision in XNA (3D) - xna

Basically I want to use Rays and Planes in XNA (C#/.NET) to detect collisions between my models. But before I can do that I desperately need to know how they work.
Whenever I go somewhere looking for Ray/Plane examples I get nothing but picking tutorials - I'm not looking for picking tutorials...
What I've been trying to do is take a Plane, feed it 3 Vector3's so it represents a 3d primitive triangle, and fire Ray at it. The Ray is just a point in space and a direction.
My problem is that when I fire the Ray at the Plane, it gives me some results I can't make sense of. For example:
Say I have a Plane that represents a primitive with the coordinates {0,0,0}{1,0,0}{0,0,1}
Now I put a Ray at {0.5,1,0.5} (Roughly above the center of the triangular plane) and give it the direction; {0,-1,0}
This gives me 1, which is expected because the Plane is 1 units below the Ray, and the Ray is pointing down.
However when I make the Ray point at, say, {2,0,0}, it still gives me a number, which makes no sense because {2,0,0} is a point that is not on the triangle.
This is the code I've been using;
Plane plane = new Plane(Vector3.Zero, Vector3.Right, Vector3.Backward);
Vector3 rayPos = new Vector3(0.5f, 1f, 0.5f);
Vector3 direction = new Vector3(1f, 0f, 1f) - rayPos;
direction.Normalize();
Ray ray = new Ray(rayPos, direction);
Console.WriteLine(ray.Intersects(plane));
I feel I've left out something REALLY important, and that I'm thinking about it all wrong. Any help would be really appreciated.

although 2,0,0 isn't on the plane, the direction ((2,0,0) - rayPos) is a direction that will intersect the plane (if starting from the current rayPos) and returns a result of 1.87...

Related

Difficulty getting depth of face landmark points from 2D regions on iPhone X (SceneKit/ARKit app)

I'm running face landmark detection using the front-facing camera on iPhone X, and am trying very hard to get 3D points of face landmarks (VNFaceLandmarkRegion2D gives image coordinates X, Y only).
I've been trying to use either the ARSCNView.hitTest or ARFrame.hitTest, but am so far unsuccessful. I think my error may be in converting the initial landmark points to the correct coordinate system. I've tried quite a few permutations, but currently based on my research this is what I've come up with:
let point = CGPoint(x: landmarkPt.x * faceBounds.width + faceBounds.origin.x, y: (1.0 - landmarkPt.y) * faceBounds.height + faceBounds.origin.y)
let screenPoint = CGPoint(point.x * view.bounds.width, point.y * view.bounds.height)
frame.hitTest(screenPoint, types: ARHitTestResult.ResultType.featurePoint)
I've also tried to do
let newPoint = CGPoint(x: point.x, y: 1.0 - point.y)
after the conversion, but nothing seems to work. My frame.hitTest result is always empty. Am I missing anything in the conversion?
Does the front-facing camera add another level to this? (I also tried inverting the initial X value at one point, in case the coordinate system was being mirrored). It also seems to me that the initial landmark normalizedPoints are sometimes negative and also sometimes greater than 1.0, which doesn't make any sense to me. I'm using ARSession.currentFrame?.capturedImage to capture the frame of the front-facing camera, if that's important.
Any help would be very, very appreciated, thanks so much!
-- SOLVED --
For anyone with similar issues:
I am finally getting hit test results!
for point in observation.landmarks?.allPoints?.pointsInImage(imageSize: sceneView.bounds.size) {
let result = ARSCNView.hitTest(point, options: [ARSCNHitTestOption.rootNode: faceNode)
}
I use the face geometry as an occlusion node attached to the face node.
Thanks Rickster!
You're using ARFaceTrackingConfiguration, correct? In that case, the featurePoint hit test type won't help you, because feature points are part of world tracking, not face tracking... in fact, just about all the ARKit hit testing machinery is specific to world tracking, and not useful to face tracking.
What you can do instead is make use of the face mesh (ARFaceGeometry) and face pose tracking (ARFaceAnchor) to work your way from a 2D image point to a 3D world-space (or camera-space) point. There's at least a couple paths you could go down for that:
If you're already using SceneKit, you can use SceneKit's hit testing instead of ARKit's. (That is, you're hit testing against "virtual" geometry modeled in SceneKit, not against a sparse estimate of the real-world environment modeled by ARKit. In this case, the "virtual" geometry of the face mesh comes into SceneKit via ARKit.) That is, you want ARSCNView.hitTest(_:options:) (inherited from SCNSceneRenderer), not hitTest(_:types:). Of course, this means you'll need to be using ARSCNFaceGeometry to visualize the face mesh in your scene, and ARSCNView's node/anchor mapping to make it track the face pose (though if you want the video image to show through, you can make the mesh transparent) — otherwise the SceneKit hit test won't have any SceneKit geometry to find.
If you're not using SceneKit, or for some reason can't put the face mesh into your scene, you have all the information you need to reconstruct a hit test against the face mesh. ARCamera has view and projection matrices that tell you the relationship of your 2D screen projection to 3D world space, ARFaceAnchor tells you where the face is in world space, and ARFaceGeometry tells you where each point is on the face — so it's just a bunch of math to get from a screen point to a face-mesh point and vice versa.

SceneKit unproject Z documentation explanation?

I am going through some SceneKit concepts and one I am trying to solidify in my head is unprojectPoint.
I understand that the function will take a point in 2D and return a point in 3D (so one with the proper Z value).
When I read the documentation I read this:
/*!
#method unprojectPoint
#abstract Unprojects a screenspace 2D point with depth info using the receiver's current point of view and viewport.
#param point The screenspace position to be unprojected.
#discussion A point whose z component is 0 (resp. 1) is unprojected on the near (resp. far) clip plane.
*/
public func unprojectPoint(_ point: SCNVector3) -> SCNVector3
What I am not too clear on is the values 0 and 1 used when it talks about Z....
A point whose z component is 0 (resp. 1) is unprojected on the near (resp. far) clip plane.
As I was reading around online I then found this question:
How to use iOS (Swift) SceneKit SCNSceneRenderer unprojectPoint properly
When I deal with a SceneKit view, is Z = 0 always the near plane, and Z = 1 the far plane? If so, why? And, is Z = 0 and Z = 1 just normalized values?
So, can somebody help me understand why the value 0 and 1 are used for Z in this context? And ultimately help me understand the:
A point whose z component is 0 (resp. 1) is unprojected on the near (resp. far) clip plane.
statement?
Perspective projection is the task of converting a point from the 3D space used for modeling your scene into the 2D pixel space of the view your scene is rendered in. It's something the GPU does thousands of times per frame during rendering.
But it's not entirely a 3D-to-2D conversion. It's important during rendering to sort out which objects are nearer to or farther from the camera (so they obscure each other properly), so perspective projection also outputs a normalized depth component, where lower values indicate a point nearer to the camera (and vice versa). (Values are normalized because at this point all that's needed is relative depth. And/or for reasons of traditional 3D graphics math history and GPU design.) This information gets used during rendering but effectively thrown away afterward — all you see is the 2D view.
"Unprojecting" a point is doing the same thing in reverse: given a point in 2D screen space, you want a point in 3D scene space. But a 2D point in screen space corresponds to a line in 3D space: each pixel in your view is looking along a ray into the 3D scene, and what you see in that pixel comes from the first 3D object that ray intersects.
Thus, to unproject a 2D point into 3D scene space, you need the 2D point itself to define a ray into the 3D scene, then a normalized depth value to decide how far along the ray you want the resulting 3D point to be. (Beware, normalized depth doesn't linearly correspond to distance because of perspective division.)
If you don't know the depth of the point your looking for, there are two things to consider...
Are you actually looking for the scene content (geometry) "behind" a specific pixel? If so, a hit test is more likely what you need.
Is the 3D point you want to get from unprojecting related to another point, enabling you to derive the Z value? There are a few other Q&As around here for that: How to use iOS (Swift) SceneKit SCNSceneRenderer unprojectPoint properly, How to convert 2D point to 3D using SceneKit's unprojectPoint without having a depth value?

How to convert 2D point to 3D using SceneKit's unprojectPoint without having a depth value?

Is it possible to use SceneKit's unprojectPoint to convert a 2D point to 3D without having a depth value?
I only need to find the 3D location in the XZ plane. Y can be always 0 or any value since I'm not using it.
I'm trying to do this for iOS 8 Beta.
I had something similar with JavaScript and Three.js (WebGL) like this:
function getMouse3D(x, y) {
var pos = new THREE.Vector3(0, 0, 0);
var pMouse = new THREE.Vector3(
(x / renderer.domElement.width) * 2 - 1,
-(y / renderer.domElement.height) * 2 + 1,
1
);
//
projector.unprojectVector(pMouse, camera);
var cam = camera.position;
var m = pMouse.y / ( pMouse.y - cam.y );
pos.x = pMouse.x + ( cam.x - pMouse.x ) * m;
pos.z = pMouse.z + ( cam.z - pMouse.z ) * m;
return pos;
};
But I don't know how to translate the part with unprojectVector to SceneKit.
What I want to do is to be able to drag an object around in the XZ plane only. The vertical axis Y will be ignored.
Since the object would need to move along a plane, one solution would be to use hitTest method, but I don't think is very good in terms of performance to do it for every touch/drag event. Also, it wouldn't allow the object to move outside the plane either.
I've tried a solution based on the accepted answer here, but it didn't worked. Using one depth value for unprojectPoint, if dragging the object around in the +/-Z direction the object doesn't stay under the finger too long, but it moves away from it instead.
I need to have the dragged object stay under the finger no matter where is it moved in the XZ plane.
First, are you actually looking for a position in the xz-plane or the xy-plane? By default, the camera looks in the -z direction, so the x- and y-axes of the 3D Scene Kit coordinate system go in the same directions as they do in the 2D view coordinate system. (Well, y is flipped by default in UIKit, but it's still the vertical axis.) The xz-plane is then orthogonal to the plane of the screen.
Second, a depth value is a necessary part of converting from 2D to 3D. I'm not an expert on three.js, but from looking at their library documentation (which apparently can't be linked into), their unprojectVector still takes a Vector3. And that's what you're constructing for pMouse in your code above — a vector whose z- and y-coordinates come from the 2D mouse position, and whose z-coordinate is 1.
SceneKit's unprojectPoint works the same way — it takes a point whose z-coordinate refers to a depth in clip space, and maps that to a point in your scene's world space.
If your world space is oriented such that the only variation you care about is in the x- and y-axes, you may pass any z-value you want to unprojectPoint, and ignore the z-value in the vector you get back. Otherwise, pass -1 to map to the far clipping plane, 1 for the near clipping plane, or 0 for halfway in between — the plane whose z-coordinate (in camera space) is 0. If you're using the unprojected point to position a node in the scene, the best advice is to just try different z-values (between -1 and 1) until you get the behavior you want.
However, it's a good idea to be thinking about what you're using an unprojected vector for — if the next thing you'd be doing with it is testing for intersections with scene geometry, look at hitTest: instead.

XNA WorldMatrix and ViewMatrix

I have created a Camera class that allows me to move around a scene in first person. The camera has worked just fine until I decided to use it as a location to add something to the 3D world. What I am trying to do is add a cube to the world when I press a mouse button. I want to cube to eventually travel away from the camera, but for now I just want to create it right in front of it. Sometimes it works and sometimes it creates it to one side or the other. It all depends on how much I've rotated and translated the camera.
I am tryinng to find the vector in front of my camera by using the View Matrix like so:
Vector3 inFront = Camera.ViewMatrix.Forward;
I plan to use the vector to add some physics behind the cube and have it travel away from the camera. For now I am just wanting to get a correct Vector.
I know you normally draw thing in the world using the WorldMatrix, but I can't figure out how to convert my ViewMatrix into a WorldMatrix. Still learing :-)
What am I doing wrong?
-Scott
First of all, there is no real difference between a "World Matrix" and a "View Matrix", they are both transformation matrices and the distinction is somewhat arbitrary. Some systems even combine the two (OpenGL simply has a "ModelView" matrix).
Traditionally the "world matrix" is used to move individual models from "model space" to "world space". Then the "view matrix" is used to move all the models from world space into their relative positions in front of the camera (which, in effect, "moves the camera"). And finally the "Projection Matrix" converts the 3D positions into their 2D positions on the screen (generally with a perspective projection). Because they are matrices, they can be multiplied together into a single matrix that can transform points in a single step.
First of all, take a look at the properties of the Matrix struct.
What you also need to realise is that Matrix.Forward returns a Vector3. A Vector3 can represent either a position or a scalar and a direction. You need two of them to represent a position and a direction.
Now, my 3D matrix maths is a bit rusty, but I'm pretty sure that what you want is the Matrix.Translation as the position of the camera in world space. And Matrix.Forward as the forward direction of the camera in world space.
Unless your camera/view matrix is performing a scaling operation on the world (and really it shouldn't), then the Vector3 you get back from Matrix.Forward will have unit length - in other words just a direction (no scalar). Use this to give a direction to move your object in.
I assume you have to location of the camera. Have you tried something like this (I haven't done Matrix/Vector math in a few years so this might be off):
float scalar = 10; // how far away from the camera you want to move the object
Vector3 camPos = ???; // supplied from somewhere elese
Vector3 inFront = Camera.ViewMatrix.Forward;
Vector3 newPos = camPos + inFront * scalar;

XNA 4.0 Camera Question

I'm having trouble understanding how the camera works in my test application. I've been able to piece together a working camera - now I am trying to make sure I understand how it all works. My camera is encapsulated in its own class. Here is the update method that gets called from my Game.Update() method:
public void Update(float dt)
{
Yaw += (200 - Game.MouseState.X) * dt * .12f;
Pitch += (200 - Game.MouseState.Y) * dt * .12f;
Mouse.SetPosition(200, 200);
_worldMatrix = Matrix.CreateFromAxisAngle(Vector3.Right, Pitch) * Matrix.CreateFromAxisAngle(Vector3.Up, Yaw);
float distance = _speed * dt;
if (_game.KeyboardState.IsKeyDown(Keys.E))
MoveForward(distance);
if (_game.KeyboardState.IsKeyDown(Keys.D))
MoveForward(-distance);
if (_game.KeyboardState.IsKeyDown(Keys.S))
MoveRight(-distance);
if (_game.KeyboardState.IsKeyDown(Keys.F))
MoveRight(distance);
if (_game.KeyboardState.IsKeyDown(Keys.A))
MoveUp(distance);
if (_game.KeyboardState.IsKeyDown(Keys.Z))
MoveUp(-distance);
_worldMatrix *= Matrix.CreateTranslation(_position);
_viewMatrix = Matrix.Invert(_worldMatrix); // What's gong on here???
}
First of all, I understand everything in this method other than the very last part where the matrices are being manipulated. I think the terminology is getting in my way as well. For example, my _worldMatrix is really a Rotation Matrix. What really baffles me is the part where the _viewMatrix is calculated by inverting the _worldMatrix. I just don't understand what this is all about.
In prior testing, I always used Matrix.CreateLookAt() to create a view matrix, so I'm a bit confused. I'm hoping someone can explain in simple terms what is going on.
Thanks,
-Scott
One operation the view matrix does for the graphics pipeline is that it converts a 3d point from world space (the x, y, z, we all know & love) into view (or camera) space, a space where the camera is considered to be the center of the world (0,0,0) and all points/objects are relative to it. So while a point may be at 1,1,1 relative to the world, what are it's cordinates relative to the camera location? Well, as it turns out, to find out, you can transform that point by the inverse of a matrix representing the camera's world space position/rotation.
It kinda makes sense if you think about it... let's say the camera position is 2,2,2. An arbitrary point is at 3,3,3. We know that the point is 1,1,1 away from the camera, right? so what transformation would you apply to the point 3,3,3 in order for it to become 1,1,1 (it's location relative to the camera)? you would transform 3,3,3 by -2,-2,-2 to result in 1,1,1. -2,-2,-2 is also the camera's inverted position. That example was for translation because it is relatively easy to groc but basically the same happens for rotation. But don't expect to be able to simply negate all basis vectors to invert a matrix... there is a little more going on with that for rotation.
The Matrix.CreateLookAt() method automatically returns the inverted matrix so you don't really notice it happening unless you reflect its code.
Taking that one step further, the Projection matrix then takes that point in view space and projects it onto a flat surface and that point that started out in 3d space is now in 2d space.

Resources