How to calculate camera ray position for use with XMVector3Unproject(), DirectX11? - directx

I'm trying to create a ray-casting camera in DirectX11 using XMVector3Unproject(). From my understanding, I will be passing in the (Vector3)position of the pixel on the near plane, and in separate call, a corresponding position on the far plane. Then I would subtract these vectors to get the direction of the ray. The origin would then be the Unprojected coordinate on the near plane. My problem here is calculating the origin of the ray to be passed in.
Example
// assuming screenHeight and screenWidth are the number of pixels.
const uint32_t screenHeight = 768;
const uint32_t screenWidth = 1024;
struct Ray
{
XMFLOAT3 origin;
XMFLOAT3 direction;
};
Ray rays[screenWidth * screenHeight];
for (uint32_t i = 0; i < screenHeight; ++i)
{
for (uint32_t j = 0; j < screenWidth; ++j)
{
// 1. ***calculate and store the current pixel position on the near plane***
// 2. ***calculate the corresponding point on the far plane***
// 3. ***pass both positions separately into XMVector3Unproject() (2 total calls to the function)***
// 4. ***store the returned vectors' difference into rays[i * screenWidth + j].direction***
// 5. ***store the near plane pixel position's returned vector into rays[i * screenWidth + j].origin***
}
}
Hopefully I'm understanding this correctly. Any help in determining the ray origins, or corrections would be greatly appreciated.

According to the documentation, the XMVector3Unproject function gives you the coordinates of a ray you have provided in camera space (Normalized-device coordinates), in object space (given your model matrix).
To generate your camera rays, you consider your camera pinhole (all the light passes through one point, which is your camera (0, 0, 0), then you choose your ray direction. Let say you want to generate W*H camera rays, your loop might look like this
Vector3 ray_origin = Vector3(0, 0, 0);
for (float x = -1.f; x <= 1.f; x += 2.f / W) {
for (float y = -1.f; y <= 1.f; y += 2.f / H) {
Vector3 ray_direction = Normalize(Vector3(x, y, -1.f)) - ray_origin;
Vector3 ray_in_model = Unproject(ray_direction, 0.f, 0.f,
width, height, znear, zfar,
proj, view, model);
}
}
You might also want to have a look at this link which sounds interesting

Related

SceneKit 3D Marker Augmented Reality iOS

Last couple of weeks I've been working on developing a simple proof-of-concept application in which a 3D model is projected over a specific Augmented Reality marker (in my case I am using Aruco markers) in IOS (with Swift and Objective-C)
I calibrated an Ipad Camera with a specific fixed lens position and used that to estimate the pose of the AR marker (which from my debug analysis seem pretty accurate). The problem seems (surprise, surprise) when I try to use SceneKit scene to project a model over the marker.
I am aware that the axis in opencv and SceneKit are different (Y and Z) and already done this correction as well as the row order/column order difference between the two libraries.
After constructing the projection matrix, I apply that same transform to the 3D model and from my debug analysis the object seems to be translated to the desired position and with the desired rotation. The problem is that it never does overlap the specific image pixel position of the marker. I am using a AVCapturePreviewVideoLayer as to put the video in background that has the same bounds as my SceneKit View.
Has anyone has a clue why this happens? I tried to play with cameras FOV's but with no real impact in the results.
Thank you all for your time.
EDIT1: I Will post some of the code here to reveal what I am currently doing.
I have two subviews inside the main view, one which is a background AVCaptureVideoPreviewLayer and another which is a SceneKitView. Both have the same bounds as the main view.
At each frame I use an opencv wrapper which outputs the pose of each marker:
std::vector<int> ids;
std::vector<std::vector<cv::Point2f>> corners, rejected;
cv::aruco::detectMarkers(frame, _dictionary, corners, ids, _detectorParams, rejected);
if (ids.size() > 0 ){
cv::aruco::drawDetectedMarkers(frame, corners, ids);
cv::Mat rvecs, tvecs;
cv::aruco::estimatePoseSingleMarkers(corners, 2.6, _intrinsicMatrix, _distCoeffs, rvecs, tvecs);
// Let's protect ourselves agains multiple markers
if (rvecs.total() > 1)
return;
_markerFound = true;
cv::Rodrigues(rvecs, _currentR);
_currentT = tvecs;
for (int row = 0; row < _currentR.rows; row++){
for (int col = 0; col < _currentR.cols; col++){
_currentExtrinsics.at<double>(row, col) = _currentR.at<double>(row, col);
}
_currentExtrinsics.at<double>(row, 3) = _currentT.at<double>(row);
}
_currentExtrinsics.at<double>(3,3) = 1;
std::cout << tvecs << std::endl;
// Convert coordinate systems of opencv to openGL (SceneKit)
// Note that in openCV z goes away the camera (in openGL goes into the camera)
// and y points down and on openGL point up
// Another note: openCV has a column order matrix representation, while SceneKit
// has a row order matrix, but we'll take care of it later.
cv::Mat cvToGl = cv::Mat::zeros(4, 4, CV_64F);
cvToGl.at<double>(0,0) = 1.0f;
cvToGl.at<double>(1,1) = -1.0f; // invert the y axis
cvToGl.at<double>(2,2) = -1.0f; // invert the z axis
cvToGl.at<double>(3,3) = 1.0f;
_currentExtrinsics = cvToGl * _currentExtrinsics;
cv::aruco::drawAxis(frame, _intrinsicMatrix, _distCoeffs, rvecs, tvecs, 5);
Then in each frame I convert the opencv matrix for a SCN4Matrix:
- (SCNMatrix4) transformToSceneKit:(cv::Mat&) openCVTransformation{
SCNMatrix4 mat = SCNMatrix4Identity;
// Transpose
openCVTransformation = openCVTransformation.t();
// copy the rotationRows
mat.m11 = (float) openCVTransformation.at<double>(0, 0);
mat.m12 = (float) openCVTransformation.at<double>(0, 1);
mat.m13 = (float) openCVTransformation.at<double>(0, 2);
mat.m14 = (float) openCVTransformation.at<double>(0, 3);
mat.m21 = (float)openCVTransformation.at<double>(1, 0);
mat.m22 = (float)openCVTransformation.at<double>(1, 1);
mat.m23 = (float)openCVTransformation.at<double>(1, 2);
mat.m24 = (float)openCVTransformation.at<double>(1, 3);
mat.m31 = (float)openCVTransformation.at<double>(2, 0);
mat.m32 = (float)openCVTransformation.at<double>(2, 1);
mat.m33 = (float)openCVTransformation.at<double>(2, 2);
mat.m34 = (float)openCVTransformation.at<double>(2, 3);
//copy the translation row
mat.m41 = (float)openCVTransformation.at<double>(3, 0);
mat.m42 = (float)openCVTransformation.at<double>(3, 1)+2.5;
mat.m43 = (float)openCVTransformation.at<double>(3, 2);
mat.m44 = (float)openCVTransformation.at<double>(3, 3);
return mat;
}
At each frame in which the AR marker is found I add a box to the scene and apply the transformation to the object node:
SCNBox *box = [SCNBox boxWithWidth:5.0 height:5.0 length:5.0 chamferRadius:0.0];
_boxNode = [SCNNode nodeWithGeometry:box];
if (found){
[self.delegate returnExtrinsicsMat:extrinsicMatrixOfTheMarker];
Mat R, T;
[self.delegate returnRotationMat:R];
[self.delegate returnTranslationMat:T];
SCNMatrix4 Transformation;
Transformation = [self transformToSceneKit:extrinsicMatrixOfTheMarker];
//_cameraNode.transform = SCNMatrix4Invert(Transformation);
[_sceneKitScene.rootNode addChildNode:_cameraNode];
//_cameraNode.camera.projectionTransform = SCNMatrix4Identity;
//_cameraNode.camera.zNear = 0.0;
_sceneKitView.pointOfView = _cameraNode;
_boxNode.transform = Transformation;
[_sceneKitScene.rootNode addChildNode:_boxNode];
//_boxNode.position = SCNVector3Make(Transformation.m41, Transformation.m42, Transformation.m43);
std::cout << (_boxNode.position.x) << " " << (_boxNode.position.y) << " " << (_boxNode.position.z) << std::endl << std::endl;
}
For example if the translation vector is (-1, 5, 20) the object appears in the scene in position (-1, -5, -20) in the scene, and the rotation is correct also. The problem is that it never appears in the correct position in the background image. I will add some images to show the result.
Does anyone know why this is happening?
Found out the solution. Instead of applying the transform to the node of the object I applied the inverted transformation matrix to the camera node. Then for the camera perspective transform matrix I applied the following matrix:
projection = SCNMatrix4Identity
projection.m11 = (2 * (float)(cameraMatrix[0])) / -(ImageWidth*0.5)
projection.m12 = (-2 * (float)(cameraMatrix[1])) / (ImageWidth*0.5)
projection.m13 = (width - (2 * Float(cameraMatrix[2]))) / (ImageWidth*0.5)
projection.m22 = (2 * (float)(cameraMatrix[4])) / (ImageHeight*0.5)
projection.m23 = (-height + (2 * (float)(cameraMatrix[5]))) / (ImageHeight*0.5)
projection.m33 = (-far - near) / (far - near)
projection.m34 = (-2 * far * near) / (far - near)
projection.m43 = -1
projection.m44 = 0
being far and near the z clipping planes.
I also had to correct the box initial position to center it on the marker.

Why is my shape distorted on rotation about the z axis?

I just started learning metal and can best show you my frustration with the following series of screenshots. From top to bottom we have
(1) My model where the model matrix is the identity matrix
(2) My model rotated 60 deg about the x axis with orthogonal projection
(3) My model rotated 60 deg about the y axis with orthogonal projection
(4) My model rotated 60 deg about the z axis
So I use the following function for conversion into normalized device coordinates:
- (CGPoint)normalizedDevicePointForViewPoint:(CGPoint)point
{
CGPoint p = [self convertPoint:point toCoordinateSpace:self.window.screen.fixedCoordinateSpace];
CGFloat halfWidth = CGRectGetMidX(self.window.screen.bounds);
CGFloat halfHeight = CGRectGetMidY(self.window.screen.bounds);
CGFloat px = ( p.x - halfWidth ) / halfWidth;
CGFloat py = ( p.y - halfHeight ) / halfHeight;
return CGPointMake(px, -py);
}
The following rotates and orthogonally projects the model:
- (matrix_float4x4)zRotation
{
self.rotationZ = M_PI / 3;
const vector_float3 zAxis = { 0, 0, 1 };
const matrix_float4x4 zRot = matrix_float4x4_rotation(zAxis, self.rotationZ);
const matrix_float4x4 modelMatrix = zRot;
return matrix_multiply( matrix_float4x4_orthogonal_projection_on_z_plane(), modelMatrix );
}
As you can see when I use the exact same method for rotating about the other two axes, it looks fine-not distorted. What am I doing wrong? Is there some sort of scaling/aspect ratio thing I should be setting somewhere? What things could it be? I've been staring at this for an embarrassingly long period of time so any help/ideas that can lead me in the right direction are much appreciated. Thank you in advance.
There's nothing wrong with your rotation or projection matrices. The visual oddity arises from the fact that you move your vertices into NDC space prior to rotation. A rectangle doesn't preserve its aspect ratio when rotating in NDC space, because the mapping from NDC back to screen coordinates is not 1:1.
I would recommend not working in NDC until the very end of the vertex pipeline (i.e., pass vertices into your vertex function in "world" space, and out to the rasterizer as NDC). You can do this with a classic construction of the orthographic projection matrix that scales and biases the vertices, correctly accounting for the non-square aspect ratio of window coordinates.

Augmented Reality iOS application tracking issue

I am able to detect markers, identify markers and initialise OpenGL objects on screen. The issue I'm having is overlaying them on top of the markers position in the camera world. My camera is calibrated best I can using this method Iphone 6 camera calibration for OpenCV. I feel there is an issue with my cameras projection matrix, I create it as follows:
-(void)buildProjectionMatrix:
(Matrix33)cameraMatrix:
(int)screen_width:
(int)screen_height:
(Matrix44&) projectionMatrix
{
float near = 0.01; // Near clipping distance
float far = 100; // Far clipping distance
// Camera parameters
float f_x = cameraMatrix.data[0]; // Focal length in x axis
float f_y = cameraMatrix.data[4]; // Focal length in y axis
float c_x = cameraMatrix.data[2]; // Camera primary point x
float c_y = cameraMatrix.data[5]; // Camera primary point y
std::cout<<"fx "<<f_x<<" fy "<<f_y<<" cx "<<c_x<<" cy "<<c_y<<std::endl;
std::cout<<"width "<<screen_width<<" height "<<screen_height<<std::endl;
projectionMatrix.data[0] = - 2.0 * f_x / screen_width;
projectionMatrix.data[1] = 0.0;
projectionMatrix.data[2] = 0.0;
projectionMatrix.data[3] = 0.0;
projectionMatrix.data[4] = 0.0;
projectionMatrix.data[5] = 2.0 * f_y / screen_height;
projectionMatrix.data[6] = 0.0;
projectionMatrix.data[7] = 0.0;
projectionMatrix.data[8] = 2.0 * c_x / screen_width - 1.0;
projectionMatrix.data[9] = 2.0 * c_y / screen_height - 1.0;
projectionMatrix.data[10] = -( far+near ) / ( far - near );
projectionMatrix.data[11] = -1.0;
projectionMatrix.data[12] = 0.0;
projectionMatrix.data[13] = 0.0;
projectionMatrix.data[14] = -2.0 * far * near / ( far - near );
projectionMatrix.data[15] = 0.0;
}
This is the method to estimate the position of the marker:
void MarkerDetector::estimatePosition(std::vector<Marker>& detectedMarkers)
{
for (size_t i=0; i<detectedMarkers.size(); i++)
{
Marker& m = detectedMarkers[i];
cv::Mat Rvec;
cv::Mat_<float> Tvec;
cv::Mat raux,taux;
cv::solvePnP(m_markerCorners3d, m.points, camMatrix, distCoeff,raux,taux);
raux.convertTo(Rvec,CV_32F);
taux.convertTo(Tvec ,CV_32F);
cv::Mat_<float> rotMat(3,3);
cv::Rodrigues(Rvec, rotMat);
// Copy to transformation matrix
for (int col=0; col<3; col++)
{
for (int row=0; row<3; row++)
{
m.transformation.r().mat[row][col] = rotMat(row,col); // Copy rotation component
}
m.transformation.t().data[col] = Tvec(col); // Copy translation component
}
// Since solvePnP finds camera location, w.r.t to marker pose, to get marker pose w.r.t to the camera we invert it.
m.transformation = m.transformation.getInverted();
}
}
The OpenGL shape is able to track and account for size and roation, but something is going wrong with the translation. If the camera is turned 90 degrees, the opengl shape swings around 90 degrees about the centre of the marker. Its almost as if I am translating before rotating, but I am not.
See video for issue:
https://vid.me/fLvX
I guess you can have some problem with projecting the 3-D modelpoints. Essentially, solvePnP gives a transformation that brings points from the model coordinate system to the camera coordinate system and this is composed of a rotation and translation vector (output of solvePnP):
cv::Mat rvec, tvec;
cv::solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs, rvec, tvec)
At this point you are able to project model points onto the image plane
std::vector<cv::Vec2d> imagePointsRP; // Reprojected image points
cv::projectPoints(objectPoints, rvec, tvec, cameraMatrix, distCoeffs, imagePointsRP);
Now, you should only draw the points of imagePointsRP over the incoming image and if the pose estimation was correct then you'll see the reprojected corners over the corners of the marker
Anyway, the matrices of model TO camera and camera TO model direction can be composed as below:
cv::Mat rmat
cv::Rodrigues(rvec, rmat); // mRmat is 3x3
cv::Mat modelToCam = cv::Mat::eye(4, 4, CV_64FC1);
modelToCam(cv::Range(0, 3), cv::Range(0, 3)) = rmat * 1.0;
modelToCam(cv::Range(0, 3), cv::Range(3, 4)) = tvec * 1.0;
cv::Mat camToModel = cv::Mat::eye(4, 4, CV_64FC1);
cv::Mat rmatinv = rmat.t(); // rotation of inverse
cv::Mat tvecinv = -rmatinv * tvec; // translation of inverse
camToModel(cv::Range(0, 3), cv::Range(0, 3)) = rmatinv * 1.0;
camToModel(cv::Range(0, 3), cv::Range(3, 4)) = tvecinv * 1.0;
In any case, it's also useful to estimate reprojection error and discard the poses with high error (remember, the PnP problem has only unique solution if n=4 and these points are coplanar):
double totalErr = 0.0;
for (size_t i = 0; i < imagePoints.size(); i++)
{
double err = cv::norm(cv::Mat(imagePoints[i]), cv::Mat(imagePointsRP[i]), cv::NORM_L2);
totalErr += err*err;
}
totalErr = std::sqrt(totalErr / imagePoints.size());

Using OpenGL ES 2.0 with iOS, how do I draw a cylinder between two points?

I am given two GLKVector3's representing the start and end points of the cylinder. Using these points and the radius, I need to build and render a cylinder. I can build a cylinder with the correct distance between the points but in a fixed direction (currently always in the y (0, 1, 0) up direction). I am not sure what kind of calculations I need to make to get the cylinder on the correct plane between the two points so that a line would run through the two end points. I am thinking there is some sort of calculations I can apply as I create my vertex data with the direction vector, or angle, that will create the cylinder pointing the correct direction. Does anyone have an algorithm, or know of one, that will help?
Are you drawing more than one of these cylinders? Or ever drawing it in a different position? If so, using the algorithm from the awesome article is a not-so-awesome idea. Every time you upload geometry data to the GPU, you incur a performance cost.
A better approach is to calculate the geometry for a single basic cylinder once — say, one with unit radius and height — and stuff that vertex data into a VBO. Then, when you draw, use a model-to-world transformation matrix to scale (independently in radius and length if needed) and rotate the cylinder into place. This way, the only new data that gets sent to the GPU with each draw call is a 4x4 matrix instead of all the vertex data for whatever polycount of cylinder you're drawing.
Check this awesome article; it's dated but after adapting the algorithm, it works like a charm. One tip, OpenGL ES 2.0 only supports triangles so instead of using GL_QUAD_STRIP as the method does, use GL_TRIANGLE_STRIP instead and the result is identical. The site also contains a bunch of other useful information regarding OpenGL geometries.
See code below for solution. Self represents the mesh and contains the vertices, indices, and such.
- (instancetype)initWithOriginRadius:(CGFloat)originRadius
atOriginPoint:(GLKVector3)originPoint
andEndRadius:(CGFloat)endRadius
atEndPoint:(GLKVector3)endPoint
withPrecision:(NSInteger)precision
andColor:(GLKVector4)color
{
self = [super init];
if (self) {
// normal pointing from origin point to end point
GLKVector3 normal = GLKVector3Make(originPoint.x - endPoint.x,
originPoint.y - endPoint.y,
originPoint.z - endPoint.z);
// create two perpendicular vectors - perp and q
GLKVector3 perp = normal;
if (normal.x == 0 && normal.z == 0) {
perp.x += 1;
} else {
perp.y += 1;
}
// cross product
GLKVector3 q = GLKVector3CrossProduct(perp, normal);
perp = GLKVector3CrossProduct(normal, q);
// normalize vectors
perp = GLKVector3Normalize(perp);
q = GLKVector3Normalize(q);
// calculate vertices
CGFloat twoPi = 2 * PI;
NSInteger index = 0;
for (NSInteger i = 0; i < precision + 1; i++) {
CGFloat theta = ((CGFloat) i) / precision * twoPi; // go around circle and get points
// normals
normal.x = cosf(theta) * perp.x + sinf(theta) * q.x;
normal.y = cosf(theta) * perp.y + sinf(theta) * q.y;
normal.z = cosf(theta) * perp.z + sinf(theta) * q.z;
AGLKMeshVertex meshVertex;
AGLKMeshVertexDynamic colorVertex;
// top vertex
meshVertex.position.x = endPoint.x + endRadius * normal.x;
meshVertex.position.y = endPoint.y + endRadius * normal.y;
meshVertex.position.z = endPoint.z + endRadius * normal.z;
meshVertex.normal = normal;
meshVertex.originalColor = color;
// append vertex
[self appendVertex:meshVertex];
// append color vertex
colorVertex.colors = color;
[self appendColorVertex:colorVertex];
// append index
[self appendIndex:index++];
// bottom vertex
meshVertex.position.x = originPoint.x + originRadius * normal.x;
meshVertex.position.y = originPoint.y + originRadius * normal.y;
meshVertex.position.z = originPoint.z + originRadius * normal.z;
meshVertex.normal = normal;
meshVertex.originalColor = color;
// append vertex
[self appendVertex:meshVertex];
// append color vertex
[self appendColorVertex:colorVertex];
// append index
[self appendIndex:index++];
}
// draw command
[self appendCommand:GL_TRIANGLE_STRIP firstIndex:0 numberOfIndices:self.numberOfIndices materialName:#""];
}
return self;
}

Tile to CGPoint conversion with Retina display

I have a project that uses a tilemap. I have a separate tilemap for low-res (29x29 Tilesize) and high-res (58x58). I have these methods to calculate tileCoord to position and back again.
- (CGPoint)tileCoordForPosition:(CGPoint)position {
int x = position.x / _tileMap.tileSize.width;
int y = ((_tileMap.mapSize.height * _tileMap.tileSize.height) - position.y) / _tileMap.tileSize.height;
return ccp(x, y);
}
- (CGPoint)positionForTileCoord:(CGPoint)tileCoord {
int x = (tileCoord.x * _tileMap.tileSize.width) + _tileMap.tileSize.width/2;
int y = (_tileMap.mapSize.height * _tileMap.tileSize.height) - (tileCoord.y * _tileMap.tileSize.height) - _tileMap.tileSize.height/2;
return ccp(x, y);
}
I got this from RayWenderLich and I do honeslty not understand how it works, and why it has to be so complicated. But this doesn't work when I use retina tilemaps, only on 480x320. Can someone clever come up with a way to make this work for HD? Does not have to work on low-res either, I do not plan on supporting sub-iOS 7.
I want the output to be in the low-res coordinate scale tho, as you might know, cocos2d does the resizing to HD for you. (By multiplying by two)
i think this will work
- (CGPoint)tileCoordForPosition:(CGPoint)position {
    int x = position.x/29;
    int y = ((11*29)-position.y) / 29;
    
    return ccp(x, y);
}
- (CGPoint)positionForTileCoord:(CGPoint)tileCoord {
    double x = tileCoord.x * 29 + 14.5;
    double y = (11*29) - (tileCoord.y * 29) - 14.5;
    return ccp(x, y);
}
Here you're trying to compute your map X coordinate:
int x = position.x / _tileMap.tileSize.width;
The problem here is that (as of v0.99.5-rc0, cocos2d generally uses points for positions, but CCTMXTiledMap always uses pixels for tileSize. On a low-res device, 1 point = 1 pixel, but on a Retina device, 1 point = 2 pixels. Thus on a Retina device, you need to multiply by 2.
You can use the CC_CONTENT_SCALE_FACTOR() macro to fix this:
int x = CC_CONTENT_SCALE_FACTOR() * position.x / _tileMap.tileSize.width;
Here you're trying to compute yoru map Y coordinate:
int y = ((_tileMap.mapSize.height * _tileMap.tileSize.height) - position.y) / _tileMap.tileSize.height;
The extra math here is trying to account for the difference between Cocos2D's normal coordinate system and your map's flipped coordinate system. In standard Cartesian coordinates, the origin is at the lower left and Y coordinates increase as you move up. In a flipped coordinate system, the origin is at the upper left and Y coordinates increase as you move down. Thus you must subtract your position's Y coordinate from the height of the map (in scene units, which are points) to flip it to map coordinates.
The problem again is that _tileMap.tileSize is in pixels, not points. You can again fix that by using CC_CONTENT_SCALE_FACTOR():
CGFloat tileHeight = _tileMap.tileSize.height / CC_CONTENT_SCALE_FACTOR();
int y = ((_tileMap.mapSize.height * tileHeight) - position.y) / tileHeight;

Resources