What is the coordinate system used in metal? - metal

In metal what coordinate system to use inside shader (in and out)? and when we render to texture is it the same? with the z buffer also? Are there any inconsistencies? finally what are the difference between metal, opengl and directX ?

Metal Coordinate Systems
Metal defines several standard coordinate systems to represent transformed graphics data at
different stages along the rendering pipeline.
1) NDC (Normalized Device Coordinate): this coordinates is used by developers to construct their geometries and transform the geometries in vertex shader via model and view matrices.
Point(-1, -1) in NDC is located at the the bottom left corner (Y up)..
2) Framebuffer Coordinate (Viewport coordinate): when we write into attachment or read from attachment or copy/blit between attachments, we use framebuffer coordiante to specify the location. The origin(0, 0) is located at the top-left corner (Y down).
3) Texture Coordinate: when we upload texture into memory or sample from texture, we use texture coordinate. The origin(0, 0) is located at the top-left corner (Y down).
D3D12 and Metal
NDC: +Y is up. Point(-1, -1) is at the bottom left corner.
Framebuffer coordinate: +Y is down. Origin(0, 0) is at the top left corner.
Texture coordinate: +Y is down. Origin(0, 0) is at the top left corner.
OpenGL, OpenGL ES and WebGL
NDC: +Y is up. Point(-1, -1) is at the bottom left corner.
Framebuffer coordinate: +Y is up. Origin(0, 0) is at the bottom left corner.
Texture coordinate: +Y is up. Origin(0, 0) is at the bottom left corner.
Vulkan
NDC: +Y is down. Point(-1, -1) is at the top left corner.
Framebuffer coordinate: +Y is down. Origin(0, 0) is at the bottom left corner.
Texture coordinate: +Y is up. Origin(0, 0) is at the bottom left corner.

Related

Mapping textures to 2 triangles in roblox

I am currently trying to map textures using image labels onto 2 different triangles (because im using right angle wedges so i need 2 to make scalene triangles), but here is the problem, I can only set positional, size, and rotational data so I need to figure out how I can use this information to correctly map the texture onto the triangle
the position is based on the topleft corner and size of triangle (<1,1> corner is at the bottom right and <0,0> corner is at top left) and the size is based on triangle size also (<1,1> is same size as triangle and <0,0> is infinitely tiny) and rotation is central based.
I have the UV coordinates (given 0-1) and face vertices, all from an obj file. The triangles in 3D are made up of 2 wedges which are split at a right angle from the longest surface and from the opposite angle.
I don't quite understand this however it may be help to change the canvas properties on the Surface GUI

Camera projection matrix principal point

I'm a little confused about the purpose of adding the offsets of the principal point, in the camera matrix. These equations are from OpenCV Docs.
I understand all of this except for adding c_x and c_y. I've read that we do this in order to shift the origin of the projected point so that it's relative to (0, 0), the top left of the image. However, I don't know how adding the coordinates of the center of the image (the principal point) accomplishes this. I think it's simple geometry, but I'm having a hard time understanding.
Just take a look at the diagram in your question. The x/y coordinate system has its origin somewhere around the center of the image. I.e., there can be negative coordinates. The u/v coordinate system has its origin at the top left corner, i.e., there can be no negative coordinates. For the purpose of this question, I will consider the x/y coordinate system to already be scaled with fx, fy, i.e., (x, y) = (fx * x', fy * y').
What you want to do is transform the coordinates from the x/y coordinate system to the u/v coordinate system. Let's look at a few examples:
The origin in x/y (0, 0) will map to (cx, cy) in u/v.
The top left corner (i.e., (0, 0) in u/v) has the coordinates (-cx, -cy) in x/y.
You could establish many more examples. They all have in common that (u, v) = (x, y) + (fx, fy). And this is the transform stated in the equations.

Mapping a fish eye to a sphere - 360 degree view

I am attempting to map a fisheye image to a 360 degree view using a sky sphere in Unity. The scene is inside the sphere. I am very close but I am seeing some slight distortion. I am calculating UV coordinates as follows:
Vector3 v = currentVertice; // unit vector from edge of sphere, -1, -1, -1 to 1, 1, 1
float r = Mathf.Atan2(Mathf.Sqrt(v.x * v.x + v.y * v.y), v.z) / (Mathf.PI * 2.0f);
float phi = Mathf.Atan2(v.y, v.x);
textureCoordinates.x = (r * Mathf.Cos(phi)) + 0.5f;
textureCoordinates.y = (r * Mathf.Sin(phi)) + 0.5f;
Here is the distortion and triangles:
The rest of the entire sphere looks great, it's just at this one spot that I get the distortion.
Here is the source fish eye image:
And the same sphere with a UV test texture over the top showing the same distortion area. Full UV test texture is on the right, and is a square although stretched into a rectangle on the right for purposes of my screenshot.
The distortion above is using sphere mapping rather than fish eye mapping. Here is the UV texture using fish eye mapping:
Math isn't my strong point, am I doing anything wrong here or is this kind of mapping simply not 100% possible?
The spot you are seeing is the case where r gets very close to 1. As you can see in the source image, this is the border area between the very distorted image data and the black.
This area is very distorted, however that's not the main problem. Looking at the result you can see that there are problems with UV orientation.
I've added a few lines to your source image to demonstrate what I mean. Where r is small (yellow lines) you can see that the UV coordinates can be interpolated between the corners of your quad (assuming quads instead of tris). However, where r is big (red corners), interpolating UV coordinates will make them travel through areas of your source image whose r is much smaller than 1 (red lines), causing distortions in UV space. Actually, those red lines should not be straight, but they should travel along the border of your source image data.
You can improve this by having a higher polycount in the area of your skysphere where r gets close to 1, but it will never be perfect as long as your UVs are interpolated in a linear way.
I also found another problem. If you look closely at the spot, you'll find that the complete source image is present there in small. This is because your UV coordinates wrap around at that point. As rendering passes around the viewer, uv coordinates travel from 0 towards 1. At the spot they are at 1, the neighboring vertex however is at 0.001 or something, causing the whole source image to be rendered inbetween. To fix that, you'll need to have two seperate vertices at the seam of your skysphere, one where the surface of the sphere starts, and one where it ends. In object space they are identical, but in uv space one is at 0, the other at 1.

Calculating distance between 2 homography planes that have shared ground plane

I think the easiest is to explain problem with image:
I have two cubes (same size) that are laying on the table. One of their side is marked with green color (for easy tracking). I want to calculate the relative position (x,y) of left cube to the right cube (red line on the picture) in cube size unit.
Is it even possible? I know problem would be simple if those two green sides would have common plane - like top side of cube however I can't use that for tracking. I would just calculate homography for one square and multiply with other cube corner.
Should I 'rotate' homography matrix by multiplying with 90deegre rotation matrix to get 'ground' homography? I plan to do processing in smartphone scenario so maybe gyroscope, camera intrinsic params can be of any value.
This is possible.
Let's assume (or state) that the table is the z=0-plane and that your first box is at the origin of this plane. This means that green corners of the left box have the (table-)coordinates (0,0,0),(1,0,0),(0,0,1) and (1,0,1). (Your box has the size 1).
You also have the pixel coordinates of these points. If you give these 2d and 3d-values (as well as the intrinsics and distortion of the camera) to cv::solvePnP, you get the relative Pose of the camera to your box (and the plane).
In the next step, you have to intersect the table-plane with the ray that goes from your camera's center through the lower right corner pixels of the second green box. This intersection will look like (x,y,0) and [x-1,y] will be translation between the right corners of your boxes.
If you have all the information (camera intrinsics) you can do it the way FooBar answered.
But you can use the information that the points lie on a plane even more directly with a homography (no need to calculate rays etc):
Compute the homography between the image plane and the ground plane.
Unfortunately you need 4 point correspondences, but there are only 3 cube-points visible in the image, touching the ground plane.
Instead you can use the top-plane of the cubes, where the same distance can be measured.
first the code:
int main()
{
// calibrate plane distance for boxes
cv::Mat input = cv::imread("../inputData/BoxPlane.jpg");
// if we had 4 known points on the ground plane, we could use the ground plane but here we instead use the top plane
// points on real world plane: height = 1: // so it's not measured on the ground plane but on the "top plane" of the cube
std::vector<cv::Point2f> objectPoints;
objectPoints.push_back(cv::Point2f(0,0)); // top front
objectPoints.push_back(cv::Point2f(1,0)); // top right
objectPoints.push_back(cv::Point2f(0,1)); // top left
objectPoints.push_back(cv::Point2f(1,1)); // top back
// image points:
std::vector<cv::Point2f> imagePoints;
imagePoints.push_back(cv::Point2f(141,302));// top front
imagePoints.push_back(cv::Point2f(334,232));// top right
imagePoints.push_back(cv::Point2f(42,231)); // top left
imagePoints.push_back(cv::Point2f(223,177));// top back
cv::Point2f pointToMeasureInImage(741,200); // bottom right of second box
// for transform we need the point(s) to be in a vector
std::vector<cv::Point2f> sourcePoints;
sourcePoints.push_back(pointToMeasureInImage);
//sourcePoints.push_back(pointToMeasureInImage);
sourcePoints.push_back(cv::Point2f(718,141));
sourcePoints.push_back(imagePoints[0]);
// list with points that correspond to sourcePoints. This is not needed but used to create some ouput
std::vector<int> distMeasureIndices;
distMeasureIndices.push_back(1);
//distMeasureIndices.push_back(0);
distMeasureIndices.push_back(3);
distMeasureIndices.push_back(2);
// draw points for visualization
for(unsigned int i=0; i<imagePoints.size(); ++i)
{
cv::circle(input, imagePoints[i], 5, cv::Scalar(0,255,255));
}
//cv::circle(input, pointToMeasureInImage, 5, cv::Scalar(0,255,255));
//cv::line(input, imagePoints[1], pointToMeasureInImage, cv::Scalar(0,255,255), 2);
// compute the relation between the image plane and the real world top plane of the cubes
cv::Mat homography = cv::findHomography(imagePoints, objectPoints);
std::vector<cv::Point2f> destinationPoints;
cv::perspectiveTransform(sourcePoints, destinationPoints, homography);
// compute the distance between some defined points (here I use the input points but could be something else)
for(unsigned int i=0; i<sourcePoints.size(); ++i)
{
std::cout << "distance: " << cv::norm(destinationPoints[i] - objectPoints[distMeasureIndices[i]]) << std::endl;
cv::circle(input, sourcePoints[i], 5, cv::Scalar(0,255,255));
// draw the line which was measured
cv::line(input, imagePoints[distMeasureIndices[i]], sourcePoints[i], cv::Scalar(0,255,255), 2);
}
// just for fun, measure distances on the 2nd box:
float distOn2ndBox = cv::norm(destinationPoints[0]-destinationPoints[1]);
std::cout << "distance on 2nd box: " << distOn2ndBox << " which should be near 1.0" << std::endl;
cv::line(input, sourcePoints[0], sourcePoints[1], cv::Scalar(255,0,255), 2);
cv::imshow("input", input);
cv::waitKey(0);
return 0;
}
Here's the output which I want to explain:
distance: 2.04674
distance: 2.82184
distance: 1
distance on 2nd box: 0.882265 which should be near 1.0
those distances are:
1. the yellow bottom one from one box to the other
2. the yellow top one
3. the yellow one on the first box
4. the pink one
so the red line (you asked for) should have a length of nearly exactly 2 x cube side length. But we have some error as you can see.
the better/more correct your pixel positions are before homography computation, the more exact your results.
You need a pinhole camera model, so undistort your camera (in real world application).
keep in mind too, that you could compute the distances on the ground plane, if you had 4 linear points visible there (that dont lie on same line)!

How can I get the camera space position on the near plane from the object space position

As titled, I have an item with a specific position in object space defined by a single vector.
I would like to retrieve the coordinates in camera space of the projection of this vector on the near clipping plane.
In other words, I need the intersection in camera space between this vector and the plane defined by the z coordinate equals to -1 (my near plane).
I needed it for moving object linearly with the mouse in perspective projection
Edit: Right now I go from the object space down to the window space, then from there up to the camera space by setting the window depth window.z equal to 0, that is the near plane.
Note that to get the camera space from the unProject I just pass in as modelview matrix an identity matrix new Mat4(1f):
public Vec3 getCameraSpacePositionOnNearPlane(Vec2i mousePoint) {
int[] viewport = new int[]{0, 0, glViewer.getGlWindow().getWidth(), glViewer.getGlWindow().getHeight()};
Vec3 window = new Vec3();
window.x = mousePoint.x;
window.y = viewport[3] - mousePoint.y - 1;
window.z = 0;
return Jglm.unProject(window, new Mat4(1f), glViewer.getVehicleCameraToClipMatrix(), new Vec4(viewport));
}
Is there a better way (more efficient) to get it without going down to the window space and come back to the camera one?
The most direct approach I could think of would be to simply transform your object space position (let this be called vector x in the following) into eye space, construct a ray from the origin to that eye-space coordinates and calculate the intersection between that ray and the near plane z_eye=-near.
Another approach would be to fully transfrom into the clip space. Since the near plane is z_clip = - w_clip there, you can just set the z coordinate of the result to -w, and project that back to eye space by using the inverse projection matrix.
In both cases, the result will be meaningless if the point lies behind the camera, or at the camera plane z_eye = 0.

Resources