This is driving me crazy! Any help would be appreciated.
I have a 3d model that is drawn to the screen. This definitely works, and the model lies within the screen bounds. I want to calculate the screen coordinates on the CPU of a couple of the vertices.
To do this, I multiply these vertices' positions by the model/view/projection matrix in the same way my vertex shader does:
XMVECTOR pos = XMVectorSet(input.x, input.y, input.z, 1);
pos = XMVector3Transform(pos, XMLoadFloat4x4(&m_constantBufferData.model));
pos = XMVector3Transform(pos, XMLoadFloat4x4(&m_constantBufferData.view));
pos = XMVector3Transform(pos, XMLoadFloat4x4(&m_constantBufferData.projection));
I then divide pos.X and pos.Y by pos.Z.
How do I interpret the result of pos? I was expecting it to have X and Y coordinates lying between 0 and 1, or possibly -1 and 1, but I am getting numbers such as -3. Am I doing something incorrectly?
For the record, this is part of vertex shader:
float4 pos = float4(input.pos, 1.0f);
pos = mul(pos, model);
pos = mul(pos, view);
pos = mul(pos, projection);
output.pos = pos;
Thank you in advance for any help! :)
You have to divide by the w component, not the z:
float3 clipPosition = affineClipPosition.xyz / affineClipPosition.w;
If you transformation matrices are valid and applied in the correct order, your xy components should be in the range -1 to 1 and the z component between 0 and 1.
Affine space - Wikipedia
Just in case anyone else hits this problem, I was forgetting that I had transposed the matrices before passing them to HLSL, so I needed to retranspose them back again before using them on the CPU:
XMVECTOR pos = XMVectorSet(input.x, input.y, input.z, 1);
pos = XMVector3Transform(pos, XMMatrixTranspose(XMLoadFloat4x4(&m_constantBufferData.model)));
pos = XMVector3Transform(pos, XMMatrixTranspose(XMLoadFloat4x4(&m_constantBufferData.view)));
pos = XMVector3Transform(pos, XMMatrixTranspose(XMLoadFloat4x4(&m_constantBufferData.projection)));
I then needed to divide the pos.X/pos.Y/pos.Z coordinates of pos by pos.W, as Lucius suggested above.
Hope this helps someone!
Related
I am currently studying shadow mapping, and my biggest issue right now is the transformations between spaces. This is my current working theory/steps.
Pass 1:
Get depth of pixel from camera, store in depth buffer
Get depth of pixel from light, store in another buffer
Pass 2:
Use texture coordinate to sample camera's depth buffer at current pixel
Convert that depth to a view space position by multiplying the projection coordinate with invProj matrix. (also do a perspective divide).
Take that view position and multiply by invV (camera's inverse view) to get a world space position
Multiply world space position by light's viewProjection matrix.
Perspective divide that projection-space coordinate, and manipulate into [0..1] to sample from light depth buffer.
Get current depth from light and closest (sampled) depth, if current depth > closest depth, it's in shadow.
Shader Code
Pass1:
PS_INPUT vs(VS_INPUT input) {
output.pos = mul(input.vPos, mvp);
output.cameraDepth = output.pos.zw;
..
float4 vPosInLight = mul(input.vPos, m);
vPosInLight = mul(vPosInLight, light.viewProj);
output.lightDepth = vPosInLight.zw;
}
PS_OUTPUT ps(PS_INPUT input){
float cameraDepth = input.cameraDepth.x / input.cameraDepth.y;
//Bundle cameraDepth in alpha channel of a normal map.
output.normal = float4(input.normal, cameraDepth);
//4 Lights in total -- although only 1 is active right now. Going to use r/g/b/a for each light depth.
output.lightDepths.r = input.lightDepth.x / input.lightDepth.y;
}
Pass 2 (Screen Quad):
float4 ps(PS_INPUT input) : SV_TARGET{
float4 pixelPosView = depthToViewSpace(input.texCoord);
..
float4 pixelPosWorld = mul(pixelPosView, invV);
float4 pixelPosLight = mul(pixelPosWorld, light.viewProj);
float shadow = shadowCalc(pixelPosLight);
//For testing / visualisation
return float4(shadow,shadow,shadow,1);
}
float4 depthToViewSpace(float2 xy) {
//Get pixel depth from camera by sampling current texcoord.
//Extract the alpha channel as this holds the depth value.
//Then, transform from [0..1] to [-1..1]
float z = (_normal.Sample(_sampler, xy).a) * 2 - 1;
float x = xy.x * 2 - 1;
float y = (1 - xy.y) * 2 - 1;
float4 vProjPos = float4(x, y, z, 1.0f);
float4 vPositionVS = mul(vProjPos, invP);
vPositionVS = float4(vPositionVS.xyz / vPositionVS.w,1);
return vPositionVS;
}
float shadowCalc(float4 pixelPosL) {
//Transform pixelPosLight from [-1..1] to [0..1]
float3 projCoords = (pixelPosL.xyz / pixelPosL.w) * 0.5 + 0.5;
float closestDepth = _lightDepths.Sample(_sampler, projCoords.xy).r;
float currentDepth = projCoords.z;
return currentDepth > closestDepth; //Supposed to have bias, but for now I just want shadows working haha
}
CPP Matrices
// (Position, LookAtPos, UpDir)
auto lightView = XMMatrixLookAtLH(XMLoadFloat4(&pos4), XMVectorSet(0,0,0,1), XMVectorSet(0,1,0,0));
// (FOV, AspectRatio (1000/680), NEAR, FAR)
auto lightProj = XMMatrixPerspectiveFovLH(1.57f , 1.47f, 0.01f, 10.0f);
XMStoreFloat4x4(&_cLightBuffer.light.viewProj, XMMatrixTranspose(XMMatrixMultiply(lightView, lightProj)));
Current Outputs
White signifies that a shadow should be projected there. Black indicates no shadow.
CameraPos (0, 2.5, -2)
CameraLookAt (0, 0, 0)
CameraFOV (1.57)
CameraNear (0.01)
CameraFar (10.0)
LightPos (0, 2.5, -2)
LightLookAt (0, 0, 0)
LightFOV (1.57)
LightNear (0.01)
LightFar (10.0)
If I change the CameraPosition to be (0, 2.5, 2), basically just flipped on the Z axis, this is the result.
Obviously a shadow shouldn't change its projection depending on where the observer is, so I think I'm making a mistake with the invV. But I really don't know for sure. I've debugged the light's projView matrix, and the values seem correct - going from CPU to GPU. It's also entirely possible I've misunderstood some theory along the way because this is quite a tricky technique for me.
Aha! Found my problem. It was a silly mistake, I was calculating the depth of pixels from each light, but storing them in a texture that was based on the view of the camera. The following image should explain my mistake better than I can with words.
For future reference, the solution I decided was to scrap my idea for storing light depths in texture channels. Instead, I basically make a new pass for each light, and bind a unique depth-stencil texture to render the geometry to. When I want to do light calculations, I bind each of the depth textures to a shader resource slot and go from there. Obviously this doesn't scale well with many lights, but for my student project where I'm only required to have 2 shadow casters, it suffices.
_context->DrawIndexed(indexCount, 0, 0); //Draw to regular render target
_sunlight->use(1, _context); //Use sunlight shader (basically just runs a Vertex Shader & Null Pixel shader so depth can be written to depth map)
_sunlight->bindDSVSetNullRenderTarget(_context);
_context->DrawIndexed(indexCount, 0, 0); //Draw to sunlight depth target
bindDSVSetNullRenderTarget(ctx){
ID3D11RenderTargetView* nullrv = { nullptr };
ctx->OMSetRenderTargets(1, &nullrv, _sunlightDepthStencilView);
}
//The purpose of setting a null render target before doing the draw call is
//that a draw call with only a depth target bound is much faster.
//(At least I believe so, from my reading online)
Demo almost (?) working example: https://ellie-app.com/4h9F8FNcRPya1/1
For demo: Click to draw ray, and rotate camera with left and right to see ray. (As the origin is from the camera, you can't see it from the position it is created)
Context
I am working on an elm & elm-webgl project where I would like to know if the mouse is over an object when clicked. To do is I tried to implement a simple ray cast. What I need is two things:
1) The coordinate of the camera (This one is easy)
2) The coordinate/direction in 3D space of where was clicked
Problem
The steps to get from 2D view space to 3D world space as I understand are:
a) Make coordinates to be in a range of -1 to 1 relative to view port
b) Invert projection matrix and perspective matrix
c) Multiply projection and perspective matrix
d) Create Vector4 from normalised mouse coordinates
e) Multiply combined matrices with Vector4
f) Normalise result
Try so far
I have made a function to transform a Mouse.Position to a coordinate to draw a line to:
getClickPosition : Model -> Mouse.Position -> Vec3
getClickPosition model pos =
let
x =
toFloat pos.x
y =
toFloat pos.y
normalizedPosition =
( (x * 2) / 1000 - 1, (1 - y / 1000 * 2) )
homogeneousClipCoordinates =
Vec4.vec4
(Tuple.first normalizedPosition)
(Tuple.second normalizedPosition)
-1
1
inversedProjectionMatrix =
Maybe.withDefault Mat4.identity (Mat4.inverse (camera model))
inversedPerspectiveMatrix =
Maybe.withDefault Mat4.identity (Mat4.inverse perspective)
inversedMatrix2 =
Mat4.mul inversedProjectionMatrix inversedPerspectiveMatrix
to =
Vec4.vec4
(Tuple.first normalizedPosition)
(Tuple.second normalizedPosition)
1
1
toInversed =
mulVector inversedMatrix2 to
toNorm =
Vec4.normalize toInversed
toVec3 =
vec3 (Vec4.getX toNorm) (Vec4.getY toNorm) (Vec4.getZ toNorm)
in
toVec3
Result
The result of this function is that the rays are too much to the center to where I click. I added a screenshot where I clicked in all four of the top face of the cube. If I click on the center of the viewport the ray will be correctly positioned.
It feels close, but not quite there yet and I can't figure out what I am doing wrong!
After trying other approaches I found a solution:
getClickPosition : Model -> Mouse.Position -> Vec3
getClickPosition model pos =
let
x =
toFloat pos.x
y =
toFloat pos.y
normalizedPosition =
( (x * 2) / 1000 - 1, (1 - y / 1000 * 2) )
homogeneousClipCoordinates =
Vec4.vec4
(Tuple.first normalizedPosition)
(Tuple.second normalizedPosition)
-1
1
inversedViewMatrix =
Maybe.withDefault Mat4.identity (Mat4.inverse (camera model))
inversedProjectionMatrix =
Maybe.withDefault Mat4.identity (Mat4.inverse perspective)
vec4CameraCoordinates = mulVector inversedProjectionMatrix homogeneousClipCoordinates
direction = Vec4.vec4 (Vec4.getX vec4CameraCoordinates) (Vec4.getY vec4CameraCoordinates) -1 0
vec4WorldCoordinates = mulVector inversedViewMatrix direction
vec3WorldCoordinates = vec3 (Vec4.getX vec4WorldCoordinates) (Vec4.getY vec4WorldCoordinates) (Vec4.getZ vec4WorldCoordinates)
normalizedVec3WorldCoordinates = Vec3.normalize vec3WorldCoordinates
origin = model.cameraPos
scaledDirection = Vec3.scale 20 normalizedVec3WorldCoordinates
destination = Vec3.add origin scaledDirection
in
destination
I left it as verbose as possible, if someone finds I use incorrect terminology please make a comment and I will update the answer.
I am sure there are lots of optimisations possible (Multiplying matrices before inverting or combining some of the steps.)
Updated the ellie app here: https://ellie-app.com/4hZ9s8S92PSa1/0
I am working on a project wich involves Aruco markers and opencv.
I am quite far in the project progress. I can read the rotation vectors and convert them to a rodrigues matrix using rodrigues() from opencv.
This is a example of a rodrigues matrix I get:
[0,1,0;
1,0,0;
0,0,-1]
I use the following code.
Mat m33(3, 3, CV_64F);
Mat measured_eulers(3, 1, CV_64F);
Rodrigues(rotationVectors, m33);
measured_eulers = rot2euler(m33);
Degree_euler = measured_eulers * 180 / CV_PI;
I use the predefined rot2euler to convert from rodrigues matrix to euler angles.
And I convert the received radians to degrees.
rot2euler looks like the following.
Mat rot2euler(const Mat & rotationMatrix)
{
Mat euler(3, 1, CV_64F);
double m00 = rotationMatrix.at<double>(0, 0);
double m02 = rotationMatrix.at<double>(0, 2);
double m10 = rotationMatrix.at<double>(1, 0);
double m11 = rotationMatrix.at<double>(1, 1);
double m12 = rotationMatrix.at<double>(1, 2);
double m20 = rotationMatrix.at<double>(2, 0);
double m22 = rotationMatrix.at<double>(2, 2);
double x, y, z;
// Assuming the angles are in radians.
if (m10 > 0.998) { // singularity at north pole
x = 0;
y = CV_PI / 2;
z = atan2(m02, m22);
}
else if (m10 < -0.998) { // singularity at south pole
x = 0;
y = -CV_PI / 2;
z = atan2(m02, m22);
}
else
{
x = atan2(-m12, m11);
y = asin(m10);
z = atan2(-m20, m00);
}
euler.at<double>(0) = x;
euler.at<double>(1) = y;
euler.at<double>(2) = z;
return euler;
}
If I use the rodrigues matrix I give as an example I get the following euler angles.
[0; 90; -180]
But I am suppose to get the following.
[-180; 0; 90]
When is use this tool http://danceswithcode.net/engineeringnotes/rotations_in_3d/demo3D/rotations_in_3d_tool.html
You can see that [0; 90; -180] doesn't match the rodrigues matrix but [-180; 0; 90] does. (I am aware of the fact that the tool works with ZYX coordinates)
So the problem is I get the correct values but in a wrong order.
Another problem is that this isn't always the case.
For example rodrigues matrix:
[1,0,0;
0,-1,0;
0,0,-1]
Provides me the correct euler angles.
If someone knows a solution to the problem or can provide me with a explanation how the rot2euler function works exactly. It will be higly appreciated.
Kind Regards
Brent Convens
I guess I am quite late but I'll answer it nonetheless.
Dont quote me on this, ie I'm not 100 % certain but this is one
of the files ( {OPENCV_INSTALLATION_DIR}/apps/interactive-calibration/rotationConverters.cpp ) from the source code of openCV 3.3
It seems to me that openCV is giving you Y-Z-X ( similar to what is being shown in the code above )
Why I said I wasn't sure because I just looked at the source code of cv::Rodrigues and it doesnt seem to call this piece of code that I have shown above. The Rodrigues function has the math harcoded into it ( and I think it can be checked by Taking the 2 rotation matrices and multiplying them as - R = Ry * Rz * Rx and then looking at the place in the code where there is a acos(R(2,0)) or asin(R(0,2) or something similar,since one of the elements of "R" will usually be a cos() or sine which will give you a solution as to which angle is being found.
Not specific to OpenCV, but you could write something like this:
cosine_for_pitch = math.sqrt(pose_mat[0][0] ** 2 + pose_mat[1][0] ** 2)
is_singular = cosine_for_pitch < 10**-6
if not is_singular:
yaw = math.atan2(pose_mat[1][0], pose_mat[0][0])
pitch = math.atan2(-pose_mat[2][0], cosine_for_pitch)
roll = math.atan2(pose_mat[2][1], pose_mat[2][2])
else:
yaw = math.atan2(-pose_mat[1][2], pose_mat[1][1])
pitch = math.atan2(-pose_mat[2][0], cosine_for_pitch)
roll = 0
Here, you could explore more:
https://www.learnopencv.com/rotation-matrix-to-euler-angles/
http://www.staff.city.ac.uk/~sbbh653/publications/euler.pdf
I propose to use the PCL library to do that with this formulation
pcl::getEulerAngles(transformatoin,roll,pitch,yaw);
you need just to initialize the roll, pitch, yaw and a pre-calculated transformation matrix you can do it
How can I calculate eye space intersection coordinates in an OptiX program?
My research showed that only object and world coordinates are provided, but I cannot believe that there is no way to get the eye space coordinates.
It is possible to rotate the intersection point by the camera orientation like this:
__device__ void worldToEye(float3& pointInOut)
{
const float3 Un = normalize(U);
const float3 Vn = normalize(V);
const float3 Wn = normalize(W);
const float viewMat[3][3] = {{Un.x, Un.y, Un.z},
{Vn.x, Vn.y, Vn.z},
{Wn.x, Wn.y, Wn.z}};
float point[3] = {pointInOut.x, pointInOut.y, pointInOut.z};
float result[3] = {0.0f, 0.0f, 0.0f};
for (int i=0; i<3; ++i)
{
for (int j=0; j<3; ++j)
{
result[i] += viewMat[i][j] * point[j];
}
}
pointInOut.x = result[0];
pointInOut.z = result[1];
pointInOut.y = result[2];
}
With the input point calculated:
float3 hit_point = t_hit * ray.direction;
worldToEye(hit_point);
prd.result = hit_point;
Optix has no eye coord. because it's based on ray tracing not rastering. First you should ask yourself what a eye coord. used for in shaders base on rastering. Basiclly for depth test, clipping etc. But all these are not a thing in ray-tracing shaders. When a ray casts from a point in world coord with a certain direction, the following executions are all in world coord. There is no clipping because all rays are basically stand for specific pixels. and There is no depth test because all rays are detected in intersection program, only the nearest hit point will be delivered to closed hit program. so in conclusion you should give up some mechanisms or pipe lines used in rastering based shadering, and gain some new skills used in ray-tracing based shadering.
poor English, my apologizes :)
I have to draw a selection feedback like Photoshop in my directx application. I came across an algorithm on wikipedia to do this. But, I am not sure if its the right way to do this especially if my selection area could be any arbitrary geometry. Has someone implemented it using Directx? Any hints are much appreciated.
Based on my comment here is a simple pixel shader to achieve the wanted result:
float4 PS( float4 pos : SV_POSITION) : SV_Target
{
float w = ((int)(pos.x + pos.y + t) % 8);
return (w < 4 ? float4(0,0,0,1) : float4(1,1,1,1));
}
x and y are added to produce the diagonal stripe pattern. You can imagine it as follows: If y is constant and x increases by 1, w also increases by 1. The same applies for y. So for w to stay constant, you have to go (x+1, y-1) or (x-1, y+1) (or other step sizes). We use the % operator to produce a periodicity of 8 pixels. The first half period is filled black and the second half white.
This is an equivalent, but more performant shader. It uses bit operations instead of modulo and comparisons.
float4 PS( float4 pos : SV_POSITION) : SV_Target
{
int w = ((int)(pos.x + pos.y + t) & 4);
return float4(w,w,w,1);
}