Vulkan Ray Tracing - Any Hit Shader doesn't write to buffer - buffer

I set up a minimal ray tracing pipeline in vulkan, with any and closest hit shaders that write into buffers and ray payloads.
The problem is that buffer writes from the any hit shader seem not to take effect.
Here is the source code for the closest hit shader:
layout(set = 0, binding = 0, std430) writeonly buffer RayStatusBuffer {
uint items[];
} gRayStatus;
layout(location = 0) rayPayloadInEXT uint gRayPayload;
void main(void)
{
gRayStatus.items[0] = 1;
gRayPayload = 2;
}
The any hit shader code is identical, except for writing 3 and 4 for ray status buffer item and ray payload, respectively.
The buffer associated with gRayStatus is initialized to 0 and fed to the pipeline with:
VkDescriptorSetLayoutBinding statusLB{};
statusLB.binding = 0;
statusLB.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
statusLB.descriptorCount = 1;
statusLB.stageFlags = VK_SHADER_STAGE_ANY_HIT_BIT_KHR | VK_SHADER_STAGE_CLOSEST_HIT_BIT_KHR;
By calling traceRayEXT(..., flags = 0, ...) from the raygen shader, I can read back the values 1 and 2 for ray status buffer item and ray payload, respectively and as expected.
But when calling traceRayEXT(..., flags = gl_RayFlagsSkipClosestHitShaderEXT, ...) I would expect the output of the any hit shader (3 and 4) to be present, but I get 0 and 4, as if the buffer write would have been ignored.
Any idea on this?

sorry for the late response.
From what I know, there could be two causes:
1° Any hit shaders are not called because of the flag VkGeometryFlagBitsKHR in the struct VkAccelerationStructureGeometryKHR used during the creation of a BLAS.
2° The conditions in which the any hit shaders are called. Look at this image helped me a lot: https://www.google.com/search?q=DxT+rtx+pipeline&tbm=isch&ved=2ahUKEwiTko-Gv-f8AhXCsEwKHUILD4IQ2-cCegQIABAA&oq=DxT+rtx+pipeline&gs_lcp=CgNpbWcQAzoECCMQJzoFCAAQgAQ6BggAEAUQHjoGCAAQCBAeOgQIABAeOgcIABCABBAYUKEPWIg5YPc5aAlwAHgAgAFmiAGyD5IBBDE4LjSYAQCgAQGqAQtnd3Mtd2l6LWltZ8ABAQ&sclient=img&ei=06DTY9PcIMLhsgLClryQCA&bih=1067&biw=1920&client=firefox-b-d#imgrc=38W16ovqUoCyRM
As you can see from the picture an any shader shader is called only if the hit geometry is the closest and is not opaque

Related

(DX12 Shadow Mapping) Depth buffer is always filled with 1

I'm really new to graphics programming in general, so please bear with me. I am trying to add shadow mapping from a distant light (orthogonal projection) into my scene, but when I follow the (very incomplete) steps from Frank Luna's DX12 book I find that my SRV for the shadow map is just filled with depths of 1.
If it helps, here is my SRV definition:
D3D12_TEX2D_SRV texDesc = {
0,
-1,
0,
0.0f
};
D3D12_SHADER_RESOURCE_VIEW_DESC srvDesc = {
DXGI_FORMAT_R32_TYPELESS,
D3D12_SRV_DIMENSION_TEXTURE2D,
D3D12_DEFAULT_SHADER_4_COMPONENT_MAPPING,
};
srvDesc.Texture2D = texDesc;
m_device->CreateShaderResourceView(m_lightDepthTexture.Get(),&srvDesc, m_cbvHeap->GetCPUDescriptorHandleForHeapStart());
and here are my DSV heap and descriptor definitions:
D3D12_DESCRIPTOR_HEAP_DESC dsvHeapDesc = {};
dsvHeapDesc.NumDescriptors = 2;
dsvHeapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_DSV;
dsvHeapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE;
ThrowIfFailed(m_device->CreateDescriptorHeap(&dsvHeapDesc, IID_PPV_ARGS(&m_dsvHeap)));
D3D12_DEPTH_STENCIL_VIEW_DESC depthStencilDesc = {};
depthStencilDesc.Format = DXGI_FORMAT_D32_FLOAT;
depthStencilDesc.ViewDimension = D3D12_DSV_DIMENSION_TEXTURE2D;
depthStencilDesc.Flags = D3D12_DSV_FLAG_NONE;
CD3DX12_HEAP_PROPERTIES heapProps = CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_DEFAULT);
CD3DX12_RESOURCE_DESC resourceDesc = CD3DX12_RESOURCE_DESC::Tex2D(DXGI_FORMAT_R32_TYPELESS, m_width, m_height, 1, 0, 1, 0, D3D12_RESOURCE_FLAG_ALLOW_DEPTH_STENCIL);
D3D12_CLEAR_VALUE depthOptimizedClearValue = {};
depthOptimizedClearValue.Format = DXGI_FORMAT_D32_FLOAT;
depthOptimizedClearValue.DepthStencil.Depth = 1.0f;
depthOptimizedClearValue.DepthStencil.Stencil = 0;
ThrowIfFailed(m_device->CreateCommittedResource(
&heapProps,
D3D12_HEAP_FLAG_NONE,
&resourceDesc,
D3D12_RESOURCE_STATE_DEPTH_WRITE,
&depthOptimizedClearValue,
IID_PPV_ARGS(&m_dsvBuffer)
));
D3D12_RESOURCE_DESC texDesc;
ZeroMemory(&texDesc, sizeof(D3D12_RESOURCE_DESC));
texDesc.Dimension = D3D12_RESOURCE_DIMENSION_TEXTURE2D;
texDesc.Alignment = 0;
texDesc.Width = m_width;
texDesc.Height = m_height;
texDesc.DepthOrArraySize = 1;
texDesc.MipLevels = 1;
texDesc.Format = DXGI_FORMAT_R32_TYPELESS;
texDesc.SampleDesc.Count = 1;
texDesc.SampleDesc.Quality = 0;
texDesc.Layout = D3D12_TEXTURE_LAYOUT_UNKNOWN;
texDesc.Flags = D3D12_RESOURCE_FLAG_ALLOW_DEPTH_STENCIL;
ThrowIfFailed(m_device->CreateCommittedResource(
&heapProps,
D3D12_HEAP_FLAG_NONE,
&texDesc,
D3D12_RESOURCE_STATE_GENERIC_READ,
&depthOptimizedClearValue,
IID_PPV_ARGS(&m_lightDepthTexture)
));
CD3DX12_CPU_DESCRIPTOR_HANDLE dsv(m_dsvHeap->GetCPUDescriptorHandleForHeapStart());
m_device->CreateDepthStencilView(m_dsvBuffer.Get(), &depthStencilDesc, dsv);
dsv.Offset(1, m_device->GetDescriptorHandleIncrementSize(D3D12_DESCRIPTOR_HEAP_TYPE_DSV));
m_device->CreateDepthStencilView(m_lightDepthTexture.Get(), &depthStencilDesc, dsv);
I then created a basic vertex shader that just transforms the vertices with my map (from Frank Luna's book, page 648,650). Since I bound the m_lightDepthTexture to D3D12GraphicsCommandList::OMSetRenderTargets, I assumed that the depth values would be written onto m_lightDepthTexture. But simply sampling this texture in my main pass proves that the values are actually 1.0f. So nothing actually happened on my shadow pass!
I really have no idea what to ask, but if anyone has a sample DX12 shadow map I could see (Google comes up with DX11 or less, or much too complicated samples), or if there's a good source to learn about this, please let me know!
EDIT: I should say that I changed the format from DXGI_FORMAT_D24_UNORM_S8_UINT, as I think the extra 8 bits for stencil is irrelevant to my case. I changed back to the book format and nothing changed, so I think this format should be fine.
If you remove the unecessary return ret; from your shadow vertex shader, the problem then seems to be in winding order of vertices of your sphere. You can easily verify this by setting cull mode to D3D12_CULL_MODE_NONE for your shadow PSO.
You can easily correct your sphere winding order by switching order of any two vertices of every triangle, so wherever you have p1,p2,p3 you just write it for example as p1,p3,p2.
You will also need to check your matrix multiplication order in your vertex shaders, I didn't checked it in detail but it's inconsistent and I believe the cause why the sphere will appear black when you fix the above issue. You also seem to be missing division by w for your light coords in lighting vertex shader.

how can I update dynamic vertex buffer fastly?

I'm trying to make a simple 3D modeling tool.
there is some work to move a vertex( or vertices ) for transform the model.
I used dynamic vertex buffer because thought it needs much update.
but performance is too low in high polygon model even though I change just one vertex.
is there other methods? or did I wrong way?
here is my D3D11_BUFFER_DESC
Usage = D3D11_USAGE_DYNAMIC;
CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
BindFlags = D3D11_BIND_VERTEX_BUFFER;
ByteWidth = sizeof(ST_Vertex) * _nVertexCount
D3D11_SUBRESOURCE_DATA d3dBufferData;
d3dBufferData.pSysMem = pVerticesInfo;
hr = pd3dDevice->CreateBuffer(&descBuffer, &d3dBufferData, &_pVertexBuffer);
and my update funtion
D3D11_MAPPED_SUBRESOURCE d3dMappedResource;
pImmediateContext->Map(_pVertexBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &d3dMappedResource);
ST_Vertex* pBuffer = (ST_Vertex*)d3dMappedResource.pData;
for (int i = 0; i < vIndice.size(); ++i)
{
pBuffer[vIndice[i]].xfPosition.x = pVerticesInfo[vIndice[i]].xfPosition.x;
pBuffer[vIndice[i]].xfPosition.y = pVerticesInfo[vIndice[i]].xfPosition.y;
pBuffer[vIndice[i]].xfPosition.z = pVerticesInfo[vIndice[i]].xfPosition.z;
}
pImmediateContext->Unmap(_pVertexBuffer, 0);
As mentioned in the previous answer, you are updating your whole buffer every time, which will be slow depending on model size.
The solution is indeed to implement partial updates, there are two possibilities for it, you want to update a single vertex, or you want to update
arbitrary indices (for example, you want to move N vertices in one go, in different locations, like vertex 1,20,23 for example.
The first solution is rather simple, first create your buffer with the following description :
Usage = D3D11_USAGE_DEFAULT;
CPUAccessFlags = 0;
BindFlags = D3D11_BIND_VERTEX_BUFFER;
ByteWidth = sizeof(ST_Vertex) * _nVertexCount
D3D11_SUBRESOURCE_DATA d3dBufferData;
d3dBufferData.pSysMem = pVerticesInfo;
hr = pd3dDevice->CreateBuffer(&descBuffer, &d3dBufferData, &_pVertexBuffer);
This makes sure your vertex buffer is gpu visible only.
Next create a second dynamic buffer which has the size of a single vertex (you do not need any bind flags in that case, as it will be used only for copies)
_pCopyVertexBuffer
Usage = D3D11_USAGE_DYNAMIC; //Staging works as well
CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
BindFlags = 0;
ByteWidth = sizeof(ST_Vertex);
D3D11_SUBRESOURCE_DATA d3dBufferData;
d3dBufferData.pSysMem = NULL;
hr = pd3dDevice->CreateBuffer(&descBuffer, &d3dBufferData, &_pCopyVertexBuffer);
when you move a vertex, copy the changed vertex in the copy buffer :
ST_Vertex changedVertex;
D3D11_MAPPED_SUBRESOURCE d3dMappedResource;
pImmediateContext->Map(_pVertexBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &d3dMappedResource);
ST_Vertex* pBuffer = (ST_Vertex*)d3dMappedResource.pData;
pBuffer->xfPosition.x = changedVertex.xfPosition.x;
pBuffer->.xfPosition.y = changedVertex.xfPosition.y;
pBuffer->.xfPosition.z = changedVertex.xfPosition.z;
pImmediateContext->Unmap(_pVertexBuffer, 0);
Since you use D3D11_MAP_WRITE_DISCARD, make sure to write all attributes there (not only position).
Now once you done, you can use ID3D11DeviceContext::CopySubresourceRegion to only copy the modified vertex in the current location :
I assume that vertexID is the index of the modified vertex :
pd3DeviceContext->CopySubresourceRegion(_pVertexBuffer,
0, //must be 0
vertexID * sizeof(ST_Vertex), //location of the vertex in you gpu vertex buffer
0, //must be 0
0, //must be 0
_pCopyVertexBuffer,
0, //must be 0
NULL //in this case we copy the full content of _pCopyVertexBuffer, so we can set to null
);
Now if you want to update a list of vertices, things get more complicated and you have several options :
-First you apply this single vertex technique in a loop, this will work quite well if your changeset is small.
-If your changeset is very big (close to almost full vertex size, you can probably rewrite the whole buffer instead).
-An intermediate technique is to use compute shader to perform the updates (thats the one I normally use as its the most flexible version).
Posting all c++ binding code would be way too long, but here is the concept :
your vertex buffer must have BindFlags = D3D11_BIND_VERTEX_BUFFER | D3D11_BIND_UNORDERED_ACCESS; //this allows to write wioth compute
you need to create an ID3D11UnorderedAccessView for this buffer (so shader can write to it)
you need the following misc flags : D3D11_RESOURCE_MISC_BUFFER_ALLOW_RAW_VIEWS //this allows to write as RWByteAddressBuffer
you then create two dynamic structured buffers (I prefer those over byteaddress, but vertex buffer and structured is not allowed in dx11, so for the write one you need raw instead)
first structured buffer has a stride of ST_Vertex (this is your changeset)
second structured buffer has a stride of 4 (uint, these are the indices)
both structured buffers get an arbitrary element count (normally i use 1024 or 2048), so that will be the maximum amount of vertices you can update in a single pass.
both structured buffers you need an ID3D11ShaderResourceView (shader visible, read only)
Then update process is the following :
write modified vertices and locations in structured buffers (using map discard, if you have to copy less its ok)
attach both structured buffers for read
attach ID3D11UnorderedAccessView for write
set your compute shader
call dispatch
detach ID3D11UnorderedAccessView for write (this is VERY important)
This is a sample compute shader code (I assume you vertex is position only, for simplicity)
cbuffer cbUpdateCount : register(b0)
{
uint updateCount;
};
RWByteAddressBuffer RWVertexPositionBuffer : register(u0);
StructuredBuffer<float3> ModifiedVertexBuffer : register(t0);
StructuredBuffer<uint> ModifiedVertexIndicesBuffer : register(t0);
//this is the stride of your vertex buffer, since here we use float3 it is 12 bytes
#define WRITE_STRIDE 12
[numthreads(64, 1, 1)]
void CS( uint3 tid : SV_DispatchThreadID )
{
//make sure you do not go part element count, as here we runs 64 threads at a time
if (tid.x >= updateCount) { return; }
uint readIndex = tid.x;
uint writeIndex = ModifiedVertexIndicesBuffer[readIndex];
float3 vertex = ModifiedVertexBuffer[readIndex];
//byte address buffers do not understand float, asuint is a binary cast.
RWVertexPositionBuffer.Store3(writeIndex * WRITE_STRIDE, asuint(vertex));
}
For the purposes of this question I'm going to assume you already have a mechanism for selecting a vertex from a list of vertices based upon ray casting or some other picking method and a mechanism for creating a displacement vector detailing how the vertex was moved in model space.
The method you have for updating the buffer is sufficient for anything less than a few hundred vertices, but on large scale models it becomes extremely slow. This is because you're updating everything, rather than the individual vertices you modified.
To fix this, you should only update the vertices you have changed, and to do that you need to create a change set.
In concept, a change set is nothing more than a set of changes made to the data - a list of the vertices that need to be updated. Since we already know which vertices were modified (otherwise we couldn't have manipulated them), we can map in the GPU buffer, go to that vertex specifically, and copy just those vertices into the GPU buffer.
In your vertex modification method, record the index of the vertex that was modified by the user:
//Modify the vertex coordinates based on mouse displacement
pVerticesInfo[SelectedVertexIndex].xfPosition.x += DisplacementVector.x;
pVerticesInfo[SelectedVertexIndex].xfPosition.y += DisplacementVector.y;
pVerticesInfo[SelectedVertexIndex].xfPosition.z += DisplacementVector.z;
//Add the changed vertex to the list of changes.
changedVertices.add(SelectedVertexIndex);
//And update the GPU buffer
UpdateD3DBuffer();
In UpdateD3DBuffer(), do the following:
D3D11_MAPPED_SUBRESOURCE d3dMappedResource;
pImmediateContext->Map(_pVertexBuffer, 0, D3D11_MAP_WRITE, 0, &d3dMappedResource);
ST_Vertex* pBuffer = (ST_Vertex*)d3dMappedResource.pData;
for (int i = 0; i < changedVertices.size(); ++i)
{
pBuffer[changedVertices[i]].xfPosition.x = pVerticesInfo[changedVertices[i]].xfPosition.x;
pBuffer[changedVertices[i]].xfPosition.y = pVerticesInfo[changedVertices[i]].xfPosition.y;
pBuffer[changedVertices[i]].xfPosition.z = pVerticesInfo[changedVertices[i]].xfPosition.z;
}
pImmediateContext->Unmap(_pVertexBuffer, 0);
changedVertices.clear();
This has the effect of only updating the vertices that have changed, rather than all vertices in the model.
This also allows for some more complex manipulations. You can select multiple vertices and move them all as a group, select a whole face and move all the connected vertices, or move entire regions of the model relatively easily, assuming your picking method is capable of handling this.
In addition, if you record the change sets with enough information (the affected vertices and the displacement index), you can fairly easily implement an undo function by simply reversing the displacement vector and reapplying the selected change set.

DirectX 11, exception thrown when updating constant buffer with UpdateSubresource

So I am very new to DirectX and are trying to learn the basics but I'm running into some problem with my constant buffer. I'm trying to send a struct with three matrices to the vertex shader, but when I try to update the buffer with UpdateSubresource I get "Exception is thrown at 0x710B5DF3 (d3d11.dll) in Demo.exe: 0xC0000005: Access violation reading location 0x0000003C".
My struct:
struct Matracies
{
DirectX::XMMATRIX projection;
DirectX::XMMATRIX world;
DirectX::XMMATRIX view;
};
Matracies matracies;
Buffer creation:
ID3D11Buffer* ConstantBuffer = nullptr;
D3D11_BUFFER_DESC Buffer;
memset(&Buffer, 0, sizeof(Buffer));
Buffer.BindFlags = D3D11_BIND_CONSTANT_BUFFER;
Buffer.Usage = D3D11_USAGE_DEFAULT;
Buffer.ByteWidth = sizeof(Matracies);
Buffer.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
D3D11_SUBRESOURCE_DATA data;
data.pSysMem = &matracies;
data.SysMemPitch = 0;
data.SysMemSlicePitch = 0;
Device->CreateBuffer(&Buffer, &data, &ConstantBuffer);
DeviceContext->VSSetConstantBuffers(0, 1, &ConstantBuffer);
Updating buffer:
DeviceContext->UpdateSubresource(ConstantBuffer, 0, 0, &matracies, 0, 0);
I am not sure what information is relevant to solve this so let me know if anything is missing.
Welcome to the wooly world of DirectX!
The first two steps in debugging any DirectX program are:
(1) Enable the Debug device. See this blog post. This will generate additional debug output at runtime which gives hints about problems like the one you have above.
(2) If a function returns an HRESULT, you must check that for success or failure at runtime. If it was safe to ignore the return value, it would return void. See this page.
If you had done either or both of the above, you would have caught the error returned from CreateBuffer above which resulted in ConstantBuffer still being a nullptr when you called UpdateSubresource.
The reason it failed is that you can't in general create a constant buffer that is both D3D11_USAGE_DEFAULT and D3D11_CPU_ACCESS_WRITE. DEFAULT usage memory is often in video memory that is not accessible to the CPU. Since you are using UpdateSubresource as opposed to Map, you should just use:
Buffer.CPUAccessFlags = 0;
You should take a look at DirectX Tool Kit and it's associated tutorials.

View GPU Memory / View Texture2D memory space for debugging

I've got a question about a PixelShader I am trying to implement, and what I currently do (this is just for debugging, and trying to figure stuff out):
int3 loc;
loc.x = (int)(In.TextureUV.x * resolution_XY.x);
loc.y = (int)(In.TextureUV.x * resolution_XY.x);
loc.z = 0;
float4 r = g_txDiffuse.Load(loc);
return float4(r.x, r.y, r.z, 1);
The point is, this is always 0,0,0,1
The texture buffer is created:
D3D11_TEXTURE2D_DESC tDesc;
tDesc.Height = 480;
tDesc.Width = 640;
tDesc.Usage = D3D11_USAGE_DYNAMIC;
tDesc.MipLevels = 1;
tDesc.ArraySize = 1;
tDesc.SampleDesc.Count = 1;
tDesc.SampleDesc.Quality = 0;
tDesc.Format = DXGI_FORMAT_R8_UINT;
tDesc.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
tDesc.BindFlags = D3D11_BIND_SHADER_RESOURCE;
tDesc.MiscFlags = 0;
V_RETURN(pd3dDevice->CreateTexture2D(&tDesc, NULL, &g_pCurrentImage));
I upload the texture (which should be a live display at the end) via:
D3D11_MAPPED_SUBRESOURCE resource;
pd3dImmediateContext->Map(g_pCurrentImage, 0, D3D11_MAP_WRITE_DISCARD, 0, &resource);
memcpy( resource.pData, g_Images.GetData(), g_Images.GetDataSize() );
pd3dImmediateContext->Unmap( g_pCurrentImage, 0 );
I've checked the resource.pData, the data in there is a valid 8bit monochrome image. I made sure the data coming from the camera is 8bit monochrome 640x480.
There's a few things I don't fully understand:
if I run the Map / memcpy / Unmap routine in every frame, the driver will ultimately crash, the system will be unresponsive. Is there a different way to update a complete texture every frame which should be done?
the texture I uploaded is 8bit, why is the Texture2D.load() a float4 return? Do I have to use a different method to access the texture data? I tried to .sample it, but that didn't work either. Would I have to use a int buffer or something instead?
is there a way to debug the GPU memory, to check if the memcpy worked in the first place?
The Map, memcpy, Unmap really ought not to crash unless2 you are trying to copy too much data into the texture. It would be interesting to know what "GetDataSize()" returns. Does it equal 307,200? If its more than that then there lies your problem.
Texture2D returns a float4 because thats what you've asked for. If you write float r = g_txDiffuse.Load( ... ). The 8-bits get extended to a normalised float as part of the load process. Are you sure, btw, that your calculation of "loc" is correct because as you have it now loc.x and loc.y will always be the same.
You can debug whats going on with DirectX using PIX. Its a great tool and I highly recommend you familiarise yourself with it.

Problem with HLSL looping/sampling

I have a piece of HLSL code which looks like this:
float4 GetIndirection(float2 TexCoord)
{
float4 indirection = tex2D(IndirectionSampler, TexCoord);
for (half mip = indirection.b * 255; mip > 1 && indirection.a < 128; mip--)
{
indirection = tex2Dlod(IndirectionSampler, float4(TexCoord, 0, mip));
}
return indirection;
}
The results I am getting are consistent with that loop only executing once. I checked the shader in PIX and things got even more weird, the yellow arrow indicating position in the code gets to the loop, goes through it once, and jumps back to the start, at that point the yellow arrow never moves again but the cursor moves through the code and returns a result (a bug in PIX, or am I just using it wrong?)
I have a suspicion this may be a problem to do with texture reads getting moved outside the loop by the compiler, however I thought that didn't happen with tex2Dlod since I'm setting the LOD manually :/
So:
1) What's the problem?
2) Any suggested solutions?
Problem was solved, it was a simple coding mistake, I needed to increase mip level on each iteration, not decrease it.
float4 GetIndirection(float2 TexCoord)
{
float4 indirection = tex2D(IndirectionSampler, TexCoord);
for (half mip = indirection.b * 255; mip > 1 && indirection.a < 128; mip++)
{
indirection = tex2Dlod(IndirectionSampler, float4(TexCoord, 0, mip));
}
return indirection;
}

Resources