GPU Texture Splatting - directx

Just as a quick example, I'm trying to do the following:
+
+
=
With the third image as an alpha map, how could this be implemented in a DX9-compatible pixel shader to "blend" between the first two images, creating an effect similar to the fourth image?
Furthermore, how could this newly created texture be given back to the CPU, where it could be placed back inside the original array of textures?

The rough way is to blend the colors of the textures with the alphamap and return it from the pixelshader:
float alpha = tex2D(AlphaSampler,TexCoord).r;
float3 texture1 = tex2D(Texture1Sampler,TexCoord).rgb;
float3 texture2 = tex2D(Texture2Sampler,TexCoord).rgb;
float3 color = lerp(texture1,texture2,alpha);
return float4(color.rgb,1);
Therefore you need a texture as rendertarget (doc) with the size of the inputtextures and a fullscreen quad as geometry for rendering, a xyzrhw quad would be the easiest. This texture you can use further for rendering. If you want to read the texels or something else, where you must lock the result you could work with StretchRect (doc) or UpdateSurface (doc) to copy the data into a normal texture.
If the performance isn't important (e.g. you preprocess the textures), you could easier compute this on the cpu (but it's slower). Lock the 4 textures, iterate over the pixels and merge them directly.

Related

How to render a 2D UI on top of a 3D scene before rendering the 3D scene?

I have a 2D texture that contains 2D overlay. The texture itself is mostly blank (transparent) with a few parts containing some data.
What I currently do is render the whole 3D scene, disable the depth buffer and render the 2D quad on top of it:
// render 3D scene
context->OMSetDepthStencilState(_noDepthTestState.Get(), 1); // disable depth test
// render 2D quad on top of the whole viewport
context->OMSetDepthStencilState(nullptr, 1); // restore default
The _noDepthTestState variable is a ID3D11DepthStencilState created with the following descriptor:
D3D11_DEPTH_STENCIL_DESC desc{};
desc.DepthEnable = false;
desc.DepthWriteMask = D3D11_DEPTH_WRITE_MASK_ALL;
desc.DepthFunc = D3D11_COMPARISON_LESS;
desc.StencilEnable = true;
desc.StencilReadMask = 0xFF;
desc.StencilWriteMask = 0xFF;
desc.FrontFace.StencilFailOp = D3D11_STENCIL_OP_KEEP;
desc.FrontFace.StencilDepthFailOp = D3D11_STENCIL_OP_INCR;
desc.FrontFace.StencilPassOp = D3D11_STENCIL_OP_KEEP;
desc.FrontFace.StencilFunc = D3D11_COMPARISON_ALWAYS;
desc.BackFace.StencilFailOp = D3D11_STENCIL_OP_KEEP;
desc.BackFace.StencilDepthFailOp = D3D11_STENCIL_OP_DECR;
desc.BackFace.StencilPassOp = D3D11_STENCIL_OP_KEEP;
desc.BackFace.StencilFunc = D3D11_COMPARISON_ALWAYS;
I would like to render the 2D overlay quad before rendering the 3D scene, and that 3D objects will be drawn only in places where the 2D overlay is blank.
Is there an efficient way to achieve this behavior? what is the correct configuration of depth / stencil states?
It depends on the way your main 3D scene is rendered.
Assuming you don’t use stencil buffer, and you using D3D11_COMPARISON_LESS depth comparison, draw the quad in the following way.
Your vertex shader should output quad vertices [ ±1, ±1, 0, 1 ]. This makes GUI to be the closest to the camera, early Z rejection should clip occluded pixels in the 3D scene before PS stage, saving some GPU resources (I’m assuming that’s why you want to render the 2D first).
Your pixel shader should read from your GUI texture, compare alpha to some threshold, and if it’s small enough, call discard. Discarded pixels don’t change any buffers, neither color nor depth/stencil.
This will work OK if your GUI is made of axis-aligned rectangles, or if you don’t use MSAA. However, if you do have curved/diagonal edges of the GUI, and using MSAA, you won’t be happy with the result. Fixing that is possible but way more complicated. You gonna need to somehow produce per-pixel SV_Coverage values when rendering your GUI texture.

What can vertex function do except for mapping to clip space?

The Metal Shading Language includes a lot of mathematic functions, but it seems most of the codes inside Metal official documentation just use it to map vertexes from pixel space to clip space like
RasterizerData out;
out.clipSpacePosition = vector_float4(0.0, 0.0, 0.0, 1.0);
float2 pixelSpacePosition = vertices[vertexID].position.xy;
vector_float2 viewportSize = vector_float2(*viewportSizePointer);
out.clipSpacePosition.xy = pixelSpacePosition / (viewportSize / 2.0);
out.color = vertices[vertexID].color;
return out;
Except for GPGPU using kernel functions to do parallel computation, what things that vertex function can do, with some examples? In a game, if all vertices positions are calculated by the CPU, why GPU still matters? What does vertex function do usually?
Vertex shaders compute properties for vertices. That's their point. In addition to vertex positions, they also calculate lighting normals at each vertex. And potentially texture coordinates. And various material properties used by lighting and shading routines. Then, in the fragment processing stage, those values are interpolated and sent to the fragment shader for each fragment.
In general, you don't modify vertices on the CPU. In a game, you'd usually load them from a file into main memory, put them into a buffer and send them to the GPU. Once they're on the GPU you pass them to the vertex shader on each frame along with model, view, and projection matrices. A single buffer containing the vertices of, say, a tree or a car's wheel might be used multiple times. Each time all the CPU sends is the model, view, and projection matrices. The model matrix is used in the vertex shader to reposition and scale the vertice's positions in world space. The view matrix then moves and rotates the world around so that the virtual camera is at the origin and facing the appropriate way. Then the projection matrix modifies the vertices to put them into clip space.
There are other things a vertex shader can do, too. You can pass in vertices that are in a grid in the x-y plane, for example. Then in your vertex shader you can sample a texture and use that to generate the z-value. This gives you a way to change the geometry using a height map.
On older hardware (and some lower-end mobile hardware) it was expensive to do calculations on a texture coordinate before using it to sample from a texture because you lose some cache coherency. For example, if you wanted to sample several pixels in a column, you might loop over them adding an offset to the current texture coordinate and then sampling with the result. One trick was to do the calculation on the texture coordinates in the vertex shader and have them automatically interpolated before being sent to the fragment shader, then doing a normal look-up in the fragment shader. (I don't think this is an optimization on modern hardware, but it was a big win on some older models.)
First, I'll address this statement
In a game, if all vertices positions are calculated by the CPU, why GPU still matters? What does vertex function do usually?
I don't believe I've seen anyone calculating positions for meshes that will be later used to render them on a GPU. It's slow, you would need to get all this data from CPU to a GPU (which means copying it through a bus if you have a dedicated GPU). And it's just not that flexible. There are much more things other than vertex positions that are required to produce any meaningful image and calculating all this stuff on CPU is just wasteful, since CPU doesn't care for this data for the most part.
The sole purpose of vertex shader is to provide rasterizer with primitives that are in clip space. But there are some other uses that are mostly tricks based on different GPU features.
For example, vertex shaders can write out some data to buffers, so, for example, you can stream out transformed geometry if you don't want to transform it again at a later vertex stage if you have multi-pass rendering that uses the same geometry in more than one pass.
You can also use vertex shaders to output just one triangle that covers the whole screen, so that fragment shaders gets called one time per pixel for the whole screen (but, honestly, you are better of using compute (kernel) shaders for this).
You can also write out data from vertex shader and not generate any primitives. You can do that by generating degenerate triangles. You can use this to generate bounding boxes. Using atomic operations you can update min/max positions and read them at a later stage. This is useful for light culling, frustum culling, tile-based processing and many other things.
But, and it's a BIG BUT, you can do most of this stuff in a compute shader without incurring GPU to run all the vertex assembly pipeline. That means, you can do full-screen effects using just a compute shader (instead of vertex and fragment shader and many pipeline stages in between, such as rasterizer, primitive culling, depth testing and output merging). You can calculate bounding boxes and do light culling or frustum culling in compute shader.
There are reasons to fire up the whole rendering pipeline instead of just running a compute shader, for example, if you will still use triangles that are output from vertex shader, or if you aren't sure how primitives are laid out in memory so you need vertex assembler to do the heavy lifting of assembling primitives. But, getting back to your point, almost all of the reasonable uses for vertex shader include outputting primitives in clip space. If you aren't using resulting primitives, it's probably best to stick to compute shaders.

Metal. Why does setting MTLCullMode to none turn off depth comparison?

I an rendering a simple box:
MDLMesh(boxWithExtent: ...)
In my draw loop when I turn off back-face culling:
renderCommandEncoder.setCullMode(.none)
All depth comparison is disabled and sides of the box are drawn completely wrong with back-facing quads in front of front-facing.
Huh?
My intent is to include back-facing surfaces in the depth comparison not ignore them. This is important for when I have, for example, a shape with semi-transparent textures that reveal the shape's internals which have a different shading style. How to I force depth comparison?
UPDATE
So Warren's suggestion is an improvement but it is still not correct.
My depthStencilDescriptor:
let depthStencilDescriptor = MTLDepthStencilDescriptor()
depthStencilDescriptor.depthCompareFunction = .less
depthStencilDescriptor.isDepthWriteEnabled = true
depthStencilState = device.makeDepthStencilState(descriptor: depthStencilDescriptor)
Within my draw loop I set depth stencil state:
renderCommandEncoder.setDepthStencilState(depthStencilState)
The resultant rendering
Description. This is a box mesh. Each box face uses a shader the paints a disk texture. The texture is transparent outside the body of the disk. The shader paints a red/white spiral texture on front-facings quads and a blue/black spiral texture on back-facing quads. The box sits in front of a camera aligned quad textured with a mobil image.
Notice how one of the textures paints over the rear back-facing quad with the background texture color. Notice also that the rear-most back-facing quad is not drawn at all.
Actually it is not possible to achieve the effect I am after. I basically want to do a simple composite - Porter/Duff - here but that is order dependent. Order cannot be guaranteed here so I am basically hosed.

Drawing Multiple 2d shapes in DirectX

I completed a tutorial on rendering 2d triangles in directx. Now, I want to use my knowledge of rendering a single triangle to render multiple triangles, or for that matter multiple objects on screen.
Should I create a list/stack/vector of vertexbuffers and input layouts and then draw each object? Or is there a better approach to this?
My process would be:
Setup directx, including vertex and pixel shaders
Create vertex buffers for each shape that has to be drawn on the screen and store them in an array.
Draw them to the render target for each frame(each frame)
Present the render target(each frame)
Please assume very rudimentary knowledge of DirectX and graphics programming in general when answering.
You don't need to create vertex buffer for each shape, you can just create one to store all the vertices of all triangles, then create a index buffer to store all indices of all shapes, at last draw them with index buffer.
I am not familiar with DX11, So, I just list the links for D3D 9 for your reference, I think the concept was same, just with some API changes.
Index Buffers(Direct3D 9)
Rendering from Vertex and Index buffers
If the triangles are in the same shape, just with different position or colors, you can consider using geometry instancing, it's a powerful way to render multiple copies of the same geometry.
Geometry Instancing
Efficiently Drawing Multiple Instances of Geometry(D3D9)
I don't know much about DirectX but general rule in rendering on GPU is to use separate vertex and index buffers for every mesh.
Although there is nothing limiting you from using single vertex buffer with many index buffers, in fact you may get some performance gains especially for small meshes...
You'll need just one vertex buffer for do this , and then Batching them,
so here is what you can do, you can make an array/vector holding the triangle information, let's say (pseudo-code)
struct TriangleInfo{
..... texture;
vect2 pos;
vect2 dimension;
float rot;
}
then in you draw method
for(int i=0; i < vector.size(); i++){
TriangleInfo tInfo = vector[i];
matrix worldMatrix = Transpose(matrix(tInfo.dimension) * matrix(tInfo.rot) * matrix(tInfo.pos));
shaderParameters.worldMatrix = worldMatrix; //info to the constabuffer
..
..
dctx->PSSetShaderResources(0, 1, &tInfo.texture);
dctx->Draw(0,4);
}
then in your vertex shader:
cbuffer cbParameters : register( b0 ) {
float4x4 worldMatrix;
};
VOut main(float4 position : POSITION, float4 texCoord : TEXCOORD)
{
....
output.position = mul(position,worldMatrix);
...
}
Remenber all is pseudo-code, but this should give you the idea, but there is a problem if you are planing to make a lot of Triangle, let's say 1000 triangles, maybe this is not the best option, you should using DrawIndexed and modifying the vertex position of each triangle, or you can use DrawInstanced , that is simpler , to be able to send all the information in just once Draw call, because calling Draw * triangleCount , is very heavy for large amounts

XNA - Render to a texture's alpha channel

I have a texture that I want to modify it's alpha channel in runtime.
Is there a way to draw on a texture's alpha channel ?
Or maybe replace the channel with that of another texture ?
Thanks,
SW.
Ok, based on your comment, what you should do is use a pixel shader. Your source image doesn't even need an alpha channel - let the pixel shader apply an alpha.
In fact you should probably calculate the values for the alpha channel (ie: run your fluid solver) on the GPU as well.
Your shader might look something like this:
float4 main(float2 uv : TEXCOORD) : COLOR
{
float4 c = tex2D(textureSampler, uv);
c.A = /* calculate alpha value here */;
return c;
}
A good place to start would be the XNA Sprite Effects sample.
There's even an effect similar to what you are doing:
(source: msdn.com)
The effect in the sample reads from a second texture to get values for the calculation of the alpha channel of the first texture when it is drawn.

Resources