How can a shader read a vertex information? - directx

I have lately learned a shader.
Speeking of this as I know simply,
First, Make a buffer that saves vertices information.
Then make a shader file and compile.
Finally, Set a shader and Draw.
But studying code, I guess that there is no direct connection between
a shader and buffer has vertices. So I wonder How can a shader read a vertex information? Just does a shader read a existent buffer?
I am not sure that my intend will be well delivered.
Because I can't speak English well. I hope you guys understand me.

You are not mentioned about the InputLayout, to render it is necessary to define in the context:
Vertex buffer,
Index buffer (optional),
Input layout (how the data will be distributed in the Vertex Shader parameters, sizes, types, "offset for each stride"),
VS and PS

Related

Most efficient way to store normals in gltf files?

I am looking at most efficient way to store normals in gltf files, we would like to use them for webgl 1 flat and smooth shading.
Is there support for different normal precision?
We prefer to not duplicate all vertices in order to have normals because or meshes are huge.
I just found this
KhronosGroup/glTF/blob/master/specification/2.0/README.md#meshes
NORMAL "VEC3" 5126 (FLOAT) Normalized XYZ vertex normals
so I assume there is a single way?
Thanks for your help
Flat normals: The most efficient way to store flat normals is to omit them entirely. From the glTF specification, in the Meshes section,
"When normals are not specified, client implementations should calculate flat normals."
Smooth normals: You are correct, the core glTF specification requires that normals use 5126 / float32 precision. If you need other options, enable the KHR_mesh_quantization extension (extension spec) on your model, which allows for additional types (int16 normalized or int8 normalized). The gltfpack library can apply these optimizations automatically, or you can modify the model directly.

Whats the equivalent of (dx11) structuredbuffer in webgl2?

I'm porting a directx hlsl script to webgl 2, but I cannot find the equivalent of a structuredbuffer.
I can only see a constant buffer which are limited to 64k size and use aligning. Should I split the structuredbuffers to constant buffers?
The more-or-less equivalent in OpenGL land of D3D's StructuredBuffers are Shader Storage Buffer Objects. However, WebGL 2.0 is based on OpenGL ES 3.0, which does not include SSBOs.

Is it better to use one large buffer with all related data or several smaller buffers in HLSL

I interested in both a code design and performance aspect if having a separated buffers when sending data to the GPU in HLSL, or another high-level shader language, is better.
This is where a particular shader needs to have a lot of variable data which changes during runtime and as such needs information to be passed by buffers.
I give an very basic example:
cbuffer SomeLargeBuffer : register(cb0)
{
float3 data;
float someData;
float4 largeArray[2500];
float moreData;
...
...
...
...
...
}
or to have
cbuffer SamllerBuffer: register(cb0)
{
float3 data;
float someRelatedData;
}
cbuffer SecondSmallerBuffer : register(cb1)
{
float4 largeArray[2500];
float moreData;
}
cbuffer ThirdBuffer: register(cb2)
{
...
...
...
}
In terms of efficiency, the documentation on shader constants in HLSL gives the following advice:
The best way to efficiently use constant buffers is to organize shader
variables into constant buffers based on their frequency of update.
This allows an application to minimize the bandwidth required for
updating shader constants. For example, a shader might declare two
constant buffers and organize the data in each based on their
frequency of update: data that needs to be updated on a per-object
basis (like a world matrix) is grouped into a constant buffer which
could be updated for each object. This is separate from data that
characterizes a scene and is therefore likely to be updated much less
often (when the scene changes).
So, if you data updates at different rates, it would be best to group all data that are updated at the same frequency in the same constant buffers. Generally, data is either updated a) every frame, b) sporadically or c) never (once at startup). Reducing the number of total constant buffers also is helpful for performance, because it will reduce the number of binding calls, and required resource tracking.
In terms of code design, it's difficult to say, although usually it fits naturally the frequency-of-update pattern.

Metal Compute Kernel vs Fragment Shader

Metal supports kernel in addition to the standard vertex and fragment functions. I found a metal kernel example that converts an image to grayscale.
What exactly is the difference between doing this in a kernel vs fragment? What can a compute kernel do (better) that a fragment shader can't and vice versa?
Metal has four different types of command encoders:
MTLRenderCommandEncoder
MTLComputeCommandEncoder
MTLBlitCommandEncoder
MTLParallelRenderCommandEncoder
If you're just doing graphics programming, you're most familiar with the MTLRenderCommandEncoder. That is where you would set up your vertex and fragment shaders. This is optimized to deal with a lot of draw calls and object primitives.
The kernel shaders are primarily used for the MTLComputeCommandEncoder. I think the reason a kernel shader and a compute encoder were used for the image processing example is because you're not drawing any primitives as you would be with the render command encoder. Even though both methods are utilizing graphics, in this instance it's simply modifying color data on a texture rather than calculating depth of multiple objects on a screen.
The compute command encoder is also more easily set up to do parallel computing using threads:
https://developer.apple.com/reference/metal/mtlcomputecommandencoder
So if your application wanted to utilize multithreading on data modification, it's easier to do that in this command encoder than the render command encoder.

Why pixel shader returns float4 when the back buffer format is DXGI_FORMAT_B8G8R8A8_UNORM?

Alright, so this has been bugging me for a while now, and could not find anything on MSDN that goes into the specifics that I need.
This is more of a 3 part question, so here it goes:
1-) When creating the swapchain applications specify backbuffer pixel formats, and most often is either B8G8R8A8 or R8G8B8A8. This gives 8 bit per color channel so a total of 4 bytes is used per pixel....so why does the pixel shader has to return a color as a float4 when float4 is actually 16 bytes?
2-) When binding textures to the Pixel Shader my textures are DXGI_FORMAT_B8G8R8A8_UNORM format, but why does the sampler need a float4 per pixel to work?
3-) Am I missing something here? am I overthinking this or what?
Please provide links to to support your claim. Preferably from MSDN!!!!
GPUs are designed to perform calculations on 32bit floating point data, at least if they want to support D3D11. As of D3D10 you can also perform 32bit signed and unsigned integer operations. There's no requirement or language support for types smaller than 4 bytes in HLSL, so there's no "byte/char" or "short" for 1 and 2 byte integers or lower precision floating point.
Any DXGI formats that use the "FLOAT", "UNORM" or "SNORM" suffix are non-integer formats, while "UINT" and "SINT" are unsigned and signed integer. Any reads performed by the shader on the first three types will be provided to the shader as 32 bit floating point irrespective of whether the original format was 8 bit UNORM/SNORM or 10/11/16/32 bit floating point. Data in vertices is usually stored at a lower precision than full-fat 32bit floating point to save memory, but by the time it reaches the shader it has already been converted to 32bit float.
On output (to UAVs or Render Targets) the GPU compresses the "float" or "uint" data to whatever format the target was created at. If you try outputting float4(4.4, 5.5, 6.6, 10.1) to a target that is 8-bit normalised then it'll simply be truncated to (1.0,1.0,1.0,1.0) and only consume 4 bytes per pixel.
So to answer your questions:
1) Because shaders only operate on 32 bit types, but the GPU will compress/truncate your output as necessary to be stored in the resource you currently have bound according to its type. It would be madness to have special keywords and types for every format that the GPU supported.
2) The "sampler" doesn't "need a float4 per pixel to work". I think you're mixing your terminology. The declaration that the texture is a Texture2D<float4> is really just stating that this texture has four components and is of a format that is not an integer format. "float" doesn't necessarily mean the source data is 32 bit float (or actually even floating point) but merely that the data has a fractional component to it (eg 0.54, 1.32). Equally, declaring a texture as Texture2D<uint4> doesn't mean that the source data is 32 bit unsigned necessarily, but more that it contains four components of unsigned integer data. However, the data will be returned to you and converted to 32 bit float or 32 bit integer for use inside the shader.
3) You're missing the fact that the GPU decompresses textures / vertex data on reads and compresses it again on writes. The amount of storage used for your vertices/texture data is only as much as the format that you create the resource in, and has nothing to do with the fact that the shader is operating on 32 bit floats / integers.

Resources