For some scientific data visualization, I am drawing a large float array using WebGL. The dataset is two-dimensional and typically hundreds or few thousands of values in height and several tens of thousands values in width.
To fit this dataset into video memory, I cut it up into several non-square textures (depending on MAX_TEXTURE_SIZE) and display them next to one another. I use the same shader with a single sampler2d to draw all the textures. This means that I have to iterate over all the textures for drawing:
for (var i=0; i<dataTextures.length; i++) {
gl.activeTexture(gl.TEXTURE0+i);
gl.bindTexture(gl.TEXTURE_2D, dataTextures[i]);
gl.uniform1i(samplerUniform, i);
gl.bindBuffer(gl.ARRAY_BUFFER, vertexPositionBuffers[i]);
gl.vertexAttribPointer(vertexPositionAttribute, 2, gl.FLOAT, false, 0, 0);
gl.drawArrays(gl.TRIANGLE_STRIP, 0, 4);
}
However, if the number of textures gets larger than half a dozen, performance becomes quite bad. Now I know that games use quite a few more textures than that, so this can't be expected behavior. I also read that you can bind arrays of samplers, but as far as I can tell, the total number of texture has to be known ahead of time. For me, the number of textures depends on the dataset, so I can't know it before loading the data.
Also, I suspect that I am doing unnecessary things in this render loop. Any hints would be welcome.
How would you normally draw a variable number of textures in WebGL?
Here's a few previous answers that will help
How to bind an array of textures to a WebGL shader uniform?
How to send multiple textures to a fragment shader in WebGL?
How many textures can I use in a webgl fragment shader?
Some ways off the top if my head
Create a shader that loops over N textures. Set the textures you're not using to some 1x1 pixel texture with 0,0,0,0 in it or something else that doesn't effect your calculations
Create a shader that loops over N textures. Create a uniform boolean array, in the loop skip any texture who's corresponding boolean value is false.
Generate a shader on the fly that has exactly the number of textures you need. It shouldn't be that hard to concatinate a few strings etc..
Related
The Metal Shading Language includes a lot of mathematic functions, but it seems most of the codes inside Metal official documentation just use it to map vertexes from pixel space to clip space like
RasterizerData out;
out.clipSpacePosition = vector_float4(0.0, 0.0, 0.0, 1.0);
float2 pixelSpacePosition = vertices[vertexID].position.xy;
vector_float2 viewportSize = vector_float2(*viewportSizePointer);
out.clipSpacePosition.xy = pixelSpacePosition / (viewportSize / 2.0);
out.color = vertices[vertexID].color;
return out;
Except for GPGPU using kernel functions to do parallel computation, what things that vertex function can do, with some examples? In a game, if all vertices positions are calculated by the CPU, why GPU still matters? What does vertex function do usually?
Vertex shaders compute properties for vertices. That's their point. In addition to vertex positions, they also calculate lighting normals at each vertex. And potentially texture coordinates. And various material properties used by lighting and shading routines. Then, in the fragment processing stage, those values are interpolated and sent to the fragment shader for each fragment.
In general, you don't modify vertices on the CPU. In a game, you'd usually load them from a file into main memory, put them into a buffer and send them to the GPU. Once they're on the GPU you pass them to the vertex shader on each frame along with model, view, and projection matrices. A single buffer containing the vertices of, say, a tree or a car's wheel might be used multiple times. Each time all the CPU sends is the model, view, and projection matrices. The model matrix is used in the vertex shader to reposition and scale the vertice's positions in world space. The view matrix then moves and rotates the world around so that the virtual camera is at the origin and facing the appropriate way. Then the projection matrix modifies the vertices to put them into clip space.
There are other things a vertex shader can do, too. You can pass in vertices that are in a grid in the x-y plane, for example. Then in your vertex shader you can sample a texture and use that to generate the z-value. This gives you a way to change the geometry using a height map.
On older hardware (and some lower-end mobile hardware) it was expensive to do calculations on a texture coordinate before using it to sample from a texture because you lose some cache coherency. For example, if you wanted to sample several pixels in a column, you might loop over them adding an offset to the current texture coordinate and then sampling with the result. One trick was to do the calculation on the texture coordinates in the vertex shader and have them automatically interpolated before being sent to the fragment shader, then doing a normal look-up in the fragment shader. (I don't think this is an optimization on modern hardware, but it was a big win on some older models.)
First, I'll address this statement
In a game, if all vertices positions are calculated by the CPU, why GPU still matters? What does vertex function do usually?
I don't believe I've seen anyone calculating positions for meshes that will be later used to render them on a GPU. It's slow, you would need to get all this data from CPU to a GPU (which means copying it through a bus if you have a dedicated GPU). And it's just not that flexible. There are much more things other than vertex positions that are required to produce any meaningful image and calculating all this stuff on CPU is just wasteful, since CPU doesn't care for this data for the most part.
The sole purpose of vertex shader is to provide rasterizer with primitives that are in clip space. But there are some other uses that are mostly tricks based on different GPU features.
For example, vertex shaders can write out some data to buffers, so, for example, you can stream out transformed geometry if you don't want to transform it again at a later vertex stage if you have multi-pass rendering that uses the same geometry in more than one pass.
You can also use vertex shaders to output just one triangle that covers the whole screen, so that fragment shaders gets called one time per pixel for the whole screen (but, honestly, you are better of using compute (kernel) shaders for this).
You can also write out data from vertex shader and not generate any primitives. You can do that by generating degenerate triangles. You can use this to generate bounding boxes. Using atomic operations you can update min/max positions and read them at a later stage. This is useful for light culling, frustum culling, tile-based processing and many other things.
But, and it's a BIG BUT, you can do most of this stuff in a compute shader without incurring GPU to run all the vertex assembly pipeline. That means, you can do full-screen effects using just a compute shader (instead of vertex and fragment shader and many pipeline stages in between, such as rasterizer, primitive culling, depth testing and output merging). You can calculate bounding boxes and do light culling or frustum culling in compute shader.
There are reasons to fire up the whole rendering pipeline instead of just running a compute shader, for example, if you will still use triangles that are output from vertex shader, or if you aren't sure how primitives are laid out in memory so you need vertex assembler to do the heavy lifting of assembling primitives. But, getting back to your point, almost all of the reasonable uses for vertex shader include outputting primitives in clip space. If you aren't using resulting primitives, it's probably best to stick to compute shaders.
I have multiple texture reads in my fragment shader, and I am supposedly doing bad things, like using the discard command and conditionals inside the shader.
The thing is, I am rendering to a texture and I want to reuse it in following passes with other shaders, that do not have to operate on pixels that were previously "discarded". This is for performance. The thing is, I need also to discard calculations if uniforms are out of certain ranges (which I read from another texture): imagine a loop with these shaders running always on the same textures, which are not cleared.
So what I have now, is a terrible performance. One idea I thought about is using glFragDepth together with the depth buffer and use that to fire depth testing in order to discard some pixels. But this does not work with the fact I want to have ranges.
Is there any alternative?
You could enable blending, and set the alpha values of pixels you don't want to render to zero. Setup:
glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
glEnable(GL_BLEND);
Then in the fragment shader, where you previously called discard:
...
if (condition) {
discard;
}
...
Set the alpha to zero instead:
float alpha = float(condition);
...
gl_FragColor(r, g, b, alpha);
Whether this will perform better than discarding pixels could be very system dependent. But if you're looking for alternatives, it's worth trying.
I have a C++ DirectX 11 renderer that I have been writing.
I have written a COLLADA 1.4.1 loader to import COLLADA data for use in supporting skeletal animations.
I'm validating the loader at this point (and I've supported COLLADA before in another renderer I've written previously using different technology) and I'm running into a problem matching up COLLADA with DX10/11.
I have 3 separate vertex buffers of data:
A vertex buffer of Unique vertex positions.
A vertex buffer of Unique normals.
A vertex buffer of Unique texture coordinates.
These vertex buffers contain different array length (positions has 2910 elements, normals has more than 9000, and texture coordinates has roughly 3200.)
COLLADA provides a triangle list which gives me the indices into each of these arrays for a given triangle (verbose and oddly done at first, but ultimately it becomes simple once you've worked with it.)
Knowing that DX10/11 support multiple vertex buffer I figured I would be filling the DX10/11 index buffer with indices into each of these buffers * and * (this is the important part), these indices could be different for a given point of a triangle.
In other words, I could set the three vertex buffers, set the correct input layout, and then in the index buffer I would put the equivalent of:
l_aIndexBuffer[ NumberOfTriangles * 3 ]
for( i = 0; i < NumberOfTriangles; i++ )
{
l_aIndexBufferData.add( triangle[i].Point1.PositionIndex )
l_aIndexBufferData.add( triangle[i].Point1.NormalIndex )
l_aIndexBufferData.add( triangle[i].Point1.TextureCoordinateIndex )
}
The documentation regarding using multiple vertex buffers in DirectX doesn't seem to give any information about how this affects the index buffer (more on this later.)
Running the code that way yield strange rendering results where I could see the mesh I had being drawn intermittently correctly (strange polygons but about a third of the points were in the correct place - hint - hint)
I figured I'd screwed up my data or my indices at this point (yesterday) so I painstakingly validated it all, and so I figured I was screwing upon my input or something else. I eliminated this by using the values from the normal and texture buffers to alternatively set the color value used by the pixel shader, the colors were correct so I wasn't suffering a padding issue.
Ultimately I came to the conclusion that DX10/11 must be expect the data ordered in a different fashion, so I tried storing the indices in this fashion:
indices.add( Point1Position index )
indices.add( Point2Position index )
indices.add( Point3Position index )
indices.add( Point1Normal index )
indices.add( Point2Normal index )
indices.add( Point3Normal index )
indices.add( Point1TexCoord index )
indices.add( Point2TexCoord index )
indices.add( Point3TexCoord index )
Oddly enough, this yielded a rendered mesh that looked 1/3 correct - hint - hint.
I then surmised that maybe DX10/DX11 wanted the indices stored 'by vertex buffer' meaning that I would add all the position indices for all the triangles first, then all the normal indices for all the triangles, then all the texture coordinate indices for all the triangles.
This yielded another 1/3 correct (looking) mesh.
This made me think - well, surely DX10/11 wouldn't provide you with the ability to stream from multiple vertex buffers and then actually expect only one index per triangle point?
Only including indices into the vertex buffer of positions yields a properly rendered mesh that unfortunately uses the wrong normals and texture coordinates.
It appears that putting the normal and texture coordinate indices into the index buffer caused erroneous drawing over the properly rendered mesh.
Is this the expected behavior?
Multiple Vertex Buffers - One Index Buffer and the index buffer can only have a single index for a point of a triangle?
That really just doesn't make sense to me.
Help!
The very first thing that comes in my head:
All hardware that supports compute shaders (equal to almost all DirectX 10 and higher) also supports ByteAddressBuffers and most of it supports StructuredBuffers. So you can bind your arrays as SRVs and have random access to any of its elements in shaders.
Something like this (not tested, just pseudocode):
// Indices passed as vertex buffer to shader
// Think of them as of "references" to real data
struct VS_INPUT
{
uint posidx;
uint noridx;
uint texidx;
}
// The real vertex data
// You pass it as structured buffers (similar to textures)
StructuredBuffer<float3> pos : register (t0);
StructuredBuffer<float3> nor : register (t1);
StructuredBuffer<float2> tex : register (t2);
VS_OUTPUT main(VS_INPUT indices)
{
// in shader you read data for current vertex
float3 pos = pos[indices.posidx];
float3 nor = nor[indices.noridx];
float2 tex = tex[indices.texidx];
// here you do something
}
Let's call that "compute shader approach". You must use DirectX 11 API.
Also you can bind your indices in same fashion and do some magic in shaders. In this case you need to find out current index id. Probably you can take it from SV_VertexId.
And probably you can workaround these buffers and bind data somehow else ( DirectX 9 compatible texture sampling! O_o ).
Hope it helps!
How do you implement per instance textures, vertex shaders, and pixel shaders, in the same Vertex Buffer and/or DeviceContext?
I am just trying to find the most efficient way to have different pixel shaders used by the same type of mesh, but colored differently. For example, I would like square and triangle models in the vertex buffer, and for the vertex/pixel/etc shaders to act differently based on instance data.... (If the instance data includes "dead" somehow, the shaders used to draw opaque shapes with solid colors rather than gradients are used.
Given:
1. Different model templates in Vertex Buffer, Square & Triangl, (more eventually).
Instance Buffer with [n] instances of type Square and/or Triangle, etc.
Guesses:
Things I am trying to Research to do this:
A: Can I add a Texture, VertexShader or PixelShader ID to the buffer data so that HLSL or the InputAssembly can determine which Shader to use at draw time?
B. Can I "Set" multiple Pixel and Vertex Shaders into the DeviceContext, and how do I tell DirectX to "switch" the Vertex Shader that is loaded at render time?
C. How many Shaders of each type, (Vertex, Pixel, Hull, etc), can I associate with model definitions/meshes in the default Vertex Buffer?
D. Can I use some sort of Shader Selector in HLSL?
Related C++ Code
When I create an input layout, can I do this without specifying an actual Vertex Shader, or somehow specify more than one?
NS::ThrowIfFailed(
result = NS::DeviceManager::Device->CreateInputLayout(
NS::ModelRenderer::InitialElementDescription,
2,
vertexShaderFile->Data,
vertexShaderFile->Length,
& NS::ModelRenderer::StaticInputLayout
)
);
When I set the VertexShader and PixelShader, how do I associate them with a particular model in my VertexBuffer? Is it possible to set more than one of each?
DeviceManager::DeviceContext->IASetInputLayout(ModelRenderer::StaticInputLayout.Get());
DeviceManager::DeviceContext->VSSetShader(ModelRenderer::StaticVertexShader.Get(), nullptr, 0);
DeviceManager::DeviceContext->PSSetShader(ModelRenderer::StaticPixelShader.Get(), nullptr, 0);
How do I add a Texture, VertexShader or PixelShader ID to the buffer
data so that HLSL or the InputAssembly can determine which Shader to
use at draw time?
You can't assign a Pixel Shader ID to a buffer, that's not how the pipeline works.
A / You can bind only one Vertex/Pixel Shader in a Device context at a time, which defines your pipeline, draw your geometry using this shader, then switch to another Vertex/Pixel shader as needed, draw next geometry...
B/ you can use different shaders using the same model, but that's done on cpu using VSSetShader, PSSetShader....
C/No, for same reason as in B (shaders are set on the CPU)
When I create an input layout, can I do this without specifying an actual Vertex Shader, or somehow specify more than one?
if you don't specify a vertex shader, the pipeline will consider that you draw "null" geometry, which is actually possible (and very fun), but bit out of context, if you provide geometry you need to send the vertex shader data so the runtime can match your geometry layout to the vertex input layout. You can of course create several input layouts by calling the function several times (once per vertex shader/geometry in worst case, but if two models/vertex shaders have the same layout you can omit it).
When I set the VertexShader and PixelShader, how do I associate them with a particular model in my VertexBuffer? Is it possible to set more than one of each?
You bind everything you need (Vertex/Pixel shaders, Vertex/IndexBuffer,Input layout) and call draw (or drawinstanced).
I completed a tutorial on rendering 2d triangles in directx. Now, I want to use my knowledge of rendering a single triangle to render multiple triangles, or for that matter multiple objects on screen.
Should I create a list/stack/vector of vertexbuffers and input layouts and then draw each object? Or is there a better approach to this?
My process would be:
Setup directx, including vertex and pixel shaders
Create vertex buffers for each shape that has to be drawn on the screen and store them in an array.
Draw them to the render target for each frame(each frame)
Present the render target(each frame)
Please assume very rudimentary knowledge of DirectX and graphics programming in general when answering.
You don't need to create vertex buffer for each shape, you can just create one to store all the vertices of all triangles, then create a index buffer to store all indices of all shapes, at last draw them with index buffer.
I am not familiar with DX11, So, I just list the links for D3D 9 for your reference, I think the concept was same, just with some API changes.
Index Buffers(Direct3D 9)
Rendering from Vertex and Index buffers
If the triangles are in the same shape, just with different position or colors, you can consider using geometry instancing, it's a powerful way to render multiple copies of the same geometry.
Geometry Instancing
Efficiently Drawing Multiple Instances of Geometry(D3D9)
I don't know much about DirectX but general rule in rendering on GPU is to use separate vertex and index buffers for every mesh.
Although there is nothing limiting you from using single vertex buffer with many index buffers, in fact you may get some performance gains especially for small meshes...
You'll need just one vertex buffer for do this , and then Batching them,
so here is what you can do, you can make an array/vector holding the triangle information, let's say (pseudo-code)
struct TriangleInfo{
..... texture;
vect2 pos;
vect2 dimension;
float rot;
}
then in you draw method
for(int i=0; i < vector.size(); i++){
TriangleInfo tInfo = vector[i];
matrix worldMatrix = Transpose(matrix(tInfo.dimension) * matrix(tInfo.rot) * matrix(tInfo.pos));
shaderParameters.worldMatrix = worldMatrix; //info to the constabuffer
..
..
dctx->PSSetShaderResources(0, 1, &tInfo.texture);
dctx->Draw(0,4);
}
then in your vertex shader:
cbuffer cbParameters : register( b0 ) {
float4x4 worldMatrix;
};
VOut main(float4 position : POSITION, float4 texCoord : TEXCOORD)
{
....
output.position = mul(position,worldMatrix);
...
}
Remenber all is pseudo-code, but this should give you the idea, but there is a problem if you are planing to make a lot of Triangle, let's say 1000 triangles, maybe this is not the best option, you should using DrawIndexed and modifying the vertex position of each triangle, or you can use DrawInstanced , that is simpler , to be able to send all the information in just once Draw call, because calling Draw * triangleCount , is very heavy for large amounts