I've got a Metal pipeline working.
I'm rendering the live face geometry,
captured from the TrueDepth camera on an iPhone X.
I grab the ARFaceGeometryfrom the ARSessionDelegate every frame.
I pass the data to my framework into the metal pipeline.
Tho the render is at 2.5fps...
Here's the render pipeline: PixelsRender.swift
Data: xyz's uv's & an index array.
The ARFaceGeometry consists of 2304 triangles.
I timed the render pipeline:
[1.086ms] Command Buffer
[0.006ms] Input Texture
[0.054ms] Drawable
[0.110ms] Command Encoder
[0.123ms] Uniforms
[0.006ms] Uniform Arrays
[0.009ms] Fragment Texture
[68.015ms] Vertices
[0.002ms] Vertex Uniforms
[0.000ms] Custom Vertex Texture
[0.027ms] Draw
[0.036ms] Encode
[80.207ms] All CPU
[346.936ms] GPU
[431.035ms] All CPU + GPU
[434.100ms] Total
It's all the vertices that take such a long time to render.
Is there a way to cache the the memory space on the GPU or something?
I've got a lot to optimise, I'm sure, tho anything obvious I'm missing?
Here's my friends face:
Update (Solved)
I was mistaking instances for triangles!
It was in the main draw func. (Thanks Ken Thomases for catching this)
commandEncoder.drawPrimitives(type: vertices.type,
vertexStart: 0,
vertexCount: vertices.vertexCount,
instanceCount: 1 /* previously triangle count of 2304 */)
The new GPU time:
[2.769ms] GPU
You were unintentionally using instanced drawing, by passing a value greater than 1 for the instanceCount: parameter. That basically multiplies the amount of rendering work the GPU has to do. So, if you don't actually need/want instanced drawing, pass 1 there.
Related
The Metal Shading Language includes a lot of mathematic functions, but it seems most of the codes inside Metal official documentation just use it to map vertexes from pixel space to clip space like
RasterizerData out;
out.clipSpacePosition = vector_float4(0.0, 0.0, 0.0, 1.0);
float2 pixelSpacePosition = vertices[vertexID].position.xy;
vector_float2 viewportSize = vector_float2(*viewportSizePointer);
out.clipSpacePosition.xy = pixelSpacePosition / (viewportSize / 2.0);
out.color = vertices[vertexID].color;
return out;
Except for GPGPU using kernel functions to do parallel computation, what things that vertex function can do, with some examples? In a game, if all vertices positions are calculated by the CPU, why GPU still matters? What does vertex function do usually?
Vertex shaders compute properties for vertices. That's their point. In addition to vertex positions, they also calculate lighting normals at each vertex. And potentially texture coordinates. And various material properties used by lighting and shading routines. Then, in the fragment processing stage, those values are interpolated and sent to the fragment shader for each fragment.
In general, you don't modify vertices on the CPU. In a game, you'd usually load them from a file into main memory, put them into a buffer and send them to the GPU. Once they're on the GPU you pass them to the vertex shader on each frame along with model, view, and projection matrices. A single buffer containing the vertices of, say, a tree or a car's wheel might be used multiple times. Each time all the CPU sends is the model, view, and projection matrices. The model matrix is used in the vertex shader to reposition and scale the vertice's positions in world space. The view matrix then moves and rotates the world around so that the virtual camera is at the origin and facing the appropriate way. Then the projection matrix modifies the vertices to put them into clip space.
There are other things a vertex shader can do, too. You can pass in vertices that are in a grid in the x-y plane, for example. Then in your vertex shader you can sample a texture and use that to generate the z-value. This gives you a way to change the geometry using a height map.
On older hardware (and some lower-end mobile hardware) it was expensive to do calculations on a texture coordinate before using it to sample from a texture because you lose some cache coherency. For example, if you wanted to sample several pixels in a column, you might loop over them adding an offset to the current texture coordinate and then sampling with the result. One trick was to do the calculation on the texture coordinates in the vertex shader and have them automatically interpolated before being sent to the fragment shader, then doing a normal look-up in the fragment shader. (I don't think this is an optimization on modern hardware, but it was a big win on some older models.)
First, I'll address this statement
In a game, if all vertices positions are calculated by the CPU, why GPU still matters? What does vertex function do usually?
I don't believe I've seen anyone calculating positions for meshes that will be later used to render them on a GPU. It's slow, you would need to get all this data from CPU to a GPU (which means copying it through a bus if you have a dedicated GPU). And it's just not that flexible. There are much more things other than vertex positions that are required to produce any meaningful image and calculating all this stuff on CPU is just wasteful, since CPU doesn't care for this data for the most part.
The sole purpose of vertex shader is to provide rasterizer with primitives that are in clip space. But there are some other uses that are mostly tricks based on different GPU features.
For example, vertex shaders can write out some data to buffers, so, for example, you can stream out transformed geometry if you don't want to transform it again at a later vertex stage if you have multi-pass rendering that uses the same geometry in more than one pass.
You can also use vertex shaders to output just one triangle that covers the whole screen, so that fragment shaders gets called one time per pixel for the whole screen (but, honestly, you are better of using compute (kernel) shaders for this).
You can also write out data from vertex shader and not generate any primitives. You can do that by generating degenerate triangles. You can use this to generate bounding boxes. Using atomic operations you can update min/max positions and read them at a later stage. This is useful for light culling, frustum culling, tile-based processing and many other things.
But, and it's a BIG BUT, you can do most of this stuff in a compute shader without incurring GPU to run all the vertex assembly pipeline. That means, you can do full-screen effects using just a compute shader (instead of vertex and fragment shader and many pipeline stages in between, such as rasterizer, primitive culling, depth testing and output merging). You can calculate bounding boxes and do light culling or frustum culling in compute shader.
There are reasons to fire up the whole rendering pipeline instead of just running a compute shader, for example, if you will still use triangles that are output from vertex shader, or if you aren't sure how primitives are laid out in memory so you need vertex assembler to do the heavy lifting of assembling primitives. But, getting back to your point, almost all of the reasonable uses for vertex shader include outputting primitives in clip space. If you aren't using resulting primitives, it's probably best to stick to compute shaders.
I want to implement a collision detector between a moving and a static object. The way I am thinking of doing so is by checking in vertex shader every time if any vertex of the moving object intersects with the position of the static object.
By doing the above, I would get the point of collision in the vertex shader, but I want to use the variable for rendering purposes in the js file.
Is there a way to do it.
In WebGL 1 you can not directly read any data from a vertex shader. The best you can do is use the vertex shader to affect the pixels rendered in the fragment shader. So you could for example set gl_Position so nothing is rendered if it fails your test and a single pixel is rendered if the test passes. Or you can set some varying that sets certain colors based on your test results. Then you can either read the pixel with gl.readPixels or you can just pass the texture you wrote to to another shader in a different draw calls.
In WebGL2 you can use transform feedback to allow a vertex shader to write its varyings to a buffer. You can then use that buffer in other draw calls or read it's contents with gl.getSubBuffer
In WebGL2 you can also do occlusion queries which means you can try to draw something and test if it was actually drawn or if the depth buffer prevented it from being drawn.
For some scientific data visualization, I am drawing a large float array using WebGL. The dataset is two-dimensional and typically hundreds or few thousands of values in height and several tens of thousands values in width.
To fit this dataset into video memory, I cut it up into several non-square textures (depending on MAX_TEXTURE_SIZE) and display them next to one another. I use the same shader with a single sampler2d to draw all the textures. This means that I have to iterate over all the textures for drawing:
for (var i=0; i<dataTextures.length; i++) {
gl.activeTexture(gl.TEXTURE0+i);
gl.bindTexture(gl.TEXTURE_2D, dataTextures[i]);
gl.uniform1i(samplerUniform, i);
gl.bindBuffer(gl.ARRAY_BUFFER, vertexPositionBuffers[i]);
gl.vertexAttribPointer(vertexPositionAttribute, 2, gl.FLOAT, false, 0, 0);
gl.drawArrays(gl.TRIANGLE_STRIP, 0, 4);
}
However, if the number of textures gets larger than half a dozen, performance becomes quite bad. Now I know that games use quite a few more textures than that, so this can't be expected behavior. I also read that you can bind arrays of samplers, but as far as I can tell, the total number of texture has to be known ahead of time. For me, the number of textures depends on the dataset, so I can't know it before loading the data.
Also, I suspect that I am doing unnecessary things in this render loop. Any hints would be welcome.
How would you normally draw a variable number of textures in WebGL?
Here's a few previous answers that will help
How to bind an array of textures to a WebGL shader uniform?
How to send multiple textures to a fragment shader in WebGL?
How many textures can I use in a webgl fragment shader?
Some ways off the top if my head
Create a shader that loops over N textures. Set the textures you're not using to some 1x1 pixel texture with 0,0,0,0 in it or something else that doesn't effect your calculations
Create a shader that loops over N textures. Create a uniform boolean array, in the loop skip any texture who's corresponding boolean value is false.
Generate a shader on the fly that has exactly the number of textures you need. It shouldn't be that hard to concatinate a few strings etc..
I have a vertex buffer with an unordered access view, which I'm using to fill the vertices using a compute shader, which treats the UAV as a RWStructuredBuffer, using an equivalent struct to the vertex definition. There are 216000 vertices (i.e. 60 x 60 x 60). But my compute shader seems to fill only about 8000 of them, leaving the rest with their initial values. Is there a limit on the number of elements in a structured buffer that can be written in this way?
As it turns out, if you turn on DirectX error-checking, assigning the UAV of a vertex buffer as a RWStructuredBuffer in the shader is considered to be an error. So although this actually works (for a limited number of vertices), it's not supported.
How do you implement per instance textures, vertex shaders, and pixel shaders, in the same Vertex Buffer and/or DeviceContext?
I am just trying to find the most efficient way to have different pixel shaders used by the same type of mesh, but colored differently. For example, I would like square and triangle models in the vertex buffer, and for the vertex/pixel/etc shaders to act differently based on instance data.... (If the instance data includes "dead" somehow, the shaders used to draw opaque shapes with solid colors rather than gradients are used.
Given:
1. Different model templates in Vertex Buffer, Square & Triangl, (more eventually).
Instance Buffer with [n] instances of type Square and/or Triangle, etc.
Guesses:
Things I am trying to Research to do this:
A: Can I add a Texture, VertexShader or PixelShader ID to the buffer data so that HLSL or the InputAssembly can determine which Shader to use at draw time?
B. Can I "Set" multiple Pixel and Vertex Shaders into the DeviceContext, and how do I tell DirectX to "switch" the Vertex Shader that is loaded at render time?
C. How many Shaders of each type, (Vertex, Pixel, Hull, etc), can I associate with model definitions/meshes in the default Vertex Buffer?
D. Can I use some sort of Shader Selector in HLSL?
Related C++ Code
When I create an input layout, can I do this without specifying an actual Vertex Shader, or somehow specify more than one?
NS::ThrowIfFailed(
result = NS::DeviceManager::Device->CreateInputLayout(
NS::ModelRenderer::InitialElementDescription,
2,
vertexShaderFile->Data,
vertexShaderFile->Length,
& NS::ModelRenderer::StaticInputLayout
)
);
When I set the VertexShader and PixelShader, how do I associate them with a particular model in my VertexBuffer? Is it possible to set more than one of each?
DeviceManager::DeviceContext->IASetInputLayout(ModelRenderer::StaticInputLayout.Get());
DeviceManager::DeviceContext->VSSetShader(ModelRenderer::StaticVertexShader.Get(), nullptr, 0);
DeviceManager::DeviceContext->PSSetShader(ModelRenderer::StaticPixelShader.Get(), nullptr, 0);
How do I add a Texture, VertexShader or PixelShader ID to the buffer
data so that HLSL or the InputAssembly can determine which Shader to
use at draw time?
You can't assign a Pixel Shader ID to a buffer, that's not how the pipeline works.
A / You can bind only one Vertex/Pixel Shader in a Device context at a time, which defines your pipeline, draw your geometry using this shader, then switch to another Vertex/Pixel shader as needed, draw next geometry...
B/ you can use different shaders using the same model, but that's done on cpu using VSSetShader, PSSetShader....
C/No, for same reason as in B (shaders are set on the CPU)
When I create an input layout, can I do this without specifying an actual Vertex Shader, or somehow specify more than one?
if you don't specify a vertex shader, the pipeline will consider that you draw "null" geometry, which is actually possible (and very fun), but bit out of context, if you provide geometry you need to send the vertex shader data so the runtime can match your geometry layout to the vertex input layout. You can of course create several input layouts by calling the function several times (once per vertex shader/geometry in worst case, but if two models/vertex shaders have the same layout you can omit it).
When I set the VertexShader and PixelShader, how do I associate them with a particular model in my VertexBuffer? Is it possible to set more than one of each?
You bind everything you need (Vertex/Pixel shaders, Vertex/IndexBuffer,Input layout) and call draw (or drawinstanced).