My renderer supports 2 vertex formats:
typedef struct
{
packed_float3 position;
packed_float2 texcoord;
packed_float3 normal;
packed_float4 tangent;
packed_float4 color;
} vertex_ptntc;
typedef struct
{
packed_float3 position;
packed_float2 texcoord;
packed_float4 color;
} vertex_ptc;
One of my shader library's vertex shader signature is as follows:
vertex ColorInOut unlit_vertex(device vertex_ptc* vertex_array [[ buffer(0) ]],
constant uniforms_t& uniforms [[ buffer(1) ]],
unsigned int vid [[ vertex_id ]])
Some of the meshes rendered by this shader will use one format and some will use the other. How do I support both formats? This shader only uses the attributes in vertex_ptc. Do I have to write another vertex shader?
When defining the shader function argument as an array of structures (as you're doing), the structure definition in the shader vertex function must match the exact shape and size of the actual structures in the buffer (including padding).
Have you considered defining the input in terms of the [[stage_in]] qualifier, and vertex descriptors? This will allow you to massage the vertex input on a shader-by-shader basis, by using an [[attribute(n)]] qualifier on each element of the structure declared for each shader function. You would define a vertex descriptor for each structure.
Related
I have an MTLTexture in RGBA8Unorm format, and a screen texture (in MTKView) in BGRA8Unorm format (reversed). In the Metal shader, when I sample from that texture using sample(), I get float4. When I write to texture in metal shader, I also write float4. It seems that when I am inside the shader code, float4 always represents the same order of components RGBA regardless of the original format the texture is in ([0] for red, [1] for green, [2] for blue, and [3] for alpha). Is my conclusion correct that the meaning of the components of the sampled/written float4 is always the same inside the shader, regardless of what the storage format of the texture is?
UPDATE: I use the following code to write to a texture with RGBA8Unnorm format:
kernel void
computeColourMap(constant Uniforms &uniforms [[buffer(0)]],
constant array<float, 120> &s [[buffer(1)]],
constant array<float, 120> &red [[buffer(2)]],
constant array<float, 120> &green [[buffer(3)]],
constant array<float, 120> &blue [[buffer(4)]],
texture2d<float, access::write> output [[texture(0)]],
uint2 id [[thread_position_in_grid]])
{
if (id.x >= output.get_width() || id.y >= output.get_height()) {
return;
}
uint i = id.x % 120;
float4 col (0, 0, 0, 1);
col.x += amps[i] * red[i];
col.y += amps[i] * green[i];
col.z += amps[i] * blue[i];
output.write(col, id);
}
I then use the following shaders for the rendering stage:
vertex VertexOut
vertexShader(const device VertexIn *vertexArray [[buffer(0)]],
unsigned int vid [[vertex_id]])
{
VertexIn vertex_in = vertexArray[vid];
VertexOut vertex_out;
vertex_out.position = vertex_in.position;
vertex_out.textureCoord = vertex_in.textureCoord;
return vertex_out;
}
fragment float4
fragmentShader(VertexOut interpolated [[stage_in]],
texture2d<float> colorTexture [[ texture(0) ]])
{
const float4 colorSample = colorTexture.sample(nearestSampler,
interpolated.textureCoord);
return colorSample;
}
where colourTexture passed into the fragment shader is the one I generated in RGBA8Unorm format, and in Swift I have:
let renderPipelineDescriptor = MTLRenderPipelineDescriptor()
renderPipelineDescriptor.vertexFunction = library.makeFunction(name: "vertexShader")!
renderPipelineDescriptor.fragmentFunction = library.makeFunction(name: "fragmentShader")!
renderPipelineDescriptor.colorAttachments[0].pixelFormat = colorPixelFormat
the colorPixelFormat of the MTKView is BGRA8Unorm (reversed relative to texture), which is not the same as my texture, but the colours on the screen come out correct.
UPDATE 2: one further pointer that within a shader the colour represented by float4 always has order of rgba is: float4 type actually has accessors called v.r, v.g, v.b, v.rgb, etc...
The vector always has 4 components, but the type of the components is not necessarily float. When you declare a texture, you specify the component type as a template argument (texture2d<float ...> in your code).
For example, from Metal Shading Language Specification v2.1, section 5.10.1:
The following member functions can be used to sample from a 1D
texture.
Tv sample(sampler s, float coord) const
Tv is a 4-component vector type based on the templated type used
to declare the texture type. If T is float, Tv is float4. If T is half,
Tv is half4. If T is int, Tv is int4. If T is uint, Tv is uint4. If T
is short, Tv is short4 and if T is ushort, Tv is ushort4.
The same Tv type is used in the declaration of write(). The functions for other texture types are documented in a similar manner.
And, yes, component .r always contains the red component (if present), etc. And [0] always corresponds to .r (or .x).
I am trying to add a smudge effect to my paint brush project. To achieve that, I think I need to sample the the current results (which is in paintedTexture) from the start of the brush stroke coordinates and pass it to the fragment shader.
I have a vertex shader such as:
vertex VertexOut vertex_particle(
device Particle *particles [[buffer(0)]],
constant RenderParticleParameters *params [[buffer(1)]],
texture2d<half> imageTexture [[ texture(0) ]],
texture2d<half> paintedTexture [[ texture(1) ]],
uint instance [[instance_id]])
{
VertexOut out;
Drawing a fragment shader such as:
fragment half4 fragment_particle(VertexOut in [[ stage_in ]],
half4 existingColor [[color(0)]],
texture2d<half> brushTexture [[ texture(0) ]],
float2 point [[ point_coord ]]) {
Is it possible to create a clipped texture from the paintedTexture and send it to the fragment shader?
paintedTexture is the current results that have been painted to the canvas. I would like to create a new texture from paintedTexture using the same area as the brush texture and pass it to the fragment shader.
The existingColor [[color(0)]] in the fragment shader is of no use since it is the current color, not the color at the beginning of a stroke. If I use existingColor, it's like using transparency (or a transfer mode based on what math is used to combine it with a new color).
If I am barking up the wrong tree, any suggestions on how to achieve a smudging effect with Metal would potentially be acceptable answers.
Update: I tried using a texture2d in the VertexOut struct:
struct VertexOut {
float4 position [[ position ]];
float point_size [[ point_size ]];
texture2d<half> paintedTexture;
}
But it fails to compile with the error:
vertex function has invalid return type 'VertexOut'
It doesn't seem possible to have an array in the VertexOut struct either (which isn't nearly as ideal as a texture, but it could be a path forward):
struct VertexOut {
float4 position [[ position ]];
float point_size [[ point_size ]];
half4 paintedPixels[65536];
}
Gives me the error:
type 'VertexOut' is not valid for attribute 'stage_in'
It's not possible for shaders to create textures. They could fill an existing one, but I don't think that's what you want or need, here.
I would expect you could pass paintedTexture to the fragment shader and use the vertex shader to note where, from that texture, to sample. So, just coordinates.
I add stock smoke SCNParticleSystem to a node in a scene in an ARSCNView. This works as expected.
I use ARSCNView.snapshot() to capture image and process before drawing in MTKView draw() method.
I then call removeAllParticleSystems() on the main thread on node with particle system and remove the node from scene with removeFromParent().
I then add other nodes to the scene and eventually, the app crashes with the error validateFunctionArguments:3469: failed assertion 'Vertex Function(uberparticle_vert): missing buffer binding at index 19 for vertexBuffer.1[0].'
TheAll Exceptions break point often stops on the ARSCNView.snapshot() call.
Why is this this crashing?
What does the error mean?
How should I be adding and removing particle systems from scenes in ARSCNView's?
UPDATE:
I hooked up the MTKView subclass I use from here to a working ARKit demo with particle system and the same Vertex Function crash occurs.
Does that mean the issue is with the passthrough vertex shader function?
Why are the particle systems treated differently?
Below are the shader functions. Thanks.
#include <metal_stdlib>
using namespace metal;
// Vertex input/output structure for passing results from vertex shader to fragment shader
struct VertexIO
{
float4 position [[position]];
float2 textureCoord [[user(texturecoord)]];
};
// Vertex shader for a textured quad
vertex VertexIO vertexPassThrough(device packed_float4 *pPosition [[ buffer(0) ]],
device packed_float2 *pTexCoords [[ buffer(1) ]],
uint vid [[ vertex_id ]])
{
VertexIO outVertex;
outVertex.position = pPosition[vid];
outVertex.textureCoord = pTexCoords[vid];
return outVertex;
}
// Fragment shader for a textured quad
fragment half4 fragmentPassThrough(VertexIO inputFragment [[ stage_in ]],
texture2d<half> inputTexture [[ texture(0) ]],
sampler samplr [[ sampler(0) ]])
{
return inputTexture.sample(samplr, inputFragment.textureCoord);
}
In GLSL, I would simply use out vec3 array[10];, to pass an array from the vertex shader to the fragment shader. In Metal, however, I thought to do it this way:
struct FragmentIn {
float4 position [[position]];
float3 array[10];
};
This produces the following error:
Type 'FragmentIn' is not valid for attribute 'stage_in' because field 'array' has invalid type 'float3 [10]'
How can I solve this issue? I need to do certain calculations with per-vertex data that the fragment shader will use.
You need to "unroll" the array:
struct FragmentIn {
float4 position [[position]];
float3 thing0;
float3 thing1;
float3 thing2;
float3 thing3;
float3 thing4;
float3 thing5;
float3 thing6;
float3 thing7;
float3 thing8;
float3 thing9;
};
I'm making a sprite batcher that can deal with more than one texture per batch. A sprite's data is stored into this huge uniforms buffer which is sent to the GPU as soon as the sprite batch is all filled up. I tried assuming there would be 16 textures, some of which were going to be unused, and based on the textureID sent through the instance uniforms the fragment shader would pick the right texture to use. This yielded in roughly 60 fps with 800-1000 sprites on an iPhone 5s. I then tested this with a single texture and received a satisfying 2000 sprites at 60 fps. Knowing I would still need to be able to swap textures, I decided to use texture arrays to bind one texture with 16 slices. If I render using texture index 0, the fps is just as it was with the single slice texture. Once I delve into further slices, however, performance drops massively.
Here is the shader:
struct VertexIn {
packed_float2 position [[ attribute(0) ]];
packed_float2 texCoord [[ attribute(1) ]];
};
struct VertexOut {
float4 position [[position]];
float2 texCoord;
uint iid;
};
struct InstanceUniforms {
float3x2 transformMatrix;
float2 uv;
float2 uvLengths;
float textureID;
};
vertex VertexOut spriteVertexShader(const device VertexIn *vertex_array [[ buffer(0) ]],
const device InstanceUniforms *instancedUniforms [[ buffer(1) ]],
uint vid [[ vertex_id ]],
uint iid [[ instance_id ]]) {
VertexIn vertexIn = vertex_array[vid];
InstanceUniforms instanceUniforms = instancedUniforms[iid];
VertexOut vertexOut;
vertexOut.position = float4(instanceUniforms.transformMatrix * float3(vertexIn.position, 1.0), 0.0, 1.0);
vertexOut.texCoord = instanceUniforms.uv + vertexIn.texCoord * instanceUniforms.uvLengths;
vertexOut.iid = iid;
return vertexOut;
}
fragment float4 spriteFragmentShader(VertexOut interpolated [[ stage_in ]],
const device InstanceUniforms *instancedUniforms [[ buffer(0) ]],
texture2d_array<float> tex [[ texture(0) ]],
sampler sampler2D [[ sampler(0) ]],
float4 dst_color [[ color(0) ]]) {
InstanceUniforms instanceUniforms = instancedUniforms[interpolated.iid];
float2 texCoord = interpolated.texCoord;
return tex.sample(sampler2D, texCoord, uint(instanceUniforms.textureID));
}
Everything is working exactly as expected until I use a texture slice greater than 0. I am using instanced rendering. All sprites share the same vertex and index buffer.