I'm building an app rendering 2D geometry in Metal.
Right now, the positions of the vertices are solved from within the vertex function. What I'd like is to write the solved positions back to a buffer from inside that same vertex function.
I'm under the impression that this is possible although in my first attempt to do it i.e.:
vertex VertexOut basic_vertex(device VertexIn *vertices [[ buffer(0) ]],
device VertexOut *solvedVertices [[ buffer(1) ]],
vid [[ vertex_id ]])
{
VertexIn in vertices[vid];
VertexOut out;
out.position = ... // Solve the position of the vertex
solvedVertices[vid] = out // Write to the buffer later to be read by CPU
return out;
}
I was graced with the presence of this compile time error:
Okay, so a few solutions come to my head - I could solve for the vertex positions in a first - non-rasterizing - pass through a vertex function declared like:
vertex void solve_vertex(device VertexIn *unsolved [[ buffer(0) ]],
device VertexOut *solved [[ buffer(1) ]],
vid [[ vertex_id ]])
{
solved[vid] = ...
}
And then pipe those solved vertices into a now much simpler - rasterizing - vertex function.
Another solution that could work but seems less appealing could be to solve them in a compute function.
So, what is the best way forward in a situation like this? From my little bits of research, I could track down that this same sort of procedure is done in Transform Feedback but I've had no luck (other than the link at the begging of the question) finding examples in Apple's documentation/sample code or elsewhere on the web for best practices when facing this sort of problem.
Alright, it turns out using a non-rasterizing vertex function is the way to go. There are some things to note however for others future reference:
A non-rasterizing vertex function is simply a vertex function returning void i.e.:
vertex void non_rasterizing_vertex(...) { }
When executing a non-rasterizing "render" pass, the MTLRenderPassDescriptor still needs to have a texture set - for instance in MTLRenderPassDescriptor's colorAttachments[0].texture - for reasons I don't know (I assume it's just due to the fixed nature of GPU programming).
The MTLRenderPipelineState needs to have it's rasterizationEnabled property set to false, then you can assign the non-rasterizing vertex function to it's vertexFunction property. The fragmentFunction property can remain nil as expected.
When actually executing the pass, one of the drawPrimitives: methods (the naming of which may be misleading) still needs to be invoked on the configured MTLRenderCommandEncoder. I ended up with a call to render MTLPrimitiveType.Points since that seems the most sensical.
Doing all of this sets up "rendering" logic ready to write back to vertex buffers from the vertex function - so long as they're in device address space:
vertex void non_rasterizing_vertex(device float *writeableBuffer [[ buffer(0) ]],
uint vid [[ vertex_id ]])
{
writeableBuffer[vid] = 42; // Write away!
}
This "answer" ended up more like a blog post but I hope it remains useful for future reference.
TODO
I'd still like to investigate performance tradeoffs between doing compute-y work like this in a compute pipeline versus in the rendering pipeline like above. Once I have some more time to do that, I'll update this answer.
The correct solution is to move any code writing to buffers to a compute kernel.
You will loose a great deal of performance writing to buffers in a vertex function. It is optimized for rasterizing, not for computation.
You just need to use a compute command encoder.
guard let computeBuffer = commandQueue.makeCommandBuffer() else { return }
guard let computeEncoder = computeBuffer.makeComputeCommandEncoder() else { return }
computeEncoder.setComputePipelineState(solveVertexPipelineState)
kernel void solve_vertex(device VertexIn *unsolved [[ buffer(0) ]],
device VertexOut *solved [[ buffer(1) ]],
vid [[ instance ]])
{
solved[vid] = ...
}
Related
I am building a minimalistic 3D engine in Metal and I want my vertex and fragment shader code to be as much reusable as possible so that my vertex shader can for instance be used without being changed no matter its input mesh vertex data layout.
An issue I have is that I can't guarantee all meshes will have the same attributes, for instance a mesh may just contain its position and normal data while another may additionally have UV coordinates attached.
Now my first issue is that if I define my vertex shader input structure like this:
struct VertexIn {
float3 position [[ attribute(0) ]];
float3 normal [[ attribute(1) ]];
float2 textureCoordinate [[ attribute(2) ]];
};
I wonder what is the consequence of doing so if there was no specified attribute 2 in my metal vertex descriptor? My tests seem to indicate there is no crash (at least of just defining such an argument in the input texture), but I wonder if this is just undefined behavior or if this is actually safe to do?
Another issue I have is that I might want to pass the uv texture info to the fragment shader (ie: return it from my vertex shader), but what happens if it is missing? It feel like except if specifically designed this way, it would be undefined behavior to access textureCoordinate to set its value to a property of some VertexOut structure I return from my vertex shader.
Additionally I notice that Apple's RealityKit framework must have found some way around this issue: it enables users to point to "shader modifier" functions that are passed the data of both vertex and fragment shaders so that they can act on it, what surprises me is that the structures the user functions are passed define a lot of properties which I am not sure are always defined for all meshes (for instance, a second UV texture). This seems pretty similar to the problem I am trying to solve.
Am I missing some obvious way to fix this issue?
I think the intended way to deal with this is function constants. This is an example of how I deal with this in my vertex shaders.
constant bool HasColor0 [[ function_constant(FunctionConstantHasColor0) ]];
constant bool HasNormal [[ function_constant(FunctionConstantHasNormal) ]];
constant bool HasTangent [[ function_constant(FunctionConstantHasTangent) ]];
constant bool HasTexCoord0 [[ function_constant(FunctionConstantHasTexCoord0) ]];
constant bool AlphaMask [[ function_constant(FunctionConstantAlphaMask) ]];
// ...
struct VertexIn
{
float3 position [[ attribute(AttributeBindingPosition) ]];
float3 normal [[ attribute(AttributeBindingNormal), function_constant(HasNormal) ]];
float4 tangent [[ attribute(AttributeBindingTangent), function_constant(HasTangent) ]];
float4 color [[ attribute(AttributeBindingColor0), function_constant(HasColor0) ]];
float2 texCoord [[ attribute(AttributeBindingTexcoord0), function_constant(HasTexCoord0) ]];
};
struct VertexOut
{
float4 positionCS [[ position ]];
float4 tangentVS = float4();
float3 positionVS = float3();
float3 normalVS = float3();
float2 texCoord = float2();
half4 color = half4();
};
static VertexOut ForwardVertexImpl(Vertex in, constant CameraUniform& camera, constant MeshUniform& meshUniform)
{
VertexOut out;
float4x4 viewModel = camera.view * meshUniform.model;
float4 positionVS = viewModel * float4(in.position.xyz, 1.0);
out.positionCS = camera.projection * positionVS;
out.positionVS = positionVS.xyz;
float4x4 normalMatrix;
if(HasNormal || HasTangent)
{
normalMatrix = transpose(meshUniform.inverseModel * camera.inverseView);
}
if(HasNormal)
{
out.normalVS = (normalMatrix * float4(in.normal, 0.0)).xyz;
}
if(HasTexCoord0)
{
out.texCoord = in.texCoord;
}
if(HasColor0)
{
out.color = half4(in.color);
}
else
{
out.color = half4(1.0);
}
if(HasTangent)
{
// Normal matrix or viewmodel matrix?
out.tangentVS.xyz = (normalMatrix * float4(in.tangent.xyz, 0.0)).xyz;
out.tangentVS.w = in.tangent.w;
}
return out;
}
vertex VertexOut ForwardVertex(
VertexIn in [[ stage_in ]],
constant CameraUniform& camera [[ buffer(BufferBindingCamera) ]],
constant MeshUniform& meshUniform [[ buffer(BufferBindingMesh) ]])
{
Vertex v
{
.color = in.color,
.tangent = in.tangent,
.position = in.position,
.normal = in.normal,
.texCoord = in.texCoord,
};
return ForwardVertexImpl(v, camera, meshUniform);
}
And in the host application I fill out the MTLFunctionConstantValues object based on the semantics geometry actually has:
func addVertexDescriptorFunctionConstants(toConstantValues values: MTLFunctionConstantValues) {
var unusedSemantics = Set<AttributeSemantic>(AttributeSemantic.allCases)
for attribute in attributes.compactMap({ $0 }) {
unusedSemantics.remove(attribute.semantic)
if let constant = attribute.semantic.functionConstant {
values.setConstantValue(true, index: constant)
}
}
for unusedSemantic in unusedSemantics {
if let constant = unusedSemantic.functionConstant {
values.setConstantValue(false, index: constant)
}
}
}
A good thing about it is that compiler should turn those function constants ifs into code without branches, so it shouldn't really be a problem during runtime AND this lets you compile your shaders offline without having to use online compilation and defines.
When I try to write to a float texture from a kernel, I get the error:
/SourceCache/AcceleratorKit/AcceleratorKit-17.7/ToolsLayers/Debug/MTLDebugComputeCommandEncoder.mm:596: failed assertion `Non-writable texture format MTLPixelFormatR32Float is being bound at index 2 to a shader argument with write access enabled.'
However, when I go check in the documentation, that format is color-renderable and function-writeable (see table at the bottom):
https://developer.apple.com/library/prerelease/ios/documentation/Metal/Reference/MetalConstants_Ref/index.html#//apple_ref/c/tdef/MTLPixelFormat
Partial code:
// texture creation
MTLTextureDescriptor *floatTextureDescriptor = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat:MTLPixelFormatR32Float width:_width height:_height mipmapped:NO];
_myFloatTexture = [self.device newTextureWithDescriptor:floatTextureDescriptor];
// texture binding
[computeCommandEncoder setTexture:_myFloatTexture atIndex:2];
// texture used in the shader
void kernel myKernel(//...
texture2d<float, access::write> myFloats [[ texture(2) ]],
uint2 gid [[ thread_position_in_grid ]])
Am I doing something wrong or might this be a bug?
They are supported only from iOS 9.
I am currently working on implementing dynamic shader linkage into my shader reflection code. It works quite nicely, but to make my code as dynamic as possible i would like to automate the process of getting the offset into the dynamicLinkageArray. Microsoft suggests something like this in their sample:
g_iNumPSInterfaces = pReflector->GetNumInterfaceSlots();
g_dynamicLinkageArray = (ID3D11ClassInstance**) malloc( sizeof(ID3D11ClassInstance*) * g_iNumPSInterfaces );
if ( !g_dynamicLinkageArray )
return E_FAIL;
ID3D11ShaderReflectionVariable* pAmbientLightingVar = pReflector->GetVariableByName("g_abstractAmbientLighting");
g_iAmbientLightingOffset = pAmbientLightingVar->GetInterfaceSlot(0);
I would like to this without giving the exact name, so when the shader changes i do not have to manually change this code. To accomplish this i would need to get the name i marked below through shader reflection. Is this possible? I searched through the References of the Shader-Reflection but did not find anything useful, besides the number of interface slots (GetNumInterfaceSlots()).
#include "BasicShader_PSBuffers.hlsli"
iBaseLight g_abstractAmbientLighting;
^^^^^^^^^^^^^^^^^^^^^^^^^^
struct PixelInput
{
float4 position : SV_POSITION;
float3 normals : NORMAL;
float2 tex: TEXCOORD0;
};
float4 main(PixelInput input) : SV_TARGET
{
float3 Ambient = (float3)0.0f;
Ambient = g_txDiffuse.Sample(g_samplerLin, input.tex) * g_abstractAmbientLighting.IlluminateAmbient(input.normals);
return float4(saturate(Ambient), 1.0f);
}
If this is not possible, how would one go about this? Just add anything i can think of there so that i have to change as little as possible manually?
Thanks in advance
people.
I have a problem passing a float array to vertex shader (HLSL) through constant buffer. I know that each "float" in the array below gets a 16-byte slot all by itself (space equivalent to float4) due to HLSL packing rule:
// C++ struct
struct ForegroundConstants
{
DirectX::XMMATRIX transform;
float bounceCpp[64];
};
// Vertex shader constant buffer
cbuffer ForegroundConstantBuffer : register(b0)
{
matrix transform;
float bounceHlsl[64];
};
(Unfortunately, the simple solution here does not work, nothing is drawn after I made that change)
While the C++ data gets passed, due to the packing rule they get spaced out such that each "float" in the bounceCpp C++ array gets into a 16-byte space all by itself in bounceHlsl array. This resulted in an warning similar to the following:
ID3D11DeviceContext::DrawIndexed: The size of the Constant Buffer at slot 0 of the Vertex Shader unit is too small (320 bytes provided, 1088 bytes, at least, expected). This is OK, as out-of-bounds reads are defined to return 0. It is also possible the developer knows the missing data will not be used anyway. This is only a problem if the developer actually intended to bind a sufficiently large Constant Buffer for what the shader expects.
The recommendation, as being pointed out here and here, is to rewrite the HLSL constant buffer this way:
cbuffer ForegroundConstantBuffer : register(b0)
{
matrix transform;
float4 bounceHlsl[16]; // equivalent to 64 floats.
};
static float temp[64] = (float[64]) bounceHlsl;
main(pos : POSITION) : SV_POSITION
{
int index = someValueRangeFrom0to63;
float y = temp[index];
// Bla bla bla...
}
But that didn't work (i.e. ID3D11Device1::CreateVertexShader never returns). I'm compiling things against Shader Model 4 Level 9_1, can you spot anything that I have done wrong here?
Thanks in advance! :)
Regards,
Ben
One solution, albeit non optimal, is to just declare your float array as
float4 bounceHlsl[16];
then process the index like
float x = ((float[4])(bounceHlsl[i/4]))[i%4];
where i is the index you require.
I've been asked to split the question below into multiple questions:
HLSL and Pix number of questions
This is asking the first question, can I in HLSL 3 run a pixel shader without a vertex shader. In HLSL 2 I notice you can but I can't seem to find a way in 3?
The shader will compile fine, I will then however get this error from Visual Studio when calling SpriteBatch Draw().
"Cannot mix shader model 3.0 with earlier shader models. If either the vertex shader or pixel shader is compiled as 3.0, they must both be."
I don't believe I've defined anything in the shader to use anything earlier then 3. So I'm left a bit confused. Any help would be appreciated.
The problem is that the built-in SpriteBatch shader is 2.0. If you specify a pixel shader only, SpriteBatch still uses its built-in vertex shader. Hence the version mismatch.
The solution, then, is to also specify a vertex shader yourself. Fortunately Microsoft provides the source to XNA's built-in shaders. All it involves is a matrix transformation. Here's the code, modified so you can use it directly:
float4x4 MatrixTransform;
void SpriteVertexShader(inout float4 color : COLOR0,
inout float2 texCoord : TEXCOORD0,
inout float4 position : SV_Position)
{
position = mul(position, MatrixTransform);
}
And then - because SpriteBatch won't set it for you - setting your effect's MatrixTransform correctly. It's a simple projection of "client" space (source from this blog post). Here's the code:
Matrix projection = Matrix.CreateOrthographicOffCenter(0,
GraphicsDevice.Viewport.Width, GraphicsDevice.Viewport.Height, 0, 0, 1);
Matrix halfPixelOffset = Matrix.CreateTranslation(-0.5f, -0.5f, 0);
effect.Parameters["MatrixTransform"].SetValue(halfPixelOffset * projection);
You can try the simple examples here. The greyscale shader is a very good example to understand how a minimal pixel shader works.
Basically, you create a Effect under your content project like this one:
sampler s0;
float4 PixelShaderFunction(float2 coords: TEXCOORD0) : COLOR0
{
// B/N
//float4 color = tex2D(s0, coords);
//color.gb = color.r;
// Transparent
float4 color = tex2D(s0, coords);
return color;
}
technique Technique1
{
pass Pass1
{
PixelShader = compile ps_2_0 PixelShaderFunction();
}
}
You also need to:
Create an Effect object and load its content.
ambienceEffect = Content.Load("Effects/Ambient");
Call your SpriteBatch.Begin() method passing the Effect object you want to use
spriteBatch.Begin( SpriteSortMode.FrontToBack,
BlendState.AlphaBlend,
null,
null,
null,
ambienceEffect,
camera2d.GetTransformation());
Inside the SpriteBatch.Begin() - SpriteBatch.End() block, you must call the Technique inside the Effect
ambienceEffect.CurrentTechnique.Passes[0].Apply();