I am trying to add this code to my Metal language file:
constant float3x3 rgb2xyz(
float3(0.412453f, 0.212671f, 0.019334f),
float3(0.357580f, 0.715160f, 0.119193f),
float3(0.180423f, 0.072169f, 0.950227f)
);
or this
constant float3x3 rgb2xyz = float3x3(
float3(0.412453f, 0.212671f, 0.019334f),
float3(0.357580f, 0.715160f, 0.119193f),
float3(0.180423f, 0.072169f, 0.950227f)
);
The metal compiler gives me the following error:
No matching constructor for initialization of 'const constant float3x3' (aka 'const constant matrix<float, 3, 3>')
However if I do
typedef struct {
float3x3 matrix;
float3 offset;
float zoom;
} Conversion;
constant Conversion colorConversion = {
.matrix = float3x3(
float3 ( 1.164f, 1.164f, 1.164f ),
float3 ( 0.000f, -0.392f, 2.017f ),
float3 ( 1.596f, -0.813f, 0.000f )
),
.offset = float3 ( -(16.0f/255.0f), -0.5f, -0.5f )
};
I don't get any compile error.
Any ideas what is going wrong? It also works without problems with vector types:
constant float3 bgr2xyzCol1(0.357580f, 0.715160f, 0.119193f);
How would be a good way to define a constant matrix directly in the code?
You should pass it in as a constant reference, see WWDC session 604.
e.g. see matrices here, TransformMatrices is a custom data structure in this case
vertex VertexOutput my_vertex(const global float3* position_data [[ buffer(0) ]], const global
float3* normal_data [[ buffer(1) ]], constant TransformMatrices& matrices [[ buffer(2) ]], uint vid [[ vertex_id ]])
{
VertexOutput out;
float3 n_d = normal_data[vid];
float3 transformed_normal = matrices.normal_matrix * n_d;
float4 p_d = float4(position_data[vid], 1.0f);
out.position = * p_d;
float4 eye_vector = * p_d;
...
return out;
}
Related
Vertex shaders in Metal can use a [[vertex_id]] attribute on an integer argument, and that argument will get values between 0 and the number of vertexes. Is there a similar thing for fragment shaders?
I want to write some debug info to a buffer, within the fragment shader, as shown below. Then in the CPU code, I would print the contents of the buffer, as a way to debug what is going on in the fragment shader.
struct VertexIn {
float4 position [[attribute(0)]];
};
struct VertexOut {
float4 position [[position]];
};
vertex VertexOut vertex_main(const VertexIn vertex_in [[stage_in]]) {
VertexOut out;
out.position = vertex_in.position;
return out;
}
fragment float4
fragment_main(VertexOut vo [[stage_in]],
uint fragment_id [[fragment_id]], // <-- Hypothetical, doesn't actually work.
device float4 *debug_out [[buffer(0)]],
device uint *debug_out2 [[buffer(1)]])
{
debug_out[fragment_id] = vo.position;
debug_out2[fragment_id] = ...
return float4(0, 1, clamp(vo.position.x/1000, 0.0, 1.0), 1);
}
Trying to create deferred screenspace decals rendering in Metal by following this article. Though can't seem to figure it out...
These are bounds of the decal...
Actual result...
Potential issue
So apparently it doesn't think that the decal is intersecting the mesh, I'm sampling the depth value correctly, but then when calculating the actual position the the pixel in 3D space something doesn't add up.
Code
vertex VertexOut vertex_decal(
const VertexIn in [[ stage_in ]],
constant DecalVertexUniforms &uniforms [[ buffer(2) ]]
) {
VertexOut out;
out.position = uniforms.projectionMatrix * uniforms.viewMatrix * uniforms.modelMatrix * in.position;
out.viewPosition = (uniforms.viewMatrix * uniforms.modelMatrix * in.position).xyz;
out.normal = uniforms.normalMatrix * in.normal;
out.uv = in.uv;
return out;
}
fragment float4 fragment_decal(
const VertexOut in [[ stage_in ]],
constant DecalFragmentUniforms &uniforms [[ buffer(3) ]],
depth2d<float, access::sample> depthTexture [[ texture(0) ]]
) {
constexpr sampler textureSampler (mag_filter::nearest, min_filter::nearest);
float2 resolution = float2(
depthTexture.get_width(),
depthTexture.get_height()
);
float2 textureCoordinate = in.position.xy / resolution;
float depth = depthTexture.sample(textureSampler, textureCoordinate);
float3 viewRay = in.viewPosition * (uniforms.farClipPlane / in.viewPosition.z);
float3 viewPosition = viewRay * depth;
float3 worldPositon = (uniforms.inverseViewMatrix * float4(viewPosition, 1)).xyz;
float3 objectPositon = (uniforms.inverseModelMatrix * float4(worldPositon, 1)).xyz;
float distX = 0.5 - abs(objectPositon.x);
float distY = 0.5 - abs(objectPositon.y);
float distZ = 0.5 - abs(objectPositon.z);
if(distX > 0 && distY > 0 && distZ > 0) {
return float4(1, 0, 0, 0.5);
} else {
discard_fragment();
}
}
EDIT:
Made a bit of a progress, now it at least renders something, it clips the decal box correctly once its outside of some mesh, but the parts on the mesh are still not completely correct.. to be exact it also renders sides of the box that are overlapping with the mesh under the decal (you can see it on the image below as the red there is a bit darker)
And to add more details, the depthTexture is passed from previous "pass" so it only contains the icosphere on it, and the decal cube shader doesn't write to the depthTexture, just reads from it.
and depth stencil is defined as...
let stencilDescriptor = MTLDepthStencilDescriptor()
stencilDescriptor.depthCompareFunction = .less
stencilDescriptor.isDepthWriteEnabled = false
and render pipeline is defined as...
let renderPipelineDescriptor = MTLRenderPipelineDescriptor()
renderPipelineDescriptor.vertexDescriptor = vertexDescriptor
renderPipelineDescriptor.vertexFunction = vertexLibrary.makeFunction(name: "vertex_decal")
renderPipelineDescriptor.fragmentFunction = fragmentLibrary.makeFunction(name: "fragment_decal")
if let colorAttachment = renderPipelineDescriptor.colorAttachments[0] {
colorAttachment.pixelFormat = .bgra8Unorm
colorAttachment.isBlendingEnabled = true
colorAttachment.rgbBlendOperation = .add
colorAttachment.sourceRGBBlendFactor = .sourceAlpha
colorAttachment.destinationRGBBlendFactor = .oneMinusSourceAlpha
}
renderPipelineDescriptor.colorAttachments[1].pixelFormat = .bgra8Unorm
renderPipelineDescriptor.depthAttachmentPixelFormat = .depth32Float
so the current issue is that it discards only pixels that are out of the mesh that its being projected on, instead of all pixels that are "above" the surface of the icosphere
New Shader Code
fragment float4 fragment_decal(
const VertexOut in [[ stage_in ]],
constant DecalFragmentUniforms &uniforms [[ buffer(3) ]],
depth2d<float, access::sample> depthTexture [[ texture(0) ]]
) {
constexpr sampler textureSampler (mag_filter::nearest, min_filter::nearest);
float2 resolution = float2(
depthTexture.get_width(),
depthTexture.get_height()
);
float2 textureCoordinate = in.position.xy / resolution;
float depth = depthTexture.sample(textureSampler, textureCoordinate);
float3 screenPosition = float3(textureCoordinate * 2 - 1, depth);
float4 viewPosition = uniforms.inverseProjectionMatrix * float4(screenPosition, 1);
float4 worldPosition = uniforms.inverseViewMatrix * viewPosition;
float3 objectPosition = (uniforms.inverseModelMatrix * worldPosition).xyz;
if(abs(worldPosition.x) > 0.5 || abs(worldPosition.y) > 0.5 || abs(worldPosition.z) > 0.5) {
discard_fragment();
} else {
return float4(1, 0, 0, 0.5);
}
}
Finally managed to get it to work properly, so the final shader code is...
the issues that the latest shader had were...
Flipped Y axis on screenPosition
Not converting the objectPosition to NDC space (localPosition)
fragment float4 fragment_decal(
const VertexOut in [[ stage_in ]],
constant DecalFragmentUniforms &uniforms [[ buffer(3) ]],
depth2d<float, access::sample> depthTexture [[ texture(0) ]],
texture2d<float, access::sample> colorTexture [[ texture(1) ]]
) {
constexpr sampler depthSampler (mag_filter::linear, min_filter::linear);
float2 resolution = float2(
depthTexture.get_width(),
depthTexture.get_height()
);
float2 depthCoordinate = in.position.xy / resolution;
float depth = depthTexture.sample(depthSampler, depthCoordinate);
float3 screenPosition = float3((depthCoordinate.x * 2 - 1), -(depthCoordinate.y * 2 - 1), depth);
float4 viewPosition = uniforms.inverseProjectionMatrix * float4(screenPosition, 1);
float4 worldPosition = uniforms.inverseViewMatrix * viewPosition;
float4 objectPosition = uniforms.inverseModelMatrix * worldPosition;
float3 localPosition = objectPosition.xyz / objectPosition.w;
if(abs(localPosition.x) > 0.5 || abs(localPosition.y) > 0.5 || abs(localPosition.z) > 0.5) {
discard_fragment();
} else {
float2 textureCoordinate = localPosition.xy + 0.5;
float4 color = colorTexture.sample(depthSampler, textureCoordinate);
return float4(color.rgb, 1);
}
}
The final results look like this (red are pixels that are kept, blue pixels are discarded)...
I want to modify a geo grid with a texture in vertex shader.
I've got a working Metal pipeline.
I pass the MTLTexture in like this:
commandEncoder.setVertexTexture(texture, index: 0)
commandEncoder.setVertexSamplerState(sampler, index: 0)
My vertex shader func:
vertex VertexOut distort3DVTX(const device VertexIn* vertecies [[ buffer(0) ]],
unsigned int vid [[ vertex_id ]],
texture2d<float> inTex [[ texture(0) ]],
sampler s [[ sampler(0) ]]) {
VertexIn vtxIn = vertecies[vid];
float x = vtxIn.position[0];
float y = vtxIn.position[1];
float u = x / 2 + 0.5;
float v = y / 2 + 0.5;
float2 uv = float2(u, v);
float4 c = inTex.sample(s, uv);
VertexOut vtxOut;
vtxOut.position = float4(x + (c.r - 0.5), y + (c.g - 0.5), 0, 1);
vtxOut.texCoord = vtxIn.texCoord;
return vtxOut;
}
This is the error I see:
Execution of the command buffer was aborted due to an error during execution. Discarded (victim of GPU error/recovery) (IOAF code 5)
If I replace float4 c = inTex.sample(s, uv); with float4 c = 0.5; I don't see the error. So it's definitely something with sampling the texture...
Any idea how to solve IOAF code 5?
Update 1:
The error code dose not seem to be related to the texture, the same thing happens when I try to pass a uniform buffer...
const device Uniforms& in [[ buffer(1) ]]
Update 2:
Edit Scheme -> Run -> Options -> GPU Frame Capture -> Metal
Previously I had it set to Automatically Enabled.
Now I get relevant error logs:
Thread 1: signal SIGABRT
validateFunctionArguments:3469: failed assertion `Vertex Function(particle3DVTX): missing buffer binding at index 1 for in[0].'
Tho I'm crashing before I drawPrimitives or endEncoding...
Update 3:
Here's how I pass the uniform values:
var vertexUnifroms: [Float] = ...
let size = MemoryLayout<Float>.size * vertexUnifroms.count
guard let uniformsBuffer = metalDevice.makeBuffer(length: size, options: []) else {
commandEncoder.endEncoding()
throw RenderError.uniformsBuffer
}
let bufferPointer = uniformsBuffer.contents()
memcpy(bufferPointer, &vertexUnifroms, size)
commandEncoder.setVertexBuffer(uniformsBuffer, offset: 0, index: 1)
Update 4:
Clean helped. I now see where it's crashing; drawPrimitives. My vertexUnifroms was empty, fixed the bug, and now I've got uniforms!
I had the same problem. I discovered that I needed to set Vertex Buffer Bytes with:
commandEncoder.setVertexBytes(&vertexUniforms, length: MemoryLayout<VertexUniforms>.size, index: 1)
...the same thing can also be done for the Fragment Buffer Bytes:
commandEncoder.setFragmentBytes(&fragmentUniforms, length: MemoryLayout<FragmentUniforms>.size, index: 1)
Below is my kernel. It works wonderfully if both the input and output buffers contain RGBA-32 bit pixel data. I've made this kernel slightly inefficient to show Metal's seeming ineptitude in dealing with 24-bit data.
(I previously had this working with the input and output buffers being declared as containing uint32_t data)
kernel void stripe_Kernel(device const uchar *inBuffer [[ buffer(0) ]],
device uchar4 *outBuffer [[ buffer(1) ]],
device const ushort *imgWidth [[ buffer(2) ]],
device const ushort *imgHeight [[ buffer(3) ]],
device const ushort *packWidth [[ buffer(4) ]],
uint2 gid [[ thread_position_in_grid ]])
{
const ushort imgW = imgWidth[0];
const ushort imgH = imgHeight[0];
const ushort packW = packWidth[0]; // eg. 2048
uint32_t posX = gid.x; // eg. 0...2047
uint32_t posY = gid.y; // eg. 0...895
uint32_t sourceX = ((int)(posY/imgH)*packW + posX) % imgW;
uint32_t sourceY = (int)(posY%imgH);
uint32_t ptr = (sourceY*imgW + sourceX)*4; // this is for 32-bit data
uchar4 pixel = uchar4(inBuffer[ptr],inBuffer[ptr+1],inBuffer[ptr+2],255);
outBuffer[posY*packW + posX] = pixel;
}
I should mention that the inBuffer has been allocated as follows:
unsigned char *diskFrame;
posix_memalign((void *)&diskFrame,0x4000,imgHeight*imgWidth*4);
Now... if I actually have 24-bit data in there, and use multipliers of 3 (wherever I have 4), I get a entirely black image.
What's with that?
As mentioned in Apple's document, texture2d of shading language could be of int type. I have tried to use texture2d of int type as parameter of shader language, but the write method of texture2d failed to work.
kernel void dummy(texture2d<int, access::write> outTexture [[ texture(0) ]],
uint2 gid [[ thread_position_in_grid ]])
{
outTexture.write( int4( 2, 4, 6, 8 ), gid );
}
However, if I replace the int with float, it worked.
kernel void dummy(texture2d<float, access::write> outTexture [[ texture(0) ]],
uint2 gid [[ thread_position_in_grid ]])
{
outTexture.write( float4( 1.0, 0, 0, 1.0 ), gid );
}
Could other types of texture2d, such texture2d of int, texture2d of short and so on, be used as shader function parameters, and how to use them? Thanks for reviewing my question.
The related host codes:
MTLTextureDescriptor *desc = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat:MTLPixelFormatRGBA8Unorm
desc.usage = MTLTextureUsageShaderWrite;
id<MTLTexture> texture = [device newTextureWithDescriptor:desc];
[commandEncoder setTexture:texture atIndex:0];
The code to show the output computed by GPU, w and h represents width and height of textrue, respectively.
uint8_t* imageBytes = malloc(w*h*4);
memset( imageBytes, 0, w*h*4 );
MTLRegion region = MTLRegionMake2D(0, 0, [texture width], [texture height]);
[texture getBytes:imageBytes bytesPerRow:[texture width]*4 fromRegion:region mipmapLevel:0];
for( int j = 0; j < h; j++ )
{
printf("%3d: ", j);
for( int i = 0; i < w*pixel_size; i++ )
{
printf(" %3d",imageBytes[j*w*pixel_size+i] );
}
printf("\n")
}
The problem is that the pixel format you used to create this texture (MTLPixelFormatRGBA8Unorm) is normalized, meaning that the expected pixel value range is 0.0-1.0. For normalized pixel types, the required data type for reading or writing to this texture within a Metal kernel is float or half-float.
In order to write to a texture with integers, you must select an integer pixel format. Here are all of the available formats:
https://developer.apple.com/documentation/metal/mtlpixelformat
The Metal Shading Language Guide states that:
Note: If T is int or short, the data associated with the texture must use a signed integer format. If T is uint or ushort, the data associated with the texture must use an unsigned integer format.
All you have to do is make sure the texture you write to in the API (host code) matches what you have in the kernel function. Alternatively, you can also cast the int values into float before writing to the outTexture.