iOS MetalKit: Loop through array in MSL - ios

This seems to be a silly question, but I can't find a good way to loop through an array and currently, I have to pass a buffer that contains the element count to my kernel function.
kernel void test_func(constant const int2* array [[ buffer(0) ]],
constant const int& arrayCount [[ buffer(1) ]],
device half4* result [[ buffer(2) ]],
uint2 pos [[thread_position_in_grid]]) {
// some code to end early if pos is outside of my data
for(ulong i = 0; i < sizeof(array) / sizeof(int2) /*(ulong) arrayCount*/; i += 1 ) {
// do something
}
}
Calculation using sizeof always yields incorrect results, on the other hand, using the count buffer return correct results. Seems like MSL doesn't support for each loop of c++ 11.
There should be a better way to do this, right?

Related

Is there a iOS Metal value for bt601?

I have sample metal code that I'm trying to convert to iOS. Is there an iOS compatible value that I can use for bt601?
#include <metal_stdlib>
#include "utilities.h" // error not found
using namespace metal;
kernel void laplace(texture2d<half, access::read> inTexture [[ texture(0) ]],
texture2d<half, access::read_write> outTexture [[ texture(1) ]],
uint2 gid [[ thread_position_in_grid ]]) {
constexpr int kernel_size = 3;
constexpr int radius = kernel_size / 2;
half3x3 laplace_kernel = half3x3(0, 1, 0,
1, -4, 1,
0, 1, 0);
half4 acc_color(0, 0, 0, 0);
for (int j = 0; j <= kernel_size - 1; j++) {
for (int i = 0; i <= kernel_size - 1; i++) {
uint2 textureIndex(gid.x + (i - radius), gid.y + (j - radius));
acc_color += laplace_kernel[i][j] * inTexture.read(textureIndex).rgba;
}
}
half value = dot(acc_color.rgb, bt601); //bt601 not defined
half4 gray_color(value, value, value, 1.0);
outTexture.write(gray_color, gid);
}
It seems that the intention here is simply to derive a single "luminance" value from the RGB output of the kernel. In that case, bt601 would be a three-element vector whose components are the desired weights of the respective channels, summing to 1.0.
Borrowing values from Rec. 601, we might define it like this:
float3 bt601(0.299f, 0.587f, 0.114f);
This is certainly a common choice. Another popular choice uses coefficients found in the Rec. 709 standard. That would look like this:
float3 bt709(0.212671f, 0.715160f, 0.072169f);
Both of these vectors will give you a single gray value that approximates the brightness of a linear sRGB color. Whether either of them is "correct" depends on the provenance of your data and how you process it further down the pipeline.
For whatever it's worth, the MetalPerformanceShaders MPSImageThresholdBinary kernel seems to favor the BT.601 values.
I'd recommend taking a look at this answer for more detail on the issues, and conditions under which the use of these values is appropriate.

How to pass a texture to a vertex shader? (iOS & Metal) (IOAF code 5)

I want to modify a geo grid with a texture in vertex shader.
I've got a working Metal pipeline.
I pass the MTLTexture in like this:
commandEncoder.setVertexTexture(texture, index: 0)
commandEncoder.setVertexSamplerState(sampler, index: 0)
My vertex shader func:
vertex VertexOut distort3DVTX(const device VertexIn* vertecies [[ buffer(0) ]],
unsigned int vid [[ vertex_id ]],
texture2d<float> inTex [[ texture(0) ]],
sampler s [[ sampler(0) ]]) {
VertexIn vtxIn = vertecies[vid];
float x = vtxIn.position[0];
float y = vtxIn.position[1];
float u = x / 2 + 0.5;
float v = y / 2 + 0.5;
float2 uv = float2(u, v);
float4 c = inTex.sample(s, uv);
VertexOut vtxOut;
vtxOut.position = float4(x + (c.r - 0.5), y + (c.g - 0.5), 0, 1);
vtxOut.texCoord = vtxIn.texCoord;
return vtxOut;
}
This is the error I see:
Execution of the command buffer was aborted due to an error during execution. Discarded (victim of GPU error/recovery) (IOAF code 5)
If I replace float4 c = inTex.sample(s, uv); with float4 c = 0.5; I don't see the error. So it's definitely something with sampling the texture...
Any idea how to solve IOAF code 5?
Update 1:
The error code dose not seem to be related to the texture, the same thing happens when I try to pass a uniform buffer...
const device Uniforms& in [[ buffer(1) ]]
Update 2:
Edit Scheme -> Run -> Options -> GPU Frame Capture -> Metal
Previously I had it set to Automatically Enabled.
Now I get relevant error logs:
Thread 1: signal SIGABRT
validateFunctionArguments:3469: failed assertion `Vertex Function(particle3DVTX): missing buffer binding at index 1 for in[0].'
Tho I'm crashing before I drawPrimitives or endEncoding...
Update 3:
Here's how I pass the uniform values:
var vertexUnifroms: [Float] = ...
let size = MemoryLayout<Float>.size * vertexUnifroms.count
guard let uniformsBuffer = metalDevice.makeBuffer(length: size, options: []) else {
commandEncoder.endEncoding()
throw RenderError.uniformsBuffer
}
let bufferPointer = uniformsBuffer.contents()
memcpy(bufferPointer, &vertexUnifroms, size)
commandEncoder.setVertexBuffer(uniformsBuffer, offset: 0, index: 1)
Update 4:
Clean helped. I now see where it's crashing; drawPrimitives. My vertexUnifroms was empty, fixed the bug, and now I've got uniforms!
I had the same problem. I discovered that I needed to set Vertex Buffer Bytes with:
commandEncoder.setVertexBytes(&vertexUniforms, length: MemoryLayout<VertexUniforms>.size, index: 1)
...the same thing can also be done for the Fragment Buffer Bytes:
commandEncoder.setFragmentBytes(&fragmentUniforms, length: MemoryLayout<FragmentUniforms>.size, index: 1)

Metal kernel -- 24-bit chicanery

Below is my kernel. It works wonderfully if both the input and output buffers contain RGBA-32 bit pixel data. I've made this kernel slightly inefficient to show Metal's seeming ineptitude in dealing with 24-bit data.
(I previously had this working with the input and output buffers being declared as containing uint32_t data)
kernel void stripe_Kernel(device const uchar *inBuffer [[ buffer(0) ]],
device uchar4 *outBuffer [[ buffer(1) ]],
device const ushort *imgWidth [[ buffer(2) ]],
device const ushort *imgHeight [[ buffer(3) ]],
device const ushort *packWidth [[ buffer(4) ]],
uint2 gid [[ thread_position_in_grid ]])
{
const ushort imgW = imgWidth[0];
const ushort imgH = imgHeight[0];
const ushort packW = packWidth[0]; // eg. 2048
uint32_t posX = gid.x; // eg. 0...2047
uint32_t posY = gid.y; // eg. 0...895
uint32_t sourceX = ((int)(posY/imgH)*packW + posX) % imgW;
uint32_t sourceY = (int)(posY%imgH);
uint32_t ptr = (sourceY*imgW + sourceX)*4; // this is for 32-bit data
uchar4 pixel = uchar4(inBuffer[ptr],inBuffer[ptr+1],inBuffer[ptr+2],255);
outBuffer[posY*packW + posX] = pixel;
}
I should mention that the inBuffer has been allocated as follows:
unsigned char *diskFrame;
posix_memalign((void *)&diskFrame,0x4000,imgHeight*imgWidth*4);
Now... if I actually have 24-bit data in there, and use multipliers of 3 (wherever I have 4), I get a entirely black image.
What's with that?

write method of texture2d<int, access:write> do not work in metal shader function

As mentioned in Apple's document, texture2d of shading language could be of int type. I have tried to use texture2d of int type as parameter of shader language, but the write method of texture2d failed to work.
kernel void dummy(texture2d<int, access::write> outTexture [[ texture(0) ]],
uint2 gid [[ thread_position_in_grid ]])
{
outTexture.write( int4( 2, 4, 6, 8 ), gid );
}
However, if I replace the int with float, it worked.
kernel void dummy(texture2d<float, access::write> outTexture [[ texture(0) ]],
uint2 gid [[ thread_position_in_grid ]])
{
outTexture.write( float4( 1.0, 0, 0, 1.0 ), gid );
}
Could other types of texture2d, such texture2d of int, texture2d of short and so on, be used as shader function parameters, and how to use them? Thanks for reviewing my question.
The related host codes:
MTLTextureDescriptor *desc = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat:MTLPixelFormatRGBA8Unorm
desc.usage = MTLTextureUsageShaderWrite;
id<MTLTexture> texture = [device newTextureWithDescriptor:desc];
[commandEncoder setTexture:texture atIndex:0];
The code to show the output computed by GPU, w and h represents width and height of textrue, respectively.
uint8_t* imageBytes = malloc(w*h*4);
memset( imageBytes, 0, w*h*4 );
MTLRegion region = MTLRegionMake2D(0, 0, [texture width], [texture height]);
[texture getBytes:imageBytes bytesPerRow:[texture width]*4 fromRegion:region mipmapLevel:0];
for( int j = 0; j < h; j++ )
{
printf("%3d: ", j);
for( int i = 0; i < w*pixel_size; i++ )
{
printf(" %3d",imageBytes[j*w*pixel_size+i] );
}
printf("\n")
}
The problem is that the pixel format you used to create this texture (MTLPixelFormatRGBA8Unorm) is normalized, meaning that the expected pixel value range is 0.0-1.0. For normalized pixel types, the required data type for reading or writing to this texture within a Metal kernel is float or half-float.
In order to write to a texture with integers, you must select an integer pixel format. Here are all of the available formats:
https://developer.apple.com/documentation/metal/mtlpixelformat
The Metal Shading Language Guide states that:
Note: If T is int or short, the data associated with the texture must use a signed integer format. If T is uint or ushort, the data associated with the texture must use an unsigned integer format.
All you have to do is make sure the texture you write to in the API (host code) matches what you have in the kernel function. Alternatively, you can also cast the int values into float before writing to the outTexture.

Compiler error when trying to add constant float3x3 to shader file

I am trying to add this code to my Metal language file:
constant float3x3 rgb2xyz(
float3(0.412453f, 0.212671f, 0.019334f),
float3(0.357580f, 0.715160f, 0.119193f),
float3(0.180423f, 0.072169f, 0.950227f)
);
or this
constant float3x3 rgb2xyz = float3x3(
float3(0.412453f, 0.212671f, 0.019334f),
float3(0.357580f, 0.715160f, 0.119193f),
float3(0.180423f, 0.072169f, 0.950227f)
);
The metal compiler gives me the following error:
No matching constructor for initialization of 'const constant float3x3' (aka 'const constant matrix<float, 3, 3>')
However if I do
typedef struct {
float3x3 matrix;
float3 offset;
float zoom;
} Conversion;
constant Conversion colorConversion = {
.matrix = float3x3(
float3 ( 1.164f, 1.164f, 1.164f ),
float3 ( 0.000f, -0.392f, 2.017f ),
float3 ( 1.596f, -0.813f, 0.000f )
),
.offset = float3 ( -(16.0f/255.0f), -0.5f, -0.5f )
};
I don't get any compile error.
Any ideas what is going wrong? It also works without problems with vector types:
constant float3 bgr2xyzCol1(0.357580f, 0.715160f, 0.119193f);
How would be a good way to define a constant matrix directly in the code?
You should pass it in as a constant reference, see WWDC session 604.
e.g. see matrices here, TransformMatrices is a custom data structure in this case
vertex VertexOutput my_vertex(const global float3* position_data [[ buffer(0) ]], const global
 float3* normal_data [[ buffer(1) ]], constant TransformMatrices& matrices [[ buffer(2) ]], uint vid [[ vertex_id ]])
{
VertexOutput out;
float3 n_d = normal_data[vid];
float3 transformed_normal = matrices.normal_matrix * n_d;
float4 p_d = float4(position_data[vid], 1.0f);
 out.position = * p_d;
float4 eye_vector = * p_d;
...
return out;
}

Resources