When I try to write to a float texture from a kernel, I get the error:
/SourceCache/AcceleratorKit/AcceleratorKit-17.7/ToolsLayers/Debug/MTLDebugComputeCommandEncoder.mm:596: failed assertion `Non-writable texture format MTLPixelFormatR32Float is being bound at index 2 to a shader argument with write access enabled.'
However, when I go check in the documentation, that format is color-renderable and function-writeable (see table at the bottom):
https://developer.apple.com/library/prerelease/ios/documentation/Metal/Reference/MetalConstants_Ref/index.html#//apple_ref/c/tdef/MTLPixelFormat
Partial code:
// texture creation
MTLTextureDescriptor *floatTextureDescriptor = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat:MTLPixelFormatR32Float width:_width height:_height mipmapped:NO];
_myFloatTexture = [self.device newTextureWithDescriptor:floatTextureDescriptor];
// texture binding
[computeCommandEncoder setTexture:_myFloatTexture atIndex:2];
// texture used in the shader
void kernel myKernel(//...
texture2d<float, access::write> myFloats [[ texture(2) ]],
uint2 gid [[ thread_position_in_grid ]])
Am I doing something wrong or might this be a bug?
They are supported only from iOS 9.
Related
My metal default library does not contain the vertex and shader functions from the .metal file of the same directory.
Then the library.makeFunction(name: ..) returns nil for both the vertex and shader functions that should be assigned to pipelineDescriptor vars.
The metal file & headers are copied from the Apple Sample App "BasicTexturing" (Creating and Sampling Textures).
The file APPLShaders.metal and APPLShaderTypes.h contain a vertexShader and samplingShader functions that are loaded by an AAPLRenderer.m
In the sample it's really straightforward
id<MTLLibrary> defaultLibrary = [_device newDefaultLibrary];
id<MTLFunction> vertexFunction = [defaultLibrary newFunctionWithName:#"vertexShader"];
id<MTLFunction> fragmentFunction = [defaultLibrary newFunctionWithName:#"samplingShader"];
I have copied these files to a RayWenderlich Swift tutorial and used the swift version
There is an init to set the library
Renderer.library = device.makeDefaultLibrary()
then
let library = Renderer.library
let importVertexFunction = library?.makeFunction(name: "vertexShader")
let importShaderFunction = library?.makeFunction(name: "samplingShader")
This works just fine!
Same thing in my app with the same files copied over and it does not load the functions.
I have checked compileSources in build settings - it lists the metal file.
Comparing everything in settings and don't see a difference between the working apps and my app.
I don't see any error messages or log messages to indicate a syntax or path problem.
Any ideas?
The Apple sample code AAPLShaders.metal
/*
See LICENSE folder for this sample’s licensing information.
Abstract:
Metal shaders used for this sample
*/
#include <metal_stdlib>
#include <simd/simd.h>
using namespace metal;
// Include header shared between this Metal shader code and C code executing Metal API commands
#import "AAPLShaderTypes.h"
// Vertex shader outputs and per-fragment inputs. Includes clip-space position and vertex outputs
// interpolated by rasterizer and fed to each fragment generated by clip-space primitives.
typedef struct
{
// The [[position]] attribute qualifier of this member indicates this value is the clip space
// position of the vertex wen this structure is returned from the vertex shader
float4 clipSpacePosition [[position]];
// Since this member does not have a special attribute qualifier, the rasterizer will
// interpolate its value with values of other vertices making up the triangle and
// pass that interpolated value to the fragment shader for each fragment in that triangle;
float2 textureCoordinate;
} RasterizerData;
// Vertex Function
vertex RasterizerData
vertexShader(uint vertexID [[ vertex_id ]],
constant AAPLVertex *vertexArray [[ buffer(AAPLVertexInputIndexVertices) ]],
constant vector_uint2 *viewportSizePointer [[ buffer(AAPLVertexInputIndexViewportSize) ]])
{
RasterizerData out;
// Index into our array of positions to get the current vertex
// Our positions are specified in pixel dimensions (i.e. a value of 100 is 100 pixels from
// the origin)
float2 pixelSpacePosition = vertexArray[vertexID].position.xy;
// Get the size of the drawable so that we can convert to normalized device coordinates,
float2 viewportSize = float2(*viewportSizePointer);
// The output position of every vertex shader is in clip space (also known as normalized device
// coordinate space, or NDC). A value of (-1.0, -1.0) in clip-space represents the
// lower-left corner of the viewport whereas (1.0, 1.0) represents the upper-right corner of
// the viewport.
// In order to convert from positions in pixel space to positions in clip space we divide the
// pixel coordinates by half the size of the viewport.
out.clipSpacePosition.xy = pixelSpacePosition / (viewportSize / 2.0);
// Set the z component of our clip space position 0 (since we're only rendering in
// 2-Dimensions for this sample)
out.clipSpacePosition.z = 0.0;
// Set the w component to 1.0 since we don't need a perspective divide, which is also not
// necessary when rendering in 2-Dimensions
out.clipSpacePosition.w = 1.0;
// Pass our input textureCoordinate straight to our output RasterizerData. This value will be
// interpolated with the other textureCoordinate values in the vertices that make up the
// triangle.
out.textureCoordinate = vertexArray[vertexID].textureCoordinate;
return out;
}
// Fragment function
fragment float4
samplingShader(RasterizerData in [[stage_in]],
texture2d<half> colorTexture [[ texture(AAPLTextureIndexBaseColor) ]])
{
constexpr sampler textureSampler (mag_filter::linear,
min_filter::linear);
// Sample the texture to obtain a color
const half4 colorSample = colorTexture.sample(textureSampler, in.textureCoordinate);
// We return the color of the texture
return float4(colorSample);
}
The Apple Sample code header AAPLShaderTypes.h
/*
See LICENSE folder for this sample’s licensing information.
Abstract:
Header containing types and enum constants shared between Metal shaders and C/ObjC source
*/
#ifndef AAPLShaderTypes_h
#define AAPLShaderTypes_h
#include <simd/simd.h>
// Buffer index values shared between shader and C code to ensure Metal shader buffer inputs match
// Metal API buffer set calls
typedef enum AAPLVertexInputIndex
{
AAPLVertexInputIndexVertices = 0,
AAPLVertexInputIndexViewportSize = 1,
} AAPLVertexInputIndex;
// Texture index values shared between shader and C code to ensure Metal shader buffer inputs match
// Metal API texture set calls
typedef enum AAPLTextureIndex
{
AAPLTextureIndexBaseColor = 0,
} AAPLTextureIndex;
// This structure defines the layout of each vertex in the array of vertices set as an input to our
// Metal vertex shader. Since this header is shared between our .metal shader and C code,
// we can be sure that the layout of the vertex array in the code matches the layout that
// our vertex shader expects
typedef struct
{
// Positions in pixel space (i.e. a value of 100 indicates 100 pixels from the origin/center)
vector_float2 position;
// 2D texture coordinate
vector_float2 textureCoordinate;
} AAPLVertex;
#endif /* AAPLShaderTypes_h */
Debug print of my library
Printing description of self.library:
(MTLLibrary?) library = (object = 0x00006000004af7b0) {
object = 0x00006000004af7b0 {
baseNSObject#0 = {
isa = CaptureMTLLibrary
}
Debug print of working library from RayWenderlich sample app
The new added sampleShader and vertexShader are shown in the library along with the existing fragment and vertex functions.
▿ Optional<MTLLibrary>
- some : <CaptureMTLLibrary: 0x600000f54210> -> <MTLDebugLibrary: 0x600002204050> -> <_MTLLibrary: 0x600001460280>
label = <none>
device = <MTLSimDevice: 0x15a5069d0>
name = Apple iOS simulator GPU
functionNames: fragment_main vertex_main samplingShader vertexShader
Did you check the target membership of file? Your code is nothing to weird so please check the target.
Answer - issue of not loading functions into the metal library is resolved by removing a leftover -fcikernel flag in the Other Metal Compiler Flags option of Build Settings of the project target.
The flag was set when testing a CoreImageKernel.metal as documented in https://developer.apple.com/documentation/coreimage/cikernel/2880194-init
I removed the kernel definition file from the app but missed the compiler flag.. and missed it when visually comparing build settings.
In Metal Shading Language Specification, Behavior of Uniform Type:
If a variable is of the uniform type, and the variable does not have the same value for all threads executing the kernel or graphics function, then the behavior is undefined.
uniform<float> bar[10]; // elements stored in bar array are uniform
My question is what is use for uniform<T> in this context? I know some uniforms such as transformation information can be passed to shader by setVertexBytes() as uniform. In what kind of situations, do you find uniform<T> useful?
struct VertexOut {
float4 position [[position]];
};
struct Uniforms {
float4x4 transform;
};
vertex VertexOut my_vertex(
const device VertexIn * vertices [[buffer(0)]],
constant Uniforms & uniforms [[buffer(1)]],
uint vid [[vertex_id]]
) {
…
}
Thanks
I want to use array of textures in metal shader, but it crash
when running my app(iPhone6, A8), here is the error log:Failed to created pipeline state, error Error Domain=AGXMetalA8 Code=3 "Could not resolve texture/sampler references" UserInfo={NSLocalizedDescription=Could not resolve texture/sampler references}
I have try to google but did not find useful information, can anyone give me some suggestions, Thanks.
Here is my code:
fragment float4 fragment_texture (ShaderInput vert [[stage_in]],
array<texture2d<float>, 2> u_texture [[texture(0)]],
sampler _mtlsmp_u_texture [[sampler(0)]])
{
float4 srcColor = u_texture[0].sample(_mtlsmp_u_texture, vert.v_texCoord);
float4 materialColor = u_texture[1].sample(_mtlsmp_u_texture, vert.v_texCoord);
float4 mixColor = mix(srcColor, materialColor, 0.5);
return mixColor;
}
In my app code:
[renderEncoder setFragmentTexture:_textureDemo
atIndex:0];
[renderEncoder setFragmentTexture:_textureBlend
atIndex:1];
[renderEncoder setFragmentSamplerState:_sampler atIndex:0];
Update Issues
I try to use Argument Buffers to deal with array of textures, but still crashed on my iPhone6 with version iOS 11.4 and get the same error info:Failed to create pipeline state, error Could not resolve texture/sampler references
Here are some critical steps:
In my app code, argumentEncoder is a MTLArgumentEncoder type object and I encode the texture resources into the argument buffer by calling setTexture:atIndex:
[argumentEncoder setArgumentBuffer:_fragmentShaderArgumentBuffer offset:0];
[argumentEncoder setTexture:_texture[0] atIndex:0];
[argumentEncoder setTexture:_texture[1] atIndex:1];
Calling the useResource:usage: method to make texture resources accessible to the GPU:
[renderEncoder useResource:_texture[0] usage:MTLResourceUsageSample];
[renderEncoder useResource:_texture[1] usage:MTLResourceUsageSample];
And set argument buffer _fragmentShaderArgumentBuffer as an argument to the fragment function:
[renderEncoder setFragmentBuffer:_fragmentShaderArgumentBuffer
offset:0
atIndex:0];
In my fragment shader:
typedef struct FragmentShaderArguments {
array<texture2d<float>, 2> exampleTextures [[ id(0) ]];
} FragmentShaderArguments;
fragment float4
fragmentShader( RasterizerData in [[ stage_in ]],
device FragmentShaderArguments & fragmentShaderArgs [[ buffer(0) ]])
{
constexpr sampler textureSampler (mag_filter::linear,
min_filter::linear);
float4 color = float4(0, 0, 0, 1);
float4 color1 = fragmentShaderArgs.exampleTextures[0].sample(textureSampler, in.texCoord);
float4 color2 = fragmentShaderArgs.exampleTextures[1].sample(textureSampler, in.texCoord);
color = mix(color1, color2, 0.5);
return color;
}
Sincerely hope that someone can provide ideas to me, Thanks!
I upgrade to iOS 12.0.1, and it works well.
I'm building an app rendering 2D geometry in Metal.
Right now, the positions of the vertices are solved from within the vertex function. What I'd like is to write the solved positions back to a buffer from inside that same vertex function.
I'm under the impression that this is possible although in my first attempt to do it i.e.:
vertex VertexOut basic_vertex(device VertexIn *vertices [[ buffer(0) ]],
device VertexOut *solvedVertices [[ buffer(1) ]],
vid [[ vertex_id ]])
{
VertexIn in vertices[vid];
VertexOut out;
out.position = ... // Solve the position of the vertex
solvedVertices[vid] = out // Write to the buffer later to be read by CPU
return out;
}
I was graced with the presence of this compile time error:
Okay, so a few solutions come to my head - I could solve for the vertex positions in a first - non-rasterizing - pass through a vertex function declared like:
vertex void solve_vertex(device VertexIn *unsolved [[ buffer(0) ]],
device VertexOut *solved [[ buffer(1) ]],
vid [[ vertex_id ]])
{
solved[vid] = ...
}
And then pipe those solved vertices into a now much simpler - rasterizing - vertex function.
Another solution that could work but seems less appealing could be to solve them in a compute function.
So, what is the best way forward in a situation like this? From my little bits of research, I could track down that this same sort of procedure is done in Transform Feedback but I've had no luck (other than the link at the begging of the question) finding examples in Apple's documentation/sample code or elsewhere on the web for best practices when facing this sort of problem.
Alright, it turns out using a non-rasterizing vertex function is the way to go. There are some things to note however for others future reference:
A non-rasterizing vertex function is simply a vertex function returning void i.e.:
vertex void non_rasterizing_vertex(...) { }
When executing a non-rasterizing "render" pass, the MTLRenderPassDescriptor still needs to have a texture set - for instance in MTLRenderPassDescriptor's colorAttachments[0].texture - for reasons I don't know (I assume it's just due to the fixed nature of GPU programming).
The MTLRenderPipelineState needs to have it's rasterizationEnabled property set to false, then you can assign the non-rasterizing vertex function to it's vertexFunction property. The fragmentFunction property can remain nil as expected.
When actually executing the pass, one of the drawPrimitives: methods (the naming of which may be misleading) still needs to be invoked on the configured MTLRenderCommandEncoder. I ended up with a call to render MTLPrimitiveType.Points since that seems the most sensical.
Doing all of this sets up "rendering" logic ready to write back to vertex buffers from the vertex function - so long as they're in device address space:
vertex void non_rasterizing_vertex(device float *writeableBuffer [[ buffer(0) ]],
uint vid [[ vertex_id ]])
{
writeableBuffer[vid] = 42; // Write away!
}
This "answer" ended up more like a blog post but I hope it remains useful for future reference.
TODO
I'd still like to investigate performance tradeoffs between doing compute-y work like this in a compute pipeline versus in the rendering pipeline like above. Once I have some more time to do that, I'll update this answer.
The correct solution is to move any code writing to buffers to a compute kernel.
You will loose a great deal of performance writing to buffers in a vertex function. It is optimized for rasterizing, not for computation.
You just need to use a compute command encoder.
guard let computeBuffer = commandQueue.makeCommandBuffer() else { return }
guard let computeEncoder = computeBuffer.makeComputeCommandEncoder() else { return }
computeEncoder.setComputePipelineState(solveVertexPipelineState)
kernel void solve_vertex(device VertexIn *unsolved [[ buffer(0) ]],
device VertexOut *solved [[ buffer(1) ]],
vid [[ instance ]])
{
solved[vid] = ...
}
I have a kernel function in Metal that I pass in a texture to so that I can perform some operations on the image. I'm passing in uint2 gid [[thread_position_in_grid]] which gives me the pixel coordinates as integers.
To get a the normalized devices coordinates I can do some simple math on gid.x and gid.y along with my texture width and heigh. Is this the best way to do it? Better way?
Your approach is a good one. If you don't want to query the texture dimensions inside the kernel function or create a buffer just to pass them in, you can use the -[MTLComputeCommandEncoder setBytes:length:atIndex:] method to bind the texture dimensions in a "temporary" buffer of sorts that is handled by Metal:
[computeEncoder setBytes:&dimensions length:sizeof(dimensions) atIndex:0]
I think you right, and it is good way to use the same approach usually is applied in GLSL:
compute texel size
float2 texSize = float2(1/outTexture.get_with(),1/outTexture.get_height());
then use it to get normalized pixel position
constexpr sampler s(address::clamp_to_edge, filter::linear, coord::normalized);
//
// something to do...
//
float4 color = inTexture.sample(s,float2(gid)*texSize);
//
// something todo with pixel
//
outTexture.write(color,gid);
The method specified in the question works well. But for completion, an alternate way to read from textures using non-normalized (and/or normalized device coordinates) would be to use samplers.
Create a sampler:
id<MTLSamplerState> GetSamplerState()
{
MTLSamplerDescriptor *desc = [[MTLSamplerDescriptor alloc] autorelease];
desc.minFilter = MTLSamplerMinMagFilterNearest;
desc.magFilter = MTLSamplerMinMagFilterNearest;
desc.mipFilter = MTLSamplerMipFilterNotMipmapped;
desc.maxAnisotropy = 1;
desc.sAddressMode = MTLSamplerAddressModeClampToEdge;
desc.tAddressMode = MTLSamplerAddressModeClampToEdge;
desc.rAddressMode = MTLSamplerAddressModeClampToEdge;
// The key point: specifies that the sampler reads non-normalized coordinates
desc.normalizedCoordinates = NO;
desc.lodMinClamp = 0.0f;
desc.lodMaxClamp = FLT_MAX;
id <MTLSamplerState> sampler_state = nil;
sampler_state = [[device_ newSamplerStateWithDescriptor:desc] autorelease];
// Release the descriptor
desc = nil;
return sampler_state;
}
And then attach it to your compute command encoder:
id <MTLComputeCommandEncoder> compute_encoder = [comand_buffer computeCommandEncoder];
id<MTLSamplerState> ss = GetSamplerState();
// Attach the sampler state to the encoder, say at sampler bind point 0
[compute_encoder setSamplerState:ss atIndex:0];
// And set your texture, say at texture bind point 0
[compute_encoder setTexture:my_texture atIndex:0];
Finally use it in the kernel:
// An example kernel that samples from a texture,
// writes one component of the sample into an output buffer
kernel void compute_main(
texture2d<uint, access::sample> tex_to_sample [[ texture(0) ]],
sampler smp [[ sampler(0) ]],
device uint *out [[buffer(0)]],
uint2 tid [[thread_position_in_grid]])
{
out[tid] = tex_to_sample.sample(smp, tid).x;
}
Using a sampler allows you to specify parameters for sampling (like filtering). You can also access the texture in different ways by using different samplers attached to the same kernel. Sampler also avoids having to pass and check for bounds on texture dimensions.
Note that the sampler can also be set up from within the compute kernel. Refer to Section 2.6 Samplers in the Metal Shading Language Specification
Finally, one main difference between read function (using gid, as specified in the question) vs. sampling using a sampler is that read() takes integer coordinates, whereas sample() takes floating point coordinates. So integer coordinates passed into sample will get casted into equivalent floating-point.