I am rendering point fragments from a buffer with this call:
renderEncoder.drawPrimitives(type: .point,
vertexStart: 0,
vertexCount: 1,
instanceCount: emitter.currentParticles)
emitter.currentParticles is the total number of particles in the buffer. Is it possible to somehow draw only a portion of the buffer?
I have tried this, but it draws the first half of the buffer:
renderEncoder.drawPrimitives(type: .point,
vertexStart: emitter.currentParticles / 2,
vertexCount: 1,
instanceCount: emitter.currentParticles / 2)
In fact, it seems that vertexStart has no effect. I can seemingly set it to any value, and it still starts at 0.
Edit:
Pipeline configuration:
private func buildParticlePipelineStates() {
do {
guard let library = Renderer.device.makeDefaultLibrary(),
let function = library.makeFunction(name: "compute") else { return }
// particle update pipeline state
particlesPipelineState = try Renderer.device.makeComputePipelineState(function: function)
// render pipeline state
let vertexFunction = library.makeFunction(name: "vertex_particle")
let fragmentFunction = library.makeFunction(name: "fragment_particle")
let descriptor = MTLRenderPipelineDescriptor()
descriptor.vertexFunction = vertexFunction
descriptor.fragmentFunction = fragmentFunction
descriptor.colorAttachments[0].pixelFormat = renderPixelFormat
descriptor.colorAttachments[0].isBlendingEnabled = true
descriptor.colorAttachments[0].rgbBlendOperation = .add
descriptor.colorAttachments[0].alphaBlendOperation = .add
descriptor.colorAttachments[0].sourceRGBBlendFactor = .sourceAlpha
descriptor.colorAttachments[0].sourceAlphaBlendFactor = .sourceAlpha
descriptor.colorAttachments[0].destinationRGBBlendFactor = .oneMinusSourceAlpha
descriptor.colorAttachments[0].destinationAlphaBlendFactor = .oneMinusSourceAlpha
renderPipelineState = try
Renderer.device.makeRenderPipelineState(descriptor: descriptor)
renderPipelineState = try Renderer.device.makeRenderPipelineState(descriptor: descriptor)
} catch let error {
print(error.localizedDescription)
}
}
Vertex shader:
struct VertexOut {
float4 position [[ position ]];
float point_size [[ point_size ]];
float4 color;
};
vertex VertexOut vertex_particle(constant float2 &size [[buffer(0)]],
device Particle *particles [[buffer(1)]],
constant float2 &emitterPosition [[ buffer(2) ]],
uint instance [[instance_id]])
{
VertexOut out;
float2 position = particles[instance].position + emitterPosition;
out.position.xy = position.xy / size * 2.0 - 1.0;
out.position.z = 0;
out.position.w = 1;
out.point_size = particles[instance].size * particles[instance].scale;
out.color = particles[instance].color;
return out;
}
fragment float4 fragment_particle(VertexOut in [[ stage_in ]],
texture2d<float> particleTexture [[ texture(0) ]],
float2 point [[ point_coord ]]) {
constexpr sampler default_sampler;
float4 color = particleTexture.sample(default_sampler, point);
if ((color.a < 0.01) || (in.color.a < 0.01)) {
discard_fragment();
}
color = float4(in.color.xyz, 0.2 * color.a * in.color.a);
return color;
}
You're not using a vertex descriptor nor a [[stage_in]] parameter for your vertex shader. So, Metal is not fetching/gathering vertex data for you. You're just indexing into a buffer that's laid out with your vertex data already in the format you want. That's fine. See my answer here for more info about vertex descriptor.
Given that, though, the vertexStart parameter of the draw call only affects the value of a parameter to your vertex function with the [[vertex_id]] attribute. Your vertex function doesn't have such a parameter, let alone use it. Instead it uses an [[instance_id]] parameter to index into the vertex data buffer. You can read another of my answers here for a quick primer on draw calls and how they result in calls to your vertex shader function.
There are a couple of ways you could change things to draw only half of the points. You could change the draw call you use to:
renderEncoder.drawPrimitives(type: .point,
vertexStart: 0,
vertexCount: 1,
instanceCount: emitter.currentParticles / 2,
baseInstance: emitter.currentParticles / 2)
This would not require any changes to the vertex shader. It just changes the range of values fed to the instance parameter. However, since it doesn't seem like this is really a case of instancing, I recommend that you change the shader and your draw call. For the shader, rename the instance parameter to vertex or vid and change its attribute from [[instance_id]] to [[vertex_id]]. Then, change the draw call to:
renderEncoder.drawPrimitives(type: .point,
vertexStart: emitter.currentParticles / 2,
vertexCount: emitter.currentParticles / 2)
In truth, they basically behave the same way in this case, but the latter better represents what you're doing (and the draw call is simpler, which is nice).
Related
I'm using ARKit with scene reconstruction and need to rendered the captured scene geometry in metal. I can access this geometry through the ARMeshAnchor.geometry, which is a ARMeshGeometry. However when I try rendering it using my custom metal rendering pipeline, nothing renders and I get a bunch of errors like this:
Invalid device load executing vertex function "myVertex" encoder: "0", draw: 3, at offset 4688
Here's a highly simplified version of my code that I've been using for debugging:
struct InOut {
float4 position [[position]];
};
vertex InOut myVertex(
uint vid [[vertex_id]],
const constant float3* vertexArray [[buffer(0)]])
{
TouchInOut out;
const float3 in = vertexArray[vid];
out.position = float4(in.position, 1);
}
fragment float4 myFragment(TouchInOut in [[stage_in]]){
return float4(1, 0, 0, 1);
}
// Setup MTLRenderPipelineDescriptor
let pipelineDescriptor = MTLRenderPipelineDescriptor()
pipelineDescriptor.colorAttachments[0].pixelFormat = .rgba8Unorm
pipelineDescriptor.sampleCount = 1
pipelineDescriptor.vertexFunction = defaultLibrary.makeFunction(name: "myVertex")
pipelineDescriptor.fragmentFunction = defaultLibrary.makeFunction(name: "myFragment")
let vertexDescriptor = MTLVertexDescriptor()
vertexDescriptor.attributes[0].format = .float3
vertexDescriptor.attributes[0].offset = 0
vertexDescriptor.attributes[0].bufferIndex = 0
vertexDescriptor.layouts[0].stride = MemoryLayout<SIMD3<Float>>.stride
pipelineDescriptor.vertexDescriptor = vertexDescriptor
func render(arMesh: ARMeshAnchor) -> void {
// snip... — Setting up command buffers
let renderEncoder = commandBuffer.makeRenderCommandEncoder(descriptor: renderPassDescriptor)!
renderEncoder.setViewport(MTLViewport(originX: 0, originY: 0, width: 512, height: 512, znear: 0, zfar: 1))
renderEncoder.setRenderPipelineState(renderPipelineState)
let vertices = arMesh.geometry.vertices
let faces = arMesh.geometry.faces
renderEncoder.setVertexBuffer(vertices.buffer, offset: 0, index: 0)
renderEncoder.drawIndexedPrimitives(type: .triangle, indexCount: faces.count * 3, indexType: .uint32, indexBuffer: buffer, indexBufferOffset: 0)
renderEncoder.endEncoding()
// snip... — Clean up
}
I can't figure out why this code causes the metal exception. It stops throwing if I cap vid in the shader to around 100, but it still doesn't draw anything properly
What's going on here? Why does my code produce an error and how can I fix it?
The problem here is the alignment/packing of the vertex data.
Each vertex in ARMeshGeometry.vertices consists of 3 float components, for a total size of 12 bytes. The code above assumes that this means the data is a float3 / SIMD3<Float>, however the vertices from ARMeshGeometry are actually tightly packed. So while SIMD3<Float> has a stride of 16, the actual vertex data has a stride of 12.
The larger size of float3 (16) vs the actual size of elements in the vertices buffer (12) results in metal trying to access data off the end of the vertices buffer, producing the error.
There are two important fixes here:
Make sure the MTLVertexDescriptor has the correct stride:
let exampleMeshGeometry: ARMeshGeometry = ...
vertexDescriptor.layouts[0].stride = exampleMeshGeometry.vertices.stride
In the shader, use packed_float3 instead of float3
vertex InOut myVertex(
uint vid [[vertex_id]],
const constant packed_float3* vertexArray [[buffer(0)]])
{
...
}
After fixing these issues, you should be able to properly transfer ARMeshGeometry buffers to your metal shader
I would like to use a fragment shader to render to a texture offscreen. Once this is done I want to use the result of that as the input for another fragment shader.
I create a texture and clear it with red (to know it is set). I use the render pass that is connected to my target texture and draw. I then use a blit command encoder to transfer the contents of that target texture to a buffer. The buffer contains red so I know it is reading the texture correctly but the drawing should make the texture green so something is wrong.
let textureDescriptor = MTLTextureDescriptor()
textureDescriptor.textureType = MTLTextureType.type2D
textureDescriptor.width = 2048
textureDescriptor.height = 1024
textureDescriptor.pixelFormat = .rgba8Unorm
textureDescriptor.storageMode = .shared
textureDescriptor.usage = [.renderTarget, .shaderRead, .shaderWrite]
bakeTexture = device.makeTexture(descriptor: textureDescriptor)
bakeRenderPass = MTLRenderPassDescriptor()
bakeRenderPass.colorAttachments[0].texture = bakeTexture
bakeRenderPass.colorAttachments[0].loadAction = .clear
bakeRenderPass.colorAttachments[0].clearColor = MTLClearColor(red:1.0,green:0.0,blue:0.0,alpha:1.0)
bakeRenderPass.colorAttachments[0].storeAction = .store
for drawing I do this:
let bakeCommandEncoder = commandBuffer.makeRenderCommandEncode(descriptor: bakeRenderPass)
let vp = MTLViewport(originX:0, originY:0, width:2048, height:1024,znear:0.0,zfar:1.0)
bakeCommandEncoder.setViewport(vp)
bakeCommandEncoder.setCullMode(.none) // disable culling
// draw here. Fragment shader sets final color to float4(0.0,1.0,0.0,1.0);
bakeCommandEncoder.endEncoding()
let blitEncoder = commandBuffer.makeBlitCommandEncoder()
blitEncoder!.copy(...) // this works as my buffer is all red
blitEncoder.endEncoding()
Here is the vertex shader - it is based on an OpenGL vertex shader to dump out the uv-layout of the texture:
struct VertexOutBake {
float4 position [[position]];
float3 normal;
float3 tangent;
float3 binormal;
float3 worldPosition;
float2 texCoords;
};
vertex VertexOutBake vertex_main_bake(VertexInBake vertexIn [[stage_in]],
constant VertexUniforms &uniforms [[buffer(1)]])
{
VertexOutBake vertexOut;
float4 worldPosition = uniforms.modelMatrix * float4(vertexIn.position, 1);
vertexOut.worldPosition = worldPosition.xyz;
vertexOut.normal = normalize(vertexIn.normal);
vertexOut.tangent = normalize(vertexIn.tangent);
vertexOut.binormal = normalize(vertexIn.binormal);
vertexOut.texCoords = vertexIn.texCoords;
vertexOut.texCoords.y = 1.0 - vertexOut.texCoords.y; // flip image
// now use uv coordinates instead of 3D positions
vertexOut.position.x = vertexOut.texCoords.x * 2.0 - 1.0;
vertexOut.position.y = 1.0 - vertexOut.texCoords.y * 2.0;
vertexOut.position.z = 1.0;
vertexOut.position.w = 1.0;
return vertexOut;
}
The buffer that gets filled as a result of the blit copy should be green but it is red. This seems to mean that either the bakeTexture is not being written to in the fragment shader or that it is, but there is some synchronization missing to make the content available at the time I am doing the copy.
I have been trying to use texture2d_array for my application of live filters in metal. But I'm not getting the proper result.
Im creating the texture array like this,
Code: Class MetalTextureArray.
class MetalTextureArray {
private(set) var arrayTexture: MTLTexture
private var width: Int
private var height: Int
init(_ width: Int, _ height: Int, _ arrayLength: Int, _ device: MTLDevice) {
self.width = width
self.height = height
let textureDescriptor = MTLTextureDescriptor()
textureDescriptor.textureType = .type2DArray
textureDescriptor.pixelFormat = .bgra8Unorm
textureDescriptor.width = width
textureDescriptor.height = height
textureDescriptor.arrayLength = arrayLength
arrayTexture = device.makeTexture(descriptor: textureDescriptor)
}
func append(_ texture: MTLTexture) -> Bool {
if let bytes = texture.buffer?.contents() {
let region = MTLRegion(origin: MTLOrigin(x: 0, y: 0, z: 0), size: MTLSize(width: width, height: height, depth: 1))
arrayTexture.replace(region: region, mipmapLevel: 0, withBytes: bytes, bytesPerRow: texture.bufferBytesPerRow)
return true
}
return false
}
}
Im encoding this texture into the renderEncoder like this,
Code:
let textureArray = MetalTextureArray.init(firstTexture!.width, firstTexture!.height, colorTextures.count, device)
_ = textureArray.append(colorTextures[0].texture)
_ = textureArray.append(colorTextures[1].texture)
_ = textureArray.append(colorTextures[2].texture)
_ = textureArray.append(colorTextures[3].texture)
_ = textureArray.append(colorTextures[4].texture)
renderEncoder.setFragmentTexture(textureArray.arrayTexture, at: 1)
Finally I'm accessing the texture2d_array in the fragment shader like this,
Code:
struct RasterizerData {
float4 clipSpacePosition [[position]];
float2 textureCoordinate;
};
multipleShader(RasterizerData in [[stage_in]],
texture2d<half> colorTexture [[ texture(0) ]],
texture2d_array<half> texture2D [[ texture(1) ]])
{
constexpr sampler textureSampler (mag_filter::linear,
min_filter::linear,
s_address::repeat,
t_address::repeat,
r_address::repeat);
// Sample the texture and return the color to colorSample
half4 colorSample = colorTexture.sample (textureSampler, in.textureCoordinate);
float4 outputColor;
half red = texture2D.sample(textureSampler, in.textureCoordinate, 2).r;
half green = texture2D.sample(textureSampler, in.textureCoordinate, 2).g;
half blue = texture2D.sample(textureSampler, in.textureCoordinate, 2).b;
outputColor = float4(colorSample.r * red, colorSample.g * green, colorSample.b * blue, colorSample.a);
// We return the color of the texture
return outputColor;
}
The textures I'm appending to the texture array are the texture which are extracted from acv curve file which is of size 256 * 1.
In this code half red = texture2D.sample(textureSampler, in.textureCoordinate, 2).r; I gave the last argument as 2 because I thought it as the index of the texture to be accessed. But I don't know what it means.
But after doing all these I'm getting the black screen. Even I have other fragment shaders and all of them are working fine. But for this fragment shader I'm getting black screen. I think for this code half blue = texture2D.sample(textureSampler, in.textureCoordinate, 2).b I'm getting 0 for all the red, green, and blue values.
Edit 1:
As suggested I used blitcommandEncoder to copy the texture and still no result.
My code goes here,
My MetalTextureArray class has come modifications.
Method append goes like this.
func append(_ texture: MTLTexture) -> Bool {
self.blitCommandEncoder.copy(from: texture, sourceSlice: 0, sourceLevel: 0, sourceOrigin: MTLOrigin(x: 0, y: 0, z: 0), sourceSize: MTLSize(width: texture.width, height: texture.height, depth: 1), to: self.arrayTexture, destinationSlice: count, destinationLevel: 0, destinationOrigin: MTLOrigin(x: 0, y: 0, z: 0))
count += 1
return true
}
And Im appending the texture like this
let textureArray = MetalTextureArray.init(256, 1, colorTextures.count, device, blitCommandEncoder: blitcommandEncoder)
for (index, filter) in colorTextures!.enumerated() {
_ = textureArray.append(colorTextures[index].texture)
}
renderEncoder.setFragmentTexture(textureArray.arrayTexture, at: 1)
My shader code goes like this
multipleShader(RasterizerData in [[stage_in]],
texture2d<half> colorTexture [[ texture(0) ]],
texture2d_array<float> textureArray [[texture(1)]],
const device struct SliceDataSource &sliceData [[ buffer(2) ]])
{
constexpr sampler textureSampler (mag_filter::linear,
min_filter::linear);
// Sample the texture and return the color to colorSample
half4 colorSample = colorTexture.sample (textureSampler, in.textureCoordinate);
float4 outputColor = float4(0,0,0,0);
int slice = 1;
float red = textureArray.sample(textureSampler, in.textureCoordinate, slice).r;
float blue = textureArray.sample(textureSampler, in.textureCoordinate, slice).b;
float green = textureArray.sample(textureSampler, in.textureCoordinate, slice).g;
outputColor = float4(colorSample.r * red, colorSample.g * green, colorSample.b * blue, colorSample.a);
// We return the color of the texture
return outputColor;
}
Still I get the black screen.
In the method textureArray.sample(textureSampler, in.textureCoordinate, slice); what is the third parameter. I though it as an index and I gave some random index to fetch the random texture. Is it correct?
Edit 2:
I finally able to implement the suggestion and I got the result by using endEncoding method before another encoder is implemented and I got the following screen with the ACV negative filter.
.
Can someone suggest me.
Thanks.
There's a difference between an array of textures and an array texture. It sounds to me like you just want an array of textures. In that case, you should not use texture2d_array; you should use array<texture2d<half>, 5> texture_array [[texture(1)]].
In the app code, you can either use multiple calls to setFragmentTexture() to assign textures to sequential indexes or you can use setFragmentTextures() to assign a bunch of textures to a range of indexes all at once.
In the shader code, you'd use array subscripting syntax to refer to the individual textures in the array (e.g. texture_array[2]).
If you really do want to use an array texture, then you probably need to change your append() method. First, if the texture argument was not created with the makeTexture(descriptor:offset:bytesPerRow:) method of MTLBuffer, then texture.buffer will always be nil. That is, textures only have associated buffers if they were originally created from a buffer. To copy from texture to texture, you should use a blit command encoder and its copy(from:sourceSlice:sourceLevel:sourceOrigin:sourceSize:to:destinationSlice:destinationLevel:destinationOrigin:) method.
Second, if you want to replace the texture data for a specific slice (array element) of the array texture, you need to pass that slice index in as an argument to the replace() method. For that, you'd need to use the replace(region:mipmapLevel:slice:withBytes:bytesPerRow:bytesPerImage:) method, not the replace(region:mipmapLevel:withBytes:bytesPerRow:) as you're currently doing. Your current code is just replacing the first slice over and over (assuming the source textures really are associated with a buffer).
I'm drawing 2 different vertex buffers in metal, one with a texture (ignoring vertex color data) and the other without a texture (drawing purely the vertex color data):
let commandBuffer = self.commandQueue.makeCommandBuffer()
let commandEncoder = commandBuffer.makeRenderCommandEncoder(descriptor: rpd)
//render first buffer with texture
commandEncoder.setRenderPipelineState(self.rps)
commandEncoder.setVertexBuffer(self.vertexBuffer1, offset: 0, at: 0)
commandEncoder.setVertexBuffer(self.uniformBuffer, offset: 0, at: 1)
commandEncoder.setFragmentTexture(self.texture, at: 0)
commandEncoder.setFragmentSamplerState(self.samplerState, at: 0)
commandEncoder.drawPrimitives(type: .triangle, vertexStart: 0, vertexCount: count1, instanceCount: 1)
//render second buffer without texture
commandEncoder.setRenderPipelineState(self.rps)
commandEncoder.setVertexBuffer(self.vertexBuffer2, offset: 0, at: 0)
commandEncoder.setVertexBuffer(self.uniformBuffer, offset: 0, at: 1)
commandEncoder.setFragmentTexture(nil, at: 0)
commandEncoder.drawPrimitives(type: .triangle, vertexStart: 0, vertexCount: count2, instanceCount: 1)
commandEncoder.endEncoding()
commandBuffer.present(drawable)
commandBuffer.commit()
The shader looks like this:
#include <metal_stdlib>
using namespace metal;
struct Vertex {
float4 position [[position]];
float4 color;
float4 texCoord;
};
struct Uniforms {
float4x4 modelMatrix;
};
vertex Vertex vertex_func(constant Vertex *vertices [[buffer(0)]],
constant Uniforms &uniforms [[buffer(1)]],
uint vid [[vertex_id]])
{
float4x4 matrix = uniforms.modelMatrix;
Vertex in = vertices[vid];
Vertex out;
out.position = matrix * float4(in.position);
out.color = in.color;
out.texCoord = in.texCoord;
return out;
}
fragment float4 fragment_func(Vertex vert [[stage_in]],
texture2d<float> tex2D [[ texture(0) ]],
sampler sampler2D [[ sampler(0) ]]) {
if (vert.color[0] == 0 && vert.color[1] == 0 && vert.color[2] == 0) {
//texture color
return tex2D.sample(sampler2D, float2(vert.texCoord[0],vert.texCoord[1]));
}
else {
//color color
return vert.color;
}
}
Is this there a better way of doing this? Any vertex that i want to use the texture i'm setting to black, and the shader checks to see if the color is black, and if so then use the texture, otherwise use the color.
Also, is there a way to blend the colored polys and textured polys together using a multiply function if they overlap on the screen? It seems like MTLBlendOperation only has options for add/subtract/min/max, no multiply?
Another way to do this would be to have two different fragment functions, one that renders textured fragments and another one that deals with coloured vertices.
First you would need to create two different MTLRenderPipelineState at load time:
let desc = MTLRenderPipelineDescriptor()
/* ...load all other settings in the descriptor... */
// Load the common vertex function.
desc.vertexFunction = library.makeFunction(name: "vertex_func")
// First create the one associated to the textured fragment function.
desc.fragmentFunction = library.makeFunction(name: "fragment_func_textured")
let texturedRPS = try! device.makeRenderPipelineState(descriptor: desc)
// Then modify the descriptor to create the state associated with the untextured fragment function.
desc.fragmentFunction = library.makeFunction(name: "fragment_func_untextured")
let untexturedRPS = try! device.makeRenderPipelineState(descriptor: desc)
Then at render time, before encoding the draw commands of a textured object you set the textured state, and before encoding the draw commands of an untextured object you switch to the untextured one. Like this:
//render first buffer with texture
commandEncoder.setRenderPipelineState(texturedRPS) // Set the textured state
commandEncoder.setVertexBuffer(self.vertexBuffer1, offset: 0, at: 0)
commandEncoder.setVertexBuffer(self.uniformBuffer, offset: 0, at: 1)
commandEncoder.setFragmentTexture(self.texture, at: 0)
commandEncoder.setFragmentSamplerState(self.samplerState, at: 0)
commandEncoder.drawPrimitives(type: .triangle, vertexStart: 0, vertexCount: count1, instanceCount: 1)
//render second buffer without texture
commandEncoder.setRenderPipelineState(untexturedRPS) // Set the untextured state
commandEncoder.setVertexBuffer(self.vertexBuffer2, offset: 0, at: 0)
commandEncoder.setVertexBuffer(self.uniformBuffer, offset: 0, at: 1)
// No need to set the fragment texture as we don't need it in the fragment function.
// commandEncoder.setFragmentTexture(nil, at: 0)
commandEncoder.drawPrimitives(type: .triangle, vertexStart: 0, vertexCount: count2, instanceCount: 1)
No change is required for the vertex function. While you need to split the fragment function into two:
fragment float4 fragment_func_textured(Vertex vert [[stage_in]],
texture2d<float> tex2D [[ texture(0) ]],
sampler sampler2D [[ sampler(0) ]]) {
//texture color
return tex2D.sample(sampler2D, float2(vert.texCoord[0],vert.texCoord[1]));
}
fragment float4 fragment_func_untextured(Vertex vert [[stage_in]]) {
//color color
return vert.color;
}
You could even go ahead and have two different vertex functions that output two different vertex structures in order to save a few bytes. In fact the textured fragment function only needs the texCoord field and not the color, while the untextured function is the other way around.
EDIT: You can use this fragment function to use both the texture color and the vertex color:
fragment float4 fragment_func_blended(Vertex vert [[stage_in]],
texture2d<float> tex2D [[ texture(0) ]],
sampler sampler2D [[ sampler(0) ]]) {
// texture color
float4 texture_sample = tex2D.sample(sampler2D, float2(vert.texCoord[0],vert.texCoord[1]));
// vertex color
float4 vertex_sample = vert.color;
// Blend the two together
float4 blended = texture_sample * vertex_sample;
// Or use another blending operation.
// float4 blended = mix(texture_sample, vertex_sample, mix_factor);
// Where mix_factor is in the range 0.0 to 1.0.
return blended;
}
I have a supposedly simple task, but apparently I still don't understand how projections work in shaders. I need to do a 2D perspective transformation on a texture quad (2 triangles), but visually it doesn't look correct (e.g. trapezoid is slightly higher or more stretched than what it is in the CPU version).
I have this struct:
struct VertexInOut
{
float4 position [[position]];
float3 warp0;
float3 warp1;
float3 warp2;
float3 warp3;
};
And in the vertex shader I do something like (texCoords are pixel coords of the quad corners and homography is calculated in pixel coords):
v.warp0 = texCoords[vid] * homographies[0];
Then in the fragment shader like this:
return intensity.sample(s, inFrag.warp0.xy / inFrag.warp0.z);
The result is not what I expect. I spent hours on this, but I cannot figure it out. venting
UPDATE:
These are code and result for CPU (aka expected result):
// _image contains the original image
cv::Matx33d h(1.03140473, 0.0778113901, 0.000169219566,
0.0342947133, 1.06025684, 0.000459250761,
-0.0364957005, -38.3375587, 0.818259298);
cv::Mat dest(_image.size(), CV_8UC4);
// h is transposed because OpenCV is col major and using backwarping because it is what is used on the GPU, so better for comparison
cv::warpPerspective(_image, dest, h.t(), _image.size(), cv::WARP_INVERSE_MAP | cv::INTER_LINEAR);
These are code and result for GPU (aka wrong result):
// constants passed in buffers, image size 320x240
const simd::float4 quadVertices[4] =
{
{ -1.0f, -1.0f, 0.0f, 1.0f },
{ +1.0f, -1.0f, 0.0f, 1.0f },
{ -1.0f, +1.0f, 0.0f, 1.0f },
{ +1.0f, +1.0f, 0.0f, 1.0f },
};
const simd::float3 textureCoords[4] =
{
{ 0, IMAGE_HEIGHT, 1.0f },
{ IMAGE_WIDTH, IMAGE_HEIGHT, 1.0f },
{ 0, 0, 1.0f },
{ IMAGE_WIDTH, 0, 1.0f },
};
// vertex shader
vertex VertexInOut homographyVertex(uint vid [[ vertex_id ]],
constant float4 *positions [[ buffer(0) ]],
constant float3 *texCoords [[ buffer(1) ]],
constant simd::float3x3 *homographies [[ buffer(2) ]])
{
VertexInOut v;
v.position = positions[vid];
// example homography
simd::float3x3 h = {
{1.03140473, 0.0778113901, 0.000169219566},
{0.0342947133, 1.06025684, 0.000459250761},
{-0.0364957005, -38.3375587, 0.818259298}
};
v.warp = h * texCoords[vid];
return v;
}
// fragment shader
fragment int4 homographyFragment(VertexInOut inFrag [[stage_in]],
texture2d<uint, access::sample> intensity [[ texture(1) ]])
{
constexpr sampler s(coord::pixel, filter::linear, address::clamp_to_zero);
float4 targetIntensity = intensityRight.sample(s, inFrag.warp.xy / inFrag.warp.z);
return targetIntensity;
}
Original image:
UPDATE 2:
Contrary to the common belief that the perspective divide should be done in the fragment shader, I get a much more similar result if I divide in the vertex shader (and no distortion or seam between triangles), but why?
UPDATE 3:
I get the same (wrong) result if:
I move the perspective divide to the fragment shader
I simply remove the divide from the code
Very strange, it looks like the divide is not happening.
OK, the solution was of course a very small detail: the division of simd::float3 behaves absolutely nuts. In fact, if I do the perspective divide in the fragment shader like this:
float4 targetIntensity = intensityRight.sample(s, inFrag.warp.xy * (1.0 / inFrag.warp.z));
it works!
Which lead me to find out that multiplying by the pre-divided float is different than dividing by a float. The reason for this is still unknown to me, if anyone knows why we can unravel this mystery.