how to describe packed_float3 in Metal vertex shader MTLVertexAttributeDescriptor? - metal

I am passing an array of structs to my Metal shader vertex function. The struct looks like this:
struct Vertex {
var x,y,z: Float // position data
var r,g,b,a: Float // color data
var s,t: Float // texture coordinates
var nX,nY,nZ: Float // normal
func floatBuffer() -> [Float] {
return [x,y,z,r,g,b,a,s,t,nX,nY,nZ]
}
};
The floatBuffer function is used to assemble the vertices into one big array of Floats. I am able to pass this into my shader function by using a struct definition which uses "packed" data types, like this:
struct VertexIn {
packed_float3 position;
packed_float4 color;
packed_float2 texCoord;
packed_float3 normal;
};
vertex VertexOut basic_vertex(
const device VertexIn* vertex_array [[ buffer(0) ]],
.
.
.
This works. However, I would like to know how to do the same thing using MTLVertexAttributeDescriptors and the associated syntax. Right now I am getting mangled polygons, presumably because of the byte alignment differences with float3 and packed_float3?
This is how I'm trying to define it now and getting the garbage polygons. I got an error that "packed_float3" is not valid for attributes, so I was trying to figure out how to make regular float3, float4, etc work.
struct VertexIn {
float3 position [[attribute(RayVertexAttributePosition)]];
float4 color [[attribute(RayVertexAttributeColor)]];
float2 texCoord [[attribute(RayVertexAttributeTexCoord)]];
float3 normal [[attribute(RayVertexAttributeNormal)]];
};
class func buildMetalVertexDescriptor() -> MTLVertexDescriptor {
let mtlVertexDescriptor = MTLVertexDescriptor()
var offset = 0
mtlVertexDescriptor.attributes[RayVertexAttribute.position.rawValue].format = MTLVertexFormat.float3
mtlVertexDescriptor.attributes[RayVertexAttribute.position.rawValue].offset = offset
mtlVertexDescriptor.attributes[RayVertexAttribute.position.rawValue].bufferIndex = RayBufferIndex.positions.rawValue
offset += 3*MemoryLayout<Float>.stride
mtlVertexDescriptor.attributes[RayVertexAttribute.color.rawValue].format = MTLVertexFormat.float4
mtlVertexDescriptor.attributes[RayVertexAttribute.color.rawValue].offset = offset
mtlVertexDescriptor.attributes[RayVertexAttribute.color.rawValue].bufferIndex = RayBufferIndex.positions.rawValue
offset += MemoryLayout<float4>.stride
mtlVertexDescriptor.attributes[RayVertexAttribute.texCoord.rawValue].format = MTLVertexFormat.float2
mtlVertexDescriptor.attributes[RayVertexAttribute.texCoord.rawValue].offset = offset
mtlVertexDescriptor.attributes[RayVertexAttribute.texCoord.rawValue].bufferIndex = RayBufferIndex.positions.rawValue
offset += MemoryLayout<float2>.stride
mtlVertexDescriptor.attributes[RayVertexAttribute.normal.rawValue].format = MTLVertexFormat.float3
mtlVertexDescriptor.attributes[RayVertexAttribute.normal.rawValue].offset = offset
mtlVertexDescriptor.attributes[RayVertexAttribute.normal.rawValue].bufferIndex = RayBufferIndex.positions.rawValue
offset += 3*MemoryLayout<Float>.stride
print("stride \(offset)")
mtlVertexDescriptor.layouts[RayBufferIndex.positions.rawValue].stride = offset
mtlVertexDescriptor.layouts[RayBufferIndex.positions.rawValue].stepRate = 1
mtlVertexDescriptor.layouts[RayBufferIndex.positions.rawValue].stepFunction = MTLVertexStepFunction.perVertex
return mtlVertexDescriptor
}
Notice that I specify the first attribute as a float3, but I specify an offset of 3 floats instead of the 4 that a float3 would normally use. But it isn't enough, apparently. I'm wondering how to set up a MTLVertexDescriptor and the shader struct with attributes so that it handles the 'packed' data from my structs?
Thanks very much.

The key is in this part of your question: "Notice that I specify the first attribute as a float3, but I specify an offset of 3 floats instead of the 4 that a float3 would normally use".
The SIMD float3 type takes up 16 bytes, it has the same memory layout as the non-packed Metal float3 type. So when you set the offset to only 3*MemoryLayout.stride you are missing the last 4 bytes which are still present causing the next field to pull from those extra bytes and for the rest of the data to be offset.
To really use packed types to transfer data to Metal (or any graphics API) you either have to stick with what you were doing before and specify x, y, z in three separate Floats in an array, or you have to define your own struct like this:
struct Vector3 {
var x: Float
var y: Float
var z: Float
}
Swift doesn't have any guarantees that this struct will be three Floats packed closely together, but for now and the foreseeable future it works and will be 12 bytes in size on most platforms.
If you want to be able to do vector operations on a struct like this then I would suggest looking for a library that defines types like these to save yourself some time as you will run into the same types of problems with 3x3 matrices also.
I ran into the same problems so I ended up rolling my own:
https://github.com/jkolb/Swiftish

Related

Corresponding Swift Data types for Metal data types?

I am passing my metal kernel and shader functions a parameter structure. I can't find anywhere that specifies what Swift data types to use to match the data types in Metal.
I have done my best to guess what data types to use on the Swift side, but it seems to be very picky in what order I define the variables in my structs. Which leads me to believe that they are not aligned.
For instance, here are the data types I am using in Metal:
struct ComputeParameters {
bool yesNo;
int count;
float scale;
float2 point;
float4 color;
};
And here is my corresponding struct in Swift:
struct ComputeParameters {
var yesNo: Bool = false
var count: Int32 = 0
var scale: Float32 = 1.0
var point: float2 = float2(0.0, 0.0)
var color: float4 = float4(0.0, 0.0, 0.0, 1.0)
}
Here is a table of the datatypes I am using from above.
Metal _________ Swift
bool Bool
int Int32
float Float32
float2 float2
float4 float4
Are those correct? Is there somewhere the parameter datatypes are documented?
The size of the Int type in Swift depends on the target platform. It could be equal to Int32 or Int64, though these days it will almost always be Int64. So you should use the more explicit Int32 type to match Metal's 32-bit int type.
As of Swift 5, float2 and float4 are deprecated in favor of SIMD2<Float> and SIMD4<Float>, respectively. These correspond exactly with Metal's float2 and float4.
I believe the rest of your correspondences are correct.
However, it's probably not wise to define these structures in Swift in the first place. Swift gives you no guarantees regarding struct layout (padding, alignment, and member order). Therefore you could wind up with a layout mismatch between Swift and MSL (I haven't seen this happen, but the point is that it can).
The current guidance, I believe, is to define such structs in C/Objective-C instead and import them via a bridging header. That makes it more likely that memcpy-style copies of structs into Metal buffers will do the right thing. Always pay careful attention to size and alignment, especially since manual reordering of struct members can change the size and/or stride of the struct.

Can I get the size of a buffer from my Metal shader?

In my iOS app, written in Swift, I generate a Metal buffer with:
vertexBuffer = device.newBufferWithBytes(vertices, length: vertices.count * sizeofValue(vertices[0]), options: nil)
And bind it to my shader program with:
renderCommandEncoder.setVertexBuffer(vertexBuffer, offset: 0, atIndex: 1)
In my shader program, written in Metal shading language, can I access the size of the buffer? I would like to access the next vertex in my buffer to do some differential calculation. Something like:
vertex float4 my_vertex(const device packed_float3* vertices [[buffer(1)]],
unsigned int vid[[vertex_id]]) {
float4 vertex = vertices[vid];
// Need to clamp this to not go beyond buffer,
// but how do I know the max value of vid?
float4 nextVertex = vertices[vid + 1];
float4 tangent = nextVertex - vertex;
// ...
}
Is my only option to pass the number of vertices as a uniform?
As far as I know, no you can't because the vertices points to an address. Just like C++, must have two things to know the count or size of an array:
1) know what data type of the array (float or some struct)
AND
2a) the array count for the data type OR
2b) the total bytes of the array.
So yes, you would need to pass the array count as a uniform.
For texture buffers you can.
You can get the size of a texture buffer from within the shader code.
Texture buffers have a get_width() and get_height() function, which return a uint.
uint get_width() const;
uint get_height() const;
But that probably does not answer OP's question about vertex buffers.
Actually you can. You can use the resulting value for loops or conditionals. You can't use it to initialise objects. (so dynamic arrays fail)
uint tempUint = 0; // some random type
uint uintSize = sizeof(tempUint); // get the size for the type
uint aVectorSize = sizeof(aVector) / uintSize; // divide the buffer by the type.
float dynamicArray[aVectorSize]; // this fails
for (uint counter = 0; counter < aVectorSize; ++ counter) {
// do stuff
};
if (aVectorSize > 10) {
// do more stuff
}

HLSL float array packing in constant buffer?

people.
I have a problem passing a float array to vertex shader (HLSL) through constant buffer. I know that each "float" in the array below gets a 16-byte slot all by itself (space equivalent to float4) due to HLSL packing rule:
// C++ struct
struct ForegroundConstants
{
DirectX::XMMATRIX transform;
float bounceCpp[64];
};
// Vertex shader constant buffer
cbuffer ForegroundConstantBuffer : register(b0)
{
matrix transform;
float bounceHlsl[64];
};
(Unfortunately, the simple solution here does not work, nothing is drawn after I made that change)
While the C++ data gets passed, due to the packing rule they get spaced out such that each "float" in the bounceCpp C++ array gets into a 16-byte space all by itself in bounceHlsl array. This resulted in an warning similar to the following:
ID3D11DeviceContext::DrawIndexed: The size of the Constant Buffer at slot 0 of the Vertex Shader unit is too small (320 bytes provided, 1088 bytes, at least, expected). This is OK, as out-of-bounds reads are defined to return 0. It is also possible the developer knows the missing data will not be used anyway. This is only a problem if the developer actually intended to bind a sufficiently large Constant Buffer for what the shader expects.
The recommendation, as being pointed out here and here, is to rewrite the HLSL constant buffer this way:
cbuffer ForegroundConstantBuffer : register(b0)
{
matrix transform;
float4 bounceHlsl[16]; // equivalent to 64 floats.
};
static float temp[64] = (float[64]) bounceHlsl;
main(pos : POSITION) : SV_POSITION
{
int index = someValueRangeFrom0to63;
float y = temp[index];
// Bla bla bla...
}
But that didn't work (i.e. ID3D11Device1::CreateVertexShader never returns). I'm compiling things against Shader Model 4 Level 9_1, can you spot anything that I have done wrong here?
Thanks in advance! :)
Regards,
Ben
One solution, albeit non optimal, is to just declare your float array as
float4 bounceHlsl[16];
then process the index like
float x = ((float[4])(bounceHlsl[i/4]))[i%4];
where i is the index you require.

Shader including another shader?

Is it possible, using XNA 4, to include a Shader within another shader? I know you could do this within 3.1, but I seem to be having trouble getting this to work? If you can, any pointers would be great.
EDIT
//---------------------------------------------------------------------------//
// Name : Rain.fx
// Desc : Rain particle effect using cylindrical billboards
// Author : Justin Stoecker. Copyright (C) 2008-2009.
//---------------------------------------------------------------------------//
#include "common.inc" // It's this line that causes me a problem
float4x4 matWorld;
float3 vVelocity;
float3 vOrigin; // min point of the cube area
float fWidth; // width of the weather region (x-axis)
float fHeight; // height of the weather region (y-axis)
float fLength; // length of the weather region (z-axis)
... Rest of file ...
The "common.inc" file has variables in there, but I was wondering if you could put methods in there as well?
Yes it's possible, from memory I think the basic effect example shader example from the MS App Hub does it.
In any case, see code below!
In FractalBase.fxh
float4x4 MatrixTransform : register(vs, c0);
float2 Pan;
float Zoom;
float Aspect;
float ZPower = 2;
float3 Colour = 0;
float3 ColourScale = 0;
float ComAbs(float2 Arg)
{
}
float2 ComSquare(float2 Arg)
{
}
int GreaterThan(float x, float y)
{
}
float4 GetColour(int DoneIterations, float MaxIterations, float BailoutTest, float OldBailoutTest, float BailoutFigure)
{
}
void SpriteVertexShader(inout float4 Colour : COLOR0,
inout float2 texCoord : TEXCOORD0,
inout float4 position : SV_Position)
{
position = mul(position, MatrixTransform);
// Convert the position into from screen space into complex coordinates
texCoord = (position) * Zoom * float2(1, Aspect) - float2(Pan.x, -Pan.y);
}
In FractalMandelbrot.fx
#include "FractalBase.fxh"
float4 FractalPixelShader(float2 texCoord : TEXCOORD0, uniform float Iterations) : COLOR0
{
}
technique Technique1
{
pass
{
VertexShader = compile vs_3_0 SpriteVertexShader();
PixelShader = compile ps_3_0 FractalPixelShader(128);
}
}
#includes work like this:
The preprocessor loads your main .fx file, and parses it, looking for anything that starts with a #. #includes cause the preprocessor to load the referenced file and insert its contents into the source buffer. Effectively, your #include directive is replaced by the entire contents of the included file.
So, yes, you can define anything in your #includes that you can define in a regular .fx file. I use this for keeping lighting functions, vertex type declarations, etc in common files that are used by several shaders.

HLSL: Using arrays inside a struct

I came across a weird behavior of HLSL. I am trying to use an array that is contained within a struct, like this (Pixel Shader code):
struct VSOUT {
float4 projected : SV_POSITION;
float3 pos: POSITION;
float3 normal : NORMAL;
};
struct Something {
float a[17];
};
float4 shMain (VSOUT input) : SV_Target {
Something s;
for (int i = 0; i < (int)(input.pos.x * 800); ++i)
s.a[(int)input.pos.x] = input.pos.x;
return col * s.a[(int)input.pos.x];
}
The code makes no sense logically, it's just a sample. The problem is that when I try to compile this code, I get the following error (line 25 is the for-loop line):
(25,7): error X3511: Forced to unroll
loop, but unrolling failed.
However, when I put the array outside the struct (just declare float a[17] in shMain), everything works as expected.
My question is, why is DirectX trying to unroll the (unrollable) for-loop when using the struct? Is this a documented behavior? Is there any available workaround except for putting the array outside the struct?
I am using shader model 4.0, DirectX 10 SDK from June 2010.
EDIT:
For clarification I am adding the working code, it only replaces usage of the struct Something with plain array:
struct VSOUT {
float4 projected : SV_POSITION;
float3 pos: POSITION;
float3 normal : NORMAL;
};
float4 shMain (VSOUT input) : SV_Target {
float a[17]; // Direct declaration of the array
for (int i = 0; i < (int)(input.pos.x * 800); ++i)
a[(int)input.pos.x] = input.pos.x;
return col * a[(int)input.pos.x];
}
This code compiles and works as expected. It works even if I add [loop] attribute in front of the for-loop which means it is not unrolled (which is a correct behavior).
I'm not sure but what I know is that the hardware schedule and process fragments by block of 2x2 (for computing derivatives). This could be a reason that fxc try to unroll the for loop so that the shader program is executed in lockstep mode.
Also did you try to use [loop] attribute for generating code that uses flow control?

Resources