I understand it is possible to pass a 1D array buffer to a metal shader, but is it possible to have it output to a 1D array buffer? I don't want it to write to a texture - I just need an array of processed values.
I can get values out with the shader with the following code, but they are one value at a time. Ideally I could get a whole array out (in the same order as the input 1D array buffer).
Any examples or pointers would be greatly appreciated!
var resultdata = [Float](repeating: 0, count: 3)
let outVectorBuffer = device.makeBuffer(bytes: &resultdata, length: MemoryLayout<float3>.size, options: [])
commandEncoder!.setBuffer(outVectorBuffer, offset: 0, index: 6)
commandBuffer!.addCompletedHandler {commandBuffer in
let data = NSData(bytes: outVectorBuffer!.contents(), length: MemoryLayout<float3>.size)
var out: float3 = float3(0,0,0)
data.getBytes(&out, length: MemoryLayout<float3>.size)
print("data: \(out)")
}
//In the Shader:
kernel void compute1d(
...
device float3 &outBuffer [[buffer(6)]],
outBuffer = float3(1.0, 2.0, 3.0);
)
Two things:
You need to create the buffer large enough to hold however many float3 elements as you want. You really need to use .stride and not .size when calculating the buffer size, though. In particular, float3 has 16-byte alignment, so there's padding between elements in an array. So, you would use something like MemoryLayout<float3>.stride * desiredNumberOfElements.
Then, in the shader, you need to change the declaration of outBuffer from a reference to a pointer. So, device float3 *outBuffer [[buffer(6)]]. Then you can index into it to access the elements (e.g. outBuffer[2] = ...;).
Related
I'm trying to make a simple 3D modeling tool.
there is some work to move a vertex( or vertices ) for transform the model.
I used dynamic vertex buffer because thought it needs much update.
but performance is too low in high polygon model even though I change just one vertex.
is there other methods? or did I wrong way?
here is my D3D11_BUFFER_DESC
Usage = D3D11_USAGE_DYNAMIC;
CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
BindFlags = D3D11_BIND_VERTEX_BUFFER;
ByteWidth = sizeof(ST_Vertex) * _nVertexCount
D3D11_SUBRESOURCE_DATA d3dBufferData;
d3dBufferData.pSysMem = pVerticesInfo;
hr = pd3dDevice->CreateBuffer(&descBuffer, &d3dBufferData, &_pVertexBuffer);
and my update funtion
D3D11_MAPPED_SUBRESOURCE d3dMappedResource;
pImmediateContext->Map(_pVertexBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &d3dMappedResource);
ST_Vertex* pBuffer = (ST_Vertex*)d3dMappedResource.pData;
for (int i = 0; i < vIndice.size(); ++i)
{
pBuffer[vIndice[i]].xfPosition.x = pVerticesInfo[vIndice[i]].xfPosition.x;
pBuffer[vIndice[i]].xfPosition.y = pVerticesInfo[vIndice[i]].xfPosition.y;
pBuffer[vIndice[i]].xfPosition.z = pVerticesInfo[vIndice[i]].xfPosition.z;
}
pImmediateContext->Unmap(_pVertexBuffer, 0);
As mentioned in the previous answer, you are updating your whole buffer every time, which will be slow depending on model size.
The solution is indeed to implement partial updates, there are two possibilities for it, you want to update a single vertex, or you want to update
arbitrary indices (for example, you want to move N vertices in one go, in different locations, like vertex 1,20,23 for example.
The first solution is rather simple, first create your buffer with the following description :
Usage = D3D11_USAGE_DEFAULT;
CPUAccessFlags = 0;
BindFlags = D3D11_BIND_VERTEX_BUFFER;
ByteWidth = sizeof(ST_Vertex) * _nVertexCount
D3D11_SUBRESOURCE_DATA d3dBufferData;
d3dBufferData.pSysMem = pVerticesInfo;
hr = pd3dDevice->CreateBuffer(&descBuffer, &d3dBufferData, &_pVertexBuffer);
This makes sure your vertex buffer is gpu visible only.
Next create a second dynamic buffer which has the size of a single vertex (you do not need any bind flags in that case, as it will be used only for copies)
_pCopyVertexBuffer
Usage = D3D11_USAGE_DYNAMIC; //Staging works as well
CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
BindFlags = 0;
ByteWidth = sizeof(ST_Vertex);
D3D11_SUBRESOURCE_DATA d3dBufferData;
d3dBufferData.pSysMem = NULL;
hr = pd3dDevice->CreateBuffer(&descBuffer, &d3dBufferData, &_pCopyVertexBuffer);
when you move a vertex, copy the changed vertex in the copy buffer :
ST_Vertex changedVertex;
D3D11_MAPPED_SUBRESOURCE d3dMappedResource;
pImmediateContext->Map(_pVertexBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &d3dMappedResource);
ST_Vertex* pBuffer = (ST_Vertex*)d3dMappedResource.pData;
pBuffer->xfPosition.x = changedVertex.xfPosition.x;
pBuffer->.xfPosition.y = changedVertex.xfPosition.y;
pBuffer->.xfPosition.z = changedVertex.xfPosition.z;
pImmediateContext->Unmap(_pVertexBuffer, 0);
Since you use D3D11_MAP_WRITE_DISCARD, make sure to write all attributes there (not only position).
Now once you done, you can use ID3D11DeviceContext::CopySubresourceRegion to only copy the modified vertex in the current location :
I assume that vertexID is the index of the modified vertex :
pd3DeviceContext->CopySubresourceRegion(_pVertexBuffer,
0, //must be 0
vertexID * sizeof(ST_Vertex), //location of the vertex in you gpu vertex buffer
0, //must be 0
0, //must be 0
_pCopyVertexBuffer,
0, //must be 0
NULL //in this case we copy the full content of _pCopyVertexBuffer, so we can set to null
);
Now if you want to update a list of vertices, things get more complicated and you have several options :
-First you apply this single vertex technique in a loop, this will work quite well if your changeset is small.
-If your changeset is very big (close to almost full vertex size, you can probably rewrite the whole buffer instead).
-An intermediate technique is to use compute shader to perform the updates (thats the one I normally use as its the most flexible version).
Posting all c++ binding code would be way too long, but here is the concept :
your vertex buffer must have BindFlags = D3D11_BIND_VERTEX_BUFFER | D3D11_BIND_UNORDERED_ACCESS; //this allows to write wioth compute
you need to create an ID3D11UnorderedAccessView for this buffer (so shader can write to it)
you need the following misc flags : D3D11_RESOURCE_MISC_BUFFER_ALLOW_RAW_VIEWS //this allows to write as RWByteAddressBuffer
you then create two dynamic structured buffers (I prefer those over byteaddress, but vertex buffer and structured is not allowed in dx11, so for the write one you need raw instead)
first structured buffer has a stride of ST_Vertex (this is your changeset)
second structured buffer has a stride of 4 (uint, these are the indices)
both structured buffers get an arbitrary element count (normally i use 1024 or 2048), so that will be the maximum amount of vertices you can update in a single pass.
both structured buffers you need an ID3D11ShaderResourceView (shader visible, read only)
Then update process is the following :
write modified vertices and locations in structured buffers (using map discard, if you have to copy less its ok)
attach both structured buffers for read
attach ID3D11UnorderedAccessView for write
set your compute shader
call dispatch
detach ID3D11UnorderedAccessView for write (this is VERY important)
This is a sample compute shader code (I assume you vertex is position only, for simplicity)
cbuffer cbUpdateCount : register(b0)
{
uint updateCount;
};
RWByteAddressBuffer RWVertexPositionBuffer : register(u0);
StructuredBuffer<float3> ModifiedVertexBuffer : register(t0);
StructuredBuffer<uint> ModifiedVertexIndicesBuffer : register(t0);
//this is the stride of your vertex buffer, since here we use float3 it is 12 bytes
#define WRITE_STRIDE 12
[numthreads(64, 1, 1)]
void CS( uint3 tid : SV_DispatchThreadID )
{
//make sure you do not go part element count, as here we runs 64 threads at a time
if (tid.x >= updateCount) { return; }
uint readIndex = tid.x;
uint writeIndex = ModifiedVertexIndicesBuffer[readIndex];
float3 vertex = ModifiedVertexBuffer[readIndex];
//byte address buffers do not understand float, asuint is a binary cast.
RWVertexPositionBuffer.Store3(writeIndex * WRITE_STRIDE, asuint(vertex));
}
For the purposes of this question I'm going to assume you already have a mechanism for selecting a vertex from a list of vertices based upon ray casting or some other picking method and a mechanism for creating a displacement vector detailing how the vertex was moved in model space.
The method you have for updating the buffer is sufficient for anything less than a few hundred vertices, but on large scale models it becomes extremely slow. This is because you're updating everything, rather than the individual vertices you modified.
To fix this, you should only update the vertices you have changed, and to do that you need to create a change set.
In concept, a change set is nothing more than a set of changes made to the data - a list of the vertices that need to be updated. Since we already know which vertices were modified (otherwise we couldn't have manipulated them), we can map in the GPU buffer, go to that vertex specifically, and copy just those vertices into the GPU buffer.
In your vertex modification method, record the index of the vertex that was modified by the user:
//Modify the vertex coordinates based on mouse displacement
pVerticesInfo[SelectedVertexIndex].xfPosition.x += DisplacementVector.x;
pVerticesInfo[SelectedVertexIndex].xfPosition.y += DisplacementVector.y;
pVerticesInfo[SelectedVertexIndex].xfPosition.z += DisplacementVector.z;
//Add the changed vertex to the list of changes.
changedVertices.add(SelectedVertexIndex);
//And update the GPU buffer
UpdateD3DBuffer();
In UpdateD3DBuffer(), do the following:
D3D11_MAPPED_SUBRESOURCE d3dMappedResource;
pImmediateContext->Map(_pVertexBuffer, 0, D3D11_MAP_WRITE, 0, &d3dMappedResource);
ST_Vertex* pBuffer = (ST_Vertex*)d3dMappedResource.pData;
for (int i = 0; i < changedVertices.size(); ++i)
{
pBuffer[changedVertices[i]].xfPosition.x = pVerticesInfo[changedVertices[i]].xfPosition.x;
pBuffer[changedVertices[i]].xfPosition.y = pVerticesInfo[changedVertices[i]].xfPosition.y;
pBuffer[changedVertices[i]].xfPosition.z = pVerticesInfo[changedVertices[i]].xfPosition.z;
}
pImmediateContext->Unmap(_pVertexBuffer, 0);
changedVertices.clear();
This has the effect of only updating the vertices that have changed, rather than all vertices in the model.
This also allows for some more complex manipulations. You can select multiple vertices and move them all as a group, select a whole face and move all the connected vertices, or move entire regions of the model relatively easily, assuming your picking method is capable of handling this.
In addition, if you record the change sets with enough information (the affected vertices and the displacement index), you can fairly easily implement an undo function by simply reversing the displacement vector and reapplying the selected change set.
I'm trying to port some CIFilter from this source by using metal shading language for Core Image.
I have a palette of color composed by an array of RGB struct and I want to pass them as an argument to a custom CI color image kernel.
The RGB struct is converted into an array of SIMD3<Float>.
static func SIMD3Palette(_ palette: [RGB]) -> [SIMD3<Float>] {
return palette.map{$0.toFloat3()}
}
The kernel should take and array of simd_float3 values, the problem is the when I launch the filter it tells me that the argument at index 1 is expecting an NSData.
override var outputImage: CIImage? {
guard let inputImage = inputImage else
{
return nil
}
let palette = EightBitColorFilter.palettes[Int(inputPaletteIndex)]
let extent = inputImage.extent
let arguments = [inputImage, palette, Float(palette.count)] as [Any]
let final = colorKernel.apply(extent: extent, arguments: arguments)
return final
}
This is the kernel:
float4 eight_bit(sample_t image, simd_float3 palette[], float paletteSize, destination dest) {
float dist = distance(image.rgb, palette[0]);
float3 returnColor = palette[0];
for (int i = 1; i < floor(paletteSize); ++i) {
float tempDist = distance(image.rgb, palette[i]);
if (tempDist < dist) {
dist = tempDist;
returnColor = palette[i];
}
}
return float4(returnColor, 1);
}
I'm wondering how can I pass a data buffer to the kernel since converting it into an NSData seems not enough. I saw some example but they are using "full" shading language that is not available for Core Image that is a sort of subset for dealing only with fragments.
Update
We have now figured out how to pass data buffers directly into Core Image kernels. Using a CIImage as described below is not needed, but still possible.
Assuming that you have your raw data as an NSData, you can just pass it to the kernel on invocation:
kernel.apply(..., arguments: [data, ...])
Note: Data might also work, but I know that NSData is an argument type that allows Core Image to cache filter results based on input arguments. So when in doubt, better cast to NSData.
Then in the kernel function, you only need to declare the parameter with an appropriate constant type:
extern "C" float4 myKernel(constant float3 data[], ...) {
float3 data0 = data[0];
// ...
}
Previous Answer
Core Image kernels don't seem to support pointer or array parameter types. Though there seem to be something coming with iOS 13. From the Release Notes:
Metal CIKernel instances support arguments with arbitrarily structured data.
But, as so often with Core Image, there seem to be no further documentation for that…
However, you can still use the "old way" of passing buffer data by wrapping it in a CIImage and sampling it in the kernel. For example:
let array: [Float] = [1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]
let data = array.withUnsafeBufferPointer { Data(buffer: $0) }
let dataImage = CIImage(bitmapData: data, bytesPerRow: data.count, size: CGSize(width: array.count/4, height: 1), format: .RGBAf, colorSpace: nil)
Note that there is no CIFormat for 3-channel images since the GPU doesn't support those. So you either have to use single-channel .Rf and re-pack the values inside your kernel to float3 again, or add some strides to your data and use .RGBAf and float4 respectively (which I'd recommend since it reduces texture fetches).
When you pass that image into your kernel, you probably want to set the sampling mode to nearest, otherwise you might get interpolated values when sampling between two pixels:
kernel.apply(..., arguments: [dataImage.samplingNearest(), ...])
In your (Metal) kernel, you can assess the data as you would with a normal input image via a sampler:
extern "C" float4 myKernel(coreimage::sampler data, ...) {
float4 data0 = data.sample(data.transform(float2(0.5, 0.5))); // data[0]
float4 data1 = data.sample(data.transform(float2(1.5, 0.5))); // data[1]
// ...
}
Note that I added 0.5 to the coordinates so that they point in the middle of a pixel in the data image to avoid ambiguity and interpolation.
Also note that pixel values you get from a sampler always have 4 channels. So even when you are creating your data image with formate .Rf, you'll get a float4 when sampling it (the other values are filled with 0.0 for G and B and 1.0 for alpha). In this case, you can just do
float data0 = data.sample(data.transform(float2(0.5, 0.5))).x;
Edit
I previously forgot to transform the sample coordinate from absolute pixel space (where (0.5, 0.5) would be the middle of the first pixel) to relative sampler space (where (0.5, 0.5) would be the middle of the whole buffer). It's fixed now.
I made it, event if the answer was good and also deploys to lower target the result wasn't exactly what I was expecting. The difference between the original kernel written as a string and the above method to create an image to be used as a source of data were kind of big.
Didn't get exactly the reason, but the image I was passing as a source of the palette was kind of different from the created one in size and color(probably due to color spaces).
Since there was no documentation about this statement:
Metal CIKernel instances support arguments with arbitrarily structured
data.
I tried a lot in my spare time and came up to this.
First the shader:
float4 eight_bit_buffer(sampler image, constant simd_float3 palette[], float paletteSize, destination dest) {
float4 color = image.sample(image.transform(dest.coord()));
float dist = distance(color.rgb, palette[0]);
float3 returnColor = palette[0];
for (int i = 1; i < floor(paletteSize); ++i) {
float tempDist = distance(color.rgb, palette[i]);
if (tempDist < dist) {
dist = tempDist;
returnColor = palette[i];
}
}
return float4(returnColor, 1);
}
Second the palette transformation into SIMD3<Float>:
static func toSIMD3Buffer(from palette: [RGB]) -> Data {
var simd3Palette = SIMD3Palette(palette)
let size = MemoryLayout<SIMD3<Float>>.size
let count = palette.count * size
let palettePointer = UnsafeMutableRawPointer.allocate(
byteCount: simd3Palette.count * MemoryLayout<SIMD3<Float>>.stride,
alignment: MemoryLayout<SIMD3<Float>>.alignment)
let simd3Pointer = simd3Palette.withUnsafeMutableBufferPointer { (buffer) -> UnsafeMutablePointer<SIMD3<Float>> in
let p = palettePointer.initializeMemory(as: SIMD3<Float>.self,
from: buffer.baseAddress!,
count: buffer.count)
return p
}
let data = Data(bytesNoCopy: simd3Pointer, count: count * MemoryLayout<SIMD3<Float>>.stride, deallocator: .free)
return data
}
The first time I tried by appending SIMD3 to the Data object but wasn't working probably due to memory alignment.
Remember to dealloc the memory created after you used it.
Hope to help someone else.
I am passing an array of structs to my Metal shader vertex function. The struct looks like this:
struct Vertex {
var x,y,z: Float // position data
var r,g,b,a: Float // color data
var s,t: Float // texture coordinates
var nX,nY,nZ: Float // normal
func floatBuffer() -> [Float] {
return [x,y,z,r,g,b,a,s,t,nX,nY,nZ]
}
};
The floatBuffer function is used to assemble the vertices into one big array of Floats. I am able to pass this into my shader function by using a struct definition which uses "packed" data types, like this:
struct VertexIn {
packed_float3 position;
packed_float4 color;
packed_float2 texCoord;
packed_float3 normal;
};
vertex VertexOut basic_vertex(
const device VertexIn* vertex_array [[ buffer(0) ]],
.
.
.
This works. However, I would like to know how to do the same thing using MTLVertexAttributeDescriptors and the associated syntax. Right now I am getting mangled polygons, presumably because of the byte alignment differences with float3 and packed_float3?
This is how I'm trying to define it now and getting the garbage polygons. I got an error that "packed_float3" is not valid for attributes, so I was trying to figure out how to make regular float3, float4, etc work.
struct VertexIn {
float3 position [[attribute(RayVertexAttributePosition)]];
float4 color [[attribute(RayVertexAttributeColor)]];
float2 texCoord [[attribute(RayVertexAttributeTexCoord)]];
float3 normal [[attribute(RayVertexAttributeNormal)]];
};
class func buildMetalVertexDescriptor() -> MTLVertexDescriptor {
let mtlVertexDescriptor = MTLVertexDescriptor()
var offset = 0
mtlVertexDescriptor.attributes[RayVertexAttribute.position.rawValue].format = MTLVertexFormat.float3
mtlVertexDescriptor.attributes[RayVertexAttribute.position.rawValue].offset = offset
mtlVertexDescriptor.attributes[RayVertexAttribute.position.rawValue].bufferIndex = RayBufferIndex.positions.rawValue
offset += 3*MemoryLayout<Float>.stride
mtlVertexDescriptor.attributes[RayVertexAttribute.color.rawValue].format = MTLVertexFormat.float4
mtlVertexDescriptor.attributes[RayVertexAttribute.color.rawValue].offset = offset
mtlVertexDescriptor.attributes[RayVertexAttribute.color.rawValue].bufferIndex = RayBufferIndex.positions.rawValue
offset += MemoryLayout<float4>.stride
mtlVertexDescriptor.attributes[RayVertexAttribute.texCoord.rawValue].format = MTLVertexFormat.float2
mtlVertexDescriptor.attributes[RayVertexAttribute.texCoord.rawValue].offset = offset
mtlVertexDescriptor.attributes[RayVertexAttribute.texCoord.rawValue].bufferIndex = RayBufferIndex.positions.rawValue
offset += MemoryLayout<float2>.stride
mtlVertexDescriptor.attributes[RayVertexAttribute.normal.rawValue].format = MTLVertexFormat.float3
mtlVertexDescriptor.attributes[RayVertexAttribute.normal.rawValue].offset = offset
mtlVertexDescriptor.attributes[RayVertexAttribute.normal.rawValue].bufferIndex = RayBufferIndex.positions.rawValue
offset += 3*MemoryLayout<Float>.stride
print("stride \(offset)")
mtlVertexDescriptor.layouts[RayBufferIndex.positions.rawValue].stride = offset
mtlVertexDescriptor.layouts[RayBufferIndex.positions.rawValue].stepRate = 1
mtlVertexDescriptor.layouts[RayBufferIndex.positions.rawValue].stepFunction = MTLVertexStepFunction.perVertex
return mtlVertexDescriptor
}
Notice that I specify the first attribute as a float3, but I specify an offset of 3 floats instead of the 4 that a float3 would normally use. But it isn't enough, apparently. I'm wondering how to set up a MTLVertexDescriptor and the shader struct with attributes so that it handles the 'packed' data from my structs?
Thanks very much.
The key is in this part of your question: "Notice that I specify the first attribute as a float3, but I specify an offset of 3 floats instead of the 4 that a float3 would normally use".
The SIMD float3 type takes up 16 bytes, it has the same memory layout as the non-packed Metal float3 type. So when you set the offset to only 3*MemoryLayout.stride you are missing the last 4 bytes which are still present causing the next field to pull from those extra bytes and for the rest of the data to be offset.
To really use packed types to transfer data to Metal (or any graphics API) you either have to stick with what you were doing before and specify x, y, z in three separate Floats in an array, or you have to define your own struct like this:
struct Vector3 {
var x: Float
var y: Float
var z: Float
}
Swift doesn't have any guarantees that this struct will be three Floats packed closely together, but for now and the foreseeable future it works and will be 12 bytes in size on most platforms.
If you want to be able to do vector operations on a struct like this then I would suggest looking for a library that defines types like these to save yourself some time as you will run into the same types of problems with 3x3 matrices also.
I ran into the same problems so I ended up rolling my own:
https://github.com/jkolb/Swiftish
I'm trying to fill a 1D texture with values manually and pass that texture to a compute shader (these are 2 pixels that I want to set via code, they don't represent any image).
Due to the current small amount of Metal examples, all examples I could find deal with 2D textures that load the texture by converting a loaded UIImage to raw bytes data, but creating a dummy UIImage felt like a hack for me.
This is the "naive" way I started with -
...
var manualTextureData: [Float] = [ 1.0, 0.0, 0.0, 1.0,
0.0, 0.0, 1.0, 1.0 ];
let region: MTLRegion = MTLRegionMake1D(0, textureDescriptor.width);
myTexture.replaceRegion(region, mipmapLevel: 0, withBytes: &manualTextureData, bytesPerRow: 0);
but Metal doesn't recognize those values in the shader (it gets an empty texture, except for the first value).
I quickly realized that the Float array probably has to be converted into a bytes array (e.g UInt8), but couldn't find a way to convert from [Float] to [UInt8] either.
Another possible option I consider is using a CVPixelBuffer object, but that also felt like a workaround to the problem.
So whats the right way to tackle that?
Thanks in advance.
Please note I'm not familiar with Objective-C, hence I'm not sure whether using CVPixelBuffer / UIImage is exaggerated for something which should be straight-forward.
Please forgive the terse reply, but you may find it useful to take a look at my experiments with Swift and Metal. I've created a particle system in Swift which is passed to a Metal compute shader as a one dimensional array of Particle structs. By using posix_memalign, I'm able to eliminate the bottleneck caused by passing the array between Metal and Swift.
I've blogged extensively about this: http://flexmonkey.blogspot.co.uk/search/label/Metal
I hope this helps.
Simon
I don't see any reason for you to pass data using 1D texture. Instead I would go with just passing a buffer. Like this:
var dataBuffer:MTLBuffer? = device.newBufferWithBytes(&manualTextureData, length: sizeOf(manualTextureData), options: MTLResourceOptions.OptionCPUCacheModeDefault)
Then you hook it to your renderCommandEncoder like this:
renderCommandEncoder.setFragmentBuffer(dataBuffer, offset: 0, atIndex: 1)//Note that if you want this buffer to be passed to you vertex shader you should use setVertexBuffer
Then in your shader, you should add parameter like this const device float* bufferPassed [[ buffer(1) ]]
And then use it like this, inside your shader implementation:
float firstFloat = bufferPassed[0];
This will get the job done.
Not really answering your question, but you could just define an array in your metal shader instead of passing the values as a texture.
Something like:
constant float manualData[8] = { 1.0, 0.0, 0.0, 1.0,
0.0, 0.0, 1.0, 1.0 };
vertex float4 world_vertex(unsigned int vid[[vertex_id]], ...) {
int manualIndex = vid % 8;
float manualValue = manualData[manualIndex];
// something deep and meaningful here...
return float4(manualValue);
}
If you want a float texture bytesPerRow should be 4 for times the width, because a float has a size of 4 bytes. Metal copies the memory and dont care about the values. That is your task ;-)
Something like:
myTexture.replaceRegion(region, mipmapLevel: 0, withBytes: &manualTextureData, bytesPerRow: manualTextureData.count * sizeof(Float));
In my iOS app, written in Swift, I generate a Metal buffer with:
vertexBuffer = device.newBufferWithBytes(vertices, length: vertices.count * sizeofValue(vertices[0]), options: nil)
And bind it to my shader program with:
renderCommandEncoder.setVertexBuffer(vertexBuffer, offset: 0, atIndex: 1)
In my shader program, written in Metal shading language, can I access the size of the buffer? I would like to access the next vertex in my buffer to do some differential calculation. Something like:
vertex float4 my_vertex(const device packed_float3* vertices [[buffer(1)]],
unsigned int vid[[vertex_id]]) {
float4 vertex = vertices[vid];
// Need to clamp this to not go beyond buffer,
// but how do I know the max value of vid?
float4 nextVertex = vertices[vid + 1];
float4 tangent = nextVertex - vertex;
// ...
}
Is my only option to pass the number of vertices as a uniform?
As far as I know, no you can't because the vertices points to an address. Just like C++, must have two things to know the count or size of an array:
1) know what data type of the array (float or some struct)
AND
2a) the array count for the data type OR
2b) the total bytes of the array.
So yes, you would need to pass the array count as a uniform.
For texture buffers you can.
You can get the size of a texture buffer from within the shader code.
Texture buffers have a get_width() and get_height() function, which return a uint.
uint get_width() const;
uint get_height() const;
But that probably does not answer OP's question about vertex buffers.
Actually you can. You can use the resulting value for loops or conditionals. You can't use it to initialise objects. (so dynamic arrays fail)
uint tempUint = 0; // some random type
uint uintSize = sizeof(tempUint); // get the size for the type
uint aVectorSize = sizeof(aVector) / uintSize; // divide the buffer by the type.
float dynamicArray[aVectorSize]; // this fails
for (uint counter = 0; counter < aVectorSize; ++ counter) {
// do stuff
};
if (aVectorSize > 10) {
// do more stuff
}