How do you put a sim_float4x4 matrix into a metal buffer in metal in swift? - ios

Hi I am trying to program an app that will display simple 3d models in iOS on Xcode and I have run into a small problem but I can not find a solution to this problem in Apples Documentation or in any forums on the internet I have looked in. I have an big array with vertices for triangles in 3 Dimensions that I want to transform into world space in the rendering process in metal. I read in an article online that in order to tell metal to tell the graphics processor to transform the vertices in the rendering process you need to put this matrix in a metal buffer and then tell the rendering process to use this buffer with the matrix in it in this line of code:
renderEncoder.setVertexBuffer(ROTMATRIX, offset: 0, index: 1)
if "ROTMATRIX" is the name of the metal buffer that contains the models rotation matrix. The problem is that I do not know how to put the matrix inside this buffer. I constructed a matrix for the model called MODMAT like this:
var A = simd_float4(1, 0, 0, 0)
var B = simd_float4(0, 0, 0, 0)
var C = simd_float4(0, 0, 1, 0)
var D = simd_float4(0, 0, 0, 1)
var MODMAT = float4x4([A, B, C, D])
I tried to put the matrix MODMAT in ROTMATRIX in this line of code:
ROTMATRIX.contents().copyMemory(from: MODMAT, byteCount: 64)
But the compiler in Xcode says that it "Cannot convert value of type 'float4x4' (aka 'simd_float4x4') to expected argument type 'UnsafeRawPointer'". So I need to provide it with the unsafe raw pointer to the matrix MODMAT. So is it possible to create this kind of pointer to a Matrix in Swift and if not how should I modify ROTMATRIX in the correct way?
Best Regards Simon

contents returns an UnsafeMutableRawPointer. You can use either storeBytes(of:toByteOffset:as:) or storeBytes(of:as:) to store a simd_float4x4 to this pointer. In fact, you can use this to store any value of a trivial (basically, values that can be copied bit for bit without any refcounting and so on) type.
Refer to documentation page for UnsafeMutableRawPointer and contents


How to have different input and output sizes for a Drake system

I am trying to write a system that takes as input the state of a free body and gives as output the desired state of a revolute joint.
Hence the input port should take a vector of size 13 and the output port should give a vector of size 2.
For now, I just want to extract one value from the input state, so I tried something like this:
ball_state = Variable("ball_state")
desired_theta_system = builder.AddSystem(SymbolicVectorSystem(input=[ball_state], state=[], dynamics=[], output=[ball_state[6], 0]))
However, this did not work, as the ball_state variable is not subscriptable.
How can I do this? Do I need to derive LeafSystem?
You could certainly write a small LeafSystem, but you could accomplish this one with a MatrixGain system (e.g. with D =
[0, ..., 0, 1, 0, ...] ;
[0, ... 0].

How to create 3d mesh vertices in Gideros

I'm using Lua for the first time, and of course need to check around to learn how to implement certain code.
To create a vertex in Gideros, there's this code:
mesh:setVertex(index, x, y)
However, I would also like to use the z coordinate.
I've been checking around, but haven't found any help. Does anyone know if Gideros has a method for this, or are there any tips and tricks on setting the z coordinates?
First of all these functions are not provided by Lua, but by the Gideros Lua API.
There are no meshes or things like that in native Lua.
Referring to the reference Gideros Lua API reference manual would give you some valuable hints:
Mesh can be 2D or 3D, the latter expects an additionnal Z coordinate
in its vertices.[is3d])
is3d: (boolean) Specifies that this mesh
expect Z coordinate in its vertex array and is thus a 3D mesh
So in order to create a 3d mesh you have to do something like:
local myMesh =
Although the manual does not say that you can use a z coordinate in setVertex
It is very likely that you can do that.
So let's have a look at Gideros source code:
int MeshBinder::setVertex(lua_State *L)
Binder binder(L);
GMesh *mesh = static_cast<GMesh*>(binder.getInstance("Mesh", 1));
int i = luaL_checkinteger(L, 2) - 1;
float x = luaL_checknumber(L, 3);
float y = luaL_checknumber(L, 4);
float z = luaL_optnumber(L, 5, 0.0);
mesh->setVertex(i, x, y, z);
return 0;
Here you can see that you can indeed provide a z coordinate and that it will be used.
local myMesh =
myMesh:SetVertex(1, 100, 20, 40)
should work just fine.
You could have simply tried that btw. It's for free, it doesn't hurt and it's the best way to learn!

Transforming MPSNNImageNode using Metal Performance Shader

I am currently working on replicating YOLOv2 (not tiny) on iOS (Swift4) using MPS.
A problem is that it is hard for me to implement space_to_depth function ( and concatenation of two results from convolutions (13x13x256 + 13x13x1024 -> 13x13x1280). Could you give me some advice on making these parts? My codes are below.
let conv19 = MPSCNNConvolutionNode(source: conv18.resultImage,
weights: DataSource("conv19", 3, 3, 1024, 1024))
let conv20 = MPSCNNConvolutionNode(source: conv19.resultImage,
weights: DataSource("conv20", 3, 3, 1024, 1024))
let conv21 = MPSCNNConvolutionNode(source: conv13.resultImage,
weights: DataSource("conv21", 1, 1, 512, 64))
1. space_to_depth with conv21
2. concatenate the result of conv20(13x13x1024) to the result of 1 (13x13x256)
I need your help to implement this part!
I believe space_to_depth can be expressed in form of a convolution:
For instance, for an input with dimension [1,2,2,1], Use 4 convolution kernels that each output one number to one channel, ie. [[1,0],[0,0]] [[0,1],[0,0]] [[0,0],[1,0]] [[0,0],[0,1]], this should put all input numbers from spatial dimension to depth dimension.
MPS actually has a concat node. See here:
You can use it like this:
concatNode = [[MPSNNConcatenationNode alloc] initWithSources:#[layerA.resultImage, layerB.resultImage]];
If you are working with the high level interface and the MPSNNGraph, you should just use a MPSNNConcatenationNode, as described by Tianyu Liu above.
If you are working with the low level interface, manhandling the MPSKernels around yourself, then this is done by:
Create a 1280 channel destination image to hold the result
Run the first filter as normal to produce the first 256 channels of the result
Run the second filter to produce the remaining channels, with the destinationFeatureChannelOffset set to 256.
That should be enough in all cases, except when the data is not the product of a MPSKernel. In that case, you'll need to copy it in yourself or use something like a linear neuron (a=1,b=0) to do it.

OpenCV Threshold Type

I have a question about OpenCV's example on Basic Thresholding as provided in the link below:
I am slowly beginning to understand the code and have tried out an example too. However I am confused about a part of the code regarding thresholding operations. How does the thresholding function know which threshold operation to use?
This is where it is called:
threshold( src_gray, dst, threshold_value, max_BINARY_value,threshold_type);
I get that the last parameter "threshold_type is how it knows which threshold operation to use(eg. binary, binary inverted, truncated etc.) However in the code, this is all that is assigned to threshold_type:
int threshold_type = 3
As it is only assigned an int value of 3. How does the Threshold function know what operation to give it? Could someone explain it to me?
You should avoid using numeric literals to call the method of OpenCV instead use the constant variable defined in the opencv namespace, However it won't create any difference in output, but it makes the code more readable, So deciphered set of inputs to the cv::threshold() method are:
According to this table you are using thresholdType == THRESH_TOZERO

Applying a hamming window before FFT

I am currently detecting frequencies using FFT. I am aware that I need to apply a window before doing the FFT but I am unsure how to do this.
What exactly should be done to apply a window.
I malloc is using
float * hammingWindow = (float *) malloc(sizeof(float) * numberOfFrames);
vDSP_hamm_window(hammingWindow, n, 0);
but I am not sure how to proceed from here.
When I call vmul with my args
vDSP_vmul((COMPLEX*)outputBuffer, 1, hammingWindow, 1, (COMPLEX*)outputBuffer, 1, n);
I get an error that vDSP_vmul does not exist even though I am calling other vDSP methods.
I call this after my FFT function
vDSP_zvmags((COMPLEX *)outputBuffer, 1, (COMPLEX *)outputBuffer, 1, bufferCapacity);
and I get the same issue - No matching function for call to vDSP_zvmags
What am I doing wrong? Are my arguments incorrect. It looks like (COMPLEX *)outputBuffer should not be passed in for two of the args.
