How to pass scalar parameter to a metal kernel function? - metal

I am new to metal. I want to use metal compute to do some math, so I create a kernel function (shader?), let's say
kernel void foo(device float *data1,
device float *data2,
device float *result,
int flag,
uint index [[thread_position_in_grid]])
{
if(flag==SOMETHING)
{
}...
}
Any idea to encode a scalar value to the flag parameter in MTLComputeCommandEncoder?

You are already doing it. There isn't much difference between a void* buffer with "arbitrary" data and an int.
Juse make the binding a device or constant (since it's a flag I would assume constant is more suitable) address space reference and decorate if with [[ buffer(n) ]] attribute for better readability (and other buffer bindings also), so your new function signature is gonna look like
kernel void foo(device float *data1 [[buffer(0)]],
device float *data2 [[buffer(1)]],
device float *result [[buffer(2)]],
device int& flag [[buffer(3)]],
uint index [[thread_position_in_grid]])
As for the encoder, you can use setBuffer or setBytes on your MTLComputeCommandEncoder but basically, the easiest way to do this would be
id<MTLComputeCommandEncoder> encoder = ...
// ...
int flag = SomeFlag | SomeOtherFlag
[encoder setBytes:&flag length:sizeof(flag) atIndex:3];

Related

Incomplete OpenCV documentation

I am trying to perform 2D convolutions with OpenCV using the HAL functions.
I understand that I can perform this by instantiating a Filter2D object by means of the function
cv::hal::Filter2D::create(uchar *kernel_data, size_t kernel_step, int kernel_type, int kernel_width, int kernel_height, int max_width, int max_height, int stype, int dtype, int borderType, double delta, int anchor_x, int anchor_y, bool isSubmatrix, bool isInplace);
then use the function
cv::hal::Filter2D::apply(...);
The create function takes 15 arguments. So far, I haven't found any documentation about them, other than the argument names and types. This is far from being sufficient.
Where can I get better information ?
The only doc about hall::filter2d i was able to find is here. It's not the exact Filter2D but i think the param brief explanation might help you a bit.
/**
#brief hal_filterInit
#param context double pointer to user-defined context
#param kernel_data pointer to kernel data
#param kernel_step kernel step
#param kernel_type kernel type (CV_8U, ...)
#param kernel_width kernel width
#param kernel_height kernel height
#param max_width max possible image width, can be used to allocate working buffers
#param max_height max possible image height
#param src_type source image type
#param dst_type destination image type
#param borderType border processing mode (CV_HAL_BORDER_REFLECT, ...)
#param delta added to pixel values
#param anchor_x relative X position of center point within the kernel
#param anchor_y relative Y position of center point within the kernel
#param allowSubmatrix indicates whether the submatrices will be allowed as source image
#param allowInplace indicates whether the inplace operation will be possible
#sa cv::filter2D, cv::hal::Filter2D
*/
inline int hal_ni_filterInit(cvhalFilter2D **context, uchar *kernel_data, size_t kernel_step, int kernel_type, int kernel_width, int kernel_height, int max_width, int max_height, int src_type, int dst_type, int borderType, double delta, int anchor_x, int anchor_y, bool allowSubmatrix, bool allowInplace) { return CV_HAL_ERROR_NOT_IMPLEMENTED; }

Builtin MemCpy Chk will always overflow destination buffer

Updating my app from 32-bit to 64-bit.
According to the Apple Documentation floats are only 4 byte and I need to use CGFloat (8 byte)
I am using the memcpy to read in bytes. I have updated all my sizeof(float)s to sizeof(CGFloat).
But when I do I get the Semantic issue
__builtin___memcpy_chk will always overflow destination buffer. Expanded from macro memcpy
I updated my NSData readDataOfLenght to take sizeof(CGFloat) and it seems to work ok. Sometimes not all the data that is read in is correct.
I am afraid I am over my head in this and could use some help.
-(void) readByteData:(NSFileHandle *)fHandle Size:(NSInteger)byteSize
{
[super readByteData:fHandle Size:byteSize];
NSData *data = [fHandle readDataOfLength:sizeof(CGFloat)];
float r;
memcpy(&r, [data bytes], sizeof(CGFloat));
self.radius = r;
int nCGPointSize = sizeof(CGFloat) * 2;
data = [fHandle readDataOfLength:nCGPointSize];
float xy[2];
memcpy(xy, [data bytes], nCGPointSize);
self.centerPos = ccp(xy[0], xy[1]);
data = [fHandle readDataOfLength:sizeof(CGFloat)];
float start_angle;
memcpy(&start_angle, [data bytes], sizeof(CGFloat));
self.startAngle = start_angle;
data = [fHandle readDataOfLength:sizeof(CGFloat)];
float end_angle;
memcpy(&end_angle, [data bytes], sizeof(CGFloat));
self.endAngle = end_angle;
data = [fHandle readDataOfLength:sizeof(int)];
int d;
memcpy(&d, [data bytes], sizeof(int));
self.dir = d;
flagClosed = YES;
}
This instruction:
float r;
memcpy(&r, [data bytes], sizeof(CGFloat));
Tells your compiler:
Read sizeof(CGFloat) (== 8 bytes!) from the location [data bytes]
and write them to r
But r is only 4 bytes in size! So the first 4 bytes are written to r and the next 4 bytes are written to whatever follows r in memory and this is not allowed. memcpy is a simple byte copy instructions, it moves any number of bytes from memory location A to memory location B, it cannot convert data types for you. If you need to convert CGFloat values to float values, then you actually need to do that conversion yourself.
CGFloat bigR;
memcpy(&bigR, [data bytes], sizeof(bigR));
self.radius = (float)bigR;
Same when reading multiple values:
CGFloat bigXY[2];
data = [fHandle readDataOfLength:sizeof(bigXY)];
memcpy(bigXY, [data bytes], sizeof(bigXY));
self.centerPos = ccp((float)bigXY[0], (float)bixXY[1]);
The casts are only to make it more clear where the conversion takes place, most compilers will also compile the code without all the (float) casts and without complaining.
As a general rule:
memcpy(dst, src, size)
size must never be bigger than the memory src points to or the memory dst points to. In your case, size was always bigger than the memory dst pointed to.
So far the explanation why your code didn't work. However, you actually don't need to use memcpy at all as if you have a memory block out of multiple values of a known data type, of course you can access that memory directly without having to copy it anywhere:
NSData * data = [fHandle readDataOfLength:sizeof(CGFloat)];
if (!data) {
// ... handle errorr ...
}
const CGFloat * cgfloatsInData = (const CGFloat *)[data bytes];
self.radius = (float)cgfloatsInData[0];
data = [fHandle readDataOfLength:sizeof(CGFloat) * 2];
if (!data) {
// ... handle errorr ...
}
const CGFloat * cgfloatsInData = (const CGFloat *)[data bytes];
self.centerPos = ccp((float)cgfloatsInData[0], (float)cgfloatsInData[1]);
And so on. But this is highly inefficient, as you seem to always expect some fixed size structure with no optional values, so why not reading it as a structure? That way you only need one I/O access to read all of it and only one NSData object must be created by the system.
const struct {
CGFloat radius;
CGFloat xCoord;
CGFloat yCoord;
CGFloat startAngle;
CGFloat endAngle;
int dir;
} __attribute__((packed)) * entry;
// `const` as the memory `entry` will point to will be read-only.
// `* entry` means entry is a pointer to memory of a struct
// that looks as described above. __attribute__((packed)) means
// the memory must be laid out exactly as shown above and have no
// padding for better alignment of fields.
NSData * data = [fHandle readDataOfLength:sizeof(*entry)];
// `sizeof(*entry)` means the size of the memory entry points to,
// contrary to `sizeof(entry)` which would be the size of entry itself
// and that would simply be the size of a pointer on your system, 8 bytes,
// whereas `sizeof(*entry)` will be 44 bytes.
entry = (const void *)dataBytes;
// Any pointer type can be cased to `void *` and assigning
// a `void *` pointer so some pointer is always allowed by the compiler.
self.radius = (float)entry->radius;
self.centerPos = ccp((float)entry->xCoord, (float)entry->yCoord);
self.startAngle = (float)entry->startAngle;
self.endAngle = (float)entry->endAngle;
self.dir = entry->dir;

How to get memory offset?

I need to get memory offset from struct, the file is: https://github.com/BlastHackNet/mod_s0beit_sa/blob/master/src/samp.h I need to get
struct stObject : public stSAMPEntity < object_info >
{
uint8_t byteUnk0[2];
uint32_t ulUnk1;
int iModel;
uint8_t byteUnk2;
float fDrawDistance;
float fUnk;
float fPos[3];
// ...
};
fPos memory offset( as 0x1111 ). I don't know how to do it. Please help me.
Take a look at the offsetof operator: http://www.cplusplus.com/reference/cstddef/offsetof/

How to turn 4 bytes into a float in objective-c from NSData

Here is an example of turning 4 bytes into a 32bit integer in objective-c. The function readInt grabs 4 bytes from the read function and then converts it into a single 32 bit int. Does anyone know how I would convert 4 bytes to a float? I believe it is big endian. Basically I need a readFloat function. I can never grasp these bitwise operations.
EDIT:
I forgot to mention that the original data comes from Java's DataOutputStream class. The writeFloat function accordign to java doc is
Converts the float argument to an int using the floatToIntBits method
in class Float, and then writes that int value to the underlying
output stream as a 4-byte quantity, high byte first.
This is Objective c trying to extract the data written by java.
- (int32_t)read{
int8_t v;
[data getBytes:&v range:NSMakeRange(length,1)];
length++;
return ((int32_t)v & 0x0ff);
}
- (int32_t)readInt {
int32_t ch1 = [self read];
int32_t ch2 = [self read];
int32_t ch3 = [self read];
int32_t ch4 = [self read];
if ((ch1 | ch2 | ch3 | ch4) < 0){
#throw [NSException exceptionWithName:#"Exception" reason:#"EOFException" userInfo:nil];
}
return ((ch1 << 24) + (ch2 << 16) + (ch3 << 8) + (ch4 << 0));
}
OSByteOrder.h contains functions for reading, writing, and converting integer data.
You can use OSSwapBigToHostInt32() to convert a big-endian integer to the native representation, then copy the bits into a float:
NSData* data = [NSData dataWithContentsOfFile:#"/tmp/java/test.dat"];
int32_t bytes;
[data getBytes:&bytes length:sizeof(bytes)];
bytes = OSSwapBigToHostInt32(bytes);
float number;
memcpy(&number, &bytes, sizeof(bytes));
NSLog(#"Float %f", number);
[data getBytes:&myFloat range:NSMakeRange(locationWhereFloatStarts, sizeof(float)] ought to do the trick.
Given that the data comes from DataOutputStream's writeFloat() method, then that is documented to use Float.floatToIntBits() to create the integer representation. intBitsToFloat() further documents how to interpret that representation.
I'm not sure if it's the same thing, but the xdr API seems like it might handle that representation. The credits on the man page refer to Sun Microsystems standards/specifications, so it seems likely it's related to Java.
So, it may work to do something like:
// At top of file:
#include <rpc/types.h>
#include <rpc/xdr.h>
// In some function or method:
XDR xdr;
xdrmem_create(&xdr, (char*)data.bytes + offset, data.length - offset, XDR_DECODE);
float f;
if (!xdr_float(&xdr, &f))
/* handle error */;
xdr_destroy(&xdr);
If the data consists of a whole stream in eXternal Data Representation, then you would create one XDR stream for the whole task of extracting items from it, and use many xdr_...() calls between creating and destroying it to extract all of the items.

Sizeof const char* wrong value

NSString *lang = #"en";
const char* ar = [lang UTF8String];
int size_of_array = (sizeof ar) / (sizeof ar[0]);
size_of_array is equal to 4 and (sizeof ar) = 4 and sizeof ar[0] = 1.
Why? I think it (size_of_array) has to be 2.
sizeof ar will get the size of the type char *, which is a pointer and so takes 4 bytes in memory. You want to get the length of the string, so use the function strlen instead of sizeof ar
It isn't clear what you are trying to do.
Your third line of code references an array "ar" that isn't declared anywhere in your post, and doesn't seem to relate to the code before it.
Also, the bit sizeof ar[] doesn't make much sense. That will give you the size of a single element in your ar array, whatever that is. So you are taking the size of the pointer variable ar, and dividing it by the size of one element in the ar array.
Are you trying to determine the memory size of the ASCII string lang_ch?
If so, then you want
int size_of_array = strlen(lang_ch) + 1;
That will give you the length of the string you get back, including the null terminator.

Resources