passing mutable MTLBuffer between two Metal compute encoders swift - metal

In some objective c classes I have a buffer that is used by two sequential metal pipelines. One kernel processes values based on the rgb luminance of the picture and writes to the mutable buffer and the next draws out a waveform graph based on the data in the first. It's based on the osx shader described here USING METAL PERFORMANCE SHADERS WITH CORE IMAGE.
I managed to convert this from osx to IOS which is working great but I am now trying to convert it to swift to make it more flexible for me in the future. I have other filters I'd like to write and the translating process is also teaching me a lot about how metal pipelines work.
This is where it's breaking down. The graph it draws has one horizontal coloured line fluttering up and down. Close but no cigar. I wondered if it had something to do with how the buffers were created in swift vs objective c. Here's an example of the original:
size_t columnBufSize = sizeof(UInt64)*inputTexture.width*inputTexture.height;
id<MTLBuffer> columnDataRed = [kDevice newBufferWithLength:columnBufSize options:0];
but when I translate this in swift It's not behaving as expected.
let columnBufSize: size_t = MemoryLayout<UInt64>.size * inputTexture!.width * inputTexture!.height
let columnDataRed: MTLBuffer = kDevice.makeBuffer( length: columnBufSize, options: .storageModeShared)!
It looks like only the last value is passed into the second computeEncoder.
What am I missing?
for completion, here is the full code in objective c
+ (BOOL)processWithInputs:(NSArray<id<CIImageProcessorInput>> *)inputs arguments:(NSDictionary<NSString *,id> *)arguments output:(id<CIImageProcessorOutput>)output error:(NSError * _Nullable *)error
{
id<MTLComputePipelineState> renderComputerState = kParadeComputePipelineState;
id<CIImageProcessorInput> input = inputs.firstObject;
id<MTLCommandBuffer> commandBuffer = output.metalCommandBuffer;
commandBuffer.label = #"com.martinhering.WaveformKernel";
id<MTLTexture> inputTexture = input.metalTexture;
id<MTLTexture> outputTexture = output.metalTexture;
MTLSize threadgroupCount = MTLSizeMake(inputTexture.width, inputTexture.height, 1);
MTLSize _threadgroupSize = MTLSizeMake(16, 16, 1);
threadgroupCount.width = (inputTexture.width + _threadgroupSize.width - 1) / _threadgroupSize.width;
threadgroupCount.height = (inputTexture.height + _threadgroupSize.height - 1) / _threadgroupSize.height;
size_t columnBufSize = sizeof(UInt64)*inputTexture.width*inputTexture.height;
id<MTLBuffer> columnDataRed = [kDevice newBufferWithLength:columnBufSize options:0];
id<MTLBuffer> columnDataGreen = [kDevice newBufferWithLength:columnBufSize options:0];
id<MTLBuffer> columnDataBlue = [kDevice newBufferWithLength:columnBufSize options:0];
id<MTLComputeCommandEncoder> computeEncoder;
computeEncoder = [commandBuffer computeCommandEncoder];
[computeEncoder setComputePipelineState:kWaveformComputePipelineState];
[computeEncoder setTexture:inputTexture atIndex:0];
[computeEncoder setBuffer:columnDataRed offset:0 atIndex:0];
[computeEncoder setBuffer:columnDataGreen offset:0 atIndex:1];
[computeEncoder setBuffer:columnDataBlue offset:0 atIndex:2];
[computeEncoder setSamplerState:kSamplerState atIndex:0];
[computeEncoder dispatchThreadgroups:threadgroupCount
threadsPerThreadgroup:_threadgroupSize];
[computeEncoder endEncoding];
computeEncoder = [commandBuffer computeCommandEncoder];
[computeEncoder setComputePipelineState:renderComputerState];
[computeEncoder setTexture:inputTexture atIndex:0];
[computeEncoder setTexture:outputTexture atIndex:1];
[computeEncoder setBuffer:columnDataRed offset:0 atIndex:0];
[computeEncoder setBuffer:columnDataGreen offset:0 atIndex:1];
[computeEncoder setBuffer:columnDataBlue offset:0 atIndex:2];
[computeEncoder setSamplerState:kSamplerState atIndex:0];
[computeEncoder dispatchThreadgroups:threadgroupCount
threadsPerThreadgroup:_threadgroupSize];
[computeEncoder endEncoding];
return YES;
}
Here is my translation
override class func process(with inputs: [CIImageProcessorInput]?, arguments: [String : Any]?, output: CIImageProcessorOutput) throws {
guard
let kDevice = device,
let commandBuffer = output.metalCommandBuffer,
let input = inputs?.first,
let defaultLibrary: MTLLibrary = kDevice.makeDefaultLibrary()
else {
return
}
let samplerDescriptor = MTLSamplerDescriptor()
let kSamplerState = kDevice.makeSamplerState(descriptor: samplerDescriptor)
samplerDescriptor.sAddressMode = .clampToEdge
samplerDescriptor.tAddressMode = .clampToEdge
samplerDescriptor.minFilter = .nearest
samplerDescriptor.magFilter = .nearest
samplerDescriptor.normalizedCoordinates = false
var kWaveformComputePipelineState: MTLComputePipelineState?
var kParadeComputePipelineState: MTLComputePipelineState?
if let aFunction = defaultLibrary.makeFunction(name: "scope_waveform_compute") {
kWaveformComputePipelineState = try? kDevice.makeComputePipelineState(function: aFunction)
}
if let aFunction = defaultLibrary.makeFunction(name: "scope_waveform_parade") {
kParadeComputePipelineState = try? kDevice.makeComputePipelineState(function: aFunction)
}
commandBuffer.label = "com.martinhering.WaveformKernel"
weak var inputTexture: MTLTexture? = input.metalTexture
weak var outputTexture: MTLTexture? = output.metalTexture
var threadgroupCount: MTLSize = MTLSizeMake(inputTexture!.width, inputTexture!.height, 1)
let threadgroupSize: MTLSize = MTLSizeMake(16, 16, 1)
threadgroupCount.width = (inputTexture!.width + threadgroupSize.width - 1) / threadgroupSize.width
threadgroupCount.height = (inputTexture!.height + threadgroupSize.height - 1) / threadgroupSize.height
let columnBufSize: size_t = MemoryLayout<UInt64>.size * inputTexture!.width * inputTexture!.height
let columnDataRed: MTLBuffer = kDevice.makeBuffer( length: columnBufSize, options: .storageModeShared)!
let columnDataGreen: MTLBuffer = kDevice.makeBuffer( length: columnBufSize, options: .storageModeShared)!
let columnDataBlue: MTLBuffer = kDevice.makeBuffer( length: columnBufSize, options: .storageModeShared)!
weak var computeEncoder: MTLComputeCommandEncoder?
computeEncoder = commandBuffer.makeComputeCommandEncoder()
computeEncoder?.setComputePipelineState(kWaveformComputePipelineState!)
computeEncoder?.setTexture(inputTexture, index: 0)
computeEncoder?.setBuffer(columnDataRed, offset: 0, index: 0)
computeEncoder?.setBuffer(columnDataGreen, offset: 0, index: 1)
computeEncoder?.setBuffer(columnDataBlue, offset: 0, index: 2)
computeEncoder?.setSamplerState(kSamplerState, index: 0)
computeEncoder?.dispatchThreadgroups(threadgroupCount, threadsPerThreadgroup: threadgroupSize)
computeEncoder?.endEncoding()
computeEncoder = commandBuffer.makeComputeCommandEncoder()
computeEncoder?.setComputePipelineState(kParadeComputePipelineState!)
computeEncoder?.setTexture(inputTexture, index: 0)
computeEncoder?.setTexture(outputTexture, index: 1)
computeEncoder?.setBuffer(columnDataRed, offset: 0, index: 0)
computeEncoder?.setBuffer(columnDataGreen, offset: 0, index: 1)
computeEncoder?.setBuffer(columnDataBlue, offset: 0, index: 2)
computeEncoder?.setSamplerState(kSamplerState, index: 0)
computeEncoder?.dispatchThreadgroups(threadgroupCount, threadsPerThreadgroup: threadgroupSize)
computeEncoder?.endEncoding()
// return true
}

So thanks to Ken Thomases for this; it was simple in the end.
You have to set the properties of the sampler Descriptor before you create the samplerState.
let samplerDescriptor = MTLSamplerDescriptor()
samplerDescriptor.normalizedCoordinates = false
let kSamplerState = kDevice.makeSamplerState(descriptor: samplerDescriptor)

Related

Objective C to swift conversion for unions

Hi I am trying to convert below objective c code into swift but struggling to convert unions which are supported into C but not directly in swift.
I am not sure how I can convert below union type and pass it to MTLTexture getbytes?
union {
float f[2];
unsigned char bytes[8];
} u;
Also last part where I want to print these float values with log statement.
It would be great if I get working swift conversion for below code snippet.
id<MTLDevice> device = MTLCreateSystemDefaultDevice();
id<MTLCommandQueue> queue = [device newCommandQueue];
id<MTLCommandBuffer> commandBuffer = [queue commandBuffer];
MTKTextureLoader *textureLoader = [[MTKTextureLoader alloc] initWithDevice:device];
id<MTLTexture> sourceTexture = [textureLoader newTextureWithCGImage:image.CGImage options:nil error:nil];
CGColorSpaceRef srcColorSpace = CGColorSpaceCreateDeviceRGB();
CGColorSpaceRef dstColorSpace = CGColorSpaceCreateDeviceGray();
CGColorConversionInfoRef conversionInfo = CGColorConversionInfoCreate(srcColorSpace, dstColorSpace);
MPSImageConversion *conversion = [[MPSImageConversion alloc] initWithDevice:device
srcAlpha:MPSAlphaTypeAlphaIsOne
destAlpha:MPSAlphaTypeAlphaIsOne
backgroundColor:nil
conversionInfo:conversionInfo];
MTLTextureDescriptor *grayTextureDescriptor = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat:MTLPixelFormatR16Unorm
width:sourceTexture.width
height:sourceTexture.height
mipmapped:false];
grayTextureDescriptor.usage = MTLTextureUsageShaderWrite | MTLTextureUsageShaderRead;
id<MTLTexture> grayTexture = [device newTextureWithDescriptor:grayTextureDescriptor];
[conversion encodeToCommandBuffer:commandBuffer sourceTexture:sourceTexture destinationTexture:grayTexture];
MTLTextureDescriptor *textureDescriptor = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat:grayTexture.pixelFormat
width:sourceTexture.width
height:sourceTexture.height
mipmapped:false];
textureDescriptor.usage = MTLTextureUsageShaderWrite | MTLTextureUsageShaderRead;
id<MTLTexture> texture = [device newTextureWithDescriptor:textureDescriptor];
MPSImageLaplacian *imageKernel = [[MPSImageLaplacian alloc] initWithDevice:device];
[imageKernel encodeToCommandBuffer:commandBuffer sourceTexture:grayTexture destinationTexture:texture];
MPSImageStatisticsMeanAndVariance *meanAndVariance = [[MPSImageStatisticsMeanAndVariance alloc] initWithDevice:device];
MTLTextureDescriptor *varianceTextureDescriptor = [MTLTextureDescriptor
texture2DDescriptorWithPixelFormat:MTLPixelFormatR32Float
width:2
height:1
mipmapped:NO];
varianceTextureDescriptor.usage = MTLTextureUsageShaderWrite;
id<MTLTexture> varianceTexture = [device newTextureWithDescriptor:varianceTextureDescriptor];
[meanAndVariance encodeToCommandBuffer:commandBuffer sourceTexture:texture destinationTexture:varianceTexture];
[commandBuffer commit];
[commandBuffer waitUntilCompleted];
union {
float f[2];
unsigned char bytes[8];
} u;
MTLRegion region = MTLRegionMake2D(0, 0, 2, 1);
[varianceTexture getBytes:u.bytes bytesPerRow:2 * 4 fromRegion:region mipmapLevel: 0];
NSLog(#"mean: %f", u.f[0] * 255);
NSLog(#"variance: %f", u.f[1] * 255 * 255);
It will be great if I get swift representation for this?
You can use a Struct instead, like this. And add an extension to get description for logging.
struct u {
var bytes: [UInt8] = [0,0,0,0, 0,0,0,0]
var f: [Float32] {
set {
var f = newValue
memcpy(&bytes, &f, 8)
}
get {
var f: [Float32] = [0,0]
var b = bytes
memcpy(&f, &b, 8)
return Array(f)
}
}
}
extension u: CustomStringConvertible {
var description: String {
let bytesString = (bytes.map{ "\($0)"}).joined(separator: " ")
return "floats : \(f[0]) \(f[1]) - bytes : \(bytesString)"
}
}
var test = u()
print(test)
test.f = [3.14, 1.618]
print(test)
test.bytes = [195, 245, 72, 64, 160, 26, 207, 63]
print(test)
Log:
floats : 0.0 0.0 - bytes : 0 0 0 0 0 0 0 0
floats : 3.14 1.618 - bytes : 195 245 72 64 160 26 207 63
floats : 3.14 1.618 - bytes : 195 245 72 64 160 26 207 63
You don't need the whole union for getBytes to work, only u.bytes is used there, which can be converted as
var bytes = [UInt8](repeating: 0, count: 8)
that's the array of length 8 (with arbitrary initial value 0 in each element), and you pass it to getBytes as an UnsafeMutableRawPointer:
varianceTexture.getBytes(&bytes, ...)
As for union, there are many ways to represent it. For example:
var u = ([Float](repeating: 0.0, count: 2), [UInt8](repeating: 0, count: 8))
And in that case you pass it as
varianceTexture.getBytes(&u.1, ...)
Or you could make it a class or struct in a similar way.

distorted cv::Mat converted from CMSampleBuffer of video frame

I use AVAssetReader/AVAssetReaderTrackOutput to get CMSampleBuffer from video. But When I convert CMSampleBuffer to cv::Mat, the Mat is a distorted image.
Video decode code:
#objc open func startReading() -> Void {
if let reader = try? AVAssetReader.init(asset: _asset){
let videoTrack = _asset.tracks(withMediaType: .video).compactMap{ $0 }.first;
let options = [kCVPixelBufferPixelFormatTypeKey : Int(kCVPixelFormatType_32BGRA)]
let readerOutput = AVAssetReaderTrackOutput.init(track: videoTrack!, outputSettings: options as [String : Any])
reader.add(readerOutput)
reader.startReading()
var count = 0
//reading
while (reader.status == .reading && videoTrack?.nominalFrameRate != 0){
let sampleBuffer = readerOutput.copyNextSampleBuffer()
_delegate?.reader(self, newFrameReady: sampleBuffer, count)
count = count+1;
}
_delegate?.readerDidFinished(self,totalFrameCount: count)
}
}
Image covert code:
//convert sampleBuffer in callback of video reader
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
size_t width = CVPixelBufferGetWidth(imageBuffer);
size_t height = CVPixelBufferGetHeight(imageBuffer);
CVPixelBufferLockBaseAddress(imageBuffer, kCVPixelBufferLock_ReadOnly);
char *baseBuffer = (char*)CVPixelBufferGetBaseAddress(imageBuffer);
cv::Mat cvImage = cv::Mat((int)height,(int)width,CV_8UC3);
cv::MatIterator_<cv::Vec3b> it_start = cvImage.begin<cv::Vec3b>();
cv::MatIterator_<cv::Vec3b> it_end = cvImage.end<cv::Vec3b>();
long cur = 0;
while (it_start != it_end) {
//opt pixel
long p_idx = cur*4;
char b = baseBuffer[p_idx];
char g = baseBuffer[p_idx + 1];
char r = baseBuffer[p_idx + 2];
cv::Vec3b newpixel(b,g,r);
*it_start = newpixel;
cur++;
it_start++;
}
UIImage *tmpImg = MatToUIImage(cvImage);
preview of tmpImg:
I find some video is work fine but some not. Any help is appreciated!
Finally I figure out this bug is because padding bytes of sampleBuffer.
Many API pad extra bytes behind image rows to optimize memory layout for SIMD, which could process parallel pixels.
Blow code works.
cv::Mat cvImage = cv::Mat((int)height,(int)width,CV_8UC3);
cv::MatIterator_<cv::Vec3b> it_start = cvImage.begin<cv::Vec3b>();
cv::MatIterator_<cv::Vec3b> it_end = cvImage.end<cv::Vec3b>();
long cur = 0;
//Padding bytes added behind image row bytes
size_t padding = CVPixelBufferGetBytesPerRow(imageBuffer) - width*4;
size_t offset = 0;
while (it_start != it_end) {
//opt pixel
long p_idx = cur*4 + offset;
char b = baseBuffer[p_idx];
char g = baseBuffer[p_idx + 1];
char r = baseBuffer[p_idx + 2];
cv::Vec3b newpixel(b,g,r);
*it_start = newpixel;
cur++;
it_start++;
if (cur%width == 0) {
offset = offset + padding;
}
}
UIImage *tmpImg = MatToUIImage(cvImage);

Swift Metal Shader Execution of the command buffer was aborted due to an error during execution when adding a number to an array value

I'm trying some very simple algorithm using metal GPU acceleration to calculate some values in an array. The shader throws an error under some conditions I will explain.
Error: Execution of the command buffer was aborted due to an error during execution. Ignored (for causing prior/excessive GPU errors) (IOAF code 4)
The shader only throws this error when adding a value to the existing value at an index of the array. Example:
This will not cause an error:
kernel void shader (device int *wPointsIntensity [[buffer(0)]],
const device uint *wPointsXCoord [[buffer(1)]],
const device uint *wPointsYCoord [[buffer(2)]],
device float *pixelSignalIntensity [[buffer(3)]],
device float *pixelDistance [[buffer(4)]],
const device uint& noOfPoints [[ buffer(5) ]],
const device uint& width [[ buffer(6) ]],
const device uint& height [[ buffer(7) ]],
uint id [[ thread_position_in_grid ]]) {
//this does not throw error
for (uint wpIndex = 0; wpIndex < noOfPoints; wpIndex++) {
for (uint heightIndex = 0; heightIndex < height; heightIndex++) {
for (uint widthIndex = 0; widthIndex < width; widthIndex++) {
uint pixelIndex = heightIndex * width + widthIndex;
pixelDistance[pixelIndex] = float(pixelIndex);
pixelSignalIntensity[pixelIndex] = float(pixelIndex);
}}}}
While if you change
pixelDistance[pixelIndex] = float(pixelIndex);
with
pixelDistance[pixelIndex] += float(pixelIndex);
It will throw an error.
Here is the swift code:
var wPointsValues = [Int32](repeating:0, count: wPoints.count)
var wPointsXLocations = [Int32](repeating:0, count: wPoints.count)
var wPointsYLocations = [Int32](repeating:0, count: wPoints.count)
for i in 0..<wPoints.count {
wPointsValues[i] = Int32(wPoints[i].signalIntensity)
wPointsXLocations[i] = Int32(wPoints[i].location.x)
wPointsYLocations[i] = Int32(wPoints[i].location.y)
}
var numberOfWPoints:Int32 = Int32(wPoints.count)
var int32Width = Int32(width)
var int32Height = Int32(height)
//output arrays
let numberOfResults = wPoints.count * Int(width) * Int(height)
var wPointsSignalIntensity = [Float32](repeating:0.0, count: numberOfResults)
var wPointsDistance = [Float32](repeating:0.0, count: numberOfResults)
//local variables
var signalDensity:[Float32] = [Float32](repeating:0.0, count: numberOfResults)
var signalDistance:[Float32] = [Float32](repeating:0.0, count: numberOfResults)
//create input buffers
let inWPointSignalValues = device.makeBuffer(bytes: wPointsValues, length: (MemoryLayout<Int32>.stride * wPoints.count), options: [])
let inWPointXCoordBuffer = device.makeBuffer(bytes: wPointsXLocations, length: (MemoryLayout<Int32>.stride * wPoints.count), options: [])
let inWPointYCoordBuffer = device.makeBuffer(bytes: wPointsYLocations, length: (MemoryLayout<Int32>.stride * wPoints.count), options: [])
//create putput buffers
let outPixelSignalIntensityBuffer = device.makeBuffer(bytes: wPointsSignalIntensity, length: (MemoryLayout<Float32>.stride * numberOfResults), options: [])
let outPixelDistanceBuffer = device.makeBuffer(bytes: wPointsDistance, length: (MemoryLayout<Float32>.stride * numberOfResults), options: [])
let commandBuffer = (mtlCommmandQueue?.makeCommandBuffer())!
let computeCommandEncoder = (commandBuffer.makeComputeCommandEncoder())!
computeCommandEncoder.setComputePipelineState(mtlComputePipelineFilter!)
//set input buffers
computeCommandEncoder.setBuffer(inWPointSignalValues, offset: 0, index: 0)
computeCommandEncoder.setBuffer(inWPointXCoordBuffer, offset: 0, index: 1)
computeCommandEncoder.setBuffer(inWPointYCoordBuffer, offset: 0, index: 2)
//set output buffers
computeCommandEncoder.setBuffer(outPixelSignalIntensityBuffer, offset: 0, index: 3)
computeCommandEncoder.setBuffer(outPixelDistanceBuffer, offset: 0, index: 4)
//set constants
computeCommandEncoder.setBytes(&numberOfWPoints, length: MemoryLayout<Int32>.stride, index: 5)
computeCommandEncoder.setBytes(&int32Width, length: MemoryLayout<Int32>.stride, index: 6)
computeCommandEncoder.setBytes(&int32Height, length: MemoryLayout<Int32>.stride, index: 7)
let threadsPerGroup = MTLSize(width:2,height:2,depth:2)
let numThreadgroups = MTLSize(width:2, height:2, depth:2)
computeCommandEncoder.dispatchThreadgroups(numThreadgroups, threadsPerThreadgroup: threadsPerGroup)
let endBufferAllocation = mach_absolute_time()
print("time for creating and setting buffert: time: \(Double(endBufferAllocation - start) / Double(NSEC_PER_SEC))")
computeCommandEncoder.endEncoding()
commandBuffer.commit()
commandBuffer.waitUntilCompleted()
let allComplete = mach_absolute_time()
self.signalDistance = (outPixelDistanceBuffer?.contents())!
self.signalDensity = (outPixelSignalIntensityBuffer?.contents())!
I had this issue for ages and program crashed intermittently. It turned out that I was accessing memory in the kernel that had not been allocated by the buffer. In the kernel I was doing a for loop 0..<5 (i.e. output 5 values for each thread) but had not divided the num_threads by 5.
When it didn't crash it was giving the correct answer and no errors except " Execution of the command buffer was aborted due to an error during execution. Caused GPU Hang Error (IOAF code 3)" were ever thrown.

Swift 3 CGContext Memory Leak

I'm using a CGBitMapContext() to convert colour spaces to ARGB and get the pixel data values, I malloc space for bit map context and free it after I'm done but am still seeing a Memory Leak in Instruments I'm thinking I'm likely doing something wrong so any help would be appreciated.
Here is the ARGBBitmapContext function
func createARGBBitmapContext(width: Int, height: Int) -> CGContext {
var bitmapByteCount = 0
var bitmapBytesPerRow = 0
//Get image width, height
let pixelsWide = width
let pixelsHigh = height
bitmapBytesPerRow = Int(pixelsWide) * 4
bitmapByteCount = bitmapBytesPerRow * Int(pixelsHigh)
let colorSpace = CGColorSpaceCreateDeviceRGB()
// Here is the malloc call that Instruments complains of
let bitmapData = malloc(bitmapByteCount)
let context = CGContext(data: bitmapData, width: pixelsWide, height: pixelsHigh, bitsPerComponent: 8, bytesPerRow: bitmapBytesPerRow, space: colorSpace, bitmapInfo: CGImageAlphaInfo.premultipliedFirst.rawValue)
// Do I need to free something here first?
return context!
}
Here is where I use the context to retrieve all the pixel values as a list of UInt8s (and where the memory leaks)
extension UIImage {
func ARGBPixelValues() -> [UInt8] {
let width = Int(self.size.width)
let height = Int(self.size.height)
var pixels = [UInt8](repeatElement(0, count: width * height * 3))
let rect = CGRect(x: 0, y: 0, width: width, height: height)
let context = createARGBBitmapContext(inImage: self.cgImage!)
context.clear(rect)
context.draw(self.cgImage!, in: rect)
var location = 0
if let data = context.data {
while location < (width * height) {
let arrOffset = 3 * location
let offset = 4 * (location)
let R = data.load(fromByteOffset: offset + 1, as: UInt8.self)
let G = data.load(fromByteOffset: offset + 2, as: UInt8.self)
let B = data.load(fromByteOffset: offset + 3, as: UInt8.self)
pixels[arrOffset] = R
pixels[arrOffset+1] = G
pixels[arrOffset+2] = B
location += 1
}
free(context.data) // Free the data consumed, perhaps this isn't right?
}
return pixels
}
}
Instruments reports a malloc error of 1.48MiB which is right for my image size (540 x 720) I free the data but apparently that is not right.
I should mention that I know you can pass nil to CGContext init (and it will manage memory) but I'm more curious why using malloc creates an issue is there something more I should know (I'm more familiar with Obj-C).
Because CoreGraphics is not handled by ARC (like all other C libraries), you need to wrap your code with with an autorelease, even in Swift. Particularly if you are not on the main thread (which you should not be, if CoreGraphics is involved... .userInitiated or lower is appropriate).
func myFunc() {
for _ in 0 ..< makeMoneyFast {
autoreleasepool {
// Create CGImageRef etc...
// Do Stuff... whir... whiz... PROFIT!
}
}
}
For those that care, your Objective-C should also be wrapped like:
BOOL result = NO;
NSMutableData* data = [[NSMutableData alloc] init];
#autoreleasepool {
CGImageRef image = [self CGImageWithResolution:dpi
hasAlpha:hasAlpha
relativeScale:scale];
NSAssert(image != nil, #"could not create image for TIFF export");
if (image == nil)
return nil;
CGImageDestinationRef destRef = CGImageDestinationCreateWithData((CFMutableDataRef)data, kUTTypeTIFF, 1, NULL);
CGImageDestinationAddImage(destRef, image, (CFDictionaryRef)options);
result = CGImageDestinationFinalize(destRef);
CFRelease(destRef);
}
if (result) {
return [data copy];
} else {
return nil;
}
See this answer for details.

metal causes segment fault: 11

A segment fault occurred occasionally. I have no idea if it was caused by [MTLComputeComandEncoder setCcomptePipelinestate] method. The related codes are listed in the follows
void metal_func( void *metal_context, int16_t *dst, uint8_t *_src, int _srcstride, int height, int mx, int my, int width)
{
int x, y;
pixel *src = (pixel*)_src;
int srcstride = _srcstride / sizeof(pixel);
int16_t dst_buf[4900];
int16_t out_buf[4900];
int16_t *pdst = dst_buf;
int16_t *pout = out_buf;
uint8_t local_src[4900];
MetalContext *mc = metal_context;
memset( out_buf, 0, sizeof(int16_t)*4900 );
int dst_size = sizeof(int16_t)*4900;
int src_size = sizeof(uint8_t)*4900;
id<MTLDevice> device = mc->metal_device;
id<MTLCommandQueue> commandQueue = mc->metal_commandqueue;
id<MTLComputePipelineState> cpipeline = mc->metal_cps_v;
// Buffer for storing encoded commands that are sent to GPU
id<MTLCommandBuffer> commandBuffer = [commandQueue commandBuffer];
id<MTLBuffer> dst_buffer;
id<MTLBuffer> src_buffer;
id<MTLBuffer> stride_buffer;
id<MTLBuffer> mx_buffer;
id<MTLBuffer> my_buffer;
id<MTLBuffer> depth_buffer;
id <MTLComputeCommandEncoder> computeCommandEncoder;
MTLSize ts= {1, 1, 1};
MTLSize numThreadgroups = {70*height, 1, 1};
int m_x = mx;
int m_y = my;
int s = _srcstride / sizeof(pixel);
int i_size = sizeof(int);
int dpt = BIT_DEPTH;
memset( pdst, 0, 4900*sizeof(int16_t));
//copy data to the local_src_buffer
uint8_t *pcsrc = _src - 3*s;
uint8_t *pcdst = local_src;
memset( local_src, 0, sizeof(uint8_t)*4900 );
for( int i = 0; i < height+7; i++ )
{
memcpy( pcdst, pcsrc, sizeof(uint8_t)*width);
pcsrc += s;
pcdst += 70;
}
int local_src_stride = 70;
computeCommandEncoder = [commandBuffer computeCommandEncoder];
//set kernel function parameters
dst_buffer = [device newBufferWithBytes: pdst length: dst_size options: MTLResourceOptionCPUCacheModeDefault ];
[computeCommandEncoder setBuffer: dst_buffer offset: 0 atIndex: 0 ];
src_buffer = [device newBufferWithBytes: local_src length: src_size options: MTLResourceOptionCPUCacheModeDefault ];
[computeCommandEncoder setBuffer: src_buffer offset: 0 atIndex: 1 ];
stride_buffer = [device newBufferWithBytes: &local_src_stride length: i_size options: MTLResourceOptionCPUCacheModeDefault ];
[computeCommandEncoder setBuffer: stride_buffer offset: 0 atIndex: 2 ];
mx_buffer = [device newBufferWithBytes: &m_x length: i_size options: MTLResourceOptionCPUCacheModeDefault ];
[computeCommandEncoder setBuffer: mx_buffer offset: 0 atIndex: 3 ];
my_buffer = [device newBufferWithBytes: &m_y length: i_size options: MTLResourceOptionCPUCacheModeDefault ];
[computeCommandEncoder setBuffer: my_buffer offset: 0 atIndex: 4 ];
depth_buffer = [device newBufferWithBytes: &dpt length: i_size options: MTLResourceOptionCPUCacheModeDefault ];
[computeCommandEncoder setBuffer: depth_buffer offset: 0 atIndex: 5 ];
[computeCommandEncoder setComputePipelineState:cpipeline ];
**//occasionally, segment fault was reported just after the above commands**
[computeCommandEncoder dispatchThreadgroups:numThreadgroups threadsPerThreadgroup:ts];
[computeCommandEncoder endEncoding ];
[ commandBuffer commit];
[ commandBuffer waitUntilCompleted];
//get the data computed by GPU
NSData* outdata = [NSData dataWithBytesNoCopy:[dst_buffer contents] length: dst_size freeWhenDone:false ];
[outdata getBytes:pout length:dst_size];
[dst_buffer release];
[src_buffer release];
[stride_buffer release];
[mx_buffer release];
[my_buffer release];
[depth_buffer release];
pout = out_buf;
pdst = dst;
for( int j = 0; j < height; j++ )
{
memcpy( pdst, pout, sizeof(int16_t)*width );
pdst += MAX_PB_SIZE;
pout += 70;
}
}
MetalContext is defined as as followed. The 3 members of MetalContext are initialized outside when program began, and they were initialized once.
typedef struct {
void * metal_device;
void * metal_commandqueue;
void * metal_cps_v;
}MetalContext;
The codes could not run successfully all the time. Sometimes, "segment fault: 11" was reported after the command [computeCommandEncoder setComputePipelineState:cpipeline ]. Is there any wrong?

Resources