GPU Filters on UnsafeMutableRawPointer in iOS - ios

I am actually working with image processing on iOS10 with iPad Pro.
I have written small Swift3 image processing app to test the speed of image processing.
My decoder sends every ~33ms (about 30 FPS) new frame, which I need to process with some CoreImage filters of iOS without additional buffering. Every ~33ms the following function will be called:
func newFrame(_ player: MediaPlayer!, buffer: UnsafeMutableRawPointer!,
size: Int32, format_fourcc: UnsafeMutablePointer<Int8>!,
width: Int32, height: Int32, bytes_per_row: Int32,
pts: Int, will_show: Int32) -> Int32 {
if String(cString: format_fourcc) == "BGRA" && will_show == 1 {
// START
var pixelBuffer: CVPixelBuffer? = nil
let ret = CVPixelBufferCreateWithBytes(kCFAllocatorSystemDefault,
Int(width),
Int(height),
kCVPixelFormatType_32BGRA,
buffer,
Int(bytes_per_row),
{ (releaseContext:
UnsafeMutableRawPointer?,
baseAddr:
UnsafeRawPointer?) -> () in
// Do not need to be used
// since created CVPixelBuffer
// will be destroyed
// in scope of this function
// automatically
},
buffer,
nil,
&pixelBuffer)
// END_1
if ret != kCVReturnSuccess {
NSLog("New Frame: Can't create the buffer")
return -1
}
if let pBuff = pixelBuffer {
let img = CIImage(cvPixelBuffer: pBuff)
.applyingFilter("CIColorInvert", withInputParameters: [:])
}
// END_2
}
return 0
}
I need to solve one of the following problems:
Copying CIImage img raw memory data back to UnsafeMutableRawPointer buffer memory.
Somehow apply GPU image filter to CVPixelBuffer pixelBuffer or UnsafeMutableRawPointer buffer directly
The code bloc between // START and // END_2 need to be run in less than 5ms.
What I know:
The code between // START and // END_1 runs in less than 1.3ms.
Please help with your ideas.
Best regards,
Alex

I found temporary solution:
1) Create CIContext in your view :
imgContext = CIContext(eaglContext: eaglContext!)
2) Use a context to draw filtered CIImage to the pointer's memory:
imgContext.render(img,
toBitmap: buffer,
rowBytes: Int(bytes_per_row),
bounds: CGRect(x: 0,
y: 0,
width: Int(width),
height: Int(height)),
format: kCIFormatBGRA8,
colorSpace: CGColorSpaceCreateDeviceRGB())
This solution works well as it uses SIMD instructions of iPad CPU. But the CPU utilization only for copy operation is too high ~30%. This 30% will be added to CPU usage of your program.
Probably somebody has any better idea how let GPU directly write to UnsafeMutableRawPointer after CIFilter?

Related

How to stream camera preview using AVCaptureVideoOutput with MultipeerConnectivity framework?

I'm trying to create a camera remote control app with an iPhone as the camera and an iPad as the remote control. What I'm trying to do is send the iPhone's camera preview using the AVCaptureVideoDataOutput and stream it with OutputStream using the MultipeerConnectivity framework. Then the iPad will receive the data and show it using UIView by setting the layer contents. So far what I've done is this:
(iPhone/Camera preview stream) didOutput function implementation from the AVCaptureVideoDataOutputSampleBufferDelegate:
func captureOutput(_ output: AVCaptureOutput,
didOutput sampleBuffer: CMSampleBuffer,
from connection: AVCaptureConnection) {
DispatchQueue.global(qos: .utility).async { [unowned self] in
let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
if let imageBuffer {
CVPixelBufferLockBaseAddress(imageBuffer, [])
let baseAddress = CVPixelBufferGetBaseAddress(imageBuffer)
let bytesPerRow: size_t? = CVPixelBufferGetBytesPerRow(imageBuffer)
let width: size_t? = CVPixelBufferGetWidth(imageBuffer)
let height: size_t? = CVPixelBufferGetHeight(imageBuffer)
let colorSpace = CGColorSpaceCreateDeviceRGB()
let newContext = CGContext(data: baseAddress,
width: width ?? 0,
height: height ?? 0,
bitsPerComponent: 8,
bytesPerRow: bytesPerRow ?? 0,
space: colorSpace,
bitmapInfo: CGBitmapInfo.byteOrder32Little.rawValue | CGImageAlphaInfo.premultipliedFirst.rawValue)
if let newImage = newContext?.makeImage() {
let image = UIImage(cgImage: newImage,
scale: 0.2,
orientation: .up)
CVPixelBufferUnlockBaseAddress(imageBuffer, [])
if let data = image.jpegData(compressionQuality: 0.2) {
let bytesWritten = data.withUnsafeBytes({
viewFinderStream?
.write($0.bindMemory(to: UInt8.self).baseAddress!, maxLength: data.count)
})
}
}
}
}
}
(iPad/Camera remote controller) Receiving the stream and showing it on the view. This is a function from StreamDelegate protocol:
func stream(_ aStream: Stream, handle eventCode: Stream.Event) {
let inputStream = aStream as! InputStream
switch eventCode {
case .hasBytesAvailable:
DispatchQueue.global(qos: .userInteractive).async { [unowned self] in
var buffer = [UInt8](repeating: 0, count: 1024)
let numberBytes = inputStream.read(&buffer, maxLength: 1024)
let data = Data(referencing: NSData(bytes: &buffer, length: numberBytes))
if let imageData = UIImage(data: data) {
DispatchQueue.main.async {
previewCameraView.layer.contents = imageData.cgImage
}
}
}
case .hasSpaceAvailable:
break
default:
break
}
}
Unfortunately, the iPad did receive the stream but it shows the video data just a tiny bit of it like this (notice the view on the right, there are few pixels that shows the camera preview data on the top left of the view. The rest is just a gray color):
EDIT: And I get this warning too in the console
2023-02-02 20:24:44.487399+0700 MultipeerVideo-Assignment[31170:1065836] Warning! [0x15c023800] Decoding incomplete with error code -1. This is expected if the image has not been fully downloaded.
And I'm not sure if this is normal or not but the iPhone uses almost 100% of it's CPU power.
My question is what did I do wrong for the video stream not showing completely on the iPad? And is there any way to make the stream more efficient so that the iPhone's CPU doesn't work too hard? I'm still new to iOS programming so I'm not sure how to solve this. If you need more code for clarity regarding this, please reach me in the comments.
I think the root of the issue is the fact that iPad reads the data from the stream using a 1024-byte buffer, which is just 256 pixels. That's what you likely see in the preview.
Instead, you need to somehow "know" the length of every frame so you could read it in full.
If you sent an uncompressed data then you could first send the iPad the expected dimensions so iPad could always read full frames. However you send compressed images (jpegs) and you need so somehow tell iPad what's the binary size of every "image".
Sending full frames is kinda inefficient. I am not an expert in this area, but I would consider encoding the camera input into a video and then stream it to iPad. I believe it should be possible to somehow use hardware encoding and the streaming nature of mp4 videos should also help. But that might not be a good suggestion since I have a very little idea of what I'm talking about.
You might want to look into:
VideoToolbox Framework;
Explore low-latency video encoding with VideoToolbox WWDC Session;
How to use VideoToolbox to decompress H.264 video stream

getting uncompressed CIImage data

I'm trying to get CIImage uncompress data.
For now the only way I found to get compressed data is using CIContext as follow:
let ciContext = CIContext()
let ciImage = CIImage(color: .red).cropped(to: .init(x: 0, y: 0, width: 192, height: 192))
guard let ciImageData = ciContext.jpegRepresentation(of: ciImage, colorSpace: CGColorSpace(name: CGColorSpace.sRGB)!, options: [:]) else {
fatalError()
}
print(ciImageData.count) // Prints 1331
Is it possible to get (as efficiently as possible) the uncompressed CIImage data?
As you can see, ciContext.jpegRepresentation is compressing the image data as JPEG and gives you a Data object that can be written as-is as a JPEG file to disk (including image metadata).
You need to use a different CIContext API for rendering directly into (uncompressed) bitmap data:
let rowBytes = 4 * Int(ciImage.extent.width) // 4 channels (RGBA) of 8-bit data
let dataSize = rowBytes * Int(ciImage.extent.height)
var data = Data(count: dataSize)
data.withUnsafeMutableBytes { data in
ciContext.render(ciImage, toBitmap: data, rowBytes: rowBytes, bounds: ciImage.extent, format: .RGBA8, colorSpace: CGColorSpace(name: CGColorSpace.sRGB)!)
}
Alternatively, you can create a CVPixelBuffer with the correct size and format and render into that with CIContext.render(_ image: CIImage, to buffer: CVPixelBuffer). I think Core ML has direct support for CVPixelBuffer inputs, so this might be the better option.

Generate Laplacian image by Apple-Metal MPSImageLaplacian

I am trying to generate Laplacian image out of rgb CGImage by using metal laplacian.
The current code used:
if let croppedImage = self.cropImage2(image: UIImage(ciImage: image), rect: rect)?.cgImage {
let commandBuffer = self.commandQueue.makeCommandBuffer()!
let laplacian = MPSImageLaplacian(device: self.device)
let textureLoader = MTKTextureLoader(device: self.device)
let options: [MTKTextureLoader.Option : Any]? = nil
let srcTex = try! textureLoader.newTexture(cgImage: croppedImage, options: options)
let desc = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: srcTex.pixelFormat, width: srcTex.width, height: srcTex.height, mipmapped: false)
let lapTex = self.device.makeTexture(descriptor: desc)
laplacian.encode(commandBuffer: commandBuffer, sourceTexture: srcTex, destinationTexture: lapTex!)
let output = CIImage(mtlTexture: lapTex!, options: [:])?.cgImage
print("output: \(output?.width)")
print("")
}
I suspect the problem is in makeTexture:
let lapTex = self.device.makeTexture(descriptor: desc)
the width and height of the lapTex in debugger are invalid although the desc and srcTex contains valid data including width and height.
Looks like order or initialisation is wrong but couldn't find what.
Does anyone has an idea what is wrong?
Thanks
There are a few things wrong here.
First, as mentioned in my comment, the command buffer isn't being committed, so the kernel work is never being performed.
Second, you need to wait for the work to complete before attempting to read back the results. (On macOS you'd additionally need to use a blit command encoder to ensure that the contents of the texture are copied back to CPU-accessible memory.)
Third, it's important to create the destination texture with the appropriate usage flags. The default of .shaderRead is insufficient in this case, since the MPS kernel writes to the texture. Therefore, you should explicitly set the usage property on the texture descriptor (to either [.shaderRead, .shaderWrite] or .shaderWrite, depending on how you go on to use the texture).
Fourth, it may be the case that the pixel format of your source texture isn't a writable format, so unless you're absolutely certain it is, consider setting the destination pixel format to a known-writable format (like .rgba8unorm) instead of assuming the destination should match the source. This also helps later when creating CGImages.
Finally, there is no guarantee that the cgImage property of a CIImage is non-nil when it wasn't created from a CGImage. Calling the property doesn't (necessarily) create a new backing CGImage. So, you need to explicitly create a CGImage somehow.
One way of doing this would be to create a Metal device-backed CIContext and use its createCGImage(_:from:) method. Although this might work, it seems redundant if the intent is simply to create a CGImage from a MTLTexture (for display purposes, let's say).
Instead, consider using the getBytes(_:bytesPerRow:from:mipmapLevel:) method to get the bytes from the texture and load them into a CG bitmap context. It's then trivial to create a CGImage from the context.
Here's a function that computes the Laplacian of an image and returns the resulting image:
func laplacian(_ image: CGImage) -> CGImage? {
let commandBuffer = self.commandQueue.makeCommandBuffer()!
let laplacian = MPSImageLaplacian(device: self.device)
let textureLoader = MTKTextureLoader(device: self.device)
let options: [MTKTextureLoader.Option : Any]? = nil
let srcTex = try! textureLoader.newTexture(cgImage: image, options: options)
let desc = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: srcTex.pixelFormat,
width: srcTex.width,
height: srcTex.height,
mipmapped: false)
desc.pixelFormat = .rgba8Unorm
desc.usage = [.shaderRead, .shaderWrite]
let lapTex = self.device.makeTexture(descriptor: desc)!
laplacian.encode(commandBuffer: commandBuffer, sourceTexture: srcTex, destinationTexture: lapTex)
#if os(macOS)
let blitCommandEncoder = commandBuffer.makeBlitCommandEncoder()!
blitCommandEncoder.synchronize(resource: lapTex)
blitCommandEncoder.endEncoding()
#endif
commandBuffer.commit()
commandBuffer.waitUntilCompleted()
// Note: You may want to use a different color space depending
// on what you're doing with the image
let colorSpace = CGColorSpaceCreateDeviceRGB()
// Note: We skip the last component (A) since the Laplacian of the alpha
// channel of an opaque image is 0 everywhere, and that interacts oddly
// when we treat the result as an RGBA image.
let bitmapInfo = CGImageAlphaInfo.noneSkipLast.rawValue
let bytesPerRow = lapTex.width * 4
let bitmapContext = CGContext(data: nil,
width: lapTex.width,
height: lapTex.height,
bitsPerComponent: 8,
bytesPerRow: bytesPerRow,
space: colorSpace,
bitmapInfo: bitmapInfo)!
lapTex.getBytes(bitmapContext.data!,
bytesPerRow: bytesPerRow,
from: MTLRegionMake2D(0, 0, lapTex.width, lapTex.height),
mipmapLevel: 0)
return bitmapContext.makeImage()
}

Can't load large jpeg into a MTLTexture with MTKTextureLoader

I'm trying to load a large image into a MTLTexture and it works with 4000x6000 images. But when I try with 6000x8000 it can't.
func setTexture(device: MTLDevice, imageName: String) -> MTLTexture? {
let textureLoader = MTKTextureLoader(device: device)
var texture: MTLTexture? = nil
// In iOS 10 the origin was changed.
let textureLoaderOptions: [MTKTextureLoader.Option: Any]
if #available(iOS 10.0, *) {
let origin = MTKTextureLoader.Origin.bottomLeft.rawValue
textureLoaderOptions = [MTKTextureLoader.Option.origin : origin]
} else {
textureLoaderOptions = [:]
}
if let textureURL = Bundle.main.url(forResource: imageName, withExtension: nil, subdirectory: "Images") {
do {
texture = try textureLoader.newTexture(URL: textureURL, options: textureLoaderOptions)
} catch {
print("Texture not created.")
}
}
return texture
}
Pretty basic code. I'm running it in an iPad Pro with A9 chip, GPU family 3. It should handle textures this large. Should I manually tile it somehow if it doesn't accept this size? In that case, what's the best approach: using MTLRegionMake to copy bytes, slicing in Core Image or a Core Graphics context...
I appreciate any help
Following your helpful comments I decided to load it manually drawing to a CGContext and copying to a MTLTexture. I'm adding the solution code below. The context shouldn't be created each time a texture is created, it's better to put it outside the function and keep reusing it.
// Grab the CGImage, w = width, h = height...
let context = CGContext(data: nil, width: w, height: h, bitsPerComponent: bpc, bytesPerRow: (bpp / 8) * w, space: colorSpace!, bitmapInfo: bitmapInfo.rawValue)
let flip = CGAffineTransform(a: 1, b: 0, c: 0, d: -1, tx: 0, ty: CGFloat(h))
context?.concatenate(flip)
context?.draw(cgImage, in: CGRect(x: 0, y: 0, width: CGFloat(w), height: CGFloat(h)))
let textureDescriptor = MTLTextureDescriptor()
textureDescriptor.pixelFormat = .rgba8Unorm
textureDescriptor.width = w
textureDescriptor.height = h
guard let data = context?.data else {print("No data in context."); return nil}
let texture = device.makeTexture(descriptor: textureDescriptor)
texture?.replace(region: MTLRegionMake2D(0, 0, w, h), mipmapLevel: 0, withBytes: data, bytesPerRow: 4 * w)
return texture
I had this issue before, a texture would load on one device and not on another. I think it is a bug with the texture loader.
You can load in a texture manually using CGImage and a CGContext, draw the image into the context. Create a MTLTexture buffer, then copy the bytes from the CGContext into the texture using a MTLRegion.
It's not fool proof, you have to make sure to use the correct pixel format for the metal buffer or you'll get strange results, so either you code for one specific format of image you're importing, or do a lot of checking. Apples' Basic Texturing example shows how you can change the color order before writing the bytes to the texture using MTLRegion.

64-bit RGBA UIImage? CGBitmapInfo for 64-bit

I'm trying to save a 16-bit depth PNG image with P3 color space from a Metal texture on iOS. The texture has pixelformat = .rgba16Unorm, and I extract the data with this code
func dataProviderRef() -> CGDataProvider? {
let pixelCount = width * height
var imageBytes = [UInt8](repeating: 0, count: pixelCount * bytesPerPixel)
let region = MTLRegionMake2D(0, 0, width, height)
getBytes(&imageBytes, bytesPerRow: bytesPerRow, from: region, mipmapLevel: 0)
return CGDataProvider(data: NSData(bytes: &imageBytes, length: pixelCount * bytesPerPixel * MemoryLayout<UInt8>.size))
}
I figured out that the way to save a PNG image on iOS would be to create a UIImage first, and to initialize it, I need to create a CGImage. The problem is I don't know what to pass to CGIBitmapInfo. In the documentation I can see you can specify the byteOrder for 32-bit formats, but not for 64-bit.
The function I use to convert the texture to an UIImage is this,
extension UIImage {
public convenience init?(texture: MTLTexture) {
guard let rgbColorSpace = texture.defaultColorSpace else {
return nil
}
let bitmapInfo:CGBitmapInfo = [CGBitmapInfo(rawValue: CGImageAlphaInfo.last.rawValue)]
guard let provider = texture.dataProviderRef() else {
return nil
}
guard let cgim = CGImage(
width: texture.width,
height: texture.height,
bitsPerComponent: texture.bitsPerComponent,
bitsPerPixel: texture.bitsPerPixel,
bytesPerRow: texture.bytesPerRow,
space: rgbColorSpace,
bitmapInfo: bitmapInfo,
provider: provider,
decode: nil,
shouldInterpolate: false,
intent: .defaultIntent
)
else {
return nil
}
self.init(cgImage: cgim)
}
}
Note that "texture" is using a series of attributes that do not exist in MTLTexture. I created a simple extension for convenience. The only interesting bit I guess it's the color space, that at the moment is simply,
public extension MTLTexture {
var defaultColorSpace: CGColorSpace? {
get {
switch pixelFormat {
case .rgba16Unorm:
return CGColorSpace(name: CGColorSpace.displayP3)
default:
return CGColorSpaceCreateDeviceRGB()
}
}
}
}
It looks like the image I'm creating with that code above is sampling 4 bytes per pixel, instead of 8. So I obviously end up with a funny looking image...
How do I create the appropriate CGBitmapInfo? Is it even possible?
P.S. If you want to see the full code with an example, it's all in github: https://github.com/endavid/VidEngine/tree/master/SampleColorPalette
The answer was using byteOrder16. For instance, I've replace bitmapInfo in the code above for this,
let isFloat = texture.bitsPerComponent == 16
let bitmapInfo:CGBitmapInfo = [isFloat ? .byteOrder16Little : .byteOrder32Big, CGBitmapInfo(rawValue: CGImageAlphaInfo.last.rawValue)]
(The alpha can be premultiplied as well).
The SDK documentation does not provide many hints of why this is, but the book Programming with Quartz has a nice explanation of the meaning of these 16 bits:
The value byteOrder16Little specifies to Quartz that each 16-bit chunk of data supplied by your data provider should be treated in little endian order [...] For example, when using a value of byteOrder16Little for an image that specifies RGB format with 16 bits per component and 48 bits per pixel, your data provider supplies the data for each pixel where the components are ordered R, G, B, but each color component value is in little-endian order [...] For best performance when using byteOrder16Little, either the pixel size or the component size of the image must be 16 bits.
So for a 64-bit image in rgba16, the pixel size is 64 bits, but the component size is 16 bits. It works nicely :)
(Thanks #warrenm !)

Resources