Unexpected output converting CVPixelBuffer to MTLTexture - ios

I am extracting SampleBuffers from the AVAsset using AVAssetReader. I am converting each CMSampleBuffer to MTLTexture every iteration using the code snippet below. however, I am getting expected CVPixelBuffer but when I try to convert it I got . expected output is ..
I already try debugging width and height which is accurate, I try to use the different video same issue occurred, tried creating different textureCache. same issue.
func convertToMTLTexture(sampleBuffer: CMSampleBuffer?) -> MTLTexture? {
if let textureCache = textureCache,
let sampleBuffer = sampleBuffer,
let imageBuffer:CVPixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) {
let width = CVPixelBufferGetWidth(imageBuffer)
let height = CVPixelBufferGetHeight(imageBuffer)
var texture: CVMetalTexture?
CVMetalTextureCacheCreateTextureFromImage(kCFAllocatorDefault, textureCache,
imageBuffer, nil, .bgra8Unorm, width, height, 0, &texture)
if let texture = texture {
return CVMetalTextureGetTexture(texture)
}
}
return nil
}

Related

How to combine MTLTextures into the currentDrawable

I am new to using Metal but I have been following the tutorial here that takes the camera output and renders it on to the screen using metal.
Now I want to take an image, turn it into a MTLTexture, and position and render that texture on top of the camera output.
My current rendering code is as follows:
private func render(texture: MTLTexture, withCommandBuffer commandBuffer: MTLCommandBuffer, device: MTLDevice) {
guard
let currentRenderPassDescriptor = metalView.currentRenderPassDescriptor,
let currentDrawable = metalView.currentDrawable,
let renderPipelineState = renderPipelineState,
let encoder = commandBuffer.makeRenderCommandEncoder(descriptor: currentRenderPassDescriptor)
else {
semaphore.signal()
return
}
encoder.pushDebugGroup("RenderFrame")
encoder.setRenderPipelineState(renderPipelineState)
encoder.setFragmentTexture(texture, index: 0)
encoder.drawPrimitives(type: .triangleStrip, vertexStart: 0, vertexCount: 4, instanceCount: 1)
encoder.popDebugGroup()
encoder.endEncoding()
commandBuffer.addScheduledHandler { [weak self] (buffer) in
guard let unwrappedSelf = self else { return }
unwrappedSelf.didRenderTexture(texture, withCommandBuffer: buffer, device: device)
unwrappedSelf.semaphore.signal()
}
commandBuffer.present(currentDrawable)
commandBuffer.commit()
}
I know that I can convert a UIImage to a MTLTexture using the following code:
let textureLoader = MTKTextureLoader(device: device)
let cgImage = UIImage(named: "myImage")!.cgImage!
let imageTexture = try! textureLoader.newTexture(cgImage: cgImage, options: nil)
So now I have two MTLTextures. Is there a simple function that allows me to combine them? I've been trying to search online and someone mentioned a function called over, but I haven't actually been able to find that one. Any help would be greatly appreciated.
You can simply do this inside the shader by adding or multiplying color values. I guess that's what shaders are for.

How to create depth data and add it to an image?

Sorry, I duplicated this question How to build AVDepthData manually, because it doesn't have answers I want and I don't have enough rep to comment there. If you don't mind, I could remove my question in the future and ask somebody to move future answers to that topic.
So, my goal is to create depth data and attach it to an arbitrary image. There is an article on how to do it https://developer.apple.com/documentation/avfoundation/avdepthdata/creating_auxiliary_depth_data_manually, but I don't know how to implement any step of it. I won't post all questions at once and start with the first one.
As a first step a depth image must be converted per-pixel from grayscale to depth or disparity values. I took this snippet from the aforementioned topic:
func buildDepth(image: UIImage) -> AVDepthData? {
let width = Int(image.size.width)
let height = Int(image.size.height)
var maybeDepthMapPixelBuffer: CVPixelBuffer?
let status = CVPixelBufferCreate(kCFAllocatorDefault, width, height, kCVPixelFormatType_DisparityFloat32, nil, &maybeDepthMapPixelBuffer)
guard status == kCVReturnSuccess, let depthMapPixelBuffer = maybeDepthMapPixelBuffer else {
return nil
}
CVPixelBufferLockBaseAddress(depthMapPixelBuffer, .init(rawValue: 0))
guard let baseAddress = CVPixelBufferGetBaseAddress(depthMapPixelBuffer) else {
return nil
}
let buffer = unsafeBitCast(baseAddress, to: UnsafeMutablePointer<Float32>.self)
for i in 0..<width * height {
buffer[i] = 0 // disparity must be calculated somehow, but set to 0 for testing purposes
}
CVPixelBufferUnlockBaseAddress(depthMapPixelBuffer, .init(rawValue: 0))
let info: [AnyHashable: Any] = [kCGImagePropertyPixelFormat: kCVPixelFormatType_DisparityFloat32,
kCGImagePropertyWidth: image.size.width,
kCGImagePropertyHeight: image.size.height,
kCGImagePropertyBytesPerRow: CVPixelBufferGetBytesPerRow(depthMapPixelBuffer)]
let metadata = generateMetadata(image: image)
let dic: [AnyHashable: Any] = [kCGImageAuxiliaryDataInfoDataDescription: info,
// I get an error when converting baseAddress to CFData
kCGImageAuxiliaryDataInfoData: baseAddress as! CFData,
kCGImageAuxiliaryDataInfoMetadata: metadata]
guard let depthData = try? AVDepthData(fromDictionaryRepresentation: dic) else {
return nil
}
return depthData
}
Then the article says to load a base address of a pixel buffer (in which is the disparity map) as CFData and pass it as kCGImageAuxiliaryDataInfoData value into a CFDictionary. But I get an error when converting baseAddress to CFData. I tried to convert the pixel buffer itself too, but without luck. What do I have to pass as kCGImageAuxiliaryDataInfoData? Did I create the disparity buffer correctly in the first place?
Aside from this problem it would be cool if somebody could direct me to some sample code on how to do the whole thing.
Your question really helped me get from cvPixelBuffer to AVDepthData so thank you. It was about 95% of the way there.
To fix your (and mine) issue I added the following:
let bytesPerRow = CVPixelBufferGetBytesPerRow(depthMapPixelBuffer)
let size = bytesPerRow * height;
... code code code ...
CVPixelBufferLockBaseAddress(depthMapPixelBuffer!, .init(rawValue: 0))
let baseAddress = CVPixelBufferGetBaseAddressOfPlane(depthMapPixelBuffer!, 0)
let data = NSData(bytes: baseAddress, length: size);
... code code code ...
let dic: [AnyHashable: Any] = [kCGImageAuxiliaryDataInfoDataDescription: info,
kCGImageAuxiliaryDataInfoData: data,
kCGImageAuxiliaryDataInfoMetadata: metadata]

MTKView - Draw on to Two Views at Once

What I got
I am following Apple sample code AVCamPhotoFilter to display camera feed on a MTKView.
What I am trying to do
In addition to above MTKView, I need to display a second MTKView. However, the second one will be displaying exactly the same content as the first one. So I do not want to duplicate the code and do work twice.
Current drawing method
override func draw(_ rect: CGRect) {
var pixelBuffer: CVPixelBuffer?
var mirroring = false
var rotation: Rotation = .rotate0Degrees
syncQueue.sync {
pixelBuffer = internalPixelBuffer
mirroring = internalMirroring
rotation = internalRotation
}
guard let drawable = currentDrawable,
let currentRenderPassDescriptor = currentRenderPassDescriptor,
let previewPixelBuffer = pixelBuffer else {
return
}
// Create a Metal texture from the image buffer
let width = CVPixelBufferGetWidth(previewPixelBuffer)
let height = CVPixelBufferGetHeight(previewPixelBuffer)
if textureCache == nil {
createTextureCache()
}
var cvTextureOut: CVMetalTexture?
CVMetalTextureCacheCreateTextureFromImage(kCFAllocatorDefault,
textureCache!,
previewPixelBuffer,
nil,
.bgra8Unorm,
width,
height,
0,
&cvTextureOut)
guard let cvTexture = cvTextureOut, let texture = CVMetalTextureGetTexture(cvTexture) else {
print("Failed to create preview texture")
CVMetalTextureCacheFlush(textureCache!, 0)
return
}
if texture.width != textureWidth ||
texture.height != textureHeight ||
self.bounds != internalBounds ||
mirroring != textureMirroring ||
rotation != textureRotation {
setupTransform(width: texture.width, height: texture.height, mirroring: mirroring, rotation: rotation)
}
// Set up command buffer and encoder
guard let commandQueue = commandQueue else {
print("Failed to create Metal command queue")
CVMetalTextureCacheFlush(textureCache!, 0)
return
}
guard let commandBuffer = commandQueue.makeCommandBuffer() else {
print("Failed to create Metal command buffer")
CVMetalTextureCacheFlush(textureCache!, 0)
return
}
guard let commandEncoder = commandBuffer.makeRenderCommandEncoder(descriptor: currentRenderPassDescriptor) else {
print("Failed to create Metal command encoder")
CVMetalTextureCacheFlush(textureCache!, 0)
return
}
commandEncoder.label = "Preview display"
commandEncoder.setRenderPipelineState(renderPipelineState!)
commandEncoder.setVertexBuffer(vertexCoordBuffer, offset: 0, index: 0)
commandEncoder.setVertexBuffer(textCoordBuffer, offset: 0, index: 1)
commandEncoder.setFragmentTexture(texture, index: 0)
commandEncoder.setFragmentSamplerState(sampler, index: 0)
commandEncoder.drawPrimitives(type: .triangleStrip, vertexStart: 0, vertexCount: 4)
commandEncoder.endEncoding()
commandBuffer.present(drawable) // Draw to the screen
commandBuffer.commit()
}
Question
Is there a way I can simply pass on the texture to the second MTKView and draw without doing work twice?
If you set the framebufferOnly property of the first MTKView to false, you can submit commands which read from its drawable texture. Then, you can use a blit command encoder to copy from the first drawable's texture to the second's, if they are compatible. Otherwise, you can draw a quad to the second drawable's texture with the first drawable's texture as the source for texturing the quad.
Personally, I think I would prefer all of the rendering to go to a texture of your own creation (not any drawable's texture). Then, copy/draw that to both of the drawable textures.
In any case, if you need the two views to update in perfect sync, you should set presentsWithTransaction to true for both views, synchronously wait (using -waitUntilScheduled) for the command buffer that does (at least) the copy/draw to the drawable textures, and then call -present directly on both drawables. (That is, don't use -presentDrawable: on the command buffer.)

Fastest way to record video from SCNView

I have SCNView with some object in the middle of screen, user can rotate it, scale, etc.
I want to record all this movements in video and add some sound in realtime. Also I want to record only middle part of SCNView (e.g. SCNView frame is 375x812 but I want only middle 375x375 without top and bottom border). Also I want to show it on screen simultaneously with video capturing.
My current variants are:
func renderer(_ renderer: SCNSceneRenderer, didRenderScene scene: SCNScene, atTime time: TimeInterval) {
DispatchQueue.main.async {
if let metalLayer = self.sceneView.layer as? CAMetalLayer, let texture = metalLayer.currentSceneDrawable?.texture, let pixelBufferPool = self.pixelBufferPool {
//1
var maybePixelBuffer: CVPixelBuffer? = nil
let status = CVPixelBufferPoolCreatePixelBuffer(nil, pixelBufferPool, &maybePixelBuffer)
guard let pixelBuffer = maybePixelBuffer else { return }
CVPixelBufferLockBaseAddress(pixelBuffer, [])
let bytesPerRow = CVPixelBufferGetBytesPerRow(pixelBuffer)
let region = MTLRegionMake2D(Int(self.fieldOfView.origin.x * UIScreen.main.scale),
Int(self.fieldOfView.origin.y * UIScreen.main.scale),
Int(self.fieldOfView.width * UIScreen.main.scale),
Int(self.fieldOfView.height * UIScreen.main.scale))
let pixelBufferBytes = CVPixelBufferGetBaseAddress(pixelBuffer)!
texture.getBytes(pixelBufferBytes, bytesPerRow: bytesPerRow, from: region, mipmapLevel: 0)
let uiImage = self.image(from: pixelBuffer)
CVPixelBufferUnlockBaseAddress(pixelBuffer, [])
//2
if #available(iOS 11.0, *) {
var pixelBuffer: Unmanaged<CVPixelBuffer>? = nil
CVPixelBufferCreateWithIOSurface(kCFAllocatorDefault, texture.iosurface!, nil, UnsafeMutablePointer<Unmanaged<CVPixelBuffer>?>(&pixelBuffer))
let imageBuffer = pixelBuffer!.takeUnretainedValue()
} else {
// Fallback on earlier versions
}
//3
var pb: CVPixelBuffer? = nil
let result = CVPixelBufferCreate(kCFAllocatorDefault, texture.width, texture.height, kCVPixelFormatType_32BGRA, nil, &pb)
print(result)
let ciImage = CIImage(mtlTexture: texture, options: nil)
let context = CIContext()
context.render(ciImage!, to: pb!)
}
}
}
Obtained CVPixelBuffer will be added to AVAssetWriter.
but all of this methods have some flaws.
1) MTLTexture has colorPixelFormat == 555 (bgra10_XR_sRGB if I recall correctly) and I don't know how to convert it to BGR (to append it to the aseetWriter) nor how to change that colorPixelFormat nor how to add bgra10_XR_sRGB to the aseetWriter.
2) How to implement version for iOS10?
2,3) What is the fastest way to crop an image? Using this methods I can grab only full image instead of cropped one. And I don't want to convert it to UIImage because it too slow.
P.S. my previous viewer was on OpenGL ES(GLKView) and I successfully did it using this technique (overhead 1ms instead of 30ms using .screenshot method)

Changing size of image obtained from camera explanation

I am new to iOS and have experience with image processing coding in other languages, which I was hoping to translate into an app, however I am getting some unusual behavior which I don't understand. When I convert an image to a Data array and look at the number of elements in the array, with every new image this number changes. When I look at the specific data in the array, the values are 0-255 which matches what I would expect for a grayscale image, but I am confused why the size (or number of elements) in the data array changes. I would expect it to be held constant since I set the captureSession to 640x480. Why is this not the case? Even if it wasn't a grayscale image, I would expect the size to remain the same and not change picture to picture.
UPDATE:
I am getting the uiimage from AV, and the code is shown below. The other rest of the code not shown is just beginning the session. I basically want to turn the image into raw pixel data, which I have seen a lot of different ways to do this, but this seems like a good method.
Relevant Code:
#objc func timerHandle() {
imageView.image = uiimages
}
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
uiimages = sampleBuffer.image(orientation: .down, scale: 1.0)!
print(uiimages) //output1
let data = sampleBuffer.data()
let newData = Array(data!)
print(data!.count) //output2
}
extension CMSampleBuffer {
func image(orientation: UIImageOrientation = .left, scale: CGFloat = 1.0) -> UIImage? {
if let buffer = CMSampleBufferGetImageBuffer(self) {
let ciImage = CIImage(cvPixelBuffer: buffer).applyingFilter("CIColorControls", parameters: [kCIInputSaturationKey:0.0])
return UIImage(ciImage: ciImage, scale: scale, orientation: orientation)
}
return nil
}
func data(orientation: UIImageOrientation = .left, scale: CGFloat = 1.0) -> Data? {
if let buffer = CMSampleBufferGetImageBuffer(self) {
let size = self.image()?.size
let scale = self.image()?.scale
let ciImage = CIImage(cvPixelBuffer: buffer).applyingFilter("CIColorControls", parameters: [kCIInputSaturationKey:0.0])
UIGraphicsBeginImageContextWithOptions(size!, false, scale!)
defer { UIGraphicsEndImageContext() }
UIImage(ciImage: ciImage).draw(in: CGRect(origin: .zero, size: size!))
guard let redraw = UIGraphicsGetImageFromCurrentImageContext() else { return nil }
return UIImagePNGRepresentation(redraw)
}
return nil
}
}
At output1, when I straight print the uiimage variable I get:
<UIImage: 0x1c40b0ec0>, {640, 480}
which shows correct dimensions
at output2, when I print the count, every time captureOutput is called I get a different value:
225726
224474
225961
640x480 should give me 307,200, so why am I not getting constant numbers at least, even if the value isn't correct.

Resources