Raw Image data format from iOS Camera - ios

I need to save raw data from the iOS camera to the cloud. I get a CVPixelBuffer from the iOS camera. I do not specify what format (kCVPixelBufferPixelFormatTypeKey) I want the CVPixelBuffer in when I set up the iOS camera.
I turn the CVPixelBuffer into a Data object like this.
let buffer: CVPixelBuffer = //My buffer from the camera
//Get the height for the size calculation
let height = CVPixelBufferGetHeight(buffer)
//Get the bytes per row for the size calculation
let bytesPerRow = CVPixelBufferGetBytesPerRow(buffer)
//Lock the buffer so we can turn it into data
CVPixelBufferLockBaseAddress(buffer, CVPixelBufferLockFlags.readOnly)
//Get the base address. This is the address in memory of where the start of the buffer is currently stored.
guard let pointer = CVPixelBufferGetBaseAddress(buffer) else {
//If we failed to get the base address unlock the buffer and clean up.
CVPixelBufferUnlockBaseAddress(buffer, CVPixelBufferLockFlags.readOnly)
return
}
let data = Data(bytes: pointer, count: height * bytesPerRow)
Am I doing this right?
Also, when I get this data back how would I turn it into a CIImage?

Related

Deep copy CVPixelBuffer for depth data in Swift

I'm getting a stream of depth data from AVCaptureSynchronizedDataCollection and trying to do some processing on the depthDataMap asynchronously. I tried to deep copy the CVPixelBuffer since I don't want to block the camera while processing, but it doesn't seem the copied buffer is correct because I keep getting bad access errors. Here is the code I'm using to deep copy the CVPixelBuffer:
func duplicatePixelBuffer(input: CVPixelBuffer) -> CVPixelBuffer {
var copyOut: CVPixelBuffer?
let bufferWidth = CVPixelBufferGetWidth(input)
let bufferHeight = CVPixelBufferGetHeight(input)
let bytesPerRow = CVPixelBufferGetBytesPerRow(input)
let bufferFormat = CVPixelBufferGetPixelFormatType(input)
_ = CVPixelBufferCreate(kCFAllocatorDefault, bufferWidth, bufferHeight, bufferFormat, CVBufferGetAttachments(input, CVAttachmentMode.shouldPropagate), &copyOut)
let output = copyOut!
// Lock the depth map base address before accessing it
CVPixelBufferLockBaseAddress(input, CVPixelBufferLockFlags.readOnly)
CVPixelBufferLockBaseAddress(output, CVPixelBufferLockFlags.readOnly)
let baseAddress = CVPixelBufferGetBaseAddress(input)
let baseAddressCopy = CVPixelBufferGetBaseAddress(output)
memcpy(baseAddressCopy, baseAddress, bufferHeight * bytesPerRow)
// Unlock the base address when finished accessing the buffer
CVPixelBufferUnlockBaseAddress(input, CVPixelBufferLockFlags.readOnly)
CVPixelBufferUnlockBaseAddress(output, CVPixelBufferLockFlags.readOnly)
NSLog("Pixel buffer original: \(input)")
NSLog("Pixel buffer copy: \(output)")
return output
}
I checked the two CVPixelBuffer objects before the return and it seems like there is no iosurface for the copied buffer. Also, there is a MetadataDictionary object in propagatedAttachments in the original, but in the copy the MetadataDictionary object is directly in attributes.
I've tried some of the other solutions on Stack Overflow with no luck since my planes are non-planar. Would appreciate any insights on this or if I should try a different approach entirely. Thanks!

MTKView frequently displaying scrambled MTLTextures

I am working on an MTKView-backed paint program which can replay painting history via an array of MTLTextures that store keyframes. I am having an issue in which sometimes the content of these MTLTextures is scrambled.
As an example, say I want to store a section of the drawing below as a keyframe:
During playback, sometimes the drawing will display exactly as intended, but sometimes, it will display like this:
Note the distorted portion of the picture. (The undistorted portion constitutes a static background image that's not part of the keyframe in question)
I describe the way I Create individual MTLTextures from the MTKView's currentDrawable below. Because of color depth issues I won't go into, the process may seem a little round-about.
I first get a CGImage of the subsection of the screen that constitutes a keyframe.
I use that CGImage to create an MTLTexture tied to the MTKView's device.
I store that MTLTexture into a MTLTextureStructure that stores the MTLTexture and the keyframe's bounding-box (which I'll need later)
Lastly, I store in an array of MTLTextureStructures (keyframeMetalArray). During playback, when I hit a keyframe, I get it from this keyframeMetalArray.
The associated code is outlined below.
let keyframeCGImage = weakSelf!.canvasMetalViewPainting.mtlTextureToCGImage(bbox: keyframeBbox, copyMode: copyTextureMode.textureKeyframe) // convert from MetalTexture to CGImage
let keyframeMTLTexture = weakSelf!.canvasMetalViewPainting.CGImageToMTLTexture(cgImage: keyframeCGImage)
let keyframeMTLTextureStruc = mtlTextureStructure(texture: keyframeMTLTexture, bbox: keyframeBbox, strokeType: brushTypeMode.brush)
weakSelf!.keyframeMetalArray.append(keyframeMTLTextureStruc)
Without providing specifics about how each conversion is happening, I wonder if, from an architecture design point, I'm overlooking something that is corrupting my data stored in the keyframeMetalArray. It may be unwise to try to store these MTLTextures in volatile arrays, but I don't know that for a fact. I just figured using MTLTextures would be the quickest way to update content.
By the way, when I swap out arrays of keyframes to arrays of UIImage.pngData, I have no display issues, but it's a lot slower. On the plus side, it tells me that the initial capture from currentDrawable to keyframeCGImage is working just fine.
Any thoughts would be appreciated.
p.s. adding a bit of detail based on the feedback:
mtlTextureToCGImage:
func mtlTextureToCGImage(bbox: CGRect, copyMode: copyTextureMode) -> CGImage {
let kciOptions = [convertFromCIContextOption(CIContextOption.outputPremultiplied): true,
convertFromCIContextOption(CIContextOption.useSoftwareRenderer): false] as [String : Any]
let bboxStrokeScaledFlippedY = CGRect(x: (bbox.origin.x * self.viewContentScaleFactor), y: ((self.viewBounds.height - bbox.origin.y - bbox.height) * self.viewContentScaleFactor), width: (bbox.width * self.viewContentScaleFactor), height: (bbox.height * self.viewContentScaleFactor))
let strokeCIImage = CIImage(mtlTexture: metalDrawableTextureKeyframe,
options: convertToOptionalCIImageOptionDictionary(kciOptions))!.oriented(CGImagePropertyOrientation.downMirrored)
let imageCropCG = cicontext.createCGImage(strokeCIImage, from: bboxStrokeScaledFlippedY, format: CIFormat.RGBA8, colorSpace: colorSpaceGenericRGBLinear)
cicontext.clearCaches()
return imageCropCG!
} // end of func mtlTextureToCGImage(bbox: CGRect)
CGImageToMTLTexture:
func CGImageToMTLTexture (cgImage: CGImage) -> MTLTexture {
// Note that we forego the more direct method of creating stampTexture:
//let stampTexture = try! MTKTextureLoader(device: self.device!).newTexture(cgImage: strokeUIImage.cgImage!, options: nil)
// because MTKTextureLoader seems to be doing additional processing which messes with the resulting texture/colorspace
let width = Int(cgImage.width)
let height = Int(cgImage.height)
let bytesPerPixel = 4
let rowBytes = width * bytesPerPixel
//
let texDescriptor = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: .rgba8Unorm,
width: width,
height: height,
mipmapped: false)
texDescriptor.usage = MTLTextureUsage(rawValue: MTLTextureUsage.shaderRead.rawValue)
texDescriptor.storageMode = .shared
guard let stampTexture = device!.makeTexture(descriptor: texDescriptor) else {
return brushTextureSquare // return SOMETHING
}
let dstData: CFData = (cgImage.dataProvider!.data)!
let pixelData = CFDataGetBytePtr(dstData)
let region = MTLRegionMake2D(0, 0, width, height)
print ("[MetalViewPainting]: w= \(width) | h= \(height) region = \(region.size)")
stampTexture.replace(region: region, mipmapLevel: 0, withBytes: pixelData!, bytesPerRow: Int(rowBytes))
return stampTexture
} // end of func CGImageToMTLTexture (cgImage: CGImage)
The type of distortion looks like a bytes-per-row alignment issue between CGImage and MTLTexture. You're probably only seeing this issue when your image is a certain size that falls outside of the bytes-per-row alignment requirement of your MTLDevice. If you really need to store the texture as a CGImage, ensure that you are using the bytesPerRow value of the CGImage when copying back to the texture.

How to read depth data at a CGPoint from AVDepthData buffer

I am attempting to find the depth data at a certain point in the captured image and return the distance in meters.
I have enabled depth data and am capturing the data alongside the image. I get the point from the X,Y coordinates of the center of the image (and when pressed) and convert it to the buffers index using
Int((width - touchPoint.x) * (height - touchPoint.y))
with WIDTH and HEIGHT being the dimensions of the captured image. I am not sure if this is the correct method to achieve this though.
I handle the depth data as such:
func handlePhotoDepthCalculation(point : Int) {
guard let depth = self.photo else {
return
}
//
// Convert Disparity to Depth
//
let depthData = (depth.depthData as AVDepthData!).converting(toDepthDataType: kCVPixelFormatType_DepthFloat32)
let depthDataMap = depthData.depthDataMap //AVDepthData -> CVPixelBuffer
//
// Set Accuracy feedback
//
let accuracy = depthData.depthDataAccuracy
switch (accuracy) {
case .absolute:
/*
NOTE - Values within the depth map are absolutely
accurate within the physical world.
*/
self.accuracyLbl.text = "Absolute"
break
case .relative:
/*
NOTE - Values within the depth data map are usable for
foreground/background separation, but are not absolutely
accurate in the physical world. iPhone always produces this.
*/
self.accuracyLbl.text = "Relative"
}
//
// We convert the data
//
CVPixelBufferLockBaseAddress(depthDataMap, CVPixelBufferLockFlags(rawValue: 0))
let depthPointer = unsafeBitCast(CVPixelBufferGetBaseAddress(depthDataMap), to: UnsafeMutablePointer<Float32>.self)
//
// Get depth value for image center
//
let distanceAtXYPoint = depthPointer[point]
//
// Set UI
//
self.distanceLbl.text = "\(distanceAtXYPoint) m" //Returns distance in meters?
self.filteredLbl.text = "\(depthData.isDepthDataFiltered)"
}
I am not convinced I am getting the correct position. From my research as well it looks like accuracy is only returned in .relative or .absolute and not a float/integer?
To access the depth data at a CGPoint do:
let point = CGPoint(35,26)
let width = CVPixelBufferGetWidth(depthDataMap)
let distanceAtXYPoint = depthPointer[Int(point.y * CGFloat(width) + point.x)]
I hope it works.
Access depth data at pixel position:
let depthDataMap: CVPixelBuffer = ...
let pixelX: Int = ...
let pixelY: Int = ...
CVPixelBufferLockBaseAddress(self, .readOnly)
let bytesPerRow = CVPixelBufferGetBytesPerRow(depthDataMap)
let baseAddress = CVPixelBufferGetBaseAddress(depthDataMap)!
assert(kCVPixelFormatType_DepthFloat32 == CVPixelBufferGetPixelFormatType(depthDataMap))
let rowData = baseAddress + pixelY * bytesPerRow
let distance = rowData.assumingMemoryBound(to: Float32.self)[pixelX]
CVPixelBufferUnlockBaseAddress(self, .readOnly)
For me the values where incorrect and inconsistent when accessing the depth by
let depthPointer = unsafeBitCast(CVPixelBufferGetBaseAddress(depthDataMap), to: UnsafeMutablePointer<Float32>.self)
Values indicating the general accuracy of a depth data map.
The accuracy of a depth data map is highly dependent on the camera calibration data used to generate it. If the camera's focal length cannot be precisely determined at the time of capture, scaling error in the z (depth) plane will be introduced. If the camera's optical center can't be precisely determined at capture time, principal point error will be introduced, leading to an offset error in the disparity estimate.
These values report the accuracy of a map's values with respect to its reported units.
case relative
Values within the depth data map are usable for foreground/background separation, but are not absolutely accurate in the physical world.
case absolute
Values within the depth map are absolutely accurate within the physical world.
You have get CGPoint from AVDepthData buffer like hight and width like follow code.
// Useful data
let width = CVPixelBufferGetWidth(depthDataMap)
let height = CVPixelBufferGetHeight(depthDataMap)
In Apple's sample project they use the code below.
Texturepoint is the touch point projected to metal view used in the sample project.
// scale
let scale = CGFloat(CVPixelBufferGetWidth(depthFrame)) / CGFloat(CVPixelBufferGetWidth(videoFrame))
let depthPoint = CGPoint(x: CGFloat(CVPixelBufferGetWidth(depthFrame)) - 1.0 - texturePoint.x * scale, y: texturePoint.y * scale)
assert(kCVPixelFormatType_DepthFloat16 == CVPixelBufferGetPixelFormatType(depthFrame))
CVPixelBufferLockBaseAddress(depthFrame, .readOnly)
let rowData = CVPixelBufferGetBaseAddress(depthFrame)! + Int(depthPoint.y) * CVPixelBufferGetBytesPerRow(depthFrame)
// swift does not have an Float16 data type. Use UInt16 instead, and then translate
var f16Pixel = rowData.assumingMemoryBound(to: UInt16.self)[Int(depthPoint.x)]
CVPixelBufferUnlockBaseAddress(depthFrame, .readOnly)
var f32Pixel = Float(0.0)
var src = vImage_Buffer(data: &f16Pixel, height: 1, width: 1, rowBytes: 2)
var dst = vImage_Buffer(data: &f32Pixel, height: 1, width: 1, rowBytes: 4)
vImageConvert_Planar16FtoPlanarF(&src, &dst, 0)
// Convert the depth frame format to cm
let depthString = String(format: "%.2f cm", f32Pixel * 100)

MTLTexture from CMSampleBuffer has 0 bytesPerRow

I am converting the CMSampleBuffer argument in the captureOutput function of my AVCaptureVideoDataOuput delegate into a MTLTexture like so (side note, I have set the pixel format of the video output to kCVPixelFormatType_32BGRA):
func captureOutput(_ captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) {
let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)!
let width = CVPixelBufferGetWidth(imageBuffer)
let height = CVPixelBufferGetHeight(imageBuffer)
var outTexture: CVMetalTexture? = nil
var textCache : CVMetalTextureCache?
CVMetalTextureCacheCreate(kCFAllocatorDefault, nil, metalDevice, nil, &textCache)
var textureRef : CVMetalTexture?
CVMetalTextureCacheCreateTextureFromImage(kCFAllocatorDefault, textCache!, imageBuffer, nil, MTLPixelFormat.bgra8Unorm, width, height, 0, &textureRef)
let texture = CVMetalTextureGetTexture(textureRef!)!
print(texture.bufferBytesPerRow)
}
The issue is when I print the bytes per row of the texture, it always prints 0, which is problematic because I later try to convert the texture back into a UIImage using the methodology in this article: https://www.invasivecode.com/weblog/metal-image-processing. Why is the texture I receive seemingly empty? I know the CMSampleBuffer property is fine because I can convert it into a UIIMage and draw it like so:
let myPixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
let myCIimage = CIImage(cvPixelBuffer: myPixelBuffer!)
let image = UIImage(ciImage: myCIimage)
self.imageView.image = image
The bufferBytesPerRow property is only meaningful for a texture that was created using the makeTexture(descriptor:offset:bytesPerRow:) method of a MTLBuffer. As you can see, the bytes-per-row is an input to that method to tell Metal how to interpret the data in the buffer. (The texture descriptor provides additional information, too, of course.) This method is only a means to get that back out.
Note that textures created from buffers can also report which buffer they were created from and the offset supplied to the above method.
Textures created in other ways don't have that information. These textures have no intrinsic bytes-per-row. Their data is not necessarily organized internally in a simple raster buffer.
If/when you want to get the data from a texture to either a Metal buffer or a plain old byte array, you have the freedom to choose a bytes-per-row value that's useful for your purposes, so long as it's at least the bytes-per-pixel of the texture pixel format times the texture's width. (It's more complicated for compressed formats.) The docs for getBytes(_:bytesPerRow:from:mipmapLevel:) and copy(from:sourceSlice:sourceLevel:sourceOrigin:sourceSize:to:destinationOffset:destinationBytesPerRow:destinationBytesPerImage:) explain further.

How to scale an image to half size through an array of bytes?

I found many examples about how to scale an image in Windows Forms, but at this case I'm using an array of bytes in a Windows Store application. This is the snippet code what I'm using.
// Now that you have the raw bytes, create a Image Decoder
BitmapDecoder decoder = await BitmapDecoder.CreateAsync(fileStream);
// Get the first frame from the decoder because we are picking an image
BitmapFrame frame = await decoder.GetFrameAsync(0);
// Convert the frame into pixels
PixelDataProvider pixelProvider = await frame.GetPixelDataAsync();
// Convert pixels into byte array
srcPixels = pixelProvider.DetachPixelData();
wid = (int)frame.PixelWidth;
hgt =(int)frame.PixelHeight;
// Create an in memory WriteableBitmap of the same size
bitmap = new WriteableBitmap(wid, hgt);
Stream pixelStream = bitmap.PixelBuffer.AsStream();
pixelStream.Seek(0, SeekOrigin.Begin);
// Push the pixels from the original file into the in-memory bitmap
pixelStream.Write(srcPixels, 0, (int)srcPixels.Length);
bitmap.Invalidate();
At this case, it is just creating a copy of the stream. I don't know how to manipulate the byte array to reduce it to the half width and height.
If you look at the MSDN documentation for GetPixelDataAsync, you can see that it has an overload that allows you to specify a BitmapTransform to be applied during the operation.
So you can do this in your example code, something like this:
// decode a frame (as you do now)
BitmapDecoder decoder = await BitmapDecoder.CreateAsync(fileStream);
BitmapFrame frame = await decoder.GetFrameAsync(0);
// calculate required scaled size
uint newWidth = frame.PixelWidth / 2;
uint newHeight = frame.PixelHeight / 2;
// convert (and resize) the frame into pixels
PixelDataProvider pixelProvider =
await frame.GetPixelDataAsync(
BitmapPixelFormat.Rgba8,
BitmapAlphaMode.Straight,
new BitmapTransform() { ScaledWidth = newWidth, ScaledHeight = newHeight},
ExifOrientationMode.RespectExifOrientation,
ColorManagementMode.DoNotColorManage);
Now, you can call DetachPixelData as in your original code, but this will give you the resized image instead of the full sized image.
srcPixels = pixelProvider.DetachPixelData();
// create an in memory WriteableBitmap of the scaled size
bitmap = new WriteableBitmap(newWidth, newHeight);
Stream pixelStream = bitmap.PixelBuffer.AsStream();
pixelStream.Seek(0, SeekOrigin.Begin);
// push the pixels from the original file into the in-memory bitmap
pixelStream.Write(srcPixels, 0, (int)srcPixels.Length);
bitmap.Invalidate();

Resources