How to convert CGImage to OTVideoFrame - ios

What is the best way to convert CGImage to OTVideoFrame?
I tried to get the underlying CGImage pixel buffer and feed it into an OTVideoBuffer, but got a distorted image.
Here is what I have done:
created a new OTVideoFormat object with ARGB pixel format
Set the bytesPerRow of the OTVideoFormat to height*width*4. Taking the value of CGImageGetBytesPerRow(...) did not work, got no error messages but also no frames on the other end of the line.
Copied the rows truncating them to convert from CGImageGetBytesPerRow(...) to height*width*4 bytes per row.
Got a distorted image with rows slightly shifted
Here is the code:
func toOTVideoFrame() throws -> OTVideoFrame {
let width : UInt32 = UInt32(CGImageGetWidth(self)) // self is a CGImage
let height : UInt32 = UInt32(CGImageGetHeight(self))
assert(CGImageGetBitsPerPixel(self) == 32)
assert(CGImageGetBitsPerComponent(self) == 8)
let bitmapInfo = CGImageGetBitmapInfo(self)
assert(bitmapInfo.contains(CGBitmapInfo.FloatComponents) == false)
assert(bitmapInfo.contains(CGBitmapInfo.ByteOrderDefault))
assert(CGImageGetAlphaInfo(self) == .NoneSkipFirst)
let bytesPerPixel : UInt32 = 4
let cgImageBytesPerRow : UInt32 = UInt32(CGImageGetBytesPerRow(self))
let otFrameBytesPerRow : UInt32 = bytesPerPixel * width
let videoFormat = OTVideoFormat()
videoFormat.pixelFormat = .ARGB
videoFormat.bytesPerRow.addObject(NSNumber(unsignedInt: otFrameBytesPerRow))
videoFormat.imageWidth = width
videoFormat.imageHeight = height
videoFormat.estimatedFramesPerSecond = 15
videoFormat.estimatedCaptureDelay = 100
let videoFrame = OTVideoFrame(format: videoFormat)
videoFrame.timestamp = CMTimeMake(0, 1) // This is temporary
videoFrame.orientation = OTVideoOrientation.Up // This is temporary
let dataProvider = CGImageGetDataProvider(self)
let imageData : NSData = CGDataProviderCopyData(dataProvider)!
let buffer = UnsafeMutablePointer<UInt8>.alloc(Int(otFrameBytesPerRow * height))
for currentRow in 0..<height {
let currentRowStartOffsetCGImage = currentRow * cgImageBytesPerRow
let currentRowStartOffsetOTVideoFrame = currentRow * otFrameBytesPerRow
let cgImageRange = NSRange(location: Int(currentRowStartOffsetCGImage), length: Int(otFrameBytesPerRow))
imageData.getBytes(buffer.advancedBy(Int(currentRowStartOffsetOTVideoFrame)),
range: cgImageRange)
}
do {
let planes = UnsafeMutablePointer<UnsafeMutablePointer<UInt8>>.alloc(1)
planes.initialize(buffer)
videoFrame.setPlanesWithPointers(planes, numPlanes: 1)
planes.dealloc(1)
}
return videoFrame
}
The result image:

Solved this issue by my own.
It appears to be a bug in the OpenTok SDK. The SDK does not seem to be able to handle images whose size is not a multiple of 16. When I changed all image sizes to be multiple of 16, everything started to work fine.
TokBox did not bother to state this limitation in the API documentation, nor throw an exception when the input image size is not a multiple of 16.
This is a second critical bug I have found in OpenTok SDK. I strongly suggest you do not use this product. It is of very low quality.

Related

OPEN3D: How to convert CVPixelBuffer to o3d.geometry.Image

I'd like to use the o3d.PointCloud.create_from_depth_image function to convert a depth image into point cloud.
Open3D docs say the following: An Open3D Image can be directly converted to/from a numpy array.
I have a CVPixelBuffer coming from camera.
How to create an o3d.geometry.Image from pixel array without saving it to disk first?
here's my code:
guard let cameraCalibrationData = frame.cameraCalibrationData else { return }
let frameIntrinsics = cameraCalibrationData.intrinsicMatrix
let referenceDimensions = cameraCalibrationData.intrinsicMatrixReferenceDimensions
let width = Float(referenceDimensions.width)
let height = Float(referenceDimensions.height)
let fx = frameIntrinsics.columns.0[0]
let fy = frameIntrinsics.columns.0[1]
let cx = frameIntrinsics.columns.2[0]
let cy = frameIntrinsics.columns.2[1]
let intrinsics = self.o3d.camera.PinholeCameraIntrinsic()
intrinsics.set_intrinsics(width, height, fx, fy, cx, cy)
//QUESTION HERE:
//how to convert CVPixelBuffer depth to o3d geometry IMAGE ?
let depth : CVPixelBuffer = frame.depthDataMap
let depthImage = self.o3d.geometry.Image()
let cloud = self.o3d.geometry.PointCloud.create_from_depth_image(depthImage, intrinsics)
print(cloud)

MTKView frequently displaying scrambled MTLTextures

I am working on an MTKView-backed paint program which can replay painting history via an array of MTLTextures that store keyframes. I am having an issue in which sometimes the content of these MTLTextures is scrambled.
As an example, say I want to store a section of the drawing below as a keyframe:
During playback, sometimes the drawing will display exactly as intended, but sometimes, it will display like this:
Note the distorted portion of the picture. (The undistorted portion constitutes a static background image that's not part of the keyframe in question)
I describe the way I Create individual MTLTextures from the MTKView's currentDrawable below. Because of color depth issues I won't go into, the process may seem a little round-about.
I first get a CGImage of the subsection of the screen that constitutes a keyframe.
I use that CGImage to create an MTLTexture tied to the MTKView's device.
I store that MTLTexture into a MTLTextureStructure that stores the MTLTexture and the keyframe's bounding-box (which I'll need later)
Lastly, I store in an array of MTLTextureStructures (keyframeMetalArray). During playback, when I hit a keyframe, I get it from this keyframeMetalArray.
The associated code is outlined below.
let keyframeCGImage = weakSelf!.canvasMetalViewPainting.mtlTextureToCGImage(bbox: keyframeBbox, copyMode: copyTextureMode.textureKeyframe) // convert from MetalTexture to CGImage
let keyframeMTLTexture = weakSelf!.canvasMetalViewPainting.CGImageToMTLTexture(cgImage: keyframeCGImage)
let keyframeMTLTextureStruc = mtlTextureStructure(texture: keyframeMTLTexture, bbox: keyframeBbox, strokeType: brushTypeMode.brush)
weakSelf!.keyframeMetalArray.append(keyframeMTLTextureStruc)
Without providing specifics about how each conversion is happening, I wonder if, from an architecture design point, I'm overlooking something that is corrupting my data stored in the keyframeMetalArray. It may be unwise to try to store these MTLTextures in volatile arrays, but I don't know that for a fact. I just figured using MTLTextures would be the quickest way to update content.
By the way, when I swap out arrays of keyframes to arrays of UIImage.pngData, I have no display issues, but it's a lot slower. On the plus side, it tells me that the initial capture from currentDrawable to keyframeCGImage is working just fine.
Any thoughts would be appreciated.
p.s. adding a bit of detail based on the feedback:
mtlTextureToCGImage:
func mtlTextureToCGImage(bbox: CGRect, copyMode: copyTextureMode) -> CGImage {
let kciOptions = [convertFromCIContextOption(CIContextOption.outputPremultiplied): true,
convertFromCIContextOption(CIContextOption.useSoftwareRenderer): false] as [String : Any]
let bboxStrokeScaledFlippedY = CGRect(x: (bbox.origin.x * self.viewContentScaleFactor), y: ((self.viewBounds.height - bbox.origin.y - bbox.height) * self.viewContentScaleFactor), width: (bbox.width * self.viewContentScaleFactor), height: (bbox.height * self.viewContentScaleFactor))
let strokeCIImage = CIImage(mtlTexture: metalDrawableTextureKeyframe,
options: convertToOptionalCIImageOptionDictionary(kciOptions))!.oriented(CGImagePropertyOrientation.downMirrored)
let imageCropCG = cicontext.createCGImage(strokeCIImage, from: bboxStrokeScaledFlippedY, format: CIFormat.RGBA8, colorSpace: colorSpaceGenericRGBLinear)
cicontext.clearCaches()
return imageCropCG!
} // end of func mtlTextureToCGImage(bbox: CGRect)
CGImageToMTLTexture:
func CGImageToMTLTexture (cgImage: CGImage) -> MTLTexture {
// Note that we forego the more direct method of creating stampTexture:
//let stampTexture = try! MTKTextureLoader(device: self.device!).newTexture(cgImage: strokeUIImage.cgImage!, options: nil)
// because MTKTextureLoader seems to be doing additional processing which messes with the resulting texture/colorspace
let width = Int(cgImage.width)
let height = Int(cgImage.height)
let bytesPerPixel = 4
let rowBytes = width * bytesPerPixel
//
let texDescriptor = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: .rgba8Unorm,
width: width,
height: height,
mipmapped: false)
texDescriptor.usage = MTLTextureUsage(rawValue: MTLTextureUsage.shaderRead.rawValue)
texDescriptor.storageMode = .shared
guard let stampTexture = device!.makeTexture(descriptor: texDescriptor) else {
return brushTextureSquare // return SOMETHING
}
let dstData: CFData = (cgImage.dataProvider!.data)!
let pixelData = CFDataGetBytePtr(dstData)
let region = MTLRegionMake2D(0, 0, width, height)
print ("[MetalViewPainting]: w= \(width) | h= \(height) region = \(region.size)")
stampTexture.replace(region: region, mipmapLevel: 0, withBytes: pixelData!, bytesPerRow: Int(rowBytes))
return stampTexture
} // end of func CGImageToMTLTexture (cgImage: CGImage)
The type of distortion looks like a bytes-per-row alignment issue between CGImage and MTLTexture. You're probably only seeing this issue when your image is a certain size that falls outside of the bytes-per-row alignment requirement of your MTLDevice. If you really need to store the texture as a CGImage, ensure that you are using the bytesPerRow value of the CGImage when copying back to the texture.

Ways to do inter-frame video compression in AVFoundation

I've created a process to generate video "slideshows" from collections of photographs and images in an application that I'm building. The process is functioning correctly, but creates unnecessarily large files given that any photographs included in the video repeat for 100 to 150 frames unchanged. I've included whatever compression I can find in AVFoundation, which mostly applies intra-frame techniques and tried to find more information on inter-frame compression in AVFoundation. Unfortunately, there are only a few references that I've been able to find and nothing that has let me get it to work.
I'm hoping that someone can steer me in the right direction. The code for the video generator is included below. I've not included the code for fetching and preparing the individual frames (called below as self.getFrame()) since that seems to be working fine and gets quite complex since it handles photos, videos, adding title frames, and doing fade transitions. For repeated frames, it returns a structure with the frame image and a counter for the number of output frames to include.
// Create a new AVAssetWriter Instance that will build the video
assetWriter = createAssetWriter(path: filePathNew, size: videoSize!)
guard assetWriter != nil else
{
print("Error converting images to video: AVAssetWriter not created.")
inProcess = false
return
}
let writerInput = assetWriter!.inputs.filter{ $0.mediaType == AVMediaTypeVideo }.first!
let sourceBufferAttributes : [String : AnyObject] = [
kCVPixelBufferPixelFormatTypeKey as String : Int(kCVPixelFormatType_32ARGB) as AnyObject,
kCVPixelBufferWidthKey as String : videoSize!.width as AnyObject,
kCVPixelBufferHeightKey as String : videoSize!.height as AnyObject,
AVVideoMaxKeyFrameIntervalKey as String : 50 as AnyObject,
AVVideoCompressionPropertiesKey as String : [
AVVideoAverageBitRateKey: 725000,
AVVideoProfileLevelKey: AVVideoProfileLevelH264Baseline30,
] as AnyObject
]
let pixelBufferAdaptor = AVAssetWriterInputPixelBufferAdaptor(assetWriterInput: writerInput, sourcePixelBufferAttributes: sourceBufferAttributes)
// Start the writing session
assetWriter!.startWriting()
assetWriter!.startSession(atSourceTime: kCMTimeZero)
if (pixelBufferAdaptor.pixelBufferPool == nil) {
print("Error converting images to video: pixelBufferPool nil after starting session")
inProcess = false
return
}
// -- Create queue for <requestMediaDataWhenReadyOnQueue>
let mediaQueue = DispatchQueue(label: "mediaInputQueue")
// Initialize run time values
var presentationTime = kCMTimeZero
var done = false
var nextFrame: FramePack? // The FramePack struct has the frame to output, noDisplays - the number of times that it will be output
// and an isLast flag that is true when it's the final frame
writerInput.requestMediaDataWhenReady(on: mediaQueue, using: { () -> Void in // Keeps invoking the block to get input until call markAsFinished
nextFrame = self.getFrame() // Get the next frame to be added to the output with its associated values
let imageCGOut = nextFrame!.frame // The frame to output
if nextFrame!.isLast { done = true } // Identifies the last frame so can drop through to markAsFinished() below
var frames = 0 // Counts how often we've output this frame
var waitCount = 0 // Used to avoid an infinite loop if there's trouble with writer.Input
while (frames < nextFrame!.noDisplays) && (waitCount < 1000000) // Need to wait for writerInput to be ready - count deals with potential hung writer
{
waitCount += 1
if waitCount == 1000000 // Have seen it go into 100s of thousands and succeed
{
print("Exceeded waitCount limit while attempting to output slideshow frame.")
self.inProcess = false
return
}
if (writerInput.isReadyForMoreMediaData)
{
waitCount = 0
frames += 1
autoreleasepool
{
if let pixelBufferPool = pixelBufferAdaptor.pixelBufferPool
{
let pixelBufferPointer = UnsafeMutablePointer<CVPixelBuffer?>.allocate(capacity: 1)
let status: CVReturn = CVPixelBufferPoolCreatePixelBuffer(
kCFAllocatorDefault,
pixelBufferPool,
pixelBufferPointer
)
if let pixelBuffer = pixelBufferPointer.pointee, status == 0
{
CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: CVOptionFlags(0)))
let pixelData = CVPixelBufferGetBaseAddress(pixelBuffer)
let rgbColorSpace = CGColorSpaceCreateDeviceRGB()
// Set up a context for rendering using the PixelBuffer allocated above as the target
let context = CGContext(
data: pixelData,
width: Int(self.videoWidth),
height: Int(self.videoHeight),
bitsPerComponent: 8,
bytesPerRow: CVPixelBufferGetBytesPerRow(pixelBuffer),
space: rgbColorSpace,
bitmapInfo: CGImageAlphaInfo.premultipliedFirst.rawValue
)
// Draw the image into the PixelBuffer used for the context
context?.draw(imageCGOut, in: CGRect(x: 0.0,y: 0.0,width: 1280, height: 720))
// Append the image (frame) from the context pixelBuffer onto the video file
_ = pixelBufferAdaptor.append(pixelBuffer, withPresentationTime: presentationTime)
presentationTime = presentationTime + CMTimeMake(1, videoFPS)
// We're done with the PixelBuffer, so unlock it
CVPixelBufferUnlockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: CVOptionFlags(0)))
}
pixelBufferPointer.deinitialize()
pixelBufferPointer.deallocate(capacity: 1)
} else {
NSLog("Error: Failed to allocate pixel buffer from pool")
}
}
}
}
Thanks in advance for any suggestions.
It looks like you're
appending a bunch of redundant frames to your video,
labouring under a misapprehension: that video files must have a constant framerate that is high, e.g. 30fps.
If, for example, you're showing a slideshow of 3 images over a duration of 15 seconds, then you need only output 3 images, with presentation timestamps of 0s, 5s, 10s and an assetWriter.endSession(atSourceTime:) of 15s, not 15s * 30 FPS = 450 frames .
In other words, your frame rate is way too high - for the best interframe compression money can buy, lower your frame rate to the bare minimum number of frames you need and all will be well*.
*I've seen some video services/players choke on unusually low framerates,
so you may need a minimum framerate and some redundant frames, e.g. 1frame/5s, ymmv

Camera viewfinder frame in iOS

I am using SWIFT language and trying to take snapshot images from the camera viewfinder buffer. So far everything works well except for the image color. It seems incorrect or being swapped. Below is the code snippets where I set the video settings and capturing the image frames
func addVideoOutput() {
videoDeviceOutput = AVCaptureVideoDataOutput()
videoDeviceOutput.videoSettings = NSDictionary(objectsAndKeys: Int(kCVPixelFormatType_32BGRA), kCVPixelBufferPixelFormatTypeKey) as[NSObject: AnyObject]
// kCVPixelFormatType_32ARGB tested and found not supported
videoDeviceOutput.alwaysDiscardsLateVideoFrames = true
videoDeviceOutput.setSampleBufferDelegate(self, queue: sessionQueue)
if captureSession!.canAddOutput(videoDeviceOutput) {
captureSession!.addOutput(videoDeviceOutput)
}
}
/* AVCaptureVideoDataOutput Delegate
------------------------------------------*/
func captureOutput(captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, fromConnection connection: AVCaptureConnection!) {
sessionDelegate ? .cameraSessionDidOutputSampleBuffer ? (sampleBuffer)
// Extract a UImage
//var pixel_buffer : CVPixelBufferRef?
let pixel_buffer = CMSampleBufferGetImageBuffer(sampleBuffer)
CVPixelBufferLockBaseAddress(pixel_buffer, 0);
// Get the number of bytes per row for the pixel buffer
var baseAddress = CVPixelBufferGetBaseAddress(pixel_buffer);
// Get the number of bytes per row for the pixel buffer
var bytesPerRow = CVPixelBufferGetBytesPerRow(pixel_buffer);
// Get the pixel buffer width and height
let width : Int = CVPixelBufferGetWidth(pixel_buffer);
let height : Int = CVPixelBufferGetHeight(pixel_buffer);
/*Create a CGImageRef from the CVImageBufferRef*/
let colorSpace: CGColorSpace = CGColorSpaceCreateDeviceRGB()
let bitmapInfo = CGBitmapInfo(CGImageAlphaInfo.PremultipliedLast.rawValue)
var newContext = CGBitmapContextCreate(baseAddress, width, height, 8, bytesPerRow, colorSpace, bitmapInfo)
CVPixelBufferUnlockBaseAddress(pixel_buffer, 0);
// get image frame and save to local storage
var refImage: CGImageRef = CGBitmapContextCreateImage(newContext)
var pixelData = CGDataProviderCopyData(CGImageGetDataProvider(refImage))
var image: UIImage = UIImage(CGImage: refImage)!;
self.SaveImageToDocumentStorage(image)
}
As you can see one of the comment line in the addVideoOutput function, I tried the kCVPixelFormatType_32ARGB format but it says not supported in iOS???
I kinda suspect the video format is 32BGRA but the color space for the image frame is set with CGColorSpaceCreateDeviceRGB(), but I could not find any other suitable RGB format for the video setting.
Any solutions or hints are much appreciated.
Thanks
I found the cause and a solution.
Just in case anyone experiences the same problem. Just change the bitmapInfo as follow:
// let bitmapInfo = CGBitmapInfo(CGImageAlphaInfo.PremultipliedLast.rawValue)
var bitmapInfo = CGBitmapInfo(rawValue: CGImageAlphaInfo.PremultipliedFirst.rawValue) | CGBitmapInfo.ByteOrder32Little

Calculating size (in bytes) of an image in memory

I am writing a small app in Swift to resize an image. I would like to calculate the size of the resized image (in bytes/KB). How do I do that?
Here is the piece of code I am working on:
var assetRepresentation : ALAssetRepresentation = asset.defaultRepresentation()
self.originalImageSize = assetRepresentation.size()
selectedImageSize = self.originalImageSize
// now scale the image
let image = selectedImage
let hasAlpha = false
let scale: CGFloat = 0.0 // Automatically use scale factor of main screen
UIGraphicsBeginImageContextWithOptions(sizeChange, !hasAlpha, scale)
image.drawInRect(CGRect(origin: CGPointZero, size: sizeChange))
let scaledImage = UIGraphicsGetImageFromCurrentImageContext()
self.backgroundImage.image = scaledImage
Since scaledImage is not yet saved, how do I go about calculating its size?
Since you're looking to display the size of the file to your user, NSByteCountFormatter is a good solution. It takes NSData, and can output a String representing the size of the data in a human readable format (like 1 KB, 2 MB, etc).
Since you're dealing with a UIImage though, you'll have to convert the UIImage to NSData to use this, which for example, can be done using UIImagePNGRepresentation() or UIImageJPEGRepresentation(), which returns NSData representative of the image in the specified format. A usage example could look something like this:
let data = UIImagePNGRepresentation(scaledImage)
let formatted = NSByteCountFormatter.stringFromByteCount(
Int64(data.length),
countStyle: NSByteCountFormatterCountStyle.File
)
println(formatted)
Edit: If as suggested by your title, you're looking to show this information with a specific unit of measurement (bytes), this can also be achieved with NSByteCountFormatter. You just have to create an instance of the class and set its allowedUnits property.
let data = UIImagePNGRepresentation(scaledImage)
let formatter = NSByteCountFormatter()
formatter.allowedUnits = NSByteCountFormatterUnits.UseBytes
formatter.countStyle = NSByteCountFormatterCountStyle.File
let formatted = formatter.stringFromByteCount(Int64(data.length))
println(formatted)
I used this to create my image:
var imageBuffer: UnsafeMutablePointer<UInt8> = nil
let ctx = CGBitmapContextCreate(imageBuffer, UInt(width), UInt(height), UInt(8), bitmapBytesPerRow, colorSpace, bitmapInfo)
imageBuffer is allocated automatically (see according documentation).

Resources