AVFoundation captureOutput didOutputSampleBuffer Delay - ios

I am using AVFoundation captureOutput didOutputSampleBuffer to extract an image then to be used for a filter.
self.bufferFrameQueue = DispatchQueue(label: "bufferFrame queue", qos: DispatchQoS.background, attributes: [], autoreleaseFrequency: .inherit)
self.videoDataOutput = AVCaptureVideoDataOutput()
if self.session.canAddOutput(self.videoDataOutput) {
self.session.addOutput(videoDataOutput)
self.videoDataOutput!.alwaysDiscardsLateVideoFrames = true
self.videoDataOutput!.videoSettings = [kCVPixelBufferPixelFormatTypeKey as String: Int(kCVPixelFormatType_32BGRA)]
self.videoDataOutput!.setSampleBufferDelegate(self, queue: self.bufferFrameQueue)
}
func captureOutput(_ captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) {
connection.videoOrientation = .portrait
let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)!
let ciImage = CIImage(cvPixelBuffer: pixelBuffer)
DispatchQueue.main.async {
self.cameraBufferImage = ciImage
}
}
Above just updates self.cameraBufferImage anytime there's a new output sample buffer.
Then, when a filter button is pressed, I use self.cameraBufferImage as this:
func filterButtonPressed() {
if var inputImage = self.cameraBufferImage {
if let currentFilter = CIFilter(name: "CISepiaTone") {
currentFilter.setValue(inputImage, forKey: "inputImage")
currentFilter.setValue(1, forKey: "inputIntensity")
if let output = currentFilter.outputImage {
if let cgimg = self.context.createCGImage(output, from: inputImage.extent) {
self.filterImageLayer = CALayer()
self.filterImageLayer!.frame = self.imagePreviewView.bounds
self.filterImageLayer!.contents = cgimg
self.filterImageLayer!.contentsGravity = kCAGravityResizeAspectFill
self.imagePreviewView.layer.addSublayer(self.filterImageLayer!)
}
}
}
}
}
When above method is invoked, it grabs the 'current' self.cameraBufferImage and use it to apply the filter. This works fine in normal exposure duration times (below 1/15 seconds or so...)
Issue
When exposure duration is slow, i.e. 1/3 seconds, it takes a awhile (about 1/3 seconds) to apply the filter. This delay is only present upon the first time after launch. If done again, there is no delay at all.
Thoughts
I understand that if exposure duration is 1/3 seconds, didOutputSampleBuffer only updates every 1/3 seconds. However, why is that initial delay? Shouldn't it just grab whatever self.cameraBufferImage available at that exact time, instead of waiting?
Queue issue?
CMSampleBuffer retain issue? (Although on Swift 3, there is no CFRetain)
Update
Apple's Documentation
Delegates receive this message whenever the output captures and
outputs a new video frame, decoding or re-encoding it as specified by
its videoSettings property. Delegates can use the provided video frame
in conjunction with other APIs for further processing.
This method is called on the dispatch queue specified by the output’s
sampleBufferCallbackQueue property. It is called periodically, so it
must be efficient to prevent capture performance problems, including
dropped frames.
If you need to reference the CMSampleBuffer object outside of the
scope of this method, you must CFRetain it and then CFRelease it when
you are finished with it.
To maintain optimal performance, some sample buffers directly
reference pools of memory that may need to be reused by the device
system and other capture inputs. This is frequently the case for
uncompressed device native capture where memory blocks are copied as
little as possible. If multiple sample buffers reference such pools of
memory for too long, inputs will no longer be able to copy new samples
into memory and those samples will be dropped.
If your application is causing samples to be dropped by retaining the
provided CMSampleBuffer objects for too long, but it needs access to
the sample data for a long period of time, consider copying the data
into a new buffer and then releasing the sample buffer (if it was
previously retained) so that the memory it references can be reused.

Related

Feed frames one at a time into WebRTC iOS

I am trying to make an iOS app that does some pre-processing on video from the camera, then sends it out over webrtc. I am doing the pre-processing on each individual frame using the AVCaptureVideoDataOutputSampleBufferDelegate protocol and then capturing the frame with the captureOutput method.
Now I need to figure out how to send it out on WebRTC. I am using the Google WebRTC library: https://webrtc.googlesource.com/src/.
There is a class called RTCCameraVideoCapturer [(link)][1] that most iOS example apps using this library seem to use. This class accesses the camera itself, so I won't be able to use it. It uses AVCaptureVideoDataOutputSampleBufferDelegate, and in captureOutput, it does this
RTC_OBJC_TYPE(RTCCVPixelBuffer) *rtcPixelBuffer =
[[RTC_OBJC_TYPE(RTCCVPixelBuffer) alloc] initWithPixelBuffer:pixelBuffer];
int64_t timeStampNs = CMTimeGetSeconds(CMSampleBufferGetPresentationTimeStamp(sampleBuffer)) *
kNanosecondsPerSecond;
RTC_OBJC_TYPE(RTCVideoFrame) *videoFrame =
[[RTC_OBJC_TYPE(RTCVideoFrame) alloc] initWithBuffer:rtcPixelBuffer
rotation:_rotation
timeStampNs:timeStampNs];
[self.delegate capturer:self didCaptureVideoFrame:videoFrame];
[self.delegate capturer:self didCaptureVideoFrame:videoFrame] seems to be the call that is made to feed a single frame into webRTC.
How can I write swift code that will allow me to feed frames into webRTC one at a time, similar to how it is done in the `RTCCameraVideoCapturer` class?
[1]: https://webrtc.googlesource.com/src/+/refs/heads/master/sdk/objc/components/capturer/RTCCameraVideoCapturer.m
You just need to create an instance of RTCVideoCapturer (which is just a holder of the delegate, localVideoTrack.source), and calls a delegate method "capturer" with a frame whenever you have a pixelBuffer you want to push.
Here is a sample code.
var capturer: RTCVideoCapturer?
let rtcQueue = DispatchQueue(label: "WebRTC")
func appClient(_ client: ARDAppClient!, didReceiveLocalVideoTrack localVideoTrack: RTCVideoTrack!) {
capturer = RTCVideoCapturer(delegate: localVideoTrack.source)
}
func render(pixelBuffer: CVPixelBuffer, timesample: CMTime) {
let buffer = RTCCVPixelBuffer(pixelBuffer: pixelBuffer)
self.rtcQueue.async {
let frame = RTCVideoFrame(buffer: buffer, rotation: ._0, timeStampNs: Int64(CMTimeGetSeconds(timesample) * Double(NSEC_PER_SEC)))
self.capturer?.delegate?.capturer(self.capturer!, didCapture: frame)
}
}

Rendering a video in a CALayer hierarchy using CIFilters

In the UI of my iOS app, I display a complex hierarchy of CALayers. One of these layers is a AVPlayerLayer that displays a video with CIFilters applied in real time (using AVVideoComposition(asset:, applyingCIFiltersWithHandler:)).
Now I want to export this layer composition to a video file. There are two tools in AVFoundation that seem helpful:
A: AVVideoCompositionCoreAnimationTool which allows rendering a video inside a (possibly animated) CALayer hierarchy
B: AVVideoComposition(asset:, applyingCIFiltersWithHandler:), which I also use in the UI, to apply CIFilters to a video asset.
However, these two tools cannot be used simultaneously: If I start an AVAssetExportSession that combines these tools, AVFoundation throws an NSInvalidArgumentException:
Expecting video composition to contain only AVCoreImageFilterVideoCompositionInstruction
I tried to workaround this limitation as follows:
Workaround 1
1) Setup an export using AVAssetReader and AVAssetWriter
2) Obtain the sample buffers from the asset reader and apply the CIFilter, save the result in a CGImage.
3) Set the CGImage as the content of the video layer in the layer hierarchy. Now the layer hierarchy "looks like" one frame of the final video.
4) Obtain the data of the CVPixelBuffer for each frame from the asset writer using CVPixelBufferGetBaseAddress and create a CGContext with that data.
5) Render my layer to that context using CALayer.render(in ctx: CGContext).
This setup works, but is extremely slow - exporting a 5 second video sometimes takes a minute. It looks like the CoreGraphics calls are the bottleneck here (I guess that's because with this approach the composition happens on the CPU?)
Workaround 2
One other approach could be to do this in two steps: First, save the source video just with the filters applied to a file as in B, and then use that video file to embed the video in the layer composition as in A. However, as it uses two passes, I guess this isn't as efficient as it could be.
Summary
What is a good approach to export this video to a file, ideally in a single pass? How can I use CIFilters and AVVideoCompositionCoreAnimationTool simultaneously? Is there a native way to set up a "pipeline" in AVFoundation which combines these tools?
The way to achieve this is using a custom AVVideoCompositing. This object allows you to compose (in this case apply the CIFilter) each video frame.
Here's an example implementation that applies a CIPhotoEffectNoir effect to the whole video:
class VideoFilterCompositor: NSObject, AVVideoCompositing {
var sourcePixelBufferAttributes: [String : Any]? = [kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_32BGRA]
var requiredPixelBufferAttributesForRenderContext: [String : Any] = [kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_32BGRA]
private var renderContext: AVVideoCompositionRenderContext?
func renderContextChanged(_ newRenderContext: AVVideoCompositionRenderContext) {
renderContext = newRenderContext
}
func cancelAllPendingVideoCompositionRequests() {
}
private let filter = CIFilter(name: "CIPhotoEffectNoir")!
private let context = CIContext()
func startRequest(_ asyncVideoCompositionRequest: AVAsynchronousVideoCompositionRequest) {
guard let track = asyncVideoCompositionRequest.sourceTrackIDs.first?.int32Value, let frame = asyncVideoCompositionRequest.sourceFrame(byTrackID: track) else {
asyncVideoCompositionRequest.finish(with: NSError(domain: "VideoFilterCompositor", code: 0, userInfo: nil))
return
}
filter.setValue(CIImage(cvPixelBuffer: frame), forKey: kCIInputImageKey)
if let outputImage = filter.outputImage, let outBuffer = renderContext?.newPixelBuffer() {
context.render(outputImage, to: outBuffer)
asyncVideoCompositionRequest.finish(withComposedVideoFrame: outBuffer)
} else {
asyncVideoCompositionRequest.finish(with: NSError(domain: "VideoFilterCompositor", code: 0, userInfo: nil))
}
}
}
If you need to have different filters at different times, you can use custom AVVideoCompositionInstructionProtocol which you can get from the AVAsynchronousVideoCompositionRequest
Next, you need to use this with your AVMutableVideoComposition, so:
let videoComposition = AVMutableVideoComposition()
videoComposition.customVideoCompositorClass = VideoFilterCompositor.self
//Add your animator tool as usual
let animator = AVVideoCompositionCoreAnimationTool(postProcessingAsVideoLayer: v, in: p)
videoComposition.animationTool = animator
//Finish setting up the composition
With this, you should be able to export the video using a regular AVAssetExportSession, setting its videoComposition

Force Redraw of AVPlayerLayer when it is paused on iOS 13

I apply real time effects using CoreImage to video that is played using AVPlayer. The problem is when the player is paused, filters are not applied if you tweak filter parameters using slider.
let videoComposition = AVMutableVideoComposition(asset: asset, applyingCIFiltersWithHandler: {[weak self] request in
// Clamp to avoid blurring transparent pixels at the image edges
let source = request.sourceImage.clampedToExtent()
let output:CIImage
if let filteredOutput = self?.runFilters(source, filters: array)?.cropped(to: request.sourceImage.extent) {
output = filteredOutput
} else {
output = source
}
// Provide the filter output to the composition
request.finish(with: output, context: nil)
})
As a workaround, I used this answer that worked till iOS 12.4, but not anymore in iOS 13 beta 6. Looking for solutions that work on iOS 13.
After reporting this as a bug to Apple and getting some helpful feedback I have a fix:
player.currentItem?.videoComposition = player.currentItem?.videoComposition?.mutableCopy() as? AVVideoComposition
The explanation i got was:
AVPlayer redraws a frame when AVPlayerItem’s videoComposition property gets a new instance or, even if it is the same instance, a property of the instance has been modified.
As a result; forcing a redraw can be achieved by making a 'new' instance simply by copying the existing instance.

How can I convert a CMSampleBuffer with image data to a format suitable for sending over a network connection?

I want to send frames of a video stream over a network connection, so I have implemented the AVCaptureVideoDataOutputSampleBufferDelegate function:
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection)
How should I convert the CMSampleBuffer to Data as the NWConnection function:
func send(content: Data?, contentContext: NWConnection.ContentContext = default, isComplete: Bool = default, completion: NWConnection.SendCompletion)
that I'm using for networking expects Data for its content parameter?
You presumably want to compress the video frames before sending them over the network, because uncompressed video frames might require more bandwidth than you have available. And you'll want to use the hardware compressor for speed.
You can access the hardware compressor and decompressor using the VideoToolbox framework.
You should watch WWDC 2014 session 513, “Direct Access to Video Encoding and Decoding”. Here's a quote from the introduction:
And accompanying that, there's the case where you have a stream of images coming in from the camera or someplace else and you'd like to compress those but get direct access to those compressed sample buffers so that you can send them out over the network or do whatever you like with them.
You can find a transcript of the session at ASCIIwwdc.
Please Try it. It is working at me.
guard let cvBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else {
return
}
//get a CIImage out of the CVImageBuffer
let ciImage = CIImage(cvImageBuffer: cvBuffer)
//get UIImage out of CIImage
let uiImage = UIImage(ciImage: ciImage)
//get frames which is image data format
let frames = uiImage.jpegData(compressionQuality: 0.5)!

Problems accurately timing the loading of a image from file into UIImageView

I am trying to measure the time taken to load a large photo (JPEG) from file into an UIImageView on iOS 8.0.
My current code:
import UIKit
class ViewController: UIViewController {
#IBOutlet weak var imageView: UIImageView!
#IBAction func loadImage(sender: UIButton) {
if let imageFile = NSBundle.mainBundle().pathForResource("large_photo", ofType: "jpg") {
// start our timer
let tick = Tick()
// loads a very large image file into imageView
// the test photo used is a 4608 × 3456 pixel JPEG
// using contentsOfFile: to prevent caching while testing timer
imageView.image = UIImage(contentsOfFile: imageFile)
// stop our timer and print execution time
tick.tock()
}
}
}
class Tick {
let tickTime : NSDate
init () {
tickTime = NSDate()
}
func tock () {
let tockTime = NSDate()
let executionTime = tockTime.timeIntervalSinceDate(tickTime)
println("[execution time]: \(executionTime)")
}
}
When I load a very large image (4608 x 3456 JPEG) on my test device (5th gen iPod touch), I can see that the execution time is ~2-3 seconds and blocks the main thread. This is observable by the fact that the UIButton remains in a highlighted state for this period of time and no other UI elements allow interaction.
I would therefore expect my timing function to report a time of ~2-3 seconds. However, it reports a time of milliseconds - eg:
[execution time]: 0.0116159915924072
This tick.tock() prints the message to the Console before the image is loaded. This confuses me, as the main thread appears blocked until after the image is loaded.
This leads me to ask the following questions:
if the image is being loaded asynchronously in the background, then
why is user interaction/main thread blocked?
if the image is being loaded on the main thread, why does the
tick.tock() function print to the console before the image is
displayed?
There are 2 parts to what you are measuring here:
Loading the image from disk:
UIImage(contentsOfFile: imageFile)
And decompressing the image from a JPEG to a bitmap to be displayed:
imageView.image = ....
The first part involves actually retrieving the compressed JPEG data from the disk (disk I/O) and creating a UIImage object. The UIImage object holds a reference to the compressed data, until it needs to be displayed. Only at the moment that it's ready to be rendered to the screen does it decompress the image into a bitmap to display (on the main thread).
My guess is that your timer is only catching the disk load part, and the decompression is happening on the next runloop. The decompression of an image that size is likely to take a while, probably the lions share of the time.
If you want to explicitly measure how long the decompression takes, you'll need to do it manually, by drawing the image to an off screen context, like so:
let tick = Tick()
// Load the image from disk
let image = UIImage(contentsOfFile: imageFile)
// Decompress the image into a bitmap
var newImage:UIImage;
UIGraphicsBeginImageContextWithOptions(image.size, true, 0);
image.drawInRect(CGRect(x:0,y:0,width:image.size.width, height:image.size.height))
newImage = UIGraphicsGetImageFromCurrentImageContext();
UIGraphicsEndImageContext();
tick.tock()
Here we are replicating the decompression that would happen when you assigned the image to the imageView.image
A handy trick to keep the UI responsive when dealing with images this size is to kick the whole process onto a background thread. This works well because once you have manually decompressed the image, UIKit detects this and doesn't repeat the process.
// Switch to background thread
dispatch_async(dispatch_get_global_queue(Int(DISPATCH_QUEUE_PRIORITY_DEFAULT.value), 0)) {
// Load the image from disk
let image = UIImage(contentsOfFile: imageFile)
// Ref to the decompressed image
var newImage:UIImage;
// Decompress the image into a bitmap
UIGraphicsBeginImageContextWithOptions(image.size, true, 0);
image.drawInRect(CGRect(x:0,y:0,width:image.size.width, height:image.size.height))
newImage = UIGraphicsGetImageFromCurrentImageContext();
UIGraphicsEndImageContext();
// Switch back to main thread
dispatch_async(dispatch_get_main_queue()) {
// Display the decompressed image
imageView.image = newImage
}
}
A disclaimer: The code here has not been fully tested in Xcode, but it's 99% correct if you decide to use it.
I would try to time this using a unit test, since the XCTest framework provides some good performance measurement tools. I think this approach would get around the lazy loading issues... although I'm not 100% on it.
func testImagePerformance() {
let date = NSDate()
measureBlock() {
if let imageFile = NSBundle.mainBundle().pathForResource("large_photo", ofType: "jpg") {
imageView.image = UIImage(contentsOfFile: imageFile)
}
}
}
(Just an aside, you mentioned that the loading blocks the main app thread... you should look into using an NSOperationQueue to make sure that doesn't happen... you probably already know that: http://nshipster.com/nsoperation/)

Resources