I am trying to make an iOS app that does some pre-processing on video from the camera, then sends it out over webrtc. I am doing the pre-processing on each individual frame using the AVCaptureVideoDataOutputSampleBufferDelegate protocol and then capturing the frame with the captureOutput method.
Now I need to figure out how to send it out on WebRTC. I am using the Google WebRTC library: https://webrtc.googlesource.com/src/.
There is a class called RTCCameraVideoCapturer [(link)][1] that most iOS example apps using this library seem to use. This class accesses the camera itself, so I won't be able to use it. It uses AVCaptureVideoDataOutputSampleBufferDelegate, and in captureOutput, it does this
RTC_OBJC_TYPE(RTCCVPixelBuffer) *rtcPixelBuffer =
[[RTC_OBJC_TYPE(RTCCVPixelBuffer) alloc] initWithPixelBuffer:pixelBuffer];
int64_t timeStampNs = CMTimeGetSeconds(CMSampleBufferGetPresentationTimeStamp(sampleBuffer)) *
kNanosecondsPerSecond;
RTC_OBJC_TYPE(RTCVideoFrame) *videoFrame =
[[RTC_OBJC_TYPE(RTCVideoFrame) alloc] initWithBuffer:rtcPixelBuffer
rotation:_rotation
timeStampNs:timeStampNs];
[self.delegate capturer:self didCaptureVideoFrame:videoFrame];
[self.delegate capturer:self didCaptureVideoFrame:videoFrame] seems to be the call that is made to feed a single frame into webRTC.
How can I write swift code that will allow me to feed frames into webRTC one at a time, similar to how it is done in the `RTCCameraVideoCapturer` class?
[1]: https://webrtc.googlesource.com/src/+/refs/heads/master/sdk/objc/components/capturer/RTCCameraVideoCapturer.m
You just need to create an instance of RTCVideoCapturer (which is just a holder of the delegate, localVideoTrack.source), and calls a delegate method "capturer" with a frame whenever you have a pixelBuffer you want to push.
Here is a sample code.
var capturer: RTCVideoCapturer?
let rtcQueue = DispatchQueue(label: "WebRTC")
func appClient(_ client: ARDAppClient!, didReceiveLocalVideoTrack localVideoTrack: RTCVideoTrack!) {
capturer = RTCVideoCapturer(delegate: localVideoTrack.source)
}
func render(pixelBuffer: CVPixelBuffer, timesample: CMTime) {
let buffer = RTCCVPixelBuffer(pixelBuffer: pixelBuffer)
self.rtcQueue.async {
let frame = RTCVideoFrame(buffer: buffer, rotation: ._0, timeStampNs: Int64(CMTimeGetSeconds(timesample) * Double(NSEC_PER_SEC)))
self.capturer?.delegate?.capturer(self.capturer!, didCapture: frame)
}
}
In the UI of my iOS app, I display a complex hierarchy of CALayers. One of these layers is a AVPlayerLayer that displays a video with CIFilters applied in real time (using AVVideoComposition(asset:, applyingCIFiltersWithHandler:)).
Now I want to export this layer composition to a video file. There are two tools in AVFoundation that seem helpful:
A: AVVideoCompositionCoreAnimationTool which allows rendering a video inside a (possibly animated) CALayer hierarchy
B: AVVideoComposition(asset:, applyingCIFiltersWithHandler:), which I also use in the UI, to apply CIFilters to a video asset.
However, these two tools cannot be used simultaneously: If I start an AVAssetExportSession that combines these tools, AVFoundation throws an NSInvalidArgumentException:
Expecting video composition to contain only AVCoreImageFilterVideoCompositionInstruction
I tried to workaround this limitation as follows:
Workaround 1
1) Setup an export using AVAssetReader and AVAssetWriter
2) Obtain the sample buffers from the asset reader and apply the CIFilter, save the result in a CGImage.
3) Set the CGImage as the content of the video layer in the layer hierarchy. Now the layer hierarchy "looks like" one frame of the final video.
4) Obtain the data of the CVPixelBuffer for each frame from the asset writer using CVPixelBufferGetBaseAddress and create a CGContext with that data.
5) Render my layer to that context using CALayer.render(in ctx: CGContext).
This setup works, but is extremely slow - exporting a 5 second video sometimes takes a minute. It looks like the CoreGraphics calls are the bottleneck here (I guess that's because with this approach the composition happens on the CPU?)
Workaround 2
One other approach could be to do this in two steps: First, save the source video just with the filters applied to a file as in B, and then use that video file to embed the video in the layer composition as in A. However, as it uses two passes, I guess this isn't as efficient as it could be.
Summary
What is a good approach to export this video to a file, ideally in a single pass? How can I use CIFilters and AVVideoCompositionCoreAnimationTool simultaneously? Is there a native way to set up a "pipeline" in AVFoundation which combines these tools?
The way to achieve this is using a custom AVVideoCompositing. This object allows you to compose (in this case apply the CIFilter) each video frame.
Here's an example implementation that applies a CIPhotoEffectNoir effect to the whole video:
class VideoFilterCompositor: NSObject, AVVideoCompositing {
var sourcePixelBufferAttributes: [String : Any]? = [kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_32BGRA]
var requiredPixelBufferAttributesForRenderContext: [String : Any] = [kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_32BGRA]
private var renderContext: AVVideoCompositionRenderContext?
func renderContextChanged(_ newRenderContext: AVVideoCompositionRenderContext) {
renderContext = newRenderContext
}
func cancelAllPendingVideoCompositionRequests() {
}
private let filter = CIFilter(name: "CIPhotoEffectNoir")!
private let context = CIContext()
func startRequest(_ asyncVideoCompositionRequest: AVAsynchronousVideoCompositionRequest) {
guard let track = asyncVideoCompositionRequest.sourceTrackIDs.first?.int32Value, let frame = asyncVideoCompositionRequest.sourceFrame(byTrackID: track) else {
asyncVideoCompositionRequest.finish(with: NSError(domain: "VideoFilterCompositor", code: 0, userInfo: nil))
return
}
filter.setValue(CIImage(cvPixelBuffer: frame), forKey: kCIInputImageKey)
if let outputImage = filter.outputImage, let outBuffer = renderContext?.newPixelBuffer() {
context.render(outputImage, to: outBuffer)
asyncVideoCompositionRequest.finish(withComposedVideoFrame: outBuffer)
} else {
asyncVideoCompositionRequest.finish(with: NSError(domain: "VideoFilterCompositor", code: 0, userInfo: nil))
}
}
}
If you need to have different filters at different times, you can use custom AVVideoCompositionInstructionProtocol which you can get from the AVAsynchronousVideoCompositionRequest
Next, you need to use this with your AVMutableVideoComposition, so:
let videoComposition = AVMutableVideoComposition()
videoComposition.customVideoCompositorClass = VideoFilterCompositor.self
//Add your animator tool as usual
let animator = AVVideoCompositionCoreAnimationTool(postProcessingAsVideoLayer: v, in: p)
videoComposition.animationTool = animator
//Finish setting up the composition
With this, you should be able to export the video using a regular AVAssetExportSession, setting its videoComposition
I apply real time effects using CoreImage to video that is played using AVPlayer. The problem is when the player is paused, filters are not applied if you tweak filter parameters using slider.
let videoComposition = AVMutableVideoComposition(asset: asset, applyingCIFiltersWithHandler: {[weak self] request in
// Clamp to avoid blurring transparent pixels at the image edges
let source = request.sourceImage.clampedToExtent()
let output:CIImage
if let filteredOutput = self?.runFilters(source, filters: array)?.cropped(to: request.sourceImage.extent) {
output = filteredOutput
} else {
output = source
}
// Provide the filter output to the composition
request.finish(with: output, context: nil)
})
As a workaround, I used this answer that worked till iOS 12.4, but not anymore in iOS 13 beta 6. Looking for solutions that work on iOS 13.
After reporting this as a bug to Apple and getting some helpful feedback I have a fix:
player.currentItem?.videoComposition = player.currentItem?.videoComposition?.mutableCopy() as? AVVideoComposition
The explanation i got was:
AVPlayer redraws a frame when AVPlayerItem’s videoComposition property gets a new instance or, even if it is the same instance, a property of the instance has been modified.
As a result; forcing a redraw can be achieved by making a 'new' instance simply by copying the existing instance.
I am using ios 11, swift 4 and capturing a picture with av foundation library. I have a custom preview as shown and mysettings are as suggested. The problem is when I capture and save the CMSample buffer, it is leftLandscape oriented. I tried to change CapturePhotoOutput orientation but it resist to change?(changing photoOutputConnection.videoOrientation changes nothing?)
if let photoOutputConnection = capturePhotoOutput.connection(with: AVMediaType.video) {
if(photoOutputConnection.isVideoOrientationSupported) {
print("video oryantasyonu = \(photoOutputConnection.videoOrientation)")
} else {
print("video oryantasyonu desteklenmiyor ?!")
}
}
Here is the preview (phono screen):
and here is the capture output taken from xcode debug quick view:
Here is session configuration :
self.capturePhotoOutput = AVCapturePhotoOutput()
capturePhotoOutput.isHighResolutionCaptureEnabled = true
// A Live Photo captures both a still image and a short movie centered on the moment of capture,
// which are presented together in user interfaces such as the Photos app.
capturePhotoOutput.isLivePhotoCaptureEnabled = capturePhotoOutput.isLivePhotoCaptureSupported
guard self.captureSession.canAddOutput(capturePhotoOutput) else { return }
// The sessionPreset property of the capture session defines the resolution and quality level of the video output.
// For most photo capture purposes, it is best set to AVCaptureSessionPresetPhoto to deliver high resolution photo quality output.
self.captureSession.sessionPreset = AVCaptureSession.Preset.photo
self.captureSession.addOutput(capturePhotoOutput)
self.captureSession.commitConfiguration()
I'm accessing the camera in iOS and using session presets as so:
captureSession.sessionPreset = AVCaptureSessionPresetMedium;
Pretty standard stuff. However, I'd like to know ahead of time the resolution of the video I'll be getting due to this preset (especially because depending on the device it'll be different). I know there are tables online you can look this up (such as here: http://cmgresearch.blogspot.com/2010/10/augmented-reality-on-iphone-with-ios40.html ). But I'd like to be able to get this programmatically so that I'm not just relying on magic numbers.
So, something like this (theoretically):
[captureSession resolutionForPreset:AVCaptureSessionPresetMedium];
which might return a CGSize of { width: 360, height: 480}. I have not been able to find any such API, so far I've had to resort to waiting to get my first captured image and querying it then (which for other reasons in my program flow is not good).
I am no AVFoundation pro, but I think the way to go is:
captureSession.sessionPreset = AVCaptureSessionPresetMedium;
AVCaptureInput *input = [captureSession.inputs objectAtIndex:0]; // maybe search the input in array
AVCaptureInputPort *port = [input.ports objectAtIndex:0];
CMFormatDescriptionRef formatDescription = port.formatDescription;
CMVideoDimensions dimensions = CMVideoFormatDescriptionGetDimensions(formatDescription);
I'm not sure about the last step and I didn't try it myself. Just found that in the documentation and think it should work.
Searching for CMVideoDimensions in Xcode you'll find the RosyWriter example project. Have a look at that code (I don't have time to do that now).
You can programmatically get the resolution from activeFormat before capture begins, though not before adding inputs and outputs: https://developer.apple.com/library/ios/documentation/AVFoundation/Reference/AVCaptureDevice_Class/index.html#//apple_ref/occ/instp/AVCaptureDevice/activeFormat
private func getCaptureResolution() -> CGSize {
// Define default resolution
var resolution = CGSize(width: 0, height: 0)
// Get cur video device
let curVideoDevice = useBackCamera ? backCameraDevice : frontCameraDevice
// Set if video portrait orientation
let portraitOrientation = orientation == .Portrait || orientation == .PortraitUpsideDown
// Get video dimensions
if let formatDescription = curVideoDevice?.activeFormat.formatDescription {
let dimensions = CMVideoFormatDescriptionGetDimensions(formatDescription)
resolution = CGSize(width: CGFloat(dimensions.width), height: CGFloat(dimensions.height))
if (portraitOrientation) {
resolution = CGSize(width: resolution.height, height: resolution.width)
}
}
// Return resolution
return resolution
}
FYI, I attach here an official reply from Apple.
This is a follow-up to Bug ID# 13201137.
Engineering has determined that this issue behaves as intended based on the following information:
There are several problems with the included code:
1) The AVCaptureSession has no inputs.
2) The AVCaptureSession has no outputs.
Without at least one input (added to the session using [AVCaptureSession addInput:]) and a compatible output (added using [AVCaptureSession addOutput:]), there will be no active connections, therefore, the session won't actually run in the input device. It doesn't need to -- there are no outputs to which to deliver any camera data.
3) The JAViewController class assumes that the video port's -formatDescription property will be non nil as soon as [AVCaptureSession startRunning] returns.
There is no guarantee that the format description will be updated with the new camera format as soon as startRunning returns. -startRunning starts up the camera and returns when it is completely up and running, but doesn't wait for video frames to be actively flowing through the capture pipeline, which is when the format description would be updated.
You're just querying too fast. If you waited a few milliseconds more, it would be there. But the right way to do this is to listen for the AVCaptureInputPortFormatDescriptionDidChangeNotification.
4) Your JAViewController class creates a PVCameraInfo object in retrieveCameraInfo: and asks it a question, then lets it fall out of scope, where it is released and dealloc'ed.
Therefore, the session doesn't have long enough to run to satisfy your dimensions request. You stop the camera too quickly.
We consider this issue closed. If you have any questions or concern regarding this issue, please update your report directly (http://bugreport.apple.com).
Thank you for taking the time to notify us of this issue.
Best Regards,
Developer Bug Reporting Team
Apple Worldwide Developer Relations
According to Apple, there's no API for that. It stinks, I've had the same problem.
May be you can provide a list of all posible preset resolutions for every iPhone model and check which device model the app is running on? - using something like this...
[[UIDevice currentDevice] platformType] // ex: UIDevice4GiPhone
[[UIDevice currentDevice] platformString] // ex: #"iPhone 4G"
However, you have to update the list for each newer device model. Hope this helps :)
if preset is .photo, the return size is for still photo size, not preview video size
if preset is not .photo, the return size is for video size, not for captured photo size.
if self.session.sessionPreset != .photo {
// return video size, not captured photo size
let format = videoDevice.activeFormat
let formatDescription = format.formatDescription
let dimensions = CMVideoFormatDescriptionGetDimensions(formatDescription)
} else {
// other way to get video size
}
Answer of #Christian Beer is a good way for specified preset.
My way is a good for active preset.
The best way to do what you want (get a known video or image format) is to set the format of the capture device.
First find the capture device you want to use:
if #available(iOS 10.0, *) {
captureDevice = defaultCamera()
} else {
let devices = AVCaptureDevice.devices()
// Loop through all the capture devices on this phone
for device in devices {
// Make sure this particular device supports video
if ((device as AnyObject).hasMediaType(AVMediaType.video)) {
// Finally check the position and confirm we've got the back camera
if((device as AnyObject).position == AVCaptureDevice.Position.back) {
captureDevice = device as AVCaptureDevice
}
}
}
}
self.autoLevelWindowCenter = ALCWindow.frame
if captureDevice != nil && currentUser != nil {
beginSession()
}
}
func defaultCamera() -> AVCaptureDevice? {
if #available(iOS 10.0, *) { // only use the wide angle camera never dual camera
if let device = AVCaptureDevice.default(AVCaptureDevice.DeviceType.builtInWideAngleCamera,
for: AVMediaType.video,
position: .back) {
return device
} else {
return nil
}
} else {
return nil
}
}
Then find the formats that that device can use:
let options = captureDevice!.formats
var supportable = options.first as! AVCaptureDevice.Format
for format in options {
let testFormat = format
let description = testFormat.description
if (description.contains("60 fps") && description.contains("1280x 720")){
supportable = testFormat
}
}
You can do more complex parsing of the formats, but you might not care.
Then just set the device to that format:
do {
try captureDevice?.lockForConfiguration()
captureDevice!.activeFormat = supportable
// setup other capture device stuff like autofocus, frame rate, ISO, shutter speed, etc.
try captureSession.addInput(AVCaptureDeviceInput(device: captureDevice!))
// add the device to an active CaptureSession
}
You may want to look at the AVFoundation docs and tutorial on AVCaptureSession as there are lots of things you can do with the output as well. For example, you can convert the result to .mp4 using AVAssetExportSession so that you can post it on YouTube, etc.
Hope this helps
Apple is using 4:3 ratio for the iPhone camera.
You can you this ratio to get the frame size of the captured video by fixing either the width or height constraint of the AVCaptureVideoPreviewLayer and set the aspect ratio constraint to 4:3.
In the left image, the width was fixed to 300px and the height was retrieved by setting the 4:3 ratio, and it was 400px.
In the right image, the height was fixed to 300px and width was retrieved by setting the 3:4 ratio, and it was 225px.