I am trying to make an iOS app that does some pre-processing on video from the camera, then sends it out over webrtc. I am doing the pre-processing on each individual frame using the AVCaptureVideoDataOutputSampleBufferDelegate protocol and then capturing the frame with the captureOutput method.
Now I need to figure out how to send it out on WebRTC. I am using the Google WebRTC library: https://webrtc.googlesource.com/src/.
There is a class called RTCCameraVideoCapturer [(link)][1] that most iOS example apps using this library seem to use. This class accesses the camera itself, so I won't be able to use it. It uses AVCaptureVideoDataOutputSampleBufferDelegate, and in captureOutput, it does this
RTC_OBJC_TYPE(RTCCVPixelBuffer) *rtcPixelBuffer =
[[RTC_OBJC_TYPE(RTCCVPixelBuffer) alloc] initWithPixelBuffer:pixelBuffer];
int64_t timeStampNs = CMTimeGetSeconds(CMSampleBufferGetPresentationTimeStamp(sampleBuffer)) *
kNanosecondsPerSecond;
RTC_OBJC_TYPE(RTCVideoFrame) *videoFrame =
[[RTC_OBJC_TYPE(RTCVideoFrame) alloc] initWithBuffer:rtcPixelBuffer
rotation:_rotation
timeStampNs:timeStampNs];
[self.delegate capturer:self didCaptureVideoFrame:videoFrame];
[self.delegate capturer:self didCaptureVideoFrame:videoFrame] seems to be the call that is made to feed a single frame into webRTC.
How can I write swift code that will allow me to feed frames into webRTC one at a time, similar to how it is done in the `RTCCameraVideoCapturer` class?
[1]: https://webrtc.googlesource.com/src/+/refs/heads/master/sdk/objc/components/capturer/RTCCameraVideoCapturer.m
You just need to create an instance of RTCVideoCapturer (which is just a holder of the delegate, localVideoTrack.source), and calls a delegate method "capturer" with a frame whenever you have a pixelBuffer you want to push.
Here is a sample code.
var capturer: RTCVideoCapturer?
let rtcQueue = DispatchQueue(label: "WebRTC")
func appClient(_ client: ARDAppClient!, didReceiveLocalVideoTrack localVideoTrack: RTCVideoTrack!) {
capturer = RTCVideoCapturer(delegate: localVideoTrack.source)
}
func render(pixelBuffer: CVPixelBuffer, timesample: CMTime) {
let buffer = RTCCVPixelBuffer(pixelBuffer: pixelBuffer)
self.rtcQueue.async {
let frame = RTCVideoFrame(buffer: buffer, rotation: ._0, timeStampNs: Int64(CMTimeGetSeconds(timesample) * Double(NSEC_PER_SEC)))
self.capturer?.delegate?.capturer(self.capturer!, didCapture: frame)
}
}
Related
I'm trying to stream a CMSampleBuffer video / audio combo using WebRTC on iOS, but I'm running into trouble trying to capture audio. Video works just fine:
guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else {
print("couldn't get image from buffer :~(")
return
}
let rtcPixelBuffer = RTCCVPixelBuffer(pixelBuffer: pixelBuffer)
let rtcVideoFrame = RTCVideoFrame(buffer: rtcPixelBuffer, rotation: ._0, timeStampNs: timeStampNs)
videoSource.capturer(videoCapturer, didCapture: rtcVideoFrame)
When it comes to audio, I can't see any method on the RTCAudioSource class in order to capture audio, any help would be appreciated!
I found a fork of the WebRTC codebase which solves this issue by adding a way for audio samples to be captured by an RTCAudioDeviceModule:
https://github.com/pixiv/webrtc/blob/87.0.4280.142-pixiv0/README.pixiv.en.md
In the UI of my iOS app, I display a complex hierarchy of CALayers. One of these layers is a AVPlayerLayer that displays a video with CIFilters applied in real time (using AVVideoComposition(asset:, applyingCIFiltersWithHandler:)).
Now I want to export this layer composition to a video file. There are two tools in AVFoundation that seem helpful:
A: AVVideoCompositionCoreAnimationTool which allows rendering a video inside a (possibly animated) CALayer hierarchy
B: AVVideoComposition(asset:, applyingCIFiltersWithHandler:), which I also use in the UI, to apply CIFilters to a video asset.
However, these two tools cannot be used simultaneously: If I start an AVAssetExportSession that combines these tools, AVFoundation throws an NSInvalidArgumentException:
Expecting video composition to contain only AVCoreImageFilterVideoCompositionInstruction
I tried to workaround this limitation as follows:
Workaround 1
1) Setup an export using AVAssetReader and AVAssetWriter
2) Obtain the sample buffers from the asset reader and apply the CIFilter, save the result in a CGImage.
3) Set the CGImage as the content of the video layer in the layer hierarchy. Now the layer hierarchy "looks like" one frame of the final video.
4) Obtain the data of the CVPixelBuffer for each frame from the asset writer using CVPixelBufferGetBaseAddress and create a CGContext with that data.
5) Render my layer to that context using CALayer.render(in ctx: CGContext).
This setup works, but is extremely slow - exporting a 5 second video sometimes takes a minute. It looks like the CoreGraphics calls are the bottleneck here (I guess that's because with this approach the composition happens on the CPU?)
Workaround 2
One other approach could be to do this in two steps: First, save the source video just with the filters applied to a file as in B, and then use that video file to embed the video in the layer composition as in A. However, as it uses two passes, I guess this isn't as efficient as it could be.
Summary
What is a good approach to export this video to a file, ideally in a single pass? How can I use CIFilters and AVVideoCompositionCoreAnimationTool simultaneously? Is there a native way to set up a "pipeline" in AVFoundation which combines these tools?
The way to achieve this is using a custom AVVideoCompositing. This object allows you to compose (in this case apply the CIFilter) each video frame.
Here's an example implementation that applies a CIPhotoEffectNoir effect to the whole video:
class VideoFilterCompositor: NSObject, AVVideoCompositing {
var sourcePixelBufferAttributes: [String : Any]? = [kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_32BGRA]
var requiredPixelBufferAttributesForRenderContext: [String : Any] = [kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_32BGRA]
private var renderContext: AVVideoCompositionRenderContext?
func renderContextChanged(_ newRenderContext: AVVideoCompositionRenderContext) {
renderContext = newRenderContext
}
func cancelAllPendingVideoCompositionRequests() {
}
private let filter = CIFilter(name: "CIPhotoEffectNoir")!
private let context = CIContext()
func startRequest(_ asyncVideoCompositionRequest: AVAsynchronousVideoCompositionRequest) {
guard let track = asyncVideoCompositionRequest.sourceTrackIDs.first?.int32Value, let frame = asyncVideoCompositionRequest.sourceFrame(byTrackID: track) else {
asyncVideoCompositionRequest.finish(with: NSError(domain: "VideoFilterCompositor", code: 0, userInfo: nil))
return
}
filter.setValue(CIImage(cvPixelBuffer: frame), forKey: kCIInputImageKey)
if let outputImage = filter.outputImage, let outBuffer = renderContext?.newPixelBuffer() {
context.render(outputImage, to: outBuffer)
asyncVideoCompositionRequest.finish(withComposedVideoFrame: outBuffer)
} else {
asyncVideoCompositionRequest.finish(with: NSError(domain: "VideoFilterCompositor", code: 0, userInfo: nil))
}
}
}
If you need to have different filters at different times, you can use custom AVVideoCompositionInstructionProtocol which you can get from the AVAsynchronousVideoCompositionRequest
Next, you need to use this with your AVMutableVideoComposition, so:
let videoComposition = AVMutableVideoComposition()
videoComposition.customVideoCompositorClass = VideoFilterCompositor.self
//Add your animator tool as usual
let animator = AVVideoCompositionCoreAnimationTool(postProcessingAsVideoLayer: v, in: p)
videoComposition.animationTool = animator
//Finish setting up the composition
With this, you should be able to export the video using a regular AVAssetExportSession, setting its videoComposition
I apologise in advance for the "dumb" question, but I feel I have exhausted all resources. I have little to no experience with Swift and coding in general but I understand much based on past experience and use of object based programming such as MAX MSP.
I am attempting to develop a camera/microphone capture iOS app for the macOS QuickTime Player recording function (answering my own need for RAW camera access as I literally could not find the right thing out there!).
Having implemented AVCaptureSession video output successfully, I have tried many methods of sending audio to Quicktime (including AVAudioSessionPortUSBAudio) to no avail. This was before I realised that QuickTime automatically captures the iOS system audio output.
So my presumption was that I should be able to preview audio under AVCapture Session easily; not so! It seems AVCaptureAudioPreviewOutput in "not available" in swift4 or I am simple missing some basics. I have seen articles on stack mentions the need to STOP audio processing, so I'm hopeful it is easy to preview/monitor it.
Could any of you point me to a method of previewing audio in AVCaptureSession? I have an instantiated AVAudioSession still (my original attempt), and have also just managed (I hope) to successfully connect the mic to the AVCaptureSession. However, I am not sure what else to use! My aim: just to hear the Mic input on the system's audio output: the Quicktime connection should (hopefully) handle capturing from the USB port (music played on the phone goes over the usb when the iOS device is selected as the microphone).
let audioDevice = AVCaptureDevice.default(for: AVMediaType.audio)
do {
let audioInput = try AVCaptureDeviceInput(device: audioDevice!)
self.captureSession.addInput(audioInput)
} catch {
print("Unable to add Audio Device")
}
I have also attempted other things which I am becoming lost on;
captureSession.automaticallyConfiguresApplicationAudioSession = true
func showAudioPreview() -> Bool { return true }
Perhaps it is possible to use AVAudioSession alongside the capture? However, my basic knowledge points to the fact that there are problems running Capture and Audio Sessions together.
Any help would be sincerely appreciated, I am sure many of you will roll your eyes and be able to easily point out my mistakes!
Thanks,
Iwan
AVCaptureAudioPreviewOutput is only available on the mac, but you could instead use AVSampleBufferAudioRenderer. You have to manually enqueue audio CMSampleBuffers to it which an AVCaptureAudioDataOutput can provide:
import UIKit
import AVFoundation
class ViewController: UIViewController, AVCaptureAudioDataOutputSampleBufferDelegate {
let session = AVCaptureSession()
let bufferRenderSyncer = AVSampleBufferRenderSynchronizer()
let bufferRenderer = AVSampleBufferAudioRenderer()
override func viewDidLoad() {
super.viewDidLoad()
bufferRenderSyncer.addRenderer(bufferRenderer)
let audioDevice = AVCaptureDevice.default(for: .audio)!
let captureInput = try! AVCaptureDeviceInput(device: audioDevice)
let audioOutput = AVCaptureAudioDataOutput()
audioOutput.setSampleBufferDelegate(self, queue: DispatchQueue.main) // or some other dispatch queue
session.addInput(captureInput)
session.addOutput(audioOutput)
session.startRunning()
}
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
bufferRenderer.enqueue(sampleBuffer)
if bufferRenderSyncer.rate == 0 {
bufferRenderSyncer.setRate(1, time: sampleBuffer.presentationTimeStamp)
}
}
}
I'm trying to use AKSampler on a simple iOS project to load a file and play it when tapping the device's screen.
I did the same steps with AKSamplePlayer and it worked fine, But I rather use the AKSampler, and also I get a strong feeling of missing something.
I've tried the play() method, and also the one with the midi note.
Which one is right? Do they both work?Besides, AudioKit looks so promising.
Here is my code:
import UIKit
import AudioKit
class ViewController: UIViewController
{
var sampler = AKSampler()
var tapRecognizer = UITapGestureRecognizer()
override func viewDidLoad()
{
super.viewDidLoad()
do
{
let file = try AKAudioFile(readFileName: "AH_G2.wav")
try sampler.loadAudioFile(file)
}
catch
{
print("No Such File...")
}
view.addGestureRecognizer(tapRecognizer)
view.isUserInteractionEnabled = true
tapRecognizer.addTarget(self, action: #selector(viewTapped))
AudioKit.output = sampler
AudioKit.start()
}
#objc private func viewTapped()
{
sampler.play(noteNumber: 60, velocity: 80, channel: 0)
print("tapped...")
}
}
Edit:
My problem is actually with the loadAudioFile method, the AKAudioFile itself is good, and the AKSampler plays a default sine sound.
I tried also the AKAudioFile methods for creating player and sampler didn't.
let file = try AKAudioFile (readFileName: "AH_G2.wav")
player = file.player
sampler = file.sampler
I also tried to add the wav file using the menu, no change.
If you look at the implementation, there is just the one play() method, but it has default values for noteNumber, velocity, and channel:
#objc open func play(noteNumber: MIDINoteNumber = 60,
velocity: MIDIVelocity = 127,
channel: MIDIChannel = 0) {
samplerUnit.startNote(noteNumber, withVelocity: velocity, onChannel: channel)
}
Changing the MIDI note will change the pitch/speed of the sample playback (60 is standard, 72 is double speed, 48 would be half speed etc), and changing the velocity will change the volume.
NB: the title of your post is 'AKSampler doesn't play', but I ran your code (changing the sample, of course) and it played just fine on my iPad.
I've tried a different audio file and it worked fine.
The first file was a mono file, so my conclusion here is that the AKSampler does not support mono files. Would love to hear more on that.
I am using AVFoundation captureOutput didOutputSampleBuffer to extract an image then to be used for a filter.
self.bufferFrameQueue = DispatchQueue(label: "bufferFrame queue", qos: DispatchQoS.background, attributes: [], autoreleaseFrequency: .inherit)
self.videoDataOutput = AVCaptureVideoDataOutput()
if self.session.canAddOutput(self.videoDataOutput) {
self.session.addOutput(videoDataOutput)
self.videoDataOutput!.alwaysDiscardsLateVideoFrames = true
self.videoDataOutput!.videoSettings = [kCVPixelBufferPixelFormatTypeKey as String: Int(kCVPixelFormatType_32BGRA)]
self.videoDataOutput!.setSampleBufferDelegate(self, queue: self.bufferFrameQueue)
}
func captureOutput(_ captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) {
connection.videoOrientation = .portrait
let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)!
let ciImage = CIImage(cvPixelBuffer: pixelBuffer)
DispatchQueue.main.async {
self.cameraBufferImage = ciImage
}
}
Above just updates self.cameraBufferImage anytime there's a new output sample buffer.
Then, when a filter button is pressed, I use self.cameraBufferImage as this:
func filterButtonPressed() {
if var inputImage = self.cameraBufferImage {
if let currentFilter = CIFilter(name: "CISepiaTone") {
currentFilter.setValue(inputImage, forKey: "inputImage")
currentFilter.setValue(1, forKey: "inputIntensity")
if let output = currentFilter.outputImage {
if let cgimg = self.context.createCGImage(output, from: inputImage.extent) {
self.filterImageLayer = CALayer()
self.filterImageLayer!.frame = self.imagePreviewView.bounds
self.filterImageLayer!.contents = cgimg
self.filterImageLayer!.contentsGravity = kCAGravityResizeAspectFill
self.imagePreviewView.layer.addSublayer(self.filterImageLayer!)
}
}
}
}
}
When above method is invoked, it grabs the 'current' self.cameraBufferImage and use it to apply the filter. This works fine in normal exposure duration times (below 1/15 seconds or so...)
Issue
When exposure duration is slow, i.e. 1/3 seconds, it takes a awhile (about 1/3 seconds) to apply the filter. This delay is only present upon the first time after launch. If done again, there is no delay at all.
Thoughts
I understand that if exposure duration is 1/3 seconds, didOutputSampleBuffer only updates every 1/3 seconds. However, why is that initial delay? Shouldn't it just grab whatever self.cameraBufferImage available at that exact time, instead of waiting?
Queue issue?
CMSampleBuffer retain issue? (Although on Swift 3, there is no CFRetain)
Update
Apple's Documentation
Delegates receive this message whenever the output captures and
outputs a new video frame, decoding or re-encoding it as specified by
its videoSettings property. Delegates can use the provided video frame
in conjunction with other APIs for further processing.
This method is called on the dispatch queue specified by the output’s
sampleBufferCallbackQueue property. It is called periodically, so it
must be efficient to prevent capture performance problems, including
dropped frames.
If you need to reference the CMSampleBuffer object outside of the
scope of this method, you must CFRetain it and then CFRelease it when
you are finished with it.
To maintain optimal performance, some sample buffers directly
reference pools of memory that may need to be reused by the device
system and other capture inputs. This is frequently the case for
uncompressed device native capture where memory blocks are copied as
little as possible. If multiple sample buffers reference such pools of
memory for too long, inputs will no longer be able to copy new samples
into memory and those samples will be dropped.
If your application is causing samples to be dropped by retaining the
provided CMSampleBuffer objects for too long, but it needs access to
the sample data for a long period of time, consider copying the data
into a new buffer and then releasing the sample buffer (if it was
previously retained) so that the memory it references can be reused.