AVAssetWriter async video and audio after calling broadcastPaused() - ios

I am trying to record a video with sound from device screen using ReplayKit and RPBroadcastSampleHandler.
When i record it just using "start broadcast" and "stop" the result i get is great.
But if i try to pause recording (by using red status bar) i got problems. The result i got is video and audio with different length (audio is shorter but have all i need). On the recording i got video and audio that start being async after moment of tapping status bar(ios14). Audio goes good, but video freezing when status bar tapped and continue when modal window closed. As result i got video without audio in the end.
Here is my code:
1.All class fields i have:
class SampleHandler: RPBroadcastSampleHandler {
private let videoService = VideoService()
private let audioService = AudioService()
private var isRecording = false
private let lock = NSLock()
private var finishCalled = false
private var videoWriter: AVAssetWriter!
private var videoWriterInput: AVAssetWriterInput!
private var microphoneWriterInput: AVAssetWriterInput!
private var sessionBeginAtSourceTime: CMTime!
2.Some configure on start capturing:
override func broadcastStarted(withSetupInfo setupInfo: [String : NSObject]?) {
guard !isRecording else { return }
isRecording = true
BroadcastData.clear()
BroadcastData.startVideoDate = Date()
BroadcastData.status = .writing
sessionBeginAtSourceTime = nil
configurateVideoWriter()
}
private func configurateVideoWriter() {
let outputFileLocation = videoService.getVideoFileLocation()
videoWriter = try? AVAssetWriter.init(outputURL: outputFileLocation,
fileType: AVFileType.mov)
configurateVideoWriterInput()
configurateMicrophoneWriterInput()
if videoWriter.canAdd(videoWriterInput) { videoWriter.add(videoWriterInput) }
if videoWriter.canAdd(microphoneWriterInput) { videoWriter.add(microphoneWriterInput) }
videoWriter.startWriting()
}
private func configurateVideoWriterInput() {
let RESOLUTION_COEF: CGFloat = 16
let naturalWidth = UIScreen.main.bounds.width
let naturalHeight = UIScreen.main.bounds.height
let width = naturalWidth - naturalWidth.truncatingRemainder(dividingBy: RESOLUTION_COEF)
let height = naturalHeight - naturalHeight.truncatingRemainder(dividingBy: RESOLUTION_COEF)
let videoSettings: [String: Any] = [
AVVideoCodecKey: AVVideoCodecType.h264,
AVVideoWidthKey: width,
AVVideoHeightKey: height
]
videoWriterInput = AVAssetWriterInput(mediaType: .video,
outputSettings: videoSettings)
videoWriterInput.expectsMediaDataInRealTime = true
}
private func configurateMicrophoneWriterInput() {
let audioOutputSettings: [String : Any] = [
AVFormatIDKey: kAudioFormatMPEG4AAC,
AVNumberOfChannelsKey : 1,
AVSampleRateKey : 44100.0,
AVEncoderBitRateKey: 96000
]
microphoneWriterInput = AVAssetWriterInput(mediaType: .audio,
outputSettings: audioOutputSettings)
microphoneWriterInput.expectsMediaDataInRealTime = true
}
3.Write process:
override func processSampleBuffer(_ sampleBuffer: CMSampleBuffer, with sampleBufferType:
RPSampleBufferType) {
guard isRecording && videoWriter?.status == .writing else { return }
if BroadcastData.status != .writing {
isRecording = false
finishBroadCast()
return
}
if sessionBeginAtSourceTime == nil {
sessionBeginAtSourceTime = CMSampleBufferGetPresentationTimeStamp(sampleBuffer)
videoWriter.startSession(atSourceTime: sessionBeginAtSourceTime!)
}
switch sampleBufferType {
case .video:
if videoWriterInput.isReadyForMoreMediaData {
videoWriterInput.append(sampleBuffer)
}
case .audioMic:
if microphoneWriterInput.isReadyForMoreMediaData {
microphoneWriterInput.append(sampleBuffer)
}
case .audioApp:
break
#unknown default:
print("unknown")
}
}
4.Pause and resume
override func broadcastPaused() {
super.broadcastPaused()
}
override func broadcastResumed() {
super.broadcastResumed()
}

Pausing and resuming the recording creates a gap in the video presentation timestamps and a discontinuity in the audio, which I believe explains your symptoms.
What you need to do is measure how long the recording was paused for, possibly using the sample buffer timestamps, and then subtract that offset from the presentation timestamps of all the subsequent CMSampleBuffer's that you process. CMSampleBufferCreateCopyWithNewTiming() can help you with this.

Related

Using AVKit to detect luminosity

I am working on an app using SwiftUi that leverages the device camera to detect luminosity as described in top answer of this post. The captureOutput(_:didOutput:from:) function in the top answer was used to calculate luminosity. According to Apple Docs this function is intended to notify a delegate that a new video frame was written, and so I have placed this function in a VideoDelegate class. This delegate is then set in a VideoStream class that handles the logic of asking permissions and setting up an AVCaptureSession. My question is how to access the luminosity value calculated within the delegate inside my SwiftUI view?
struct ContentView: View {
#StateObject var videoStream = VideoStream()
var body: some View {
Text("\(videoStream.luminosityReading) ?? Detecting...")
.padding()
}
}
class VideoStream: ObservableObject {
#Published var luminosityReading : Double = 0.0 // TODO get luminosity from VideoDelegate
var session : AVCaptureSession!
init() {
authorizeCapture()
}
func authorizeCapture() {
// permission logic and call to beginCapture()
}
func beginCapture() {
session = AVCaptureSession()
session.beginConfiguration()
let videoDevice = bestDevice() // func definition omitted for readability
guard
let videoDeviceInput = try? AVCaptureDeviceInput(device: videoDevice),
session.canAddInput(videoDeviceInput)
else {
print("Camera selection failed")
return
}
let videoOutput = AVCaptureVideoDataOutput()
guard
session.canAddOutput(videoOutput)
else {
print("Error creating video output")
return
}
session.sessionPreset = .high
session.addOutput(videoOutput)
let queue = DispatchQueue(label: "VideoFrameQueue")
let delegate = VideoDelegate()
videoOutput.setSampleBufferDelegate(delegate, queue: queue)
session.commitConfiguration()
session.startRunning()
}
}
class VideoDelegate: NSObject, AVCaptureVideoDataOutputSampleBufferDelegate {
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
//Retrieving EXIF data of camara frame buffer
let rawMetadata = CMCopyDictionaryOfAttachments(allocator: nil, target: sampleBuffer, attachmentMode: CMAttachmentMode(kCMAttachmentMode_ShouldPropagate))
let metadata = CFDictionaryCreateMutableCopy(nil, 0, rawMetadata) as NSMutableDictionary
let exifData = metadata.value(forKey: "{Exif}") as? NSMutableDictionary
let FNumber : Double = exifData?["FNumber"] as! Double
let ExposureTime : Double = exifData?["ExposureTime"] as! Double
let ISOSpeedRatingsArray = exifData!["ISOSpeedRatings"] as? NSArray
let ISOSpeedRatings : Double = ISOSpeedRatingsArray![0] as! Double
let CalibrationConstant : Double = 50
//Calculating the luminosity
let luminosity : Double = (CalibrationConstant * FNumber * FNumber ) / ( ExposureTime * ISOSpeedRatings )
// how to pass value of luminosity to `VideoStream`?
}
}
As discussed in the comments, the lowest friction option would be to have VideoStream conform to AVCaptureVideoDataOutputSampleBufferDelegate and implement the delegate method there.

How to send CMSampleBuffer to WebRTC?

So I am using Replaykit to try stream my phone screen on a web browser.
override func processSampleBuffer(_ sampleBuffer: CMSampleBuffer, with sampleBufferType: RPSampleBufferType) {
//if source!.isSocketConnected {
switch sampleBufferType {
case RPSampleBufferType.video:
// Handle video sample buffer
break
case RPSampleBufferType.audioApp:
// Handle audio sample buffer for app audio
break
case RPSampleBufferType.audioMic:
// Handle audio sample buffer for mic audio
break
#unknown default:
break
}
}
So how do we send that data to WebRTC?
In order to use WebRTC, I learned that you need a signaling server.
Is it possible to start a signaling server on your mobile, just like http server?
Hi Sam WebRTC have one function which can process CMSampleBuffer frames to get Video Frames. But it is working with CVPixelBuffer. So you have to firstly convert your CMSampleBuffer to CVPixelBuffer. And than add this frames into your localVideoSource with RTCVideoCapturer. i have solved similar problem on AVCaptureVideoDataOutputSampleBufferDelegate. This delegate produces CMSampleBuffer as ReplayKit. i hope that below code lines could be help to you. You can try at the below code lines to solve your problem.
private var videoCapturer: RTCVideoCapturer?
private var localVideoSource = RTCClient.factory.videoSource()
private var localVideoTrack: RTCVideoTrack?
private var remoteVideoTrack: RTCVideoTrack?
private var peerConnection: RTCPeerConnection? = nil
public static let factory: RTCPeerConnectionFactory = {
RTCInitializeSSL()
let videoEncoderFactory = RTCDefaultVideoEncoderFactory()
let videoDecoderFactory = RTCDefaultVideoDecoderFactory()
return RTCPeerConnectionFactory(encoderFactory: videoEncoderFactory, decoderFactory: videoDecoderFactory)
}()
extension RTCClient : AVCaptureVideoDataOutputSampleBufferDelegate {
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
print("didOutPut: \(sampleBuffer)")
guard let imageBuffer: CVImageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
let timeStampNs: Int64 = Int64(CMTimeGetSeconds(CMSampleBufferGetPresentationTimeStamp(sampleBuffer)) * 1000000000)
let rtcPixlBuffer = RTCCVPixelBuffer(pixelBuffer: imageBuffer)
let rtcVideoFrame = RTCVideoFrame(buffer: rtcPixlBuffer, rotation: ._90, timeStampNs: timeStampNs)
self.localVideoSource.capturer(videoCapturer!, didCapture: rtcVideoFrame)
}
}
Also you need configuration like that for mediaSender,
func createMediaSenders() {
let streamId = "stream"
let videoTrack = self.createVideoTrack()
self.localVideoTrack = videoTrack
self.peerConnection!.add(videoTrack, streamIds: [streamId])
self.remoteVideoTrack = self.peerConnection!.transceivers.first { $0.mediaType == .video }?.receiver.track as? RTCVideoTrack
}
private func createVideoTrack() -> RTCVideoTrack {
let videoTrack = RTCClient.factory.videoTrack(with: self.videoSource, trackId: "video0")
return videoTrack
}

Unwanted "smoothing" in AVDepthData on iPhone 13 (not evident in iPhone 12)

We are writing an app which analyzes a real world 3D data by using the TrueDepth camera on the front of an iPhone, and an AVCaptureSession configured to produce AVDepthData along with image data. This worked great on iPhone 12, but the same code on iPhone 13 produces an unwanted "smoothing" effect which makes the scene impossible to process and breaks our app. We are unable to find any information on this effect, from Apple or otherwise, much less how to avoid it, so we are asking you experts.
At the bottom of this post (Figure 3) is our code which configures the capture session, using an AVCaptureDataOutputSynchronizer, to produce frames of 640x480 image and depth data. I boiled it down as much as possible, sorry it's so long. The main two parts are the configure function, which sets up our capture session, and the dataOutputSynchronizer function, near the bottom, which fires when a sycned set of data is available. In the latter function I've included my code which extracts the information from the AVDepthData object, including looping through all 640x480 depth data points (in meters). I've excluded further processing for brevity (believe it or not :)).
On an iPhone 12 device, the PNG data and the depth data merge nicely. The front view and side view of the merged pointcloud are below (Figure 1) . The angles visible in the side view are due to the application of the focal length which "de-perspectives" the data and places them in their proper position in xyz space.
The same code on an iPhone 13 produces depth maps that result in point cloud further below (Figure 2 -- straight on view, angled view, and side view). There is no longer any clear distinction between objects and the background becasue the depth data appears to be "smoothed" between the mannequin and the background -- i.e., there are seven or eight points between the subject and background that are not realistic and make it impossible to do any meaningful processing such as segmenting the scene.
Has anyone else encountered this issue, or have any insight into how we might change our code to avoid it? Any help or ideas are MUCH appreciated, since this is a definite showstopper (we can't tell people to only run our App on older phones :)). Thank you!
Figure 1 -- Merged depth data and image into point cloud, from iPhone 12
Figure 2 -- Merged depth data and image into point cloud, from iPhone 13; unwanted smoothing effect visible
Figure 3 -- Our configuration code and capture handler; edited to remove downstream processing of captured data (which was basically formatting it into an XML file and uploading to the cloud)
import Foundation
import Combine
import AVFoundation
import Photos
import UIKit
import FirebaseStorage
public struct AlertError {
public var title: String = ""
public var message: String = ""
public var primaryButtonTitle = "Accept"
public var secondaryButtonTitle: String?
public var primaryAction: (() -> ())?
public var secondaryAction: (() -> ())?
public init(title: String = "", message: String = "", primaryButtonTitle: String = "Accept", secondaryButtonTitle: String? = nil, primaryAction: (() -> ())? = nil, secondaryAction: (() -> ())? = nil) {
self.title = title
self.message = message
self.primaryAction = primaryAction
self.primaryButtonTitle = primaryButtonTitle
self.secondaryAction = secondaryAction
}
}
///////////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////
//
//
// this is the CameraService class, which configures and runs a capture session
// which acquires syncronized image and depth data
// using an AVCaptureDataOutputSynchronizer
//
//
///////////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////
public class CameraService: NSObject,
AVCaptureVideoDataOutputSampleBufferDelegate,
AVCaptureDepthDataOutputDelegate,
AVCaptureDataOutputSynchronizerDelegate,
MyFirebaseProtocol,
ObservableObject{
#Published public var shouldShowAlertView = false
#Published public var shouldShowSpinner = false
public var labelStatus: String = "Ready"
var images: [UIImage?] = []
public var alertError: AlertError = AlertError()
public let session = AVCaptureSession()
var isSessionRunning = false
var isConfigured = false
var setupResult: SessionSetupResult = .success
private let sessionQueue = DispatchQueue(label: "session queue") // Communicate with the session and other session objects on this queue.
#objc dynamic var videoDeviceInput: AVCaptureDeviceInput!
private let videoDeviceDiscoverySession = AVCaptureDevice.DiscoverySession(deviceTypes: [.builtInTrueDepthCamera], mediaType: .video, position: .front)
var videoCaptureDevice : AVCaptureDevice? = nil
let videoDataOutput: AVCaptureVideoDataOutput = AVCaptureVideoDataOutput() // Define frame output.
let depthDataOutput = AVCaptureDepthDataOutput()
var outputSynchronizer: AVCaptureDataOutputSynchronizer? = nil
let dataOutputQueue = DispatchQueue(label: "video data queue", qos: .userInitiated, attributes: [], autoreleaseFrequency: .workItem)
var scanStateCounter: Int = 0
var m_DepthDatasetsToUpload = [AVCaptureSynchronizedDepthData]()
var m_FrameBufferToUpload = [AVCaptureSynchronizedSampleBufferData]()
var firebaseDepthDatasetsArray: [String] = []
#Published var firebaseImageUploadCount = 0
#Published var firebaseTextFileUploadCount = 0
public func configure() {
/*
Setup the capture session.
In general, it's not safe to mutate an AVCaptureSession or any of its
inputs, outputs, or connections from multiple threads at the same time.
Don't perform these tasks on the main queue because
AVCaptureSession.startRunning() is a blocking call, which can
take a long time. Dispatch session setup to the sessionQueue, so
that the main queue isn't blocked, which keeps the UI responsive.
*/
sessionQueue.async {
self.configureSession()
}
}
// MARK: Checks for user's permisions
public func checkForPermissions() {
switch AVCaptureDevice.authorizationStatus(for: .video) {
case .authorized:
// The user has previously granted access to the camera.
break
case .notDetermined:
/*
The user has not yet been presented with the option to grant
video access. Suspend the session queue to delay session
setup until the access request has completed.
*/
sessionQueue.suspend()
AVCaptureDevice.requestAccess(for: .video, completionHandler: { granted in
if !granted {
self.setupResult = .notAuthorized
}
self.sessionQueue.resume()
})
default:
// The user has previously denied access.
setupResult = .notAuthorized
DispatchQueue.main.async {
self.alertError = AlertError(title: "Camera Access", message: "SwiftCamera doesn't have access to use your camera, please update your privacy settings.", primaryButtonTitle: "Settings", secondaryButtonTitle: nil, primaryAction: {
UIApplication.shared.open(URL(string: UIApplication.openSettingsURLString)!,
options: [:], completionHandler: nil)
}, secondaryAction: nil)
self.shouldShowAlertView = true
}
}
}
// MARK: Session Management
// Call this on the session queue.
/// - Tag: ConfigureSession
private func configureSession() {
if setupResult != .success {
return
}
session.beginConfiguration()
session.sessionPreset = AVCaptureSession.Preset.vga640x480
// Add video input.
do {
var defaultVideoDevice: AVCaptureDevice?
let frontCameraDevice = AVCaptureDevice.default(.builtInTrueDepthCamera, for: .video, position: .front)
// If the rear wide angle camera isn't available, default to the front wide angle camera.
defaultVideoDevice = frontCameraDevice
videoCaptureDevice = defaultVideoDevice
guard let videoDevice = defaultVideoDevice else {
print("Default video device is unavailable.")
setupResult = .configurationFailed
session.commitConfiguration()
return
}
let videoDeviceInput = try AVCaptureDeviceInput(device: videoDevice)
if session.canAddInput(videoDeviceInput) {
session.addInput(videoDeviceInput)
self.videoDeviceInput = videoDeviceInput
} else if session.inputs.isEmpty == false {
self.videoDeviceInput = videoDeviceInput
} else {
print("Couldn't add video device input to the session.")
setupResult = .configurationFailed
session.commitConfiguration()
return
}
} catch {
print("Couldn't create video device input: \(error)")
setupResult = .configurationFailed
session.commitConfiguration()
return
}
//////////////////////////////////////////////////////////////////////////////////////////////////////////////
// MARK: add video output to session
//////////////////////////////////////////////////////////////////////////////////////////////////////////////
videoDataOutput.videoSettings = [(kCVPixelBufferPixelFormatTypeKey as NSString) : NSNumber(value: kCVPixelFormatType_32BGRA)] as [String : Any]
videoDataOutput.alwaysDiscardsLateVideoFrames = true
videoDataOutput.setSampleBufferDelegate(self, queue: DispatchQueue(label: "camera_frame_processing_queue"))
if session.canAddOutput(self.videoDataOutput) {
session.addOutput(self.videoDataOutput)
} else if session.outputs.contains(videoDataOutput) {
} else {
print("Couldn't create video device output")
setupResult = .configurationFailed
session.commitConfiguration()
return
}
guard let connection = self.videoDataOutput.connection(with: AVMediaType.video),
connection.isVideoOrientationSupported else { return }
connection.videoOrientation = .portrait
//////////////////////////////////////////////////////////////////////////////////////////////////////////////
// MARK: add depth output to session
//////////////////////////////////////////////////////////////////////////////////////////////////////////////
// Add a depth data output
if session.canAddOutput(depthDataOutput) {
session.addOutput(depthDataOutput)
depthDataOutput.isFilteringEnabled = false
//depthDataOutput.setDelegate(T##delegate: AVCaptureDepthDataOutputDelegate?##AVCaptureDepthDataOutputDelegate?, callbackQueue: <#T##DispatchQueue?#>)
depthDataOutput.setDelegate(self, callbackQueue: DispatchQueue(label: "depth_frame_processing_queue"))
if let connection = depthDataOutput.connection(with: .depthData) {
connection.isEnabled = true
} else {
print("No AVCaptureConnection")
}
} else if session.outputs.contains(depthDataOutput){
} else {
print("Could not add depth data output to the session")
session.commitConfiguration()
return
}
// Search for highest resolution with half-point depth values
let depthFormats = videoCaptureDevice!.activeFormat.supportedDepthDataFormats
let filtered = depthFormats.filter({
CMFormatDescriptionGetMediaSubType($0.formatDescription) == kCVPixelFormatType_DepthFloat16
})
let selectedFormat = filtered.max(by: {
first, second in CMVideoFormatDescriptionGetDimensions(first.formatDescription).width < CMVideoFormatDescriptionGetDimensions(second.formatDescription).width
})
do {
try videoCaptureDevice!.lockForConfiguration()
videoCaptureDevice!.activeDepthDataFormat = selectedFormat
videoCaptureDevice!.unlockForConfiguration()
} catch {
print("Could not lock device for configuration: \(error)")
session.commitConfiguration()
return
}
//////////////////////////////////////////////////////////////////////////////////////////////////////////////
// Use an AVCaptureDataOutputSynchronizer to synchronize the video data and depth data outputs.
// The first output in the dataOutputs array, in this case the AVCaptureVideoDataOutput, is the "master" output.
//////////////////////////////////////////////////////////////////////////////////////////////////////////////
outputSynchronizer = AVCaptureDataOutputSynchronizer(dataOutputs: [videoDataOutput, depthDataOutput])
outputSynchronizer!.setDelegate(self, queue: dataOutputQueue)
session.commitConfiguration()
self.isConfigured = true
//self.start()
}
// MARK: Device Configuration
/// - Tag: Stop capture session
public func stop(completion: (() -> ())? = nil) {
sessionQueue.async {
//print("entered stop")
if self.isSessionRunning {
//print(self.setupResult)
if self.setupResult == .success {
//print("entered success")
DispatchQueue.main.async{
self.session.stopRunning()
self.isSessionRunning = self.session.isRunning
if !self.session.isRunning {
DispatchQueue.main.async {
completion?()
}
}
}
}
}
}
}
/// - Tag: Start capture session
public func start() {
// We use our capture session queue to ensure our UI runs smoothly on the main thread.
sessionQueue.async {
if !self.isSessionRunning && self.isConfigured {
switch self.setupResult {
case .success:
self.session.startRunning()
self.isSessionRunning = self.session.isRunning
if self.session.isRunning {
}
case .configurationFailed, .notAuthorized:
print("Application not authorized to use camera")
DispatchQueue.main.async {
self.alertError = AlertError(title: "Camera Error", message: "Camera configuration failed. Either your device camera is not available or its missing permissions", primaryButtonTitle: "Accept", secondaryButtonTitle: nil, primaryAction: nil, secondaryAction: nil)
self.shouldShowAlertView = true
}
}
}
}
}
// ------------------------------------------------------------------------
// MARK: CAPTURE HANDLERS
// ------------------------------------------------------------------------
public func dataOutputSynchronizer(_ synchronizer: AVCaptureDataOutputSynchronizer, didOutput synchronizedDataCollection: AVCaptureSynchronizedDataCollection) {
//printWithTime("Capture")
guard let syncedDepthData: AVCaptureSynchronizedDepthData =
synchronizedDataCollection.synchronizedData(for: depthDataOutput) as? AVCaptureSynchronizedDepthData else {
return
}
guard let syncedVideoData: AVCaptureSynchronizedSampleBufferData =
synchronizedDataCollection.synchronizedData(for: videoDataOutput) as? AVCaptureSynchronizedSampleBufferData else {
return
}
///////////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////
//
//
// Below is the code that extracts the information from depth data
// The depth data is 640x480, which matches the size of the synchronized image
// I save this info to a file, upload it to the cloud, and merge it with the image
// on a PC to create a pointcloud
//
//
///////////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////
let depth_data : AVDepthData = syncedDepthData.depthData
let cvpixelbuffer : CVPixelBuffer = depth_data.depthDataMap
let height : Int = CVPixelBufferGetHeight(cvpixelbuffer)
let width : Int = CVPixelBufferGetWidth(cvpixelbuffer)
let quality : AVDepthData.Quality = depth_data.depthDataQuality
let accuracy : AVDepthData.Accuracy = depth_data.depthDataAccuracy
let pixelsize : Float = depth_data.cameraCalibrationData!.pixelSize
let camcaldata : AVCameraCalibrationData = depth_data.cameraCalibrationData!
let intmat : matrix_float3x3 = camcaldata.intrinsicMatrix
let cal_lensdistort_x : CGFloat = camcaldata.lensDistortionCenter.x
let cal_lensdistort_y : CGFloat = camcaldata.lensDistortionCenter.y
let cal_matrix_width : CGFloat = camcaldata.intrinsicMatrixReferenceDimensions.width
let cal_matrix_height : CGFloat = camcaldata.intrinsicMatrixReferenceDimensions.height
let intrinsics_fx : Float = camcaldata.intrinsicMatrix.columns.0.x
let intrinsics_fy : Float = camcaldata.intrinsicMatrix.columns.1.y
let intrinsics_ox : Float = camcaldata.intrinsicMatrix.columns.2.x
let intrinsics_oy : Float = camcaldata.intrinsicMatrix.columns.2.y
let pixelformattype : OSType = CVPixelBufferGetPixelFormatType(cvpixelbuffer)
CVPixelBufferLockBaseAddress(cvpixelbuffer, CVPixelBufferLockFlags(rawValue: 0))
let int16Buffer = unsafeBitCast(CVPixelBufferGetBaseAddress(cvpixelbuffer), to: UnsafeMutablePointer<Float16>.self)
let int16PerRow = CVPixelBufferGetBytesPerRow(cvpixelbuffer) / 2
for x in 0...height-1
{
for y in 0...width-1
{
let luma = int16Buffer[x * int16PerRow + y]
/////////////////////////
// SAVE DEPTH VALUE 'luma' to FILE FOR PROCESSING
}
}
CVPixelBufferUnlockBaseAddress(cvpixelbuffer, CVPixelBufferLockFlags(rawValue: 0))
}

Real-time AVAssetWriter synchronise audio and video when pausing/resuming

I am trying to record a video with sound using iPhone's front camera. As I need to also support pause/resume functionality, I need to use AVAssetWriter. I've found an example online, written in Objective-C, which almost achieves the desired functionality (http://www.gdcl.co.uk/2013/02/20/iPhone-Pause.html)
Unfortunately, after converting this example to Swift, I notice that if I pause/resume, at the end of each "section" there is a small but noticeable period during which the video is just a still frame and the audio is playing. So, it seems that when isPaused is triggered, the recorded audio track is longer than the recorded video track.
Sorry if it may seem like a noob question, but I am not a great expert in AVFoundation and some help would be appreciated!
Below I post my implementation of didOutput sampleBuffer.
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
var isVideo = true
if videoConntection != connection {
isVideo = false
}
if (!isCapturing || isPaused) {
return
}
if (encoder == nil) {
if isVideo {
return
}
if let fmt = CMSampleBufferGetFormatDescription(sampleBuffer) {
let desc = CMAudioFormatDescriptionGetStreamBasicDescription(fmt as CMAudioFormatDescription)
if let chan = desc?.pointee.mChannelsPerFrame, let rate = desc?.pointee.mSampleRate {
let path = tempPath()!
encoder = VideoEncoder(path: path, height: Int(cameraSize.height), width: Int(cameraSize.width), channels: chan, rate: rate)
}
}
}
if discont {
if isVideo {
return
}
discont = false
var pts = CMSampleBufferGetPresentationTimeStamp(sampleBuffer)
let last = lastAudio
if last.flags.contains(CMTimeFlags.valid) {
if cmOffset.flags.contains(CMTimeFlags.valid) {
pts = CMTimeSubtract(pts, cmOffset)
}
let off = CMTimeSubtract(pts, last)
print("setting offset from \(isVideo ? "video":"audio")")
print("adding \(CMTimeGetSeconds(off)) to \(CMTimeGetSeconds(cmOffset)) (pts \(CMTimeGetSeconds(cmOffset)))")
if cmOffset.value == 0 {
cmOffset = off
}
else {
cmOffset = CMTimeAdd(cmOffset, off)
}
}
lastVideo.flags = []
lastAudio.flags = []
return
}
var out:CMSampleBuffer?
if cmOffset.value > 0 {
var count:CMItemCount = CMSampleBufferGetNumSamples(sampleBuffer)
let pInfo = UnsafeMutablePointer<CMSampleTimingInfo>.allocate(capacity: count)
CMSampleBufferGetSampleTimingInfoArray(sampleBuffer, entryCount: count, arrayToFill: pInfo, entriesNeededOut: &count)
var i = 0
while i<count {
pInfo[i].decodeTimeStamp = CMTimeSubtract(pInfo[i].decodeTimeStamp, cmOffset)
pInfo[i].presentationTimeStamp = CMTimeSubtract(pInfo[i].presentationTimeStamp, cmOffset)
i+=1
}
CMSampleBufferCreateCopyWithNewTiming(allocator: nil, sampleBuffer: sampleBuffer, sampleTimingEntryCount: count, sampleTimingArray: pInfo, sampleBufferOut: &out)
}
else {
out = sampleBuffer
}
var pts = CMSampleBufferGetPresentationTimeStamp(out!)
let dur = CMSampleBufferGetDuration(out!)
if (dur.value > 0)
{
pts = CMTimeAdd(pts, dur);
}
if (isVideo) {
lastVideo = pts;
}
else {
lastAudio = pts;
}
encoder?.encodeFrame(sampleBuffer: out!, isVideo: isVideo)
}
And this is my VideoEncoder class:
final class VideoEncoder {
var writer:AVAssetWriter
var videoInput:AVAssetWriterInput
var audioInput:AVAssetWriterInput
var path:String
init(path:String, height:Int, width:Int, channels:UInt32, rate:Float64) {
self.path = path
if FileManager.default.fileExists(atPath:path) {
try? FileManager.default.removeItem(atPath: path)
}
let url = URL(fileURLWithPath: path)
writer = try! AVAssetWriter(outputURL: url, fileType: .mp4)
videoInput = AVAssetWriterInput(mediaType: .video, outputSettings: [
AVVideoCodecKey: AVVideoCodecType.h264,
AVVideoWidthKey:height,
AVVideoHeightKey:width
])
videoInput.expectsMediaDataInRealTime = true
writer.add(videoInput)
audioInput = AVAssetWriterInput(mediaType: .audio, outputSettings: [
AVFormatIDKey:kAudioFormatMPEG4AAC,
AVNumberOfChannelsKey:channels,
AVSampleRateKey:rate
])
audioInput.expectsMediaDataInRealTime = true
writer.add(audioInput)
}
func finish(with completionHandler:#escaping ()->Void) {
writer.finishWriting(completionHandler: completionHandler)
}
func encodeFrame(sampleBuffer:CMSampleBuffer, isVideo:Bool) -> Bool {
if CMSampleBufferDataIsReady(sampleBuffer) {
if writer.status == .unknown {
writer.startWriting()
writer.startSession(atSourceTime: CMSampleBufferGetPresentationTimeStamp(sampleBuffer))
}
if writer.status == .failed {
QFLogger.shared.addLog(format: "[ERROR initiating AVAssetWriter]", args: [], error: writer.error)
return false
}
if isVideo {
if videoInput.isReadyForMoreMediaData {
videoInput.append(sampleBuffer)
return true
}
}
else {
if audioInput.isReadyForMoreMediaData {
audioInput.append(sampleBuffer)
return true
}
}
}
return false
}
}
The rest of the code should be pretty obvious, but just to make it complete, here is what I have for pausing:
isPaused = true
discont = true
And here is resume:
isPaused = false
If anyone could help me to understand how to align video and audio tracks during such live recording that would be great!
Ok, turns out there was no mistake in the code which I provided. The issue which I experienced was caused by a video smoothing which was turned ON :) I guess it needs extra frames to smooth the video, which is why the video output freezes at the end for a short period of time.

How to make the AKSequencer switch soundfonts?

I'm creating a function using the Audiokit API which the user presses music notes onto a screen and a sound comes out based on the SoundFont they chose. I then allow them to collect a host of notes and let them play it back in the order they chose.
The problem is that I am using an AKSequencer to play the notes back and when the AKSequencer plays the notes back it never sounds like the SoundFont. It makes a beep sound.
Is there code that lets me change what sound is coming out of the AKSequencer?
I'm using audio kit to do this.
Sample is an NSObject that contains midisampler, player, etc. Here's the code
class Sampler1: NSObject {
var engine = AVAudioEngine()
var sampler: AVAudioUnitSampler!
var midisampler = AKMIDISampler()
var octave = 4
let midiChannel = 0
var midiVelocity = UInt8(127)
var audioGraph: AUGraph?
var musicPlayer: MusicPlayer?
var patch = UInt32(0)
var synthUnit: AudioUnit?
var synthNode = AUNode()
var outputNode = AUNode()
override init() {
super.init()
// engine = AVAudioEngine()
sampler = AVAudioUnitSampler()
engine.attach(sampler)
engine.connect(sampler, to: engine.mainMixerNode, format: nil)
loadSF2PresetIntoSampler(5)
/* sampler2 = AVAudioUnitSampler()
engine.attachNode(sampler2)
engine.connect(sampler2, to: engine.mainMixerNode, format: nil)
*/
addObservers()
startEngine()
setSessionPlayback()
/* CheckError(NewAUGraph(&audioGraph))
createOutputNode(audioGraph: audioGraph!, outputNode: &outputNode)
createSynthNode()
CheckError(AUGraphNodeInfo(audioGraph!, synthNode, nil, &synthUnit))
let synthOutputElement: AudioUnitElement = 0
let ioUnitInputElement: AudioUnitElement = 0
CheckError(AUGraphConnectNodeInput(audioGraph!, synthNode, synthOutputElement,
outputNode, ioUnitInputElement))
CheckError(AUGraphInitialize(audioGraph!))
CheckError(AUGraphStart(audioGraph!))
loadnewSoundFont()
loadPatch(patchNo: 0)*/
setUpSequencer()
}
func createOutputNode(audioGraph: AUGraph, outputNode: UnsafeMutablePointer<AUNode>) {
var cd = AudioComponentDescription(
componentType: OSType(kAudioUnitType_Output),
componentSubType: OSType(kAudioUnitSubType_RemoteIO),
componentManufacturer: OSType(kAudioUnitManufacturer_Apple),
componentFlags: 0,componentFlagsMask: 0)
CheckError(AUGraphAddNode(audioGraph, &cd, outputNode))
}
func loadSF2PresetIntoSampler(_ preset: UInt8) {
guard let bankURL = Bundle.main.url(forResource: "Arachno SoundFont - Version 1.0", withExtension: "sf2") else {
print("could not load sound font")
return
}
let folder = bankURL.path
do {
try self.sampler.loadSoundBankInstrument(at: bankURL,
program: preset,
bankMSB: UInt8(kAUSampler_DefaultMelodicBankMSB),
bankLSB: UInt8(kAUSampler_DefaultBankLSB))
try midisampler.loadSoundFont(folder, preset: 0, bank: kAUSampler_DefaultBankLSB)
// try midisampler.loadPath(bankURL.absoluteString)
} catch {
print("error loading sound bank instrument")
}
}
func createSynthNode() {
var cd = AudioComponentDescription(
componentType: OSType(kAudioUnitType_MusicDevice),
componentSubType: OSType(kAudioUnitSubType_MIDISynth),
componentManufacturer: OSType(kAudioUnitManufacturer_Apple),
componentFlags: 0,componentFlagsMask: 0)
CheckError(AUGraphAddNode(audioGraph!, &cd, &synthNode))
}
func setSessionPlayback() {
let audioSession = AVAudioSession.sharedInstance()
do {
try
audioSession.setCategory(AVAudioSession.Category.playback, options:
AVAudioSession.CategoryOptions.mixWithOthers)
} catch {
print("couldn't set category \(error)")
return
}
do {
try audioSession.setActive(true)
} catch {
print("couldn't set category active \(error)")
return
}
}
func startEngine() {
if engine.isRunning {
print("audio engine already started")
return
}
do {
try engine.start()
print("audio engine started")
} catch {
print("oops \(error)")
print("could not start audio engine")
}
}
func addObservers() {
NotificationCenter.default.addObserver(self,
selector:"engineConfigurationChange:",
name:NSNotification.Name.AVAudioEngineConfigurationChange,
object:engine)
NotificationCenter.default.addObserver(self,
selector:"sessionInterrupted:",
name:AVAudioSession.interruptionNotification,
object:engine)
NotificationCenter.default.addObserver(self,
selector:"sessionRouteChange:",
name:AVAudioSession.routeChangeNotification,
object:engine)
}
func removeObservers() {
NotificationCenter.default.removeObserver(self,
name: NSNotification.Name.AVAudioEngineConfigurationChange,
object: nil)
NotificationCenter.default.removeObserver(self,
name: AVAudioSession.interruptionNotification,
object: nil)
NotificationCenter.default.removeObserver(self,
name: AVAudioSession.routeChangeNotification,
object: nil)
}
private func setUpSequencer() {
// set the sequencer voice to storedPatch so we can play along with it using patch
var status = NewMusicSequence(&musicSequence)
if status != noErr {
print("\(#line) bad status \(status) creating sequence")
}
status = MusicSequenceNewTrack(musicSequence!, &track)
if status != noErr {
print("error creating track \(status)")
}
// 0xB0 = bank select, first we do the most significant byte
var chanmess = MIDIChannelMessage(status: 0xB0 | sequencerMidiChannel, data1: 0, data2: 0, reserved: 0)
status = MusicTrackNewMIDIChannelEvent(track!, 0, &chanmess)
if status != noErr {
print("creating bank select event \(status)")
}
// then the least significant byte
chanmess = MIDIChannelMessage(status: 0xB0 | sequencerMidiChannel, data1: 32, data2: 0, reserved: 0)
status = MusicTrackNewMIDIChannelEvent(track!, 0, &chanmess)
if status != noErr {
print("creating bank select event \(status)")
}
// set the voice
chanmess = MIDIChannelMessage(status: 0xC0 | sequencerMidiChannel, data1: UInt8(0), data2: 0, reserved: 0)
status = MusicTrackNewMIDIChannelEvent(track!, 0, &chanmess)
if status != noErr {
print("creating program change event \(status)")
}
CheckError(MusicSequenceSetAUGraph(musicSequence!, audioGraph))
CheckError(NewMusicPlayer(&musicPlayer))
CheckError(MusicPlayerSetSequence(musicPlayer!, musicSequence))
CheckError(MusicPlayerPreroll(musicPlayer!))
}
func loadnewSoundFont() {
var bankURL = Bundle.main.url(forResource: "Arachno SoundFont - Version 1.0", withExtension: "sf2")
CheckError(AudioUnitSetProperty(synthUnit!, AudioUnitPropertyID(kMusicDeviceProperty_SoundBankURL), AudioUnitScope(kAudioUnitScope_Global), 0, &bankURL, UInt32(MemoryLayout<URL>.size)))
}
func loadPatch(patchNo: Int) {
let channel = UInt32(0)
var enabled = UInt32(1)
var disabled = UInt32(0)
patch = UInt32(patchNo)
CheckError(AudioUnitSetProperty(
synthUnit!,
AudioUnitPropertyID(kAUMIDISynthProperty_EnablePreload),
AudioUnitScope(kAudioUnitScope_Global),
0,
&enabled,
UInt32(MemoryLayout<UInt32>.size)))
let programChangeCommand = UInt32(0xC0 | channel)
CheckError(MusicDeviceMIDIEvent(self.synthUnit!, programChangeCommand, patch, 0, 0))
CheckError(AudioUnitSetProperty(
synthUnit!,
AudioUnitPropertyID(kAUMIDISynthProperty_EnablePreload),
AudioUnitScope(kAudioUnitScope_Global),
0,
&disabled,
UInt32(MemoryLayout<UInt32>.size)))
// the previous programChangeCommand just triggered a preload
// this one actually changes to the new voice
CheckError(MusicDeviceMIDIEvent(synthUnit!, programChangeCommand, patch, 0, 0))
}
func play(number: UInt8) {
sampler.startNote(number, withVelocity: 127, onChannel: 0)
}
func stop(number: UInt8) {
sampler.stopNote(number, onChannel: 0)
}
func musicPlayerPlay() {
var status = noErr
var playing:DarwinBoolean = false
CheckError(MusicPlayerIsPlaying(musicPlayer!, &playing))
if playing != false {
status = MusicPlayerStop(musicPlayer!)
if status != noErr {
print("Error stopping \(status)")
CheckError(status)
return
}
}
CheckError(MusicPlayerSetTime(musicPlayer!, 0))
CheckError(MusicPlayerStart(musicPlayer!))
}
var avsequencer: AVAudioSequencer!
var sequencerMode = 1
var sequenceStartTime: Date?
var noteOnTimes = [Date] (repeating: Date(), count:128)
var musicSequence: MusicSequence?
var midisequencer = AKSequencer()
// var musicPlayer: MusicPlayer?
let sequencerMidiChannel = UInt8(1)
var midisynthUnit: AudioUnit?
//track is the variable the notes are written on
var track: MusicTrack?
var newtrack: AKMusicTrack?
func setupSequencer(name: String) {
self.avsequencer = AVAudioSequencer(audioEngine: self.engine)
let options = AVMusicSequenceLoadOptions.smfChannelsToTracks
if let fileURL = Bundle.main.url(forResource: name, withExtension: "mid") {
do {
try avsequencer.load(from: fileURL, options: options)
print("loaded \(fileURL)")
} catch {
print("something screwed up \(error)")
return
}
}
avsequencer.prepareToPlay()
}
func playsequence() {
if avsequencer.isPlaying {
stopsequence()
}
avsequencer.currentPositionInBeats = TimeInterval(0)
do {
try avsequencer.start()
} catch {
print("cannot start \(error)")
}
}
func creatnewtrck(){
let sequencelegnth = AKDuration(beats: 8.0)
newtrack = midisequencer.newTrack()
}
func addnotestotrack(){
// AKMIDISampler
}
func stopsequence() {
avsequencer.stop()
}
func setSequencerMode(mode: Int) {
sequencerMode = mode
switch(sequencerMode) {
case SequencerMode.off.rawValue:
print(mode)
// CheckError(osstatus: MusicPlayerStop(musicPlayer!))
case SequencerMode.recording.rawValue:
print(mode)
case SequencerMode.playing.rawValue:
print(mode)
default:
break
}
}
/* func noteOn(note: UInt8) {
let noteCommand = UInt32(0x90 | midiChannel)
let base = note - 48
let octaveAdjust = (UInt8(octave) * 12) + base
let pitch = UInt32(octaveAdjust)
CheckError(MusicDeviceMIDIEvent(self.midisynthUnit!,
noteCommand, pitch, UInt32(self.midiVelocity), 0))
}
func noteOff(note: UInt8) {
let channel = UInt32(0)
let noteCommand = UInt32(0x80 | channel)
let base = note - 48
let octaveAdjust = (UInt8(octave) * 12) + base
let pitch = UInt32(octaveAdjust)
CheckError(MusicDeviceMIDIEvent(self.midisynthUnit!,
noteCommand, pitch, 0, 0))
}*/
func noteOn(note: UInt8) {
if sequencerMode == SequencerMode.recording.rawValue {
print("recording sequence note")
noteOnTimes[Int(note)] = Date()
} else {
print("no notes")
}
}
func noteOff(note: UInt8, timestamp: Float64, sequencetime: Date) {
if sequencerMode == SequencerMode.recording.rawValue {
let duration: Double = Date().timeIntervalSince(noteOnTimes[Int(note)])
let onset: Double = noteOnTimes[Int(note)].timeIntervalSince(sequencetime)
//the order of the notes in the array
var beat: MusicTimeStamp = 0
CheckError(MusicSequenceGetBeatsForSeconds(musicSequence!, onset, &beat))
var mess = MIDINoteMessage(channel: sequencerMidiChannel,
note: note,
velocity: midiVelocity,
releaseVelocity: 0,
duration: Float(duration) )
CheckError(MusicTrackNewMIDINoteEvent(track!, timestamp, &mess))
}
}
}
The code that plays the collection of notes
_ = sample.midisequencer.newTrack()
let sequencelegnth = AKDuration(beats: 8.0)
sample.midisequencer.setLength(sequencelegnth)
sample.sequenceStartTime = format.date(from: format.string(from: NSDate() as Date))
sample.midisequencer.setTempo(160.0)
sample.midisequencer.enableLooping()
sample.midisequencer.play()
This is the code that changes the soundfont
func loadSF2PresetIntoSampler(_ preset: UInt8) {
guard let bankURL = Bundle.main.url(forResource: "Arachno SoundFont - Version 1.0", withExtension: "sf2") else {
print("could not load sound font")
return
}
let folder = bankURL.path
do {
try self.sampler.loadSoundBankInstrument(at: bankURL,
program: preset,
bankMSB: UInt8(kAUSampler_DefaultMelodicBankMSB),
bankLSB: UInt8(kAUSampler_DefaultBankLSB))
try midisampler.loadSoundFont(folder, preset: 0, bank: kAUSampler_DefaultBankLSB)
// try midisampler.loadPath(bankURL.absoluteString)
} catch {
print("error loading sound bank instrument")
}
}
The midisampler is an AKMidisampler.
At minimum, you need to connect an AKSequencer to some kind of output to get it to make sounds. With the older version (now called AKAppleSequencer), if you don't explicitly set the output, you will hear the default (beepy) sampler.
For example, on AKAppleSequencer (in AudioKit 4.8, or AKSequencer for earlier version)
let track = seq.newTrack()
track!.setMIDIOutput(sampler.midiIn)
on the new AKSequencer
let track = seq.newTrack() // for the new AKSequencer, in AudioKit 4.8
track!.setTarget(node: sampler)
Also, make sure that you have allowed audio background mode in your project's Capabilities, as missing this step this will also get you the default sampler.
You've included a massive amount of code (and I haven't tried to absorb all of what is going on here) but the fact that you are using instances of both MusicSequence and AKSequencer (which I suspect is the older version, now called AKAppleSequencer, which is merely a wrapper around MusicSequence) is something of a red flag.

Resources