Goal: To stream audio/video from one device to another.
Problem: I managed to get both audio and video but the audio won't play on the other side.
Details:
I have created an app that will transmit A/V data from one device to another over the network. To not go into too much detail I will show you where I am stuck. I managed to listen to the output delegate, where I extract the audio information, convert it into Data and pass it on to a delegate that I've created.
func captureOutput(_ captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) {
// VIDEO | code excluded for simplicity of this question as this part works
// AUDIO | only deliver the frames if you are allowed to
if self.produceAudioFrames == true {
// process the audio buffer
let _audioFrame = self.audioFromSampleBuffer(sampleBuffer)
// process in async
DispatchQueue.main.async {
// pass the audio frame to the delegate
self.delegate?.audioFrame(data: _audioFrame)
}
}
}
The helper func that converts the SampleBuffer (not my code, can't find source. I know found it here on SO) :
func audioFromSampleBuffer(_ sampleBuffer: CMSampleBuffer) -> Data {
var audioBufferList = AudioBufferList()
var data = Data()
var blockBuffer : CMBlockBuffer?
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampleBuffer,
nil,
&audioBufferList,
MemoryLayout<AudioBufferList>.size,
nil,
nil,
0,
&blockBuffer)
let buffers = UnsafeBufferPointer<AudioBuffer>(start: &audioBufferList.mBuffers,
count: Int(audioBufferList.mNumberBuffers))
for audioBuffer in buffers {
let frame = audioBuffer.mData?.assumingMemoryBound(to: UInt8.self)
data.append(frame!, count: Int(audioBuffer.mDataByteSize))
}
// dev
//print("audio buffer count: \(buffers.count)") | this returns 2048
// give the raw data back to the caller
return data
}
Note: Before sending over the network, I convert the data returned from the helper func like so: let payload = Array(data)
That is the host's side.
On the client side I am receiving the payload as [UInt8] and this where I am stuck. I tried multiple things but none worked.
func processIncomingAudioPayloadFromFrame(_ ID: String, _ _Data: [UInt8]) {
let readableData = Data(bytes: _Data) // back from array to the data before we sent it over the network.
print(readableData.count) // still 2048 even after recieving from network, So I am guessing data is still intact
let x = self.bytesToAudioBuffer(_Data) // option two convert into a AVAudioPCMBuffer
print(x) // prints | <AVAudioPCMBuffer#0x600000201e80: 2048/2048 bytes> | I am guessing it works
// option one | play using AVAudioPlayer
do {
let player = try AVAudioPlayer(data: readableData)
try AVAudioSession.sharedInstance().setCategory(AVAudioSessionCategoryPlayback)
try AVAudioSession.sharedInstance().setActive(true)
player.prepareToPlay()
player.play()
print(player.volume) // doing this to see if this is reached
}catch{
print(error) // gets error | Error Domain=NSOSStatusErrorDomain Code=1954115647 "(null)"
}
}
Here is the helper func that converts [UInt8] into AVAudioPCMBuffer:
func bytesToAudioBuffer(_ buf: [UInt8]) -> AVAudioPCMBuffer {
// format assumption! make this part of your protocol?
let fmt = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: 44100,
channels: 1, interleaved: true)
let frameLength = UInt32(buf.count) / fmt.streamDescription.pointee.mBytesPerFrame
let audioBuffer = AVAudioPCMBuffer(pcmFormat: fmt, frameCapacity: frameLength)
audioBuffer.frameLength = frameLength
let dstLeft = audioBuffer.floatChannelData![0]
// for stereo
// let dstRight = audioBuffer.floatChannelData![1]
buf.withUnsafeBufferPointer {
let src = UnsafeRawPointer($0.baseAddress!).
bindMemory(to: Float.self, capacity: Int(frameLength))
dstLeft.initialize(from: src, count: Int(frameLength))
}
return audioBuffer
}
Questions:
Is it possible to even play directly from [UInt8]?
How can I play the AVAudioPCMBuffer payload using the AudioEngine?
Is it possible?
How can I play the audio on the client side.
Footnote: The comments in the code should give you some hint for the output I hope. Also I don't want to save to a file or anything file related as I just want to amplify the mic for real-time listening, I have no interest in saving the data.
I have used the same code, for playing audio file on Carrier call.
Please try and let me know the results :
Objective Code :
NSString *soundFilePath = [[NSBundle mainBundle]
pathForResource:self.bgAudioFileName ofType: #"mp3"];
NSURL *fileURL = [[NSURL alloc] initFileURLWithPath:soundFilePath ];
myAudioPlayer = [[AVAudioPlayer alloc] initWithContentsOfURL:fileURL
error:nil];
myAudioPlayer.numberOfLoops = -1;
NSError *sessionError = nil;
// Change the default output audio route
AVAudioSession *audioSession = [AVAudioSession sharedInstance];
// get your audio session somehow
[audioSession setCategory:AVAudioSessionCategoryMultiRoute
error:&sessionError];
BOOL success= [audioSession
overrideOutputAudioPort:AVAudioSessionPortOverrideNone
error:&sessionError];
[audioSession setActive:YES error:&sessionError];
if(!success)
{
NSLog(#"error doing outputaudioportoverride - %#", [sessionError
localizedDescription]);
}
[myAudioPlayer setVolume:1.0f];
[myAudioPlayer play];
Swift version :
var soundFilePath: String? = Bundle.main.path(forResource:
bgAudioFileName, ofType: "mp3")
var fileURL = URL(fileURLWithPath: soundFilePath ?? "")
myAudioPlayer = try? AVAudioPlayer(contentsOf: fileURL)
myAudioPlayer.numberOfLoops = -1
var sessionError: Error? = nil
// Change the default output audio route
var audioSession = AVAudioSession.sharedInstance()
// get your audio session somehow
try? audioSession.setCategory(AVAudioSessionCategoryMultiRoute)
var success: Bool? = try?
audioSession.overrideOutputAudioPort(AVAudioSessionPortOverrideNone
as? AVAudioSessionPortOverride ?? AVAudioSessionPortOverride())
try? audioSession.setActive(true)
if !(success ?? false) {
print("error doing outputaudioportoverride - \
(sessionError?.localizedDescription)")
}
myAudioPlayer.volume = 1.0
myAudioPlayer.play()
Related
I'm looking into making my Swift iOS app record a video and play it back on the same screen with 30 seconds delay.
I've been using an official example to record a video. Then I added a button that would trigger playing self.movieFileOutput?.outputFileURL using AVPlayer in a separate view on the screen. It's close to what I want but obviously it stops playing once it comes to the end of the file written to the disk and does not proceed when the next buffered chunk is written.
I could stop the video recording every 30 seconds and save the URL for each file so I can play it back but that means that there would be interruptions in video capture and playback.
How can I make video recording never stop and playback always be on the screen with any delay I want?
I've seen a similar question and all the answers pointed at AVFoundation docs. I couldn't find how to make AVFoundation to write predictable chunks of video from memory to disk when recording.
You can achieve what you want by recording 30s chunks of video, then enqueueing them to an AVQueuePlayer for seamless playback. Recording the video chunks would be very easy with AVCaptureFileOutput on macOS, but sadly, on iOS you cannot create new chunks without dropping frames, so you have to use the wordier, lower level AVAssetWriter API:
import UIKit
import AVFoundation
// TODO: delete old videos
// TODO: audio
class ViewController: UIViewController {
// capture
let captureSession = AVCaptureSession()
// playback
let player = AVQueuePlayer()
var playerLayer: AVPlayerLayer! = nil
// output. sadly not AVCaptureMovieFileOutput
var assetWriter: AVAssetWriter! = nil
var assetWriterInput: AVAssetWriterInput! = nil
var chunkNumber = 0
var chunkStartTime: CMTime! = nil
var chunkOutputURL: URL! = nil
override func viewDidLoad() {
super.viewDidLoad()
playerLayer = AVPlayerLayer(player: player)
view.layer.addSublayer(playerLayer)
// inputs
let videoCaptureDevice = AVCaptureDevice.defaultDevice(withMediaType: AVMediaTypeVideo)
let videoInput = try! AVCaptureDeviceInput(device: videoCaptureDevice)
captureSession.addInput(videoInput)
// outputs
// iOS AVCaptureFileOutput/AVCaptureMovieFileOutput still don't support dynamically
// switching files (?) so we have to re-implement with AVAssetWriter
let videoOutput = AVCaptureVideoDataOutput()
// TODO: probably something else
videoOutput.setSampleBufferDelegate(self, queue: DispatchQueue.main)
captureSession.addOutput(videoOutput)
captureSession.startRunning()
}
override func viewDidLayoutSubviews() {
super.viewDidLayoutSubviews()
playerLayer.frame = view.layer.bounds
}
func createWriterInput(for presentationTimeStamp: CMTime) {
let fileManager = FileManager.default
chunkOutputURL = fileManager.urls(for: .documentDirectory, in: .userDomainMask)[0].appendingPathComponent("chunk\(chunkNumber).mov")
try? fileManager.removeItem(at: chunkOutputURL)
assetWriter = try! AVAssetWriter(outputURL: chunkOutputURL, fileType: AVFileTypeQuickTimeMovie)
// TODO: get dimensions from image CMSampleBufferGetImageBuffer(sampleBuffer)
let outputSettings: [String: Any] = [AVVideoCodecKey:AVVideoCodecH264, AVVideoWidthKey: 1920, AVVideoHeightKey: 1080]
assetWriterInput = AVAssetWriterInput(mediaType: AVMediaTypeVideo, outputSettings: outputSettings)
assetWriterInput.expectsMediaDataInRealTime = true
assetWriter.add(assetWriterInput)
chunkNumber += 1
chunkStartTime = presentationTimeStamp
assetWriter.startWriting()
assetWriter.startSession(atSourceTime: chunkStartTime)
}
}
extension ViewController: AVCaptureVideoDataOutputSampleBufferDelegate {
func captureOutput(_ captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) {
let presentationTimeStamp = CMSampleBufferGetPresentationTimeStamp(sampleBuffer)
if assetWriter == nil {
createWriterInput(for: presentationTimeStamp)
} else {
let chunkDuration = CMTimeGetSeconds(CMTimeSubtract(presentationTimeStamp, chunkStartTime))
if chunkDuration > 30 {
assetWriter.endSession(atSourceTime: presentationTimeStamp)
// make a copy, as finishWriting is asynchronous
let newChunkURL = chunkOutputURL!
let chunkAssetWriter = assetWriter!
chunkAssetWriter.finishWriting {
print("finishWriting says: \(chunkAssetWriter.status.rawValue, chunkAssetWriter.error)")
print("queuing \(newChunkURL)")
self.player.insert(AVPlayerItem(url: newChunkURL), after: nil)
self.player.play()
}
createWriterInput(for: presentationTimeStamp)
}
}
if !assetWriterInput.append(sampleBuffer) {
print("append says NO: \(assetWriter.status.rawValue, assetWriter.error)")
}
}
}
p.s. it's very curious to see what you were doing 30 seconds ago. What exactly are you making?
Currently I am trying to process the frames of an existing video with OpenCV. Are there any AV reader libraries that contain delegate methods that process frames while playing back videos? I know how to process frames during a live AVCaptureSession through the use of the AVCaptureVideoDataOutput and the captureOutput delegate method. Is there something similar for playing back videos?
Any help would be appreiciated.
Here's the solution. Thanks to Tim Bull's answer I accomplished this using AVAssetReader / AssetReaderOutput
The below function I called within a button click to start the video, and begin processing each frame with OpenCV:
func processVids() {
guard let pathOfOrigVid = Bundle.main.path(forResource: "output_10_34_34", ofType: "mp4") else{
print("video.m4v not found\n")
exit(0)
}
var path: URL? = nil
do{
path = try FileManager.default.url(for: .documentDirectory, in:.userDomainMask, appropriateFor: nil, create: false)
path = path?.appendingPathComponent("grayVideo.mp4")
}catch{
print("Unable to make URL to Movies path\n")
exit(0)
}
let movie: AVURLAsset = AVURLAsset(url: NSURL(fileURLWithPath: pathOfOrigVid) as URL, options: nil)
let tracks: [AVAssetTrack] = movie.tracks(withMediaType: AVMediaTypeVideo)
let track: AVAssetTrack = tracks[0]
var reader: AVAssetReader? = nil
do{
reader = try AVAssetReader(asset: movie)
}
catch{
print("Problem initializing AVReader\n")
}
let settings : [String: Any?] = [
String(kCVPixelBufferPixelFormatTypeKey): NSNumber(value: kCVPixelFormatType_32ARGB),
String(kCVPixelBufferIOSurfacePropertiesKey): [:]
]
let rout: AVAssetReaderTrackOutput = AVAssetReaderTrackOutput(track: track, outputSettings: settings)
reader?.add(rout)
reader?.startReading()
DispatchQueue.global().async(execute: {
while reader?.status == AVAssetReaderStatus.reading {
if(rout.copyNextSampleBuffer() != nil){
// Buffer of the frame to perform OpenCV processing on
let sbuff: CMSampleBuffer = rout.copyNextSampleBuffer()!
}
usleep(10000)
}
})
}
AVAssetReader / AVAssetReaderOutput are what you're looking for. Check out the CopyNextSampleBuffer method.
https://developer.apple.com/documentation/avfoundation/avassetreaderoutput
You can use AVVideoComposition
If You want to process frames with CoreImage you can create an instance by calling init(asset:applyingCIFiltersWithHandler:) method.
Or you can create custom comopsitor
You can implement your own custom video compositor by implementing the
AVVideoCompositing protocol; a custom video compositor is provided
with pixel buffers for each of its video sources during playback and
other operations and can perform arbitrary graphical operations on
them in order to produce visual output.
See docs for more info.
Here you can find an example (but example is in Objective-C).
For someone need to process frame of video by OpenCV.
Decode video:
#objc public protocol ARVideoReaderDelegate : NSObjectProtocol {
func reader(_ reader:ARVideoReader!, newFrameReady sampleBuffer:CMSampleBuffer?, _ frameCount:Int)
func readerDidFinished(_ reader:ARVideoReader!, totalFrameCount:Int)
}
#objc open class ARVideoReader: NSObject {
var _asset: AVURLAsset!
#objc var _delegate: ARVideoReaderDelegate?
#objc public init!(urlAsset asset:AVURLAsset){
_asset = asset
super.init()
}
#objc open func startReading() -> Void {
if let reader = try? AVAssetReader.init(asset: _asset){
let videoTrack = _asset.tracks(withMediaType: .video).compactMap{ $0 }.first;
let options = [kCVPixelBufferPixelFormatTypeKey : Int(kCVPixelFormatType_32BGRA)]
let readerOutput = AVAssetReaderTrackOutput.init(track: videoTrack!, outputSettings: options as [String : Any])
reader.add(readerOutput)
reader.startReading()
var count = 0
//reading
while (reader.status == .reading && videoTrack?.nominalFrameRate != 0){
let sampleBuffer = readerOutput.copyNextSampleBuffer()
_delegate?.reader(self, newFrameReady: sampleBuffer, count)
count = count+1;
}
_delegate?.readerDidFinished(self,totalFrameCount: count)
}
}
}
In the callback of delegate:
//convert sampleBuffer to cv::Mat
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
size_t width = CVPixelBufferGetWidth(imageBuffer);
size_t height = CVPixelBufferGetHeight(imageBuffer);
CVPixelBufferLockBaseAddress(imageBuffer, kCVPixelBufferLock_ReadOnly);
char *baseBuffer = (char*)CVPixelBufferGetBaseAddress(imageBuffer);
cv::Mat cvImage = cv::Mat((int)height,(int)width,CV_8UC3);
cv::MatIterator_<cv::Vec3b> it_start = cvImage.begin<cv::Vec3b>();
cv::MatIterator_<cv::Vec3b> it_end = cvImage.end<cv::Vec3b>();
long cur = 0;
size_t padding = CVPixelBufferGetBytesPerRow(imageBuffer) - width*4;
size_t offset = padding;
while (it_start != it_end) {
//opt pixel
long p_idx = cur*4 + offset;
char b = baseBuffer[p_idx];
char g = baseBuffer[p_idx + 1];
char r = baseBuffer[p_idx + 2];
cv::Vec3b newpixel(b,g,r);
*it_start = newpixel;
cur++;
it_start++;
if (cur%width == 0) {
offset = offset + padding;
}
}
CVPixelBufferUnlockBaseAddress(imageBuffer, kCVPixelBufferLock_ReadOnly);
//process cvImage now
I am trying to record audio using AVAudioEngine. The file gets recorded and plays correctly. However, I also need to send AVAudioPCMBuffer that I receive in the tap handler to my server via socket. I am converting AVAudioPCMBuffer to NSData and sending it. The server is receiving it - however the file doesn't play correctly on the server. Am I missing something while converting AVAudioPCMBuffer to NSData or is my recording missing some configuration.
Any help would be appreciated guys. Thanks!
let audioEngine = AVAudioEngine()
let inputNode = audioEngine.inputNode
let bus = 0
try file = AVAudioFile(forWriting: URLFor("recording.wav")!, settings: audioEngine.inputNode!.inputFormatForBus(0).settings)
inputNode!.installTapOnBus(bus, bufferSize: 4096, format: inputNode!.inputFormatForBus(bus)) {
(buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
self.file?.writeFromBuffer(buffer)
self.socketio.send(self.toNSData(buffer))
}
do{
audioEngine.prepare()
try audioEngine.start()
}
catch{
print("catch")
}
func toNSData(PCMBuffer: AVAudioPCMBuffer) -> NSData {
let channelCount = 1 // given PCMBuffer channel count is 1
let channels = UnsafeBufferPointer(start: PCMBuffer.floatChannelData, count: channelCount)
let ch0Data = NSData(bytes: channels[0], length:Int(PCMBuffer.frameCapacity * PCMBuffer.format.streamDescription.memory.mBytesPerFrame))
return ch0Data
}
I have two classes, MicrophoneHandler, and AudioPlayer. I have managed to use AVCaptureSession to tap microphone data using the approved answer here, and and converted the CMSampleBuffer to NSData using this function:
func sendDataToDelegate(buffer: CMSampleBuffer!)
{
let block = CMSampleBufferGetDataBuffer(buffer)
var length = 0
var data: UnsafeMutablePointer<Int8> = nil
var status = CMBlockBufferGetDataPointer(block!, 0, nil, &length, &data) // TODO: check for errors
let result = NSData(bytesNoCopy: data, length: length, freeWhenDone: false)
self.delegate.handleBuffer(result)
}
I would now like to play the audio over the speaker by converting the NSData produced above to AVAudioPCMBuffer and play it using AVAudioEngine. My AudioPlayerclass is as follows:
var engine: AVAudioEngine!
var playerNode: AVAudioPlayerNode!
var mixer: AVAudioMixerNode!
override init()
{
super.init()
self.setup()
self.start()
}
func handleBuffer(data: NSData)
{
let newBuffer = self.toPCMBuffer(data)
print(newBuffer)
self.playerNode.scheduleBuffer(newBuffer, completionHandler: nil)
}
func setup()
{
self.engine = AVAudioEngine()
self.playerNode = AVAudioPlayerNode()
self.engine.attachNode(self.playerNode)
self.mixer = engine.mainMixerNode
engine.connect(self.playerNode, to: self.mixer, format: self.mixer.outputFormatForBus(0))
}
func start()
{
do {
try self.engine.start()
}
catch {
print("error couldn't start engine")
}
self.playerNode.play()
}
func toPCMBuffer(data: NSData) -> AVAudioPCMBuffer
{
let audioFormat = AVAudioFormat(commonFormat: AVAudioCommonFormat.PCMFormatFloat32, sampleRate: 8000, channels: 2, interleaved: false) // given NSData audio format
let PCMBuffer = AVAudioPCMBuffer(PCMFormat: audioFormat, frameCapacity: UInt32(data.length) / audioFormat.streamDescription.memory.mBytesPerFrame)
PCMBuffer.frameLength = PCMBuffer.frameCapacity
let channels = UnsafeBufferPointer(start: PCMBuffer.floatChannelData, count: Int(PCMBuffer.format.channelCount))
data.getBytes(UnsafeMutablePointer<Void>(channels[0]) , length: data.length)
return PCMBuffer
}
The buffer reaches the handleBuffer:buffer function when self.delegate.handleBuffer(result) is called in the first snippet above.
I am able to print(newBuffer), and see the memory locations of the converted buffers, but nothing comes out of the speakers. I can only imagine something is not consistent between the conversions to and from NSData. Any ideas? Thanks in advance.
Skip the raw NSData format
Why not use AVAudioPlayer all the way? If you positively need NSData, you can always load such data from the soundURL below. In this example, the disk buffer is something like:
let soundURL = documentDirectory.URLByAppendingPathComponent("sound.m4a")
It makes sense to record directly to a file anyway for optimal memory and resource management. You get NSData from your recording this way:
let data = NSFileManager.defaultManager().contentsAtPath(soundURL.path())
The code below is all you need:
Record
if !audioRecorder.recording {
let audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setActive(true)
audioRecorder.record()
} catch {}
}
Play
if (!audioRecorder.recording){
do {
try audioPlayer = AVAudioPlayer(contentsOfURL: audioRecorder.url)
audioPlayer.play()
} catch {}
}
Setup
let audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setCategory(AVAudioSessionCategoryPlayAndRecord)
try audioRecorder = AVAudioRecorder(URL: self.directoryURL()!,
settings: recordSettings)
audioRecorder.prepareToRecord()
} catch {}
Settings
let recordSettings = [AVSampleRateKey : NSNumber(float: Float(44100.0)),
AVFormatIDKey : NSNumber(int: Int32(kAudioFormatMPEG4AAC)),
AVNumberOfChannelsKey : NSNumber(int: 1),
AVEncoderAudioQualityKey : NSNumber(int: Int32(AVAudioQuality.Medium.rawValue))]
Download Xcode Project:
You can find this very example here. Download the full project, which records and plays on both simulator and device, from Swift Recipes.
I'm trying to record segments of audio and recombine them without producing a gap in audio.
The eventual goal is to also have video, but I've found that audio itself creates gaps when combined with ffmpeg -f concat -i list.txt -c copy out.mp4
If I put the audio in an HLS playlist, there are also gaps, so I don't think this is unique to ffmpeg.
The idea is that samples come in continuously, and my controller routes samples to the proper AVAssetWriter. How do I eliminate gaps in audio?
import Foundation
import UIKit
import AVFoundation
class StreamController: UIViewController, AVCaptureAudioDataOutputSampleBufferDelegate, AVCaptureVideoDataOutputSampleBufferDelegate {
var closingAudioInput: AVAssetWriterInput?
var closingAssetWriter: AVAssetWriter?
var currentAudioInput: AVAssetWriterInput?
var currentAssetWriter: AVAssetWriter?
var nextAudioInput: AVAssetWriterInput?
var nextAssetWriter: AVAssetWriter?
var videoHelper: VideoHelper?
var startTime: NSTimeInterval = 0
let closeAssetQueue: dispatch_queue_t = dispatch_queue_create("closeAssetQueue", nil);
override func viewDidLoad() {
super.viewDidLoad()
startTime = NSDate().timeIntervalSince1970
createSegmentWriter()
videoHelper = VideoHelper()
videoHelper!.delegate = self
videoHelper!.startSession()
NSTimer.scheduledTimerWithTimeInterval(1, target: self, selector: "createSegmentWriter", userInfo: nil, repeats: true)
}
func createSegmentWriter() {
print("Creating segment writer at t=\(NSDate().timeIntervalSince1970 - self.startTime)")
let outputPath = OutputFileNameHelper.instance.pathForOutput()
OutputFileNameHelper.instance.incrementSegmentIndex()
try? NSFileManager.defaultManager().removeItemAtPath(outputPath)
nextAssetWriter = try! AVAssetWriter(URL: NSURL(fileURLWithPath: outputPath), fileType: AVFileTypeMPEG4)
nextAssetWriter!.shouldOptimizeForNetworkUse = true
let audioSettings: [String:AnyObject] = EncodingSettings.AUDIO
nextAudioInput = AVAssetWriterInput(mediaType: AVMediaTypeAudio, outputSettings: audioSettings)
nextAudioInput!.expectsMediaDataInRealTime = true
nextAssetWriter?.addInput(nextAudioInput!)
nextAssetWriter!.startWriting()
}
func closeWriterIfNecessary() {
if closing && audioFinished {
closing = false
audioFinished = false
let outputFile = closingAssetWriter?.outputURL.pathComponents?.last
closingAssetWriter?.finishWritingWithCompletionHandler() {
let delta = NSDate().timeIntervalSince1970 - self.startTime
print("segment \(outputFile!) finished at t=\(delta)")
}
self.closingAudioInput = nil
self.closingAssetWriter = nil
}
}
var audioFinished = false
var closing = false
func captureOutput(captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBufferRef, fromConnection connection: AVCaptureConnection!) {
if let nextWriter = nextAssetWriter {
if nextWriter.status.rawValue != 0 {
if (currentAssetWriter != nil) {
closing = true
}
var sampleTiming: CMSampleTimingInfo = kCMTimingInfoInvalid
CMSampleBufferGetSampleTimingInfo(sampleBuffer, 0, &sampleTiming)
print("Switching asset writers at t=\(NSDate().timeIntervalSince1970 - self.startTime)")
closingAssetWriter = currentAssetWriter
closingAudioInput = currentAudioInput
currentAssetWriter = nextAssetWriter
currentAudioInput = nextAudioInput
nextAssetWriter = nil
nextAudioInput = nil
currentAssetWriter?.startSessionAtSourceTime(sampleTiming.presentationTimeStamp)
}
}
if let _ = captureOutput as? AVCaptureVideoDataOutput {
} else if let _ = captureOutput as? AVCaptureAudioDataOutput {
captureAudioSample(sampleBuffer)
}
dispatch_async(closeAssetQueue) {
self.closeWriterIfNecessary()
}
}
func printTimingInfo(sampleBuffer: CMSampleBufferRef, prefix: String) {
var sampleTiming: CMSampleTimingInfo = kCMTimingInfoInvalid
CMSampleBufferGetSampleTimingInfo(sampleBuffer, 0, &sampleTiming)
let presentationTime = Double(sampleTiming.presentationTimeStamp.value) / Double(sampleTiming.presentationTimeStamp.timescale)
print("\(prefix):\(presentationTime)")
}
func captureAudioSample(sampleBuffer: CMSampleBufferRef) {
printTimingInfo(sampleBuffer, prefix: "A")
if (closing && !audioFinished) {
if closingAudioInput?.readyForMoreMediaData == true {
closingAudioInput?.appendSampleBuffer(sampleBuffer)
}
closingAudioInput?.markAsFinished()
audioFinished = true
} else {
if currentAudioInput?.readyForMoreMediaData == true {
currentAudioInput?.appendSampleBuffer(sampleBuffer)
}
}
}
}
With packet formats like AAC you have silent priming frames (a.k.a encoder delay) at the beginning and remainder frames at the end (when your audio length is not a multiple of the packet size). In your case it's 2112 of them at the beginning of every file. Priming and remainder frames break the possibility of concatenating the files without transcoding them, so you can't really blame ffmpeg -c copy for not producing seamless output.
I'm not sure where this leaves you with video - obviously audio is synced to the video, even in the presence of priming frames.
It all depends on how you intend to concatenate the final audio (and eventually video). If you're doing it yourself using AVFoundation, then you can detect and account for priming/remainder frames using
CMGetAttachment(buffer, kCMSampleBufferAttachmentKey_TrimDurationAtStart, NULL)
CMGetAttachment(audioBuffer, kCMSampleBufferAttachmentKey_TrimDurationAtEnd, NULL)
As a short term solution, you can switch to a non "packetised" to get gapless, concatenatable (with ffmpeg) files.
e.g.
AVFormatIDKey: kAudioFormatAppleIMA4, fileType: AVFileTypeAIFC, suffix ".aifc" or
AVFormatIDKey: kAudioFormatLinearPCM, fileType: AVFileTypeWAVE, suffix ".wav"
p.s. you can see priming & remainder frames and packet sizes using the ubiquitous afinfo tool.
afinfo chunk.mp4
Data format: 2 ch, 44100 Hz, 'aac ' (0x00000000) 0 bits/channel, 0 bytes/packet, 1024 frames/packet, 0 bytes/frame
...
audio 39596 valid frames + 2112 priming + 276 remainder = 41984
...
Not sure if this helps you but if you have a bunch of MP4s you can use this code to combine them:
func mergeAudioFiles(audioFileUrls: NSArray, callback: (url: NSURL?, error: NSError?)->()) {
// Create the audio composition
let composition = AVMutableComposition()
// Merge
for (var i = 0; i < audioFileUrls.count; i++) {
let compositionAudioTrack :AVMutableCompositionTrack = composition.addMutableTrackWithMediaType(AVMediaTypeAudio, preferredTrackID: CMPersistentTrackID())
let asset = AVURLAsset(URL: audioFileUrls[i] as! NSURL)
let track = asset.tracksWithMediaType(AVMediaTypeAudio)[0]
let timeRange = CMTimeRange(start: CMTimeMake(0, 600), duration: track.timeRange.duration)
try! compositionAudioTrack.insertTimeRange(timeRange, ofTrack: track, atTime: composition.duration)
}
// Create output url
let format = NSDateFormatter()
format.dateFormat="yyyy-MM-dd-HH-mm-ss"
let currentFileName = "recording-\(format.stringFromDate(NSDate()))-merge.m4a"
print(currentFileName)
let documentsDirectory = NSFileManager.defaultManager().URLsForDirectory(.DocumentDirectory, inDomains: .UserDomainMask)[0]
let outputUrl = documentsDirectory.URLByAppendingPathComponent(currentFileName)
print(outputUrl.absoluteString)
// Export it
let assetExport = AVAssetExportSession(asset: composition, presetName: AVAssetExportPresetAppleM4A)
assetExport?.outputFileType = AVFileTypeAppleM4A
assetExport?.outputURL = outputUrl
assetExport?.exportAsynchronouslyWithCompletionHandler({ () -> Void in
switch assetExport!.status {
case AVAssetExportSessionStatus.Failed:
callback(url: nil, error: assetExport?.error)
default:
callback(url: assetExport?.outputURL, error: nil)
}
})
}