Writing encoded audio CMSampleBuffer not working - ios

I'm using AudioConverter to convert uncompressed CMSampleBuffer being captured via AVCaptureSession to AudioBufferList:
let packetDescriptionsPtr = UnsafeMutablePointer<AudioStreamPacketDescription>.allocate(capacity: 1)
AudioConverterFillComplexBuffer(
converter,
inputDataProc,
Unmanaged.passUnretained(self).toOpaque(),
&ioOutputDataPacketSize,
outOutputData.unsafeMutablePointer,
packetDescriptionsPtr
)
I am then constructing a CMSampleBuffer containing compressed data using packet descriptions like so:
CMAudioSampleBufferCreateWithPacketDescriptions(
allocator: kCFAllocatorDefault,
dataBuffer: nil,
dataReady: false,
makeDataReadyCallback: nil,
refcon: nil,
formatDescription: formatDescription!,
sampleCount: Int(data.unsafePointer.pointee.mNumberBuffers),
presentationTimeStamp: presentationTimeStamp,
packetDescriptions: &packetDescriptions,
sampleBufferOut: &sampleBuffer)
When I tried saving the buffer using AVAssetWriter I got the following error:
-[AVAssetWriterInput appendSampleBuffer:] Cannot append sample buffer: First input buffer must have an appropriate kCMSampleBufferAttachmentKey_TrimDurationAtStart since the codec has encoder delay'
I decided to prime the first three buffers knowing that each is of consistent length:
if self.receivedAudioBuffers < 2 {
let primingDuration = CMTimeMake(value: 1024, timescale: 44100)
CMSetAttachment(sampleBuffer,
key: kCMSampleBufferAttachmentKey_TrimDurationAtStart,
value: CMTimeCopyAsDictionary(primingDuration, allocator: kCFAllocatorDefault),
attachmentMode: kCMAttachmentMode_ShouldNotPropagate)
self.receivedAudioBuffers += 1
}
else if self.receivedAudioBuffers == 2 {
let primingDuration = CMTimeMake(value: 64, timescale: 44100)
CMSetAttachment(sampleBuffer,
key: kCMSampleBufferAttachmentKey_TrimDurationAtStart,
value: CMTimeCopyAsDictionary(primingDuration, allocator: kCFAllocatorDefault),
attachmentMode: kCMAttachmentMode_ShouldNotPropagate)
self.receivedAudioBuffers += 1
}
Now I no longer get the error and neither do I get any errors when appending the sample but the audio doesn't play in the recording and also messes up the whole video file (seems like the timing info gets corrupted).
Is there anything here that I'm missing? How should I correctly append audio CMSampleBuffer?

Related

How to convert to AVAudioPCMBuffer from AudioBufferList?

i am struggling to convert it.
I made AudioBufferList data using AudioUnit with refer This.
And It has filled audio buffer data by AudioUnitRender().
var bufferList = AudioBufferList(
mNumberBuffers: 1,
mBuffers: AudioBuffer(
mNumberChannels: UInt32(2),
mDataByteSize: 16,
mData: nil))
if let au = audioObject.audioUnit {
err = AudioUnitRender(au,
ioActionFlags,
inTimeStamp,
inBusNumber,
frameCount,
&bufferList)
}
And then, I tried to convert it to AudioBufferList.
But, It doesn't work
Following is what i did.
let audioFormat = AVAudioFormat(
commonFormat: AVAudioCommonFormat.pcmFormatFloat64,
sampleRate: recoder.sampleRate,
interleaved: false,
channelLayout: AVAudioChannelLayout(
layoutTag: kAudioChannelLayoutTag_Stereo
)!
)
guard let pcmBuffer = AVAudioPCMBuffer(
pcmFormat: audioFormat,
bufferListNoCopy: &bufferList
) else {
return
}
A console message i received.
AVAudioBuffer.mm:248 the number of buffers (1) does not match the format's number of channel streams (2)
Please help me someone
thank you.
The AudioBufferList you create has mNumberBuffers = 1 and the single contained AudioBuffer has mNumberChannels = 2 so overall the buffer list contains two interleaved channels. The AVAudioFormat you're creating is non-interleaved stereo which is why you're seeing the format mismatch error. You can try creating a non-interleaved AVAudioFormat or use AVAudioConverter to deinterleave the data.

Playing a stereo audio buffer from memory with AVAudioEngine

I am trying to play a stereo audio buffer from memory (not from a file) in my iOS app but my application crashes when I attempt to attach the AVAudioPlayerNode 'playerNode' to the AVAudioEngine 'audioEngine'. The error code that I get is as follows:
Thread 1: Exception: "required condition is false: _outputFormat.channelCount == buffer.format.channelCount"
I don't know if this due to the way I have declared the AVAudioEngine, the AVAudioPlayerNode, if there is something wrong with the buffer which I am generating, or if I am attaching the nodes incorrectly (or something else!). I have a feeling that it is something to do with how I am creating a new buffer. I am trying to make a stereo buffer from two separate 'mono' arrays, and perhaps its format is not correct.
I have declared audioEngine: AVAudioEngine! and playerNode: AVAudioPlayerNode! globally:
var audioEngine: AVAudioEngine!
var playerNode: AVAudioPlayerNode!
I then load a mono source audio file that my app is going to process (the data out of this file will not be played, it will be loaded into an array, processed and then loaded into a new buffer):
// Read audio file
let audioFileFormat = audioFile.processingFormat
let frameCount = UInt32(audioFile.length)
let audioBuffer = AVAudioPCMBuffer(pcmFormat: audioFileFormat, frameCapacity: frameCount)!
// Read audio data into buffer
do {
try audioFile.read(into: audioBuffer)
} catch let error {
print(error.localizedDescription)
}
// Convert buffer to array of floats
let input: [Float] = Array(UnsafeBufferPointer(start: audioBuffer.floatChannelData![0], count: Int(audioBuffer.frameLength)))
The array is then sent to a convolution function twice that returns a new array each time. This is because the mono source file needs to become a stereo audio buffer:
maxSignalLength = input.count + 256
let leftAudioArray: [Float] = convolve(inputAudio: input, impulse: normalisedLeftImpulse)
let rightAudioArray: [Float] = convolve(inputAudio: input, impulse: normalisedRightImpulse)
The maxSignalLength variable is currently the length of the input signal + the length of the impulse response (normalisedImpulseResponse) that is being convolved with, which at the moment is 256. This will become an appropriate variable at some point.
I then declare and load the new buffer and its format, I have a feeling that the mistake is somewhere around here as this will be the buffer that is played:
let bufferFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: hrtfSampleRate, channels: 2, interleaved: false)!
let outputBuffer = AVAudioPCMBuffer(pcmFormat: bufferFormat, frameCapacity: AVAudioFrameCount(maxSignalLength))!
Notice that I am not creating an interleaved buffer, I load the stereo audio data to the buffer as follows (which I think may also be wrong):
for ch in 0 ..< 2 {
for i in 0 ..< maxSignalLength {
var val: Float!
if ch == 0 { // Left
val = leftAudioArray[i]
// Limit
if val > 1 {
val = 1
}
if val < -1 {
val = -1
}
} else if ch == 1 { // Right
val = rightAudioArray[i]
// Limit
if val < 1 {
val = 1
}
if val < -1 {
val = -1
}
}
outputBuffer.floatChannelData![ch][i] = val
}
}
The audio is also limited to values between -1 and 1.
Then I finally come to (attempting to) load the buffer to the audio node, attach the audio node to the audio engine, start the audio engine and then play the node.
let frameCapacity = AVAudioFramePosition(outputBuffer.frameCapacity)
let frameLength = outputBuffer.frameLength
playerNode.scheduleBuffer(outputBuffer, at: nil, options: AVAudioPlayerNodeBufferOptions.interrupts, completionHandler: nil)
playerNode.prepare(withFrameCount: frameLength)
let time = AVAudioTime(sampleTime: frameCapacity, atRate: hrtfSampleRate)
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: outputBuffer.format)
audioEngine.prepare()
do {
try audioEngine.start()
} catch let error {
print(error.localizedDescription)
}
playerNode.play(at: time)
The error that I get in runtime is:
AVAEInternal.h:76 required condition is false: [AVAudioPlayerNode.mm:712:ScheduleBuffer: (_outputFormat.channelCount == buffer.format.channelCount)]
It doesn't show the line that this error occurs on. I have been stuck on this for a while now, and have tried lots of different things, but there doesn't seem to be very much clear information about playing audio from memory and not from files with AVAudioEngine from what I could find. Any help would be greatly appreciated.
Thanks!
Edit #1:
Better title
Edit# 2:
UPDATE - I have found out why I was getting the error. It seemed to be caused by setting up the playerNode before attaching it to the audioEngine. Swapping the order stopped the program from crashing and throwing the error:
let frameCapacity = AVAudioFramePosition(outputBuffer.frameCapacity)
let frameLength = outputBuffer.frameLength
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: outputBuffer.format)
audioEngine.prepare()
playerNode.scheduleBuffer(outputBuffer, at: nil, options: AVAudioPlayerNodeBufferOptions.interrupts, completionHandler: nil)
playerNode.prepare(withFrameCount: frameLength)
let time = AVAudioTime(sampleTime: frameCapacity, atRate: hrtfSampleRate)
do {
try audioEngine.start()
} catch let error {
print(error.localizedDescription)
}
playerNode.play(at: time)
However, I don't have any sound. After creating an array of floats of the outputBuffer with the same method as used for the input signal, and taking a look at its contents with a break point it seems to be empty, so I must also be incorrectly storing the data to the outputBuffer.
You might be creating and filling your buffer incorrectly. Try doing it thus:
let fileURL = Bundle.main.url(forResource: "my_file", withExtension: "aiff")!
let file = try! AVAudioFile(forReading: fileURL)
let buffer = AVAudioPCMBuffer(pcmFormat: file.processingFormat, frameCapacity: UInt32(file.length))!
try! file.read(into: buffer)
I have fixed the issue!
I tried a lot of solutions and have ended up completely re-writing the audio engine section of my app and I now have the AVAudioEngine and AVAudioPlayerNode declared within the ViewController class as the following:
class ViewController: UIViewController {
var audioEngine: AVAudioEngine = AVAudioEngine()
var playerNode: AVAudioPlayerNode = AVAudioPlayerNode()
...
I am still unclear if it is better to declare these globally or as class variables in iOS, however I can confirm that my application is playing audio with these declared within the ViewController class. I do know that they shouldn't be declared in a function as they will disappear and stop playing when the function goes out of scope.
However, I still was not getting any audio output until I set the AVAudioPCMBuffer.frameLength to frameCapacity.
I could find very little information online regarding creating a new AVAudioPCMBuffer from an array of floats, but this seems to be the missing step that I needed to do to make my outputBuffer playable. Before I set this, it was at 0 by default.
The frameLength member isn't required in the AVAudioFormat class declaration. But it is important and my buffer wasn't playable until I set it manually, and after the class instance declaration:
let bufferFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: hrtfSampleRate, channels: 2, interleaved: false)!
let frameCapacity = UInt32(audioFile.length)
guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: bufferFormat, frameCapacity: frameCapacity) else {
fatalError("Could not create output buffer.")
}
outputBuffer.frameLength = frameCapacity // Important!
This took a long time to find out, hopefully this will help someone else in the future.

Interperating AudioBuffer.mData to display audio visualization

I am trying to process audio data in real-time so that I can display an on-screen spectrum analyzer/visualization based on sound input from the microphone. I am using AVFoundation's AVCaptureAudioDataOutputSampleBufferDelegate to capture the audio data, which is triggering the delgate function captureOutput. Function below:
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
autoreleasepool {
guard captureOutput != nil,
sampleBuffer != nil,
connection != nil,
CMSampleBufferDataIsReady(sampleBuffer) else { return }
//Check this is AUDIO (and not VIDEO) being received
if (connection.audioChannels.count > 0)
{
//Determine number of frames in buffer
var numFrames = CMSampleBufferGetNumSamples(sampleBuffer)
//Get AudioBufferList
var audioBufferList = AudioBufferList(mNumberBuffers: 1, mBuffers: AudioBuffer(mNumberChannels: 0, mDataByteSize: 0, mData: nil))
var blockBuffer: CMBlockBuffer?
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampleBuffer, nil, &audioBufferList, MemoryLayout<AudioBufferList>.size, nil, nil, UInt32(kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment), &blockBuffer)
let audioBuffers = UnsafeBufferPointer<AudioBuffer>(start: &audioBufferList.mBuffers, count: Int(audioBufferList.mNumberBuffers))
for audioBuffer in audioBuffers {
let data = Data(bytes: audioBuffer.mData!, count: Int(audioBuffer.mDataByteSize))
let i16array = data.withUnsafeBytes {
UnsafeBufferPointer<Int16>(start: $0, count: data.count/2).map(Int16.init(bigEndian:))
}
for dataItem in i16array
{
print(dataItem)
}
}
}
}
}
The code above prints positive and negative numbers of type Int16 as expected, but need help in converting these raw numbers into meaningful data such as power and decibels for my visualizer.
I was on the right track... Thanks to RobertHarvey's comment on my question - Use of the Accelerate Framework's FFT calculation functions is required to achieve a spectrum analyzer. But even before I could use these functions, you need to convert your raw data into an Array of type Float as many of the functions require a Float array.
Firstly, we load the raw data into a Data object:
//Read data from AudioBuffer into a variable
let data = Data(bytes: audioBuffer.mData!, count: Int(audioBuffer.mDataByteSize))
I like to think of a Data object as a "list" of 1-byte sized chunks of info (8 bits each), but if I check the number of frames I have in my sample and the total size of my Data object in bytes, they don't match:
//Get number of frames in sample and total size of Data
var numFrames = CMSampleBufferGetNumSamples(sampleBuffer) //= 1024 frames in my case
var dataSize = audioBuffer.mDataByteSize //= 2048 bytes in my case
The total size (in bytes) of my data is twice the number of frames I have in my CMSampleBuffer. This means that each frame of audio is 2 bytes in length. In order to read the data meaningfully, I need to convert my Data object which is a "list" of 1-byte chunks into an array of 2-byte chunks. Int16 contains 16 bits (or 2 bytes - exactly what we need), so lets create an Array of Int16:
//Convert to Int16 array
let samples = data.withUnsafeBytes {
UnsafeBufferPointer<Int16>(start: $0, count: data.count / MemoryLayout<Int16>.size)
}
Now that we have an Array of Int16, we can convert it to an Array of Float:
//Convert to Float Array
let factor = Float(Int16.max)
var floats: [Float] = Array(repeating: 0.0, count: samples.count)
for i in 0..<samples.count {
floats[i] = Float(samples[i]) / factor
}
Now that we have our Float array, we can now use the Accelerate Framework's complex math to convert the raw Float values into meaningful ones like magnitude, decibels etc. Link to documentation:
Apple's Accelerate Framework
Fast Fourier Transform (FFT)
I found Apple's documentation rather overwhelming. Luckily, I found a really good example online which I was able to re-purpose for my needs, called TempiFFT. Implementation as follows:
//Initiate FFT
let fft = TempiFFT(withSize: numFrames, sampleRate: 44100.0)
fft.windowType = TempiFFTWindowType.hanning
//Pass array of Floats
fft.fftForward(floats)
//I only want to display 20 bands on my analyzer
fft.calculateLinearBands(minFrequency: 0, maxFrequency: fft.nyquistFrequency, numberOfBands: 20)
//Then use a loop to iterate through the bands in your spectrum analyzer
var magnitudeArr = [Float](repeating: Float(0), count: 20)
var magnitudeDBArr = [Float](repeating: Float(0), count: 20)
for i in 0..<20
{
var magnitudeArr[i] = fft.magnitudeAtBand(i)
var magnitudeDB = TempiFFT.toDB(fft.magnitudeAtBand(i))
//..I didn't, but you could perform drawing functions here...
}
Other useful references:
Converting Data into Array of Int16
Converting Array of Int16 to Array of Float

Generating video or audio using raw PCM

What is the process of generating .mov or .m4a file using arrays of Int16 as sterio channel for audio?
I can easily generate raw PCM data as [Int16] from a .mov file and store it in two files leftChannel.pcm and rightChannel.pcm and perform some operations for later use. But I am not able to regenerate the video from these files.
Any process, i.e. direct video generation using raw PCM or using intermediate step of generating m4a from PCM will work.
Update:
I figured out how to convert the PCM array to audio file. But it won't play.
private func convertToM4a(leftChannel leftPath : URL, rightChannel rigthPath : URL, converterCallback : ConverterCallback){
let m4aUrl = FileManagerUtil.getTempFileName(parentFolder: FrameExtractor.PCM_ENCODE_FOLDER, fileNameWithExtension: "encodedAudio.m4a")
if FileManager.default.fileExists(atPath: m4aUrl.path) {
try! FileManager.default.removeItem(atPath: m4aUrl.path)
}
do{
let leftBuffer = try NSArray(contentsOf: leftPath, error: ()) as! [Int16]
let rightBuffer = try NSArray(contentsOf: rigthPath, error: ()) as! [Int16]
let sampleRate = 44100
let channels = 2
let frameCapacity = (leftBuffer.count + rightBuffer.count)/2
let outputSettings = [
AVFormatIDKey : NSInteger(kAudioFormatMPEG4AAC),
AVSampleRateKey : NSInteger(sampleRate),
AVNumberOfChannelsKey : NSInteger(channels),
AVAudioFileTypeKey : NSInteger(kAudioFileAAC_ADTSType),
AVLinearPCMIsBigEndianKey : true,
] as [String : Any]
let audioFile = try AVAudioFile(forWriting: m4aUrl, settings: outputSettings, commonFormat: .pcmFormatInt16, interleaved: false)
let format = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: Double(sampleRate), channels: AVAudioChannelCount(channels), interleaved: false)!
let pcmBuffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: AVAudioFrameCount(frameCapacity))!
pcmBuffer.frameLength = pcmBuffer.frameCapacity
for i in 0..<leftBuffer.count {
pcmBuffer.int16ChannelData![0][i] = leftBuffer[i]
}
for i in 0..<rightBuffer.count {
pcmBuffer.int16ChannelData![1][i] = rightBuffer[i]
}
try! audioFile.write(from: pcmBuffer)
converterCallback.m4aEncoded(to: m4aUrl)
} catch {
print(error.localizedDescription)
}
}
Saving it as .m4a with AVAudioFileTypeKey as m4a type was giving malformed file error.
Saving it as .aac with above settings plays the file but with broken sound. Just the buzzing sound with some slow mo effect of the original audio, initially I thought that it is something to do with the input and output of sampling rate but that was not the case.
I assume that something is wrong in Output Dictionary. Any help would be appreciated.
At least the creation of the AAC file with the code you are showing works.
I wrote out two NSArrays with valid Int16 audio data and with your code get a valid result that e.g. when played with (using suffix .aac) in QuickTime Player sounds the same as the input.
How are you creating the input?
Buzzing sound (with lots of noise) is e.g. happening if you reading in audio data using AVAudioFormat with e.g. .pcmFormatInt16 format but the data actually read is in .pcmFormatFloat32 format (most commonly default format). There is unfortunately no runtime warning if you try to do so.
If that's the case try to use .pcmFormatFloat32. If you need it in Int16 you can convert it yourself by basically mapping [-1,1] to [-32768,32767] for both channels.
let fac = Float(1 << 15)
for i in 0..<count {
let val = min(max(inBuffer!.floatChannelData![ch][i] * fac, -fac), fac - 1)
xxx[I] = Int16(val)
}
...

How to generate audio file from Hex/Binary(raw data) value in iOS?

I am working on BLE project where an audio recorder hardware continuously streaming data and send to the iOS application. From iOS application end, I need to read transferred data.
Hardware sending HEX data to iOS application, We need to create .mp3/.wav file
Is anyone have an idea to create an audio file from binary/hex input data?
Note: I have to use raw data(Hex) to create an audio file.
Thanks
Its unclear from your question how the data is coming in, but I'm going to assume at this point that you periodically have a Data of Linear PCM data as signed integers that you want to append. If it's some other format, then you'll have to change the settings. This is all just general-purpose stuff; you will almost certainly have to modify it to your specific problem.
(Much of this code is based on Create a silent audio CMSampleBufferRef)
First you need a writer:
let writer = try AVAssetWriter(outputURL: outputURL, fileType: .wav)
Then you need to know how how your data is formatted (this is quietly assuming that the data is a multiple of the frame size; if this isn't true, you'll need to keep track of the partial frames):
let numChannels = 1
let sampleRate = 44100
let bytesPerFrame = MemoryLayout<Int16>.size * numChannels
let frames = data.count / bytesPerFrame
let duration = Double(frames) / Double(sampleRate)
let blockSize = frames * bytesPerFrame
Then you need to know what the current frame is. This will update over time.
var currentFrame: Int64 = 0
Now you need a description of your data:
var asbd = AudioStreamBasicDescription(
mSampleRate: Float64(sampleRate),
mFormatID: kAudioFormatLinearPCM,
mFormatFlags: kLinearPCMFormatFlagIsSignedInteger,
mBytesPerPacket: UInt32(bytesPerFrame),
mFramesPerPacket: 1,
mBytesPerFrame: UInt32(bytesPerFrame),
mChannelsPerFrame: UInt32(numChannels),
mBitsPerChannel: UInt32(MemoryLayout<Int16>.size*8),
mReserved: 0
)
var formatDesc: CMAudioFormatDescription?
status = CMAudioFormatDescriptionCreate(kCFAllocatorDefault, &asbd, 0, nil, 0, nil, nil, &formatDesc)
assert(status == noErr)
And create your input adapter and add it to the writer
let settings:[String : Any] = [ AVFormatIDKey : kAudioFormatLinearPCM,
AVNumberOfChannelsKey : numChannels,
AVSampleRateKey : sampleRate ]
let input = AVAssetWriterInput(mediaType: .audio, outputSettings: settings, sourceFormatHint: formatDesc)
writer.add(input)
That's all the one-time setup, it's time to start the writer:
writer.startWriting()
writer.startSession(atSourceTime: kCMTimeZero)
If all your data is the same size, you can create a reusable buffer (or you can create a new one each time):
var block: CMBlockBuffer?
var status = CMBlockBufferCreateWithMemoryBlock(
kCFAllocatorDefault,
nil,
blockSize, // blockLength
nil, // blockAllocator
nil, // customBlockSource
0, // offsetToData
blockSize, // dataLength
0, // flags
&block
)
assert(status == kCMBlockBufferNoErr)
When data comes in, copy it into the buffer:
status = CMBlockBufferReplaceDataBytes(&inputData, block!, 0, blockSize)
assert(status == kCMBlockBufferNoErr)
Now create a sample buffer from the buffer and append it to the writer input:
var sampleBuffer: CMSampleBuffer?
status = CMAudioSampleBufferCreateReadyWithPacketDescriptions(
kCFAllocatorDefault,
block, // dataBuffer
formatDesc!,
frames, // numSamples
CMTimeMake(currentFrame, Int32(sampleRate)), // sbufPTS
nil, // packetDescriptions
&sampleBuffer
)
assert(status == noErr)
input.append(sampleBuffer!)
When everything is done, finalize the writer and you're done:
input.markAsFinished()
writer.finishWriting{}

Resources