Playing a stereo audio buffer from memory with AVAudioEngine - ios

I am trying to play a stereo audio buffer from memory (not from a file) in my iOS app but my application crashes when I attempt to attach the AVAudioPlayerNode 'playerNode' to the AVAudioEngine 'audioEngine'. The error code that I get is as follows:
Thread 1: Exception: "required condition is false: _outputFormat.channelCount == buffer.format.channelCount"
I don't know if this due to the way I have declared the AVAudioEngine, the AVAudioPlayerNode, if there is something wrong with the buffer which I am generating, or if I am attaching the nodes incorrectly (or something else!). I have a feeling that it is something to do with how I am creating a new buffer. I am trying to make a stereo buffer from two separate 'mono' arrays, and perhaps its format is not correct.
I have declared audioEngine: AVAudioEngine! and playerNode: AVAudioPlayerNode! globally:
var audioEngine: AVAudioEngine!
var playerNode: AVAudioPlayerNode!
I then load a mono source audio file that my app is going to process (the data out of this file will not be played, it will be loaded into an array, processed and then loaded into a new buffer):
// Read audio file
let audioFileFormat = audioFile.processingFormat
let frameCount = UInt32(audioFile.length)
let audioBuffer = AVAudioPCMBuffer(pcmFormat: audioFileFormat, frameCapacity: frameCount)!
// Read audio data into buffer
do {
try audioFile.read(into: audioBuffer)
} catch let error {
print(error.localizedDescription)
}
// Convert buffer to array of floats
let input: [Float] = Array(UnsafeBufferPointer(start: audioBuffer.floatChannelData![0], count: Int(audioBuffer.frameLength)))
The array is then sent to a convolution function twice that returns a new array each time. This is because the mono source file needs to become a stereo audio buffer:
maxSignalLength = input.count + 256
let leftAudioArray: [Float] = convolve(inputAudio: input, impulse: normalisedLeftImpulse)
let rightAudioArray: [Float] = convolve(inputAudio: input, impulse: normalisedRightImpulse)
The maxSignalLength variable is currently the length of the input signal + the length of the impulse response (normalisedImpulseResponse) that is being convolved with, which at the moment is 256. This will become an appropriate variable at some point.
I then declare and load the new buffer and its format, I have a feeling that the mistake is somewhere around here as this will be the buffer that is played:
let bufferFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: hrtfSampleRate, channels: 2, interleaved: false)!
let outputBuffer = AVAudioPCMBuffer(pcmFormat: bufferFormat, frameCapacity: AVAudioFrameCount(maxSignalLength))!
Notice that I am not creating an interleaved buffer, I load the stereo audio data to the buffer as follows (which I think may also be wrong):
for ch in 0 ..< 2 {
for i in 0 ..< maxSignalLength {
var val: Float!
if ch == 0 { // Left
val = leftAudioArray[i]
// Limit
if val > 1 {
val = 1
}
if val < -1 {
val = -1
}
} else if ch == 1 { // Right
val = rightAudioArray[i]
// Limit
if val < 1 {
val = 1
}
if val < -1 {
val = -1
}
}
outputBuffer.floatChannelData![ch][i] = val
}
}
The audio is also limited to values between -1 and 1.
Then I finally come to (attempting to) load the buffer to the audio node, attach the audio node to the audio engine, start the audio engine and then play the node.
let frameCapacity = AVAudioFramePosition(outputBuffer.frameCapacity)
let frameLength = outputBuffer.frameLength
playerNode.scheduleBuffer(outputBuffer, at: nil, options: AVAudioPlayerNodeBufferOptions.interrupts, completionHandler: nil)
playerNode.prepare(withFrameCount: frameLength)
let time = AVAudioTime(sampleTime: frameCapacity, atRate: hrtfSampleRate)
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: outputBuffer.format)
audioEngine.prepare()
do {
try audioEngine.start()
} catch let error {
print(error.localizedDescription)
}
playerNode.play(at: time)
The error that I get in runtime is:
AVAEInternal.h:76 required condition is false: [AVAudioPlayerNode.mm:712:ScheduleBuffer: (_outputFormat.channelCount == buffer.format.channelCount)]
It doesn't show the line that this error occurs on. I have been stuck on this for a while now, and have tried lots of different things, but there doesn't seem to be very much clear information about playing audio from memory and not from files with AVAudioEngine from what I could find. Any help would be greatly appreciated.
Thanks!
Edit #1:
Better title
Edit# 2:
UPDATE - I have found out why I was getting the error. It seemed to be caused by setting up the playerNode before attaching it to the audioEngine. Swapping the order stopped the program from crashing and throwing the error:
let frameCapacity = AVAudioFramePosition(outputBuffer.frameCapacity)
let frameLength = outputBuffer.frameLength
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: outputBuffer.format)
audioEngine.prepare()
playerNode.scheduleBuffer(outputBuffer, at: nil, options: AVAudioPlayerNodeBufferOptions.interrupts, completionHandler: nil)
playerNode.prepare(withFrameCount: frameLength)
let time = AVAudioTime(sampleTime: frameCapacity, atRate: hrtfSampleRate)
do {
try audioEngine.start()
} catch let error {
print(error.localizedDescription)
}
playerNode.play(at: time)
However, I don't have any sound. After creating an array of floats of the outputBuffer with the same method as used for the input signal, and taking a look at its contents with a break point it seems to be empty, so I must also be incorrectly storing the data to the outputBuffer.

You might be creating and filling your buffer incorrectly. Try doing it thus:
let fileURL = Bundle.main.url(forResource: "my_file", withExtension: "aiff")!
let file = try! AVAudioFile(forReading: fileURL)
let buffer = AVAudioPCMBuffer(pcmFormat: file.processingFormat, frameCapacity: UInt32(file.length))!
try! file.read(into: buffer)

I have fixed the issue!
I tried a lot of solutions and have ended up completely re-writing the audio engine section of my app and I now have the AVAudioEngine and AVAudioPlayerNode declared within the ViewController class as the following:
class ViewController: UIViewController {
var audioEngine: AVAudioEngine = AVAudioEngine()
var playerNode: AVAudioPlayerNode = AVAudioPlayerNode()
...
I am still unclear if it is better to declare these globally or as class variables in iOS, however I can confirm that my application is playing audio with these declared within the ViewController class. I do know that they shouldn't be declared in a function as they will disappear and stop playing when the function goes out of scope.
However, I still was not getting any audio output until I set the AVAudioPCMBuffer.frameLength to frameCapacity.
I could find very little information online regarding creating a new AVAudioPCMBuffer from an array of floats, but this seems to be the missing step that I needed to do to make my outputBuffer playable. Before I set this, it was at 0 by default.
The frameLength member isn't required in the AVAudioFormat class declaration. But it is important and my buffer wasn't playable until I set it manually, and after the class instance declaration:
let bufferFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: hrtfSampleRate, channels: 2, interleaved: false)!
let frameCapacity = UInt32(audioFile.length)
guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: bufferFormat, frameCapacity: frameCapacity) else {
fatalError("Could not create output buffer.")
}
outputBuffer.frameLength = frameCapacity // Important!
This took a long time to find out, hopefully this will help someone else in the future.

Related

AudioKit 5.2 Migration: AKSampler() to Sampler() latency when loading audioFiles

I am migrating my AudioKit-Code to AudioKit 5.2 and I am having problems with the AK5 Sampler() which I can not solve by myself. In AK previously it was AKSampler() and it had a function called:
loadAKAudioFile(from: AKSampleDescriptor, file: AKAudioFile)
Since Sampler() does not have that function, I took a piece of code from the source Sampler.swift which does the work, but it is extremely slow, it takes ages to load and map the 12 different small samples that I map on the note numbers from 60 to 71:
internal func loadAudioFile(from sampleDescriptor: SampleDescriptor, file: AVAudioFile) {
guard let floatChannelData = file.toFloatChannelData() else { return }
let sampleRate = Float(file.fileFormat.sampleRate)
let sampleCount = Int32(file.length)
let channelCount = Int32(file.fileFormat.channelCount)
var flattened = Array(floatChannelData.joined())
flattened.withUnsafeMutableBufferPointer { data in
var descriptor = SampleDataDescriptor(sampleDescriptor: sampleDescriptor,
sampleRate: sampleRate,
isInterleaved: false,
channelCount: channelCount,
sampleCount: sampleCount,
data: data.baseAddress)
akSamplerLoadData(au.dsp, &descriptor)
}
}
The AKSampler() did not have any noticable latency when doing the work but with Sampler() it takes more than a second to load a sample. Obviously AKSampler() worked asynchronously. I am new to swift and audio so I have no idea how to make Sampler() work asychronously.
Would be great to get some bits of code that could help it, thank's

Using AudioKit's AVAudioPCMBuffer normalize function to normalize multiple audio files

I've got an array of audio files that I want to normalize so they all have similar perceived loudness. For testing purposes, I decided to adapt the AVAudioPCMBuffer.normalize method from AudioKit to suit my purposes. See here for implementation: https://github.com/AudioKit/AudioKit/blob/main/Sources/AudioKit/Audio%20Files/AVAudioPCMBuffer%2BProcessing.swift
I am converting each file into an AVAudioPCMBuffer, and then performing a reduce on that array of buffers to get the highest peak across all of the buffers. Then I created a new version of normalize called normalize(with peakAmplitude: Float) -> AVAudioPCMBuffer takes that peak amplitude, calculates a gainFactor and then iterates through the floatData for each channel and multiplies the floatData by the gainFactor. I then call my new flavor of normalize with the peak.amplitude that I get from the reduce operation on all the audio buffers.
This produces useful results, sometimes.
Here's the actual code in question:
extension AVAudioPCMBuffer {
public func normalize(with peakAmplitude: Float) -> AVAudioPCMBuffer {
guard let floatData = floatChannelData else { return self }
let gainFactor: Float = 1 / peakAmplitude
let length: AVAudioFrameCount = frameLength
let channelCount = Int(format.channelCount)
// i is the index in the buffer
for i in 0 ..< Int(length) {
// n is the channel
for n in 0 ..< channelCount {
let sample = floatData[n][i] * gainFactor
self.floatChannelData?[n][i] = sample
}
}
self.frameLength = length
return self
}
}
extension Array where Element == AVAudioPCMBuffer {
public func normalized() -> [AVAudioPCMBuffer] {
var minPeak = AVAudioPCMBuffer.Peak()
minPeak.amplitude = AVAudioPCMBuffer.Peak.min
let maxPeakForAllBuffers: AVAudioPCMBuffer.Peak = reduce(minPeak) { result, buffer in
guard
let currentBufferPeak = buffer.peak(),
currentBufferPeak.amplitude > result.amplitude
else {
return result
}
return currentBufferPeak
}
return map { $0.normalize(with: maxPeakForAllBuffers.amplitude) }
}
}
Three questions:
Is my approach reasonable for multiple files?
This appears to be using "peak normalization" vs RMS or EBU R128 normalization. Is that why when I give it a batch of 3 audio files and 2 of them are correctly made louder that 1 of them is made louder even though ffmpeg-normalize on the same batch of files makes that 1 file significantly quieter?
Any other suggestions on ways to alter the floatData across multiple AVAudioAudioPCMBuffers in order to make them have similar perceived loudness?

How to determine volume from AVAudioBuffer audio sample

I am attempting to write a small demonstration application that will do some audio measurements (volume and pitch) in realtime.
I've got, I think, to the point where I have the audio samples, but I am new to working with audio and not sure where to go next. Is there a way to determine pitch and volume of a particular sample as a function of the float/integer/byte value of the samples?
Also, I had to add this line "buffer.frameLength = 1" to get the code to run. When I print the variable "inputFormat", I get the value "".
All the material+tutorials that I can find about audio processing (in general and on ios) seems to require a lot of contextual info they leave out.
The code written in swift works to get the samples, and outputs Sample: (~ -8 to +8 float value).
func test() {
let inputNode = audioEngine.inputNode
let inputFormat = inputNode.outputFormat(forBus: 0)
let bufferSize = 10
inputNode.installTap(onBus: 0, bufferSize: AVAudioFrameCount(bufferSize), format: inputFormat) { (buffer, time) in
buffer.frameLength = 1
var i = 0;
while i < Int(buffer.frameLength) && buffer.floatChannelData != nil
{
let sample : Double = Double(buffer.floatChannelData![i].pointee)
print("\nSample: "+String(sample))
i += 1
}
}
audioEngine.prepare()
do {
try audioEngine.start()
}catch {
print(error.localizedDescription)
}
}

How to store data from an UnsafeMutablePointer in the iOS file system

I am reading data from an MFi external device into a buffer using a 3rd party SDK "sessionController". See below:
let handle: UInt64 = self.sessionController.openFile(file.path, mode: openMode)
if handle == 0 {
//Error
return
}
let c: UInt64 = file.size
var bytesArray: [UInt8] = [UInt8](fileData)
let bufferPointer: UnsafeMutablePointer<UInt8> = UnsafeMutablePointer<UInt8>.allocate(capacity: Int(c))
bufferPointer.initialize(repeating: 0, count: Int(c))
defer {
bufferPointer.deinitialize(count: Int(c))
bufferPointer.deallocate()
}
var sum: UInt32 = 0
let singleSize: UInt32 = 8 << 20
while sum < c {
let read = self.sessionController.readFile(handle, data: bufferPointer, len: singleSize)
if read == 0 {
//There was an error
return
}
sum += read
}
let newPointer : UnsafeRawPointer = UnsafeRawPointer(bufferPointer)
fileURL = try! FileManager.default.url(for: .documentDirectory, in: .userDomainMask, appropriateFor: nil, create: false).appendingPathComponent("test.MOV")
fileData = Data(bytes: newPointer, count: Int(c))
try! fileData.write(to: fileURL)
//Now use this fileURL to watch video in an AVPlayer...
//AVPlayer(init: fileURL)
For some reason the data stored at the fileURL becomes corrupted (I think) and I am unable to play the video file. I think I am not doing something correctly with Unsafe Swift but I am not sure what. How can I make sure that I have properly read the data from the device into memory, and then taken that data from memory and stored it on the hard drive at the fileURL? What am I doing wrong here? The video will not play in AVPlayer given the fileURL.
The main error is here:
let read = self.sessionController.readFile(handle, data: bufferPointer, len: singleSize)
If you read in multiple chunks then the second and all subsequent reads will overwrite the data read previously. So that should probably be
let read = self.sessionController.readFile(handle, data: bufferPointer + sum, len: singleSize)
Note also that the file size is defined as UInt64, but the variable sum (which holds the total number of bytes read so far) is an UInt32. This will lead to problems if there is more than 4GB data.
But generally I would avoid to read the complete data into a memory buffer. You already read in chunks, so you can write the data immediately to the destination file. Here is how that could look like:
// Output file:
let fileURL = ...
let fileHandle = try FileHandle(forWritingTo: fileURL)
defer { fileHandle.closeFile() }
// Buffer:
let bufferSize = 1024 * 1024 // Choose some buffer size
var buffer = Data(count: bufferSize)
// Read/write loop:
let fileSize: UInt64 = file.size
var remainingToRead = fileSize
while remainingToRead > 0 {
let read = buffer.withUnsafeMutableBytes { bufferPointer in
self.sessionController.readFile(handle, data: bufferPointer, len: UInt32(min(remainingToRead, UInt64(bufferSize))))
}
if read == 0 {
return // Read error
}
remainingToRead -= UInt64(read)
fileHandle.write(buffer)
}
Note also that the data is read directly into a Data value, instead of reading it into allocated memory and then copying it to another Data.

Generating video or audio using raw PCM

What is the process of generating .mov or .m4a file using arrays of Int16 as sterio channel for audio?
I can easily generate raw PCM data as [Int16] from a .mov file and store it in two files leftChannel.pcm and rightChannel.pcm and perform some operations for later use. But I am not able to regenerate the video from these files.
Any process, i.e. direct video generation using raw PCM or using intermediate step of generating m4a from PCM will work.
Update:
I figured out how to convert the PCM array to audio file. But it won't play.
private func convertToM4a(leftChannel leftPath : URL, rightChannel rigthPath : URL, converterCallback : ConverterCallback){
let m4aUrl = FileManagerUtil.getTempFileName(parentFolder: FrameExtractor.PCM_ENCODE_FOLDER, fileNameWithExtension: "encodedAudio.m4a")
if FileManager.default.fileExists(atPath: m4aUrl.path) {
try! FileManager.default.removeItem(atPath: m4aUrl.path)
}
do{
let leftBuffer = try NSArray(contentsOf: leftPath, error: ()) as! [Int16]
let rightBuffer = try NSArray(contentsOf: rigthPath, error: ()) as! [Int16]
let sampleRate = 44100
let channels = 2
let frameCapacity = (leftBuffer.count + rightBuffer.count)/2
let outputSettings = [
AVFormatIDKey : NSInteger(kAudioFormatMPEG4AAC),
AVSampleRateKey : NSInteger(sampleRate),
AVNumberOfChannelsKey : NSInteger(channels),
AVAudioFileTypeKey : NSInteger(kAudioFileAAC_ADTSType),
AVLinearPCMIsBigEndianKey : true,
] as [String : Any]
let audioFile = try AVAudioFile(forWriting: m4aUrl, settings: outputSettings, commonFormat: .pcmFormatInt16, interleaved: false)
let format = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: Double(sampleRate), channels: AVAudioChannelCount(channels), interleaved: false)!
let pcmBuffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: AVAudioFrameCount(frameCapacity))!
pcmBuffer.frameLength = pcmBuffer.frameCapacity
for i in 0..<leftBuffer.count {
pcmBuffer.int16ChannelData![0][i] = leftBuffer[i]
}
for i in 0..<rightBuffer.count {
pcmBuffer.int16ChannelData![1][i] = rightBuffer[i]
}
try! audioFile.write(from: pcmBuffer)
converterCallback.m4aEncoded(to: m4aUrl)
} catch {
print(error.localizedDescription)
}
}
Saving it as .m4a with AVAudioFileTypeKey as m4a type was giving malformed file error.
Saving it as .aac with above settings plays the file but with broken sound. Just the buzzing sound with some slow mo effect of the original audio, initially I thought that it is something to do with the input and output of sampling rate but that was not the case.
I assume that something is wrong in Output Dictionary. Any help would be appreciated.
At least the creation of the AAC file with the code you are showing works.
I wrote out two NSArrays with valid Int16 audio data and with your code get a valid result that e.g. when played with (using suffix .aac) in QuickTime Player sounds the same as the input.
How are you creating the input?
Buzzing sound (with lots of noise) is e.g. happening if you reading in audio data using AVAudioFormat with e.g. .pcmFormatInt16 format but the data actually read is in .pcmFormatFloat32 format (most commonly default format). There is unfortunately no runtime warning if you try to do so.
If that's the case try to use .pcmFormatFloat32. If you need it in Int16 you can convert it yourself by basically mapping [-1,1] to [-32768,32767] for both channels.
let fac = Float(1 << 15)
for i in 0..<count {
let val = min(max(inBuffer!.floatChannelData![ch][i] * fac, -fac), fac - 1)
xxx[I] = Int16(val)
}
...

Resources