How to determine volume from AVAudioBuffer audio sample - ios

I am attempting to write a small demonstration application that will do some audio measurements (volume and pitch) in realtime.
I've got, I think, to the point where I have the audio samples, but I am new to working with audio and not sure where to go next. Is there a way to determine pitch and volume of a particular sample as a function of the float/integer/byte value of the samples?
Also, I had to add this line "buffer.frameLength = 1" to get the code to run. When I print the variable "inputFormat", I get the value "".
All the material+tutorials that I can find about audio processing (in general and on ios) seems to require a lot of contextual info they leave out.
The code written in swift works to get the samples, and outputs Sample: (~ -8 to +8 float value).
func test() {
let inputNode = audioEngine.inputNode
let inputFormat = inputNode.outputFormat(forBus: 0)
let bufferSize = 10
inputNode.installTap(onBus: 0, bufferSize: AVAudioFrameCount(bufferSize), format: inputFormat) { (buffer, time) in
buffer.frameLength = 1
var i = 0;
while i < Int(buffer.frameLength) && buffer.floatChannelData != nil
{
let sample : Double = Double(buffer.floatChannelData![i].pointee)
print("\nSample: "+String(sample))
i += 1
}
}
audioEngine.prepare()
do {
try audioEngine.start()
}catch {
print(error.localizedDescription)
}
}

Related

Using AudioKit's AVAudioPCMBuffer normalize function to normalize multiple audio files

I've got an array of audio files that I want to normalize so they all have similar perceived loudness. For testing purposes, I decided to adapt the AVAudioPCMBuffer.normalize method from AudioKit to suit my purposes. See here for implementation: https://github.com/AudioKit/AudioKit/blob/main/Sources/AudioKit/Audio%20Files/AVAudioPCMBuffer%2BProcessing.swift
I am converting each file into an AVAudioPCMBuffer, and then performing a reduce on that array of buffers to get the highest peak across all of the buffers. Then I created a new version of normalize called normalize(with peakAmplitude: Float) -> AVAudioPCMBuffer takes that peak amplitude, calculates a gainFactor and then iterates through the floatData for each channel and multiplies the floatData by the gainFactor. I then call my new flavor of normalize with the peak.amplitude that I get from the reduce operation on all the audio buffers.
This produces useful results, sometimes.
Here's the actual code in question:
extension AVAudioPCMBuffer {
public func normalize(with peakAmplitude: Float) -> AVAudioPCMBuffer {
guard let floatData = floatChannelData else { return self }
let gainFactor: Float = 1 / peakAmplitude
let length: AVAudioFrameCount = frameLength
let channelCount = Int(format.channelCount)
// i is the index in the buffer
for i in 0 ..< Int(length) {
// n is the channel
for n in 0 ..< channelCount {
let sample = floatData[n][i] * gainFactor
self.floatChannelData?[n][i] = sample
}
}
self.frameLength = length
return self
}
}
extension Array where Element == AVAudioPCMBuffer {
public func normalized() -> [AVAudioPCMBuffer] {
var minPeak = AVAudioPCMBuffer.Peak()
minPeak.amplitude = AVAudioPCMBuffer.Peak.min
let maxPeakForAllBuffers: AVAudioPCMBuffer.Peak = reduce(minPeak) { result, buffer in
guard
let currentBufferPeak = buffer.peak(),
currentBufferPeak.amplitude > result.amplitude
else {
return result
}
return currentBufferPeak
}
return map { $0.normalize(with: maxPeakForAllBuffers.amplitude) }
}
}
Three questions:
Is my approach reasonable for multiple files?
This appears to be using "peak normalization" vs RMS or EBU R128 normalization. Is that why when I give it a batch of 3 audio files and 2 of them are correctly made louder that 1 of them is made louder even though ffmpeg-normalize on the same batch of files makes that 1 file significantly quieter?
Any other suggestions on ways to alter the floatData across multiple AVAudioAudioPCMBuffers in order to make them have similar perceived loudness?

Playing a stereo audio buffer from memory with AVAudioEngine

I am trying to play a stereo audio buffer from memory (not from a file) in my iOS app but my application crashes when I attempt to attach the AVAudioPlayerNode 'playerNode' to the AVAudioEngine 'audioEngine'. The error code that I get is as follows:
Thread 1: Exception: "required condition is false: _outputFormat.channelCount == buffer.format.channelCount"
I don't know if this due to the way I have declared the AVAudioEngine, the AVAudioPlayerNode, if there is something wrong with the buffer which I am generating, or if I am attaching the nodes incorrectly (or something else!). I have a feeling that it is something to do with how I am creating a new buffer. I am trying to make a stereo buffer from two separate 'mono' arrays, and perhaps its format is not correct.
I have declared audioEngine: AVAudioEngine! and playerNode: AVAudioPlayerNode! globally:
var audioEngine: AVAudioEngine!
var playerNode: AVAudioPlayerNode!
I then load a mono source audio file that my app is going to process (the data out of this file will not be played, it will be loaded into an array, processed and then loaded into a new buffer):
// Read audio file
let audioFileFormat = audioFile.processingFormat
let frameCount = UInt32(audioFile.length)
let audioBuffer = AVAudioPCMBuffer(pcmFormat: audioFileFormat, frameCapacity: frameCount)!
// Read audio data into buffer
do {
try audioFile.read(into: audioBuffer)
} catch let error {
print(error.localizedDescription)
}
// Convert buffer to array of floats
let input: [Float] = Array(UnsafeBufferPointer(start: audioBuffer.floatChannelData![0], count: Int(audioBuffer.frameLength)))
The array is then sent to a convolution function twice that returns a new array each time. This is because the mono source file needs to become a stereo audio buffer:
maxSignalLength = input.count + 256
let leftAudioArray: [Float] = convolve(inputAudio: input, impulse: normalisedLeftImpulse)
let rightAudioArray: [Float] = convolve(inputAudio: input, impulse: normalisedRightImpulse)
The maxSignalLength variable is currently the length of the input signal + the length of the impulse response (normalisedImpulseResponse) that is being convolved with, which at the moment is 256. This will become an appropriate variable at some point.
I then declare and load the new buffer and its format, I have a feeling that the mistake is somewhere around here as this will be the buffer that is played:
let bufferFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: hrtfSampleRate, channels: 2, interleaved: false)!
let outputBuffer = AVAudioPCMBuffer(pcmFormat: bufferFormat, frameCapacity: AVAudioFrameCount(maxSignalLength))!
Notice that I am not creating an interleaved buffer, I load the stereo audio data to the buffer as follows (which I think may also be wrong):
for ch in 0 ..< 2 {
for i in 0 ..< maxSignalLength {
var val: Float!
if ch == 0 { // Left
val = leftAudioArray[i]
// Limit
if val > 1 {
val = 1
}
if val < -1 {
val = -1
}
} else if ch == 1 { // Right
val = rightAudioArray[i]
// Limit
if val < 1 {
val = 1
}
if val < -1 {
val = -1
}
}
outputBuffer.floatChannelData![ch][i] = val
}
}
The audio is also limited to values between -1 and 1.
Then I finally come to (attempting to) load the buffer to the audio node, attach the audio node to the audio engine, start the audio engine and then play the node.
let frameCapacity = AVAudioFramePosition(outputBuffer.frameCapacity)
let frameLength = outputBuffer.frameLength
playerNode.scheduleBuffer(outputBuffer, at: nil, options: AVAudioPlayerNodeBufferOptions.interrupts, completionHandler: nil)
playerNode.prepare(withFrameCount: frameLength)
let time = AVAudioTime(sampleTime: frameCapacity, atRate: hrtfSampleRate)
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: outputBuffer.format)
audioEngine.prepare()
do {
try audioEngine.start()
} catch let error {
print(error.localizedDescription)
}
playerNode.play(at: time)
The error that I get in runtime is:
AVAEInternal.h:76 required condition is false: [AVAudioPlayerNode.mm:712:ScheduleBuffer: (_outputFormat.channelCount == buffer.format.channelCount)]
It doesn't show the line that this error occurs on. I have been stuck on this for a while now, and have tried lots of different things, but there doesn't seem to be very much clear information about playing audio from memory and not from files with AVAudioEngine from what I could find. Any help would be greatly appreciated.
Thanks!
Edit #1:
Better title
Edit# 2:
UPDATE - I have found out why I was getting the error. It seemed to be caused by setting up the playerNode before attaching it to the audioEngine. Swapping the order stopped the program from crashing and throwing the error:
let frameCapacity = AVAudioFramePosition(outputBuffer.frameCapacity)
let frameLength = outputBuffer.frameLength
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: outputBuffer.format)
audioEngine.prepare()
playerNode.scheduleBuffer(outputBuffer, at: nil, options: AVAudioPlayerNodeBufferOptions.interrupts, completionHandler: nil)
playerNode.prepare(withFrameCount: frameLength)
let time = AVAudioTime(sampleTime: frameCapacity, atRate: hrtfSampleRate)
do {
try audioEngine.start()
} catch let error {
print(error.localizedDescription)
}
playerNode.play(at: time)
However, I don't have any sound. After creating an array of floats of the outputBuffer with the same method as used for the input signal, and taking a look at its contents with a break point it seems to be empty, so I must also be incorrectly storing the data to the outputBuffer.
You might be creating and filling your buffer incorrectly. Try doing it thus:
let fileURL = Bundle.main.url(forResource: "my_file", withExtension: "aiff")!
let file = try! AVAudioFile(forReading: fileURL)
let buffer = AVAudioPCMBuffer(pcmFormat: file.processingFormat, frameCapacity: UInt32(file.length))!
try! file.read(into: buffer)
I have fixed the issue!
I tried a lot of solutions and have ended up completely re-writing the audio engine section of my app and I now have the AVAudioEngine and AVAudioPlayerNode declared within the ViewController class as the following:
class ViewController: UIViewController {
var audioEngine: AVAudioEngine = AVAudioEngine()
var playerNode: AVAudioPlayerNode = AVAudioPlayerNode()
...
I am still unclear if it is better to declare these globally or as class variables in iOS, however I can confirm that my application is playing audio with these declared within the ViewController class. I do know that they shouldn't be declared in a function as they will disappear and stop playing when the function goes out of scope.
However, I still was not getting any audio output until I set the AVAudioPCMBuffer.frameLength to frameCapacity.
I could find very little information online regarding creating a new AVAudioPCMBuffer from an array of floats, but this seems to be the missing step that I needed to do to make my outputBuffer playable. Before I set this, it was at 0 by default.
The frameLength member isn't required in the AVAudioFormat class declaration. But it is important and my buffer wasn't playable until I set it manually, and after the class instance declaration:
let bufferFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: hrtfSampleRate, channels: 2, interleaved: false)!
let frameCapacity = UInt32(audioFile.length)
guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: bufferFormat, frameCapacity: frameCapacity) else {
fatalError("Could not create output buffer.")
}
outputBuffer.frameLength = frameCapacity // Important!
This took a long time to find out, hopefully this will help someone else in the future.

iOS AudioKit AKAmplitudeTracker

I'm trying to get something like this playground working on iOS:
http://audiokit.io/playgrounds/Analysis/Tracking%20Amplitude/
This is my view controller, where I use the mandolin physical model to create notes and then run an fft and an amplitudeTracker. But I get no values from them. You can see the output below:
var fft: AKFFTTap!
var amplitudeTracker: AKAmplitudeTracker!
override func viewDidLoad() {
super.viewDidLoad()
let mandolin = AKMandolin()
mandolin.detune = 1
mandolin.bodySize = 1
let pluckPosition = 0.2
let scale: [MIDINoteNumber] = [72, 74, 76, 77, 79, 81, 83, 84]
let delay = AKDelay(mandolin)
let mix = AKMixer()
mix.connect(delay)
let reverb = AKReverb(mix)
amplitudeTracker = AKAmplitudeTracker(mix)
fft = AKFFTTap(mix)
AudioKit.output = reverb
AudioKit.start()
for note in scale {
let note1: MIDINoteNumber = note
let octave1: MIDINoteNumber = 4
let course1 = 2
let count = 25
mandolin.fret(noteNumber: note1 + octave1, course: course1 - 1)
mandolin.pluck(course: course1 - 1, position: pluckPosition, velocity: 127)
print("plying note")
let fftData = self.fft.fftData
let lowMax = fftData[0 ... (count / 2) - 1].max() ?? 0
let hiMax = fftData[count / 2 ... count - 1].max() ?? 0
let hiMin = fftData[count / 2 ... count - 1].min() ?? 0
let amplitude = Float(self.amplitudeTracker.amplitude * 65)
print("amplitude \(amplitude)")
print("lowMax \(lowMax)")
print("hiMax \(hiMax)")
print("hiMin \(hiMin)")
sleep(1)
}
}
This is the output I get when I run it :
2017-09-26 12:43:27.724706-0700 AK[9467:1161171] 957: AUParameterSet 2 (1/8): err -10867
2017-09-26 12:43:28.177699-0700 AK[9467:1161171] 957: AUParameterSet 2 (1/8): err -10867
playing note
amplitude 0.0
lowMax 0.0
hiMax 0.0
hiMin 0.0
playing note
amplitude 0.0
lowMax 0.0
hiMax 0.0
hiMin 0.0
...
The main problem here is that Frequency Tracker node is not part of the signal chain. AudioKit (and Apple's underlying AVAudioEngine) works on a pull model in that audio will not be pulled through a node unless it is requested by a downstream node. This basically means everything up from the AudioKit.output node will get bytes pulled through them.
However, here, the reverb is made to be the output, so the tracker itself doesn't get any data coming through it. Changing it to AudioKit.output = amplitudeTracker will get the data going through the node.
The amplitudeTracker acts as a passthrough so the audio comes through as well. If you would not want the audio, you'd then stick the output of the tracker through a booster which would lower the volume down to zero.
I was getting this -10867 error while trying to reinitialize a AKSequencer variable that had a bunch of tracks/samplers/etc.
I stored them in arrays, called the following before reinitializing, and the -10867 errors went away:
private var samplers = [AKMIDISampler]()
private var tracks = [AKMusicTrack]()
private var mixer = AKMixer()
...
public func cleanSequencer() {
for track in tracks {
track.clear()
}
for sample in samplers {
sample.disconnectOutput()
sample.destroyEndpoint()
}
mixer.detach()
}
Hope this helps!
------- UPDATE: 01 -------
This produced some unexpected effects, mainly with no sound being played after using this method.
But, now curious if anyone knows why the -10867 would go away and sound too?

what samples will buffer contain when I use installTap(onBus for multichannel audio?

what samples will buffer contain when I use installTap(onBus for multichannel audio?
If channel count > 1 then it will contain left microphone ? or it will contain samples from left microphone and right microphone?
when I use iphone simulator then Format = pcmFormatFloat32, channelCount = 2, sampleRate = 44100.0, not Interleaved
I use this code
let bus = 0
inputNode.installTap(onBus: bus, bufferSize: myTapOnBusBufferSize, format: theAudioFormat) {
(buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
self.onNewBuffer(buffer)
}
func onNewBuffer(_ inputBuffer:AVAudioPCMBuffer!)
{
var samplesAsDoubles:[Double] = []
for i in 0 ..< Int(inputBuffer.frameLength)
{
let theSample = Double((inputBuffer.floatChannelData?.pointee[i])!)
samplesAsDoubles.append( theSample )
}
}
print("number of input busses = \(inputNode.numberOfInputs)")
it print
number of input buses = 1
for each sample in my samplesAsDoubles array from buffer that I have inside of block from what channel it will be? at what time this sample was recorded?
From the header comments for floatChannelData:
The returned pointer is to format.channelCount pointers to float. Each of these pointers
is to "frameLength" valid samples, which are spaced by "stride" samples.
If format.interleaved is false (as with the standard deinterleaved float format), then
the pointers will be to separate chunks of memory. "stride" is 1.
FloatChannelData gives you a 2D float array. For a non-interleaved 2 channel buffer, you would access the individual samples like this:
let channelCount = Int(buffer.format.channelCount)
let frameCount = Int(buffer.frameLength)
if let channels = buffer.floatChannelData { //channels is 2D float array
for channelIndex in 0..<channelCount {
let channel = channels[channelIndex] //1D float array
print(channelIndex == 0 ? "left" : "right")
for frameIndex in 0..<frameCount {
let sample = channel[frameIndex]
print(" \(sample)")
}
}
}

InstallTapOnBus - for output signal

As in topic subject, I want to analyze buffer of the output signal. I've used this function ( InstallTapOnBus ) for microphone signal, but i does not work for output. Anyone know how do that?
let bus = 0
let node = engine.outputNode
node.installTap(onBus: bus, bufferSize: AVAudioFrameCount(BUFFER_SIZE), format: node.outputFormat(forBus: bus), block: { (buffer : AVAudioPCMBuffer ,time : AVAudioTime) in
...
})
try! engine.start()
}
It provides me an error : "required condition is false: _isInput"
Have you tried tapping the mixer instead of the microphone directly?
Try mainMixerNode instead of outputNode.
This worked for me (iOS 12):
let outputNode = self.audioEngine.mainMixerNode
let format = self.audioEngine.mainMixerNode.outputFormat(forBus: 0)
Then installTap on mainMixerNode like you did.

Resources