AudioKit 5.2 Migration: AKSampler() to Sampler() latency when loading audioFiles - audiokit

I am migrating my AudioKit-Code to AudioKit 5.2 and I am having problems with the AK5 Sampler() which I can not solve by myself. In AK previously it was AKSampler() and it had a function called:
loadAKAudioFile(from: AKSampleDescriptor, file: AKAudioFile)
Since Sampler() does not have that function, I took a piece of code from the source Sampler.swift which does the work, but it is extremely slow, it takes ages to load and map the 12 different small samples that I map on the note numbers from 60 to 71:
internal func loadAudioFile(from sampleDescriptor: SampleDescriptor, file: AVAudioFile) {
guard let floatChannelData = file.toFloatChannelData() else { return }
let sampleRate = Float(file.fileFormat.sampleRate)
let sampleCount = Int32(file.length)
let channelCount = Int32(file.fileFormat.channelCount)
var flattened = Array(floatChannelData.joined())
flattened.withUnsafeMutableBufferPointer { data in
var descriptor = SampleDataDescriptor(sampleDescriptor: sampleDescriptor,
sampleRate: sampleRate,
isInterleaved: false,
channelCount: channelCount,
sampleCount: sampleCount,
data: data.baseAddress)
akSamplerLoadData(au.dsp, &descriptor)
}
}
The AKSampler() did not have any noticable latency when doing the work but with Sampler() it takes more than a second to load a sample. Obviously AKSampler() worked asynchronously. I am new to swift and audio so I have no idea how to make Sampler() work asychronously.
Would be great to get some bits of code that could help it, thank's

Related

Using AudioKit's AVAudioPCMBuffer normalize function to normalize multiple audio files

I've got an array of audio files that I want to normalize so they all have similar perceived loudness. For testing purposes, I decided to adapt the AVAudioPCMBuffer.normalize method from AudioKit to suit my purposes. See here for implementation: https://github.com/AudioKit/AudioKit/blob/main/Sources/AudioKit/Audio%20Files/AVAudioPCMBuffer%2BProcessing.swift
I am converting each file into an AVAudioPCMBuffer, and then performing a reduce on that array of buffers to get the highest peak across all of the buffers. Then I created a new version of normalize called normalize(with peakAmplitude: Float) -> AVAudioPCMBuffer takes that peak amplitude, calculates a gainFactor and then iterates through the floatData for each channel and multiplies the floatData by the gainFactor. I then call my new flavor of normalize with the peak.amplitude that I get from the reduce operation on all the audio buffers.
This produces useful results, sometimes.
Here's the actual code in question:
extension AVAudioPCMBuffer {
public func normalize(with peakAmplitude: Float) -> AVAudioPCMBuffer {
guard let floatData = floatChannelData else { return self }
let gainFactor: Float = 1 / peakAmplitude
let length: AVAudioFrameCount = frameLength
let channelCount = Int(format.channelCount)
// i is the index in the buffer
for i in 0 ..< Int(length) {
// n is the channel
for n in 0 ..< channelCount {
let sample = floatData[n][i] * gainFactor
self.floatChannelData?[n][i] = sample
}
}
self.frameLength = length
return self
}
}
extension Array where Element == AVAudioPCMBuffer {
public func normalized() -> [AVAudioPCMBuffer] {
var minPeak = AVAudioPCMBuffer.Peak()
minPeak.amplitude = AVAudioPCMBuffer.Peak.min
let maxPeakForAllBuffers: AVAudioPCMBuffer.Peak = reduce(minPeak) { result, buffer in
guard
let currentBufferPeak = buffer.peak(),
currentBufferPeak.amplitude > result.amplitude
else {
return result
}
return currentBufferPeak
}
return map { $0.normalize(with: maxPeakForAllBuffers.amplitude) }
}
}
Three questions:
Is my approach reasonable for multiple files?
This appears to be using "peak normalization" vs RMS or EBU R128 normalization. Is that why when I give it a batch of 3 audio files and 2 of them are correctly made louder that 1 of them is made louder even though ffmpeg-normalize on the same batch of files makes that 1 file significantly quieter?
Any other suggestions on ways to alter the floatData across multiple AVAudioAudioPCMBuffers in order to make them have similar perceived loudness?

Playing a stereo audio buffer from memory with AVAudioEngine

I am trying to play a stereo audio buffer from memory (not from a file) in my iOS app but my application crashes when I attempt to attach the AVAudioPlayerNode 'playerNode' to the AVAudioEngine 'audioEngine'. The error code that I get is as follows:
Thread 1: Exception: "required condition is false: _outputFormat.channelCount == buffer.format.channelCount"
I don't know if this due to the way I have declared the AVAudioEngine, the AVAudioPlayerNode, if there is something wrong with the buffer which I am generating, or if I am attaching the nodes incorrectly (or something else!). I have a feeling that it is something to do with how I am creating a new buffer. I am trying to make a stereo buffer from two separate 'mono' arrays, and perhaps its format is not correct.
I have declared audioEngine: AVAudioEngine! and playerNode: AVAudioPlayerNode! globally:
var audioEngine: AVAudioEngine!
var playerNode: AVAudioPlayerNode!
I then load a mono source audio file that my app is going to process (the data out of this file will not be played, it will be loaded into an array, processed and then loaded into a new buffer):
// Read audio file
let audioFileFormat = audioFile.processingFormat
let frameCount = UInt32(audioFile.length)
let audioBuffer = AVAudioPCMBuffer(pcmFormat: audioFileFormat, frameCapacity: frameCount)!
// Read audio data into buffer
do {
try audioFile.read(into: audioBuffer)
} catch let error {
print(error.localizedDescription)
}
// Convert buffer to array of floats
let input: [Float] = Array(UnsafeBufferPointer(start: audioBuffer.floatChannelData![0], count: Int(audioBuffer.frameLength)))
The array is then sent to a convolution function twice that returns a new array each time. This is because the mono source file needs to become a stereo audio buffer:
maxSignalLength = input.count + 256
let leftAudioArray: [Float] = convolve(inputAudio: input, impulse: normalisedLeftImpulse)
let rightAudioArray: [Float] = convolve(inputAudio: input, impulse: normalisedRightImpulse)
The maxSignalLength variable is currently the length of the input signal + the length of the impulse response (normalisedImpulseResponse) that is being convolved with, which at the moment is 256. This will become an appropriate variable at some point.
I then declare and load the new buffer and its format, I have a feeling that the mistake is somewhere around here as this will be the buffer that is played:
let bufferFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: hrtfSampleRate, channels: 2, interleaved: false)!
let outputBuffer = AVAudioPCMBuffer(pcmFormat: bufferFormat, frameCapacity: AVAudioFrameCount(maxSignalLength))!
Notice that I am not creating an interleaved buffer, I load the stereo audio data to the buffer as follows (which I think may also be wrong):
for ch in 0 ..< 2 {
for i in 0 ..< maxSignalLength {
var val: Float!
if ch == 0 { // Left
val = leftAudioArray[i]
// Limit
if val > 1 {
val = 1
}
if val < -1 {
val = -1
}
} else if ch == 1 { // Right
val = rightAudioArray[i]
// Limit
if val < 1 {
val = 1
}
if val < -1 {
val = -1
}
}
outputBuffer.floatChannelData![ch][i] = val
}
}
The audio is also limited to values between -1 and 1.
Then I finally come to (attempting to) load the buffer to the audio node, attach the audio node to the audio engine, start the audio engine and then play the node.
let frameCapacity = AVAudioFramePosition(outputBuffer.frameCapacity)
let frameLength = outputBuffer.frameLength
playerNode.scheduleBuffer(outputBuffer, at: nil, options: AVAudioPlayerNodeBufferOptions.interrupts, completionHandler: nil)
playerNode.prepare(withFrameCount: frameLength)
let time = AVAudioTime(sampleTime: frameCapacity, atRate: hrtfSampleRate)
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: outputBuffer.format)
audioEngine.prepare()
do {
try audioEngine.start()
} catch let error {
print(error.localizedDescription)
}
playerNode.play(at: time)
The error that I get in runtime is:
AVAEInternal.h:76 required condition is false: [AVAudioPlayerNode.mm:712:ScheduleBuffer: (_outputFormat.channelCount == buffer.format.channelCount)]
It doesn't show the line that this error occurs on. I have been stuck on this for a while now, and have tried lots of different things, but there doesn't seem to be very much clear information about playing audio from memory and not from files with AVAudioEngine from what I could find. Any help would be greatly appreciated.
Thanks!
Edit #1:
Better title
Edit# 2:
UPDATE - I have found out why I was getting the error. It seemed to be caused by setting up the playerNode before attaching it to the audioEngine. Swapping the order stopped the program from crashing and throwing the error:
let frameCapacity = AVAudioFramePosition(outputBuffer.frameCapacity)
let frameLength = outputBuffer.frameLength
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: outputBuffer.format)
audioEngine.prepare()
playerNode.scheduleBuffer(outputBuffer, at: nil, options: AVAudioPlayerNodeBufferOptions.interrupts, completionHandler: nil)
playerNode.prepare(withFrameCount: frameLength)
let time = AVAudioTime(sampleTime: frameCapacity, atRate: hrtfSampleRate)
do {
try audioEngine.start()
} catch let error {
print(error.localizedDescription)
}
playerNode.play(at: time)
However, I don't have any sound. After creating an array of floats of the outputBuffer with the same method as used for the input signal, and taking a look at its contents with a break point it seems to be empty, so I must also be incorrectly storing the data to the outputBuffer.
You might be creating and filling your buffer incorrectly. Try doing it thus:
let fileURL = Bundle.main.url(forResource: "my_file", withExtension: "aiff")!
let file = try! AVAudioFile(forReading: fileURL)
let buffer = AVAudioPCMBuffer(pcmFormat: file.processingFormat, frameCapacity: UInt32(file.length))!
try! file.read(into: buffer)
I have fixed the issue!
I tried a lot of solutions and have ended up completely re-writing the audio engine section of my app and I now have the AVAudioEngine and AVAudioPlayerNode declared within the ViewController class as the following:
class ViewController: UIViewController {
var audioEngine: AVAudioEngine = AVAudioEngine()
var playerNode: AVAudioPlayerNode = AVAudioPlayerNode()
...
I am still unclear if it is better to declare these globally or as class variables in iOS, however I can confirm that my application is playing audio with these declared within the ViewController class. I do know that they shouldn't be declared in a function as they will disappear and stop playing when the function goes out of scope.
However, I still was not getting any audio output until I set the AVAudioPCMBuffer.frameLength to frameCapacity.
I could find very little information online regarding creating a new AVAudioPCMBuffer from an array of floats, but this seems to be the missing step that I needed to do to make my outputBuffer playable. Before I set this, it was at 0 by default.
The frameLength member isn't required in the AVAudioFormat class declaration. But it is important and my buffer wasn't playable until I set it manually, and after the class instance declaration:
let bufferFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: hrtfSampleRate, channels: 2, interleaved: false)!
let frameCapacity = UInt32(audioFile.length)
guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: bufferFormat, frameCapacity: frameCapacity) else {
fatalError("Could not create output buffer.")
}
outputBuffer.frameLength = frameCapacity // Important!
This took a long time to find out, hopefully this will help someone else in the future.

how to port this bit of code from HOWL vocoder synth from AudioKit 2 to AudioKit 4.x?

I'm trying to port the HOWL vocoder synth from AudioKit 2 to the latest version.
https://github.com/dclelland/HOWL
I'm starting with the Vocoder:
https://github.com/dclelland/HOWL/blob/master/HOWL/Models/Audio/Vocoder.swift
I'm not sure how this next bit of code works.
Is the reduce() applying the resonant filter to the audio input in a consecutive manner? Is it doing the equivalent of AKResonantFilter(AKResonantFilter(AKResonantFilter(mutedInput)))) ?
Or is something else going on?
let mutedAudioInput = AKAudioInput() * AKPortamento(input: inputAmplitude, halfTime: 0.001.ak)
let mutedInput = (input + mutedAudioInput) * AKPortamento(input: amplitude, halfTime: 0.001.ak)
let filter = zip(frequencies, bandwidths).reduce(mutedInput) { input, parameters in
let (frequency, bandwidth) = parameters
return AKResonantFilter(
input: input,
centerFrequency: AKPortamento(input: frequency, halfTime: 0.001.ak),
bandwidth: AKPortamento(input: bandwidth, halfTime: 0.001.ak)
)
}
here is my attempt at porting the vocoder filters:
import AudioKitPlaygrounds
import AudioKit
import AudioKitUI
let mixer = AKMixer()
let sawtooth = AKTable(.sawtooth)
let sawtoothLFO = AKOscillator(waveform: sawtooth, frequency: 130.81, amplitude: 1, detuningOffset: 0.0, detuningMultiplier: 0.0)
let frequencyScale = 1.0
let topFrequencies = zip(voice.æ.formants,voice.α.formants).map {topLeftFrequency,topRightFrequency in
return 0.5 * (topRightFrequency - topLeftFrequency) + topLeftFrequency
}
let bottomFrequencies = zip(voice.i.formants,voice.u.formants).map {bottomLeftFrequency,bottomRightFrequency in
return 0.5 * (bottomRightFrequency - bottomLeftFrequency) + bottomLeftFrequency
}
let frequencies = zip(topFrequencies, bottomFrequencies).map { topFrequency, bottomFrequency in
return (0.5 * (bottomFrequency - topFrequency) + topFrequency) * frequencyScale
}
let bandwidthScale = 1.0
let bandwidths = frequencies.map { frequency in
return (frequency * 0.02 + 50.0) * bandwidthScale
}
let filteredLFO = AKResonantFilter(sawtoothLFO)
let filter = zip(frequencies,bandwidths).reduce(filteredLFO) { input,parameters in
let (frequency, bandwidth) = parameters
return AKResonantFilter(
input,
frequency: frequency,
bandwidth: bandwidth
)
}
[filter, sawtoothLFO] >>> mixer
filter.start()
sawtoothLFO.play()
I am getting some sound but it isn't quite right. I am not sure if I am taking the right approach.
In particular my question is : is this the right approach to rewriting the bit of code highlighted above?
let filteredLFO = AKResonantFilter(sawtoothLFO)
let filter = zip(frequencies,bandwidths).reduce(filteredLFO) { input,parameters in
let (frequency, bandwidth) = parameters
return AKResonantFilter(
input,
frequency: frequency,
bandwidth: bandwidth
)
}
Is there a more preferred way to do this whole thing, using AKOperation generators? Should I be using AKFormantFilter ? I've experimented with AKFormantFilter and AKVocalTract, but have not been able to get the audio results I wanted. The HOWL app pretty much sounds exactly like what I'm trying to do, which is why I started porting the code. (it's for a "talking" robot game)

iOS AudioKit AKAmplitudeTracker

I'm trying to get something like this playground working on iOS:
http://audiokit.io/playgrounds/Analysis/Tracking%20Amplitude/
This is my view controller, where I use the mandolin physical model to create notes and then run an fft and an amplitudeTracker. But I get no values from them. You can see the output below:
var fft: AKFFTTap!
var amplitudeTracker: AKAmplitudeTracker!
override func viewDidLoad() {
super.viewDidLoad()
let mandolin = AKMandolin()
mandolin.detune = 1
mandolin.bodySize = 1
let pluckPosition = 0.2
let scale: [MIDINoteNumber] = [72, 74, 76, 77, 79, 81, 83, 84]
let delay = AKDelay(mandolin)
let mix = AKMixer()
mix.connect(delay)
let reverb = AKReverb(mix)
amplitudeTracker = AKAmplitudeTracker(mix)
fft = AKFFTTap(mix)
AudioKit.output = reverb
AudioKit.start()
for note in scale {
let note1: MIDINoteNumber = note
let octave1: MIDINoteNumber = 4
let course1 = 2
let count = 25
mandolin.fret(noteNumber: note1 + octave1, course: course1 - 1)
mandolin.pluck(course: course1 - 1, position: pluckPosition, velocity: 127)
print("plying note")
let fftData = self.fft.fftData
let lowMax = fftData[0 ... (count / 2) - 1].max() ?? 0
let hiMax = fftData[count / 2 ... count - 1].max() ?? 0
let hiMin = fftData[count / 2 ... count - 1].min() ?? 0
let amplitude = Float(self.amplitudeTracker.amplitude * 65)
print("amplitude \(amplitude)")
print("lowMax \(lowMax)")
print("hiMax \(hiMax)")
print("hiMin \(hiMin)")
sleep(1)
}
}
This is the output I get when I run it :
2017-09-26 12:43:27.724706-0700 AK[9467:1161171] 957: AUParameterSet 2 (1/8): err -10867
2017-09-26 12:43:28.177699-0700 AK[9467:1161171] 957: AUParameterSet 2 (1/8): err -10867
playing note
amplitude 0.0
lowMax 0.0
hiMax 0.0
hiMin 0.0
playing note
amplitude 0.0
lowMax 0.0
hiMax 0.0
hiMin 0.0
...
The main problem here is that Frequency Tracker node is not part of the signal chain. AudioKit (and Apple's underlying AVAudioEngine) works on a pull model in that audio will not be pulled through a node unless it is requested by a downstream node. This basically means everything up from the AudioKit.output node will get bytes pulled through them.
However, here, the reverb is made to be the output, so the tracker itself doesn't get any data coming through it. Changing it to AudioKit.output = amplitudeTracker will get the data going through the node.
The amplitudeTracker acts as a passthrough so the audio comes through as well. If you would not want the audio, you'd then stick the output of the tracker through a booster which would lower the volume down to zero.
I was getting this -10867 error while trying to reinitialize a AKSequencer variable that had a bunch of tracks/samplers/etc.
I stored them in arrays, called the following before reinitializing, and the -10867 errors went away:
private var samplers = [AKMIDISampler]()
private var tracks = [AKMusicTrack]()
private var mixer = AKMixer()
...
public func cleanSequencer() {
for track in tracks {
track.clear()
}
for sample in samplers {
sample.disconnectOutput()
sample.destroyEndpoint()
}
mixer.detach()
}
Hope this helps!
------- UPDATE: 01 -------
This produced some unexpected effects, mainly with no sound being played after using this method.
But, now curious if anyone knows why the -10867 would go away and sound too?

Upload large file via URLSession

So I have some code I've been using to upload files in my app, along the lines of this:
var mutableURLRequest = URLRequest(url: url)
var uploadData = try! Data(contentsOf: dataUrl)
session.uploadTask(with: mutableURLRequest, from: uploadData).resume()
There's a little more to it than that, but those are the relevant parts. However I've noticed for some large video files Data(contentsOf: dataUrl) fails since the file is to big to load into memory. I want to restructure this so that I'm able to stream piece by piece to the server without ever having to load the whole file into memory.
I already have this figured out from my server, the only piece I haven't figured out is how to get a chunkSize piece from the data in a URL, without putting it into a data object. I essentially want this construct:
let chunkSize = 1024 * 1024
let offset = 0
let chunk = //get data from dataUrl of size chunkSize offset by offset
//Upload each chunk and increment offset
NSInputStream seemed promising in being able to do this, but I wasn't able to figure out the minimum set up in order to pull bytes from a file on disk in this fashion. What code can I use above to fill in the let chunk = line to do such a task?
I have a working solution, might need a little tweaking, but seems to work for big files I've tried:
public func getNextChunk() -> Data?{
if _inputStream == nil {
_inputStream = InputStream(url: _url)
_inputStream?.open()
}
var buffer = [UInt8](repeating: 0, count: CHUNK_SIZE)
var len = _inputStream?.read(&buffer, maxLength: CHUNK_SIZE)
if len == 0 {
return nil
}
return Data(buffer)
}
I also call _inputStream?.close() on deinit of my class that manages the chunking of a file on disk.

Resources