Using AudioKit's AVAudioPCMBuffer normalize function to normalize multiple audio files - audiokit

I've got an array of audio files that I want to normalize so they all have similar perceived loudness. For testing purposes, I decided to adapt the AVAudioPCMBuffer.normalize method from AudioKit to suit my purposes. See here for implementation: https://github.com/AudioKit/AudioKit/blob/main/Sources/AudioKit/Audio%20Files/AVAudioPCMBuffer%2BProcessing.swift
I am converting each file into an AVAudioPCMBuffer, and then performing a reduce on that array of buffers to get the highest peak across all of the buffers. Then I created a new version of normalize called normalize(with peakAmplitude: Float) -> AVAudioPCMBuffer takes that peak amplitude, calculates a gainFactor and then iterates through the floatData for each channel and multiplies the floatData by the gainFactor. I then call my new flavor of normalize with the peak.amplitude that I get from the reduce operation on all the audio buffers.
This produces useful results, sometimes.
Here's the actual code in question:
extension AVAudioPCMBuffer {
public func normalize(with peakAmplitude: Float) -> AVAudioPCMBuffer {
guard let floatData = floatChannelData else { return self }
let gainFactor: Float = 1 / peakAmplitude
let length: AVAudioFrameCount = frameLength
let channelCount = Int(format.channelCount)
// i is the index in the buffer
for i in 0 ..< Int(length) {
// n is the channel
for n in 0 ..< channelCount {
let sample = floatData[n][i] * gainFactor
self.floatChannelData?[n][i] = sample
}
}
self.frameLength = length
return self
}
}
extension Array where Element == AVAudioPCMBuffer {
public func normalized() -> [AVAudioPCMBuffer] {
var minPeak = AVAudioPCMBuffer.Peak()
minPeak.amplitude = AVAudioPCMBuffer.Peak.min
let maxPeakForAllBuffers: AVAudioPCMBuffer.Peak = reduce(minPeak) { result, buffer in
guard
let currentBufferPeak = buffer.peak(),
currentBufferPeak.amplitude > result.amplitude
else {
return result
}
return currentBufferPeak
}
return map { $0.normalize(with: maxPeakForAllBuffers.amplitude) }
}
}
Three questions:
Is my approach reasonable for multiple files?
This appears to be using "peak normalization" vs RMS or EBU R128 normalization. Is that why when I give it a batch of 3 audio files and 2 of them are correctly made louder that 1 of them is made louder even though ffmpeg-normalize on the same batch of files makes that 1 file significantly quieter?
Any other suggestions on ways to alter the floatData across multiple AVAudioAudioPCMBuffers in order to make them have similar perceived loudness?

Related

How do I remove noise from an audio signal? And what threshold/threshold range should I use?

I have loaded an audio file and have created an input and output buffer.
But when I follow Apple's post I get an output signal that has distorted sounds.
private func extractSignal(input: AVAudioPCMBuffer, output: AVAudioPCMBuffer) {
let count = 256
let forward = vDSP.DCT(previous: nil, count: count, transformType: .II)!
let inverse = vDSP.DCT(previous: nil, count: count, transformType: .III)!
// Iterates over the signal.
input.iterate(signalCount: 32000) { step, signal in
var series = forward.transform(signal)
series = vDSP.threshold(series, to: 0.0003, with: .zeroFill) // What should this threshold be?
var inversed = inverse.transform(series)
let divisor: Float = Float(count / 2)
inversed = vDSP.divide(inversed, divisor)
// Code: write inversed to output buffer.
output.frameLength = AVAudioFrameCount(step * signal.count + signal.count)
}
}

AudioKit 5.2 Migration: AKSampler() to Sampler() latency when loading audioFiles

I am migrating my AudioKit-Code to AudioKit 5.2 and I am having problems with the AK5 Sampler() which I can not solve by myself. In AK previously it was AKSampler() and it had a function called:
loadAKAudioFile(from: AKSampleDescriptor, file: AKAudioFile)
Since Sampler() does not have that function, I took a piece of code from the source Sampler.swift which does the work, but it is extremely slow, it takes ages to load and map the 12 different small samples that I map on the note numbers from 60 to 71:
internal func loadAudioFile(from sampleDescriptor: SampleDescriptor, file: AVAudioFile) {
guard let floatChannelData = file.toFloatChannelData() else { return }
let sampleRate = Float(file.fileFormat.sampleRate)
let sampleCount = Int32(file.length)
let channelCount = Int32(file.fileFormat.channelCount)
var flattened = Array(floatChannelData.joined())
flattened.withUnsafeMutableBufferPointer { data in
var descriptor = SampleDataDescriptor(sampleDescriptor: sampleDescriptor,
sampleRate: sampleRate,
isInterleaved: false,
channelCount: channelCount,
sampleCount: sampleCount,
data: data.baseAddress)
akSamplerLoadData(au.dsp, &descriptor)
}
}
The AKSampler() did not have any noticable latency when doing the work but with Sampler() it takes more than a second to load a sample. Obviously AKSampler() worked asynchronously. I am new to swift and audio so I have no idea how to make Sampler() work asychronously.
Would be great to get some bits of code that could help it, thank's

Thread 1: Fatal error: Index out of range when index is less then array count

I am getting error Thread 1: Fatal error: Index out of range on
func ReaderConverterCallback(_ converter: AudioConverterRef,
_ packetCount: UnsafeMutablePointer<UInt32>,
_ ioData: UnsafeMutablePointer<AudioBufferList>,
_ outPacketDescriptions: UnsafeMutablePointer<UnsafeMutablePointer<AudioStreamPacketDescription>?>?,
_ context: UnsafeMutableRawPointer?) -> OSStatus {
let reader = Unmanaged<Reader>.fromOpaque(context!).takeUnretainedValue()
//
// Make sure we have a valid source format so we know the data format of the parser's audio packets
//
guard let sourceFormat = reader.parser.dataFormat else {
return ReaderMissingSourceFormatError
}
//
// Check if we've reached the end of the packets. We have two scenarios:
// 1. We've reached the end of the packet data and the file has been completely parsed
// 2. We've reached the end of the data we currently have downloaded, but not the file
//
let packetIndex = Int(reader.currentPacket)
let packets = reader.parser.packets
let isEndOfData = packetIndex >= packets.count - 1
if isEndOfData {
if reader.parser.isParsingComplete {
packetCount.pointee = 0
return ReaderReachedEndOfDataError
} else {
return ReaderNotEnoughDataError
}
}
//
// Copy data over (note we've only processing a single packet of data at a time)
//
let packet = packets[packetIndex] <--------- Thread 1: Fatal error: Index out of range on
var data = packet.0
let dataCount = data.count
ioData.pointee.mNumberBuffers = 1
ioData.pointee.mBuffers.mData = UnsafeMutableRawPointer.allocate(byteCount: dataCount, alignment: 0)
_ = data.withUnsafeMutableBytes { (bytes: UnsafeMutablePointer<UInt8>) in
memcpy((ioData.pointee.mBuffers.mData?.assumingMemoryBound(to: UInt8.self))!, bytes, dataCount)
}
ioData.pointee.mBuffers.mDataByteSize = UInt32(dataCount)
//
// Handle packet descriptions for compressed formats (MP3, AAC, etc)
//
let sourceFormatDescription = sourceFormat.streamDescription.pointee
if sourceFormatDescription.mFormatID != kAudioFormatLinearPCM {
if outPacketDescriptions?.pointee == nil {
outPacketDescriptions?.pointee = UnsafeMutablePointer<AudioStreamPacketDescription>.allocate(capacity: 1)
}
outPacketDescriptions?.pointee?.pointee.mDataByteSize = UInt32(dataCount)
outPacketDescriptions?.pointee?.pointee.mStartOffset = 0
outPacketDescriptions?.pointee?.pointee.mVariableFramesInPacket = 0
}
packetCount.pointee = 1
reader.currentPacket = reader.currentPacket + 1
return noErr;
}
even if there is packetIndex is less then packets.count.
Note: Please compare both question before marking it duplicate. Reference possible duplicate doesn't show that index of array is less than array count.
I am using this https://github.com/syedhali/AudioStreamer/ library for playing audio from url.
It looks like a Multi-Thread problem. According to the printed logs, the index seems ok, but another thread may change the data 'packets', that causes the crash. Please consider locking data-related operations between threads.
Additional analyzation: according to the following lines, packets may be shared between threads.
let reader = Unmanaged<Reader>.fromOpaque(context!).takeUnretainedValue()
//......
let packets = reader.parser.packets
Suggestion: check if somewhere the Unmanaged<Reader> change the parser.packets, and make a lock strategy.

Is there a cryptographically secure alternative to arc4random on iOS?

I'd like to use a PRNG like arc4random_uniform(); however, Wikipedia seems to think that rc4 is insecure. I don't have the wherewithal to confirm myself, but security is a requirement for my use-case.
arc4random_uniform is documented as a "cryptographic pseudo-random number generator," so it should be fine for this purpose. Do not confuse the security problems of RC4 with arc4random. See Zaph's answer for more details. (I've researched this before, and I remember arc4random being just as secure as other approaches, but I trust Zaph more than I trust my own memory.)
That said, if you are nervous, the tool you want to use is SecRandomCopyBytes (alternately you can read from /dev/random, which is exactly what SecRandomCopyBytes does by spec).
Getting a random value from SecRandomCopyBytes is harder than it should be, but not too hard. Here's how you do it in a highly generic way (Swift 3):
extension Integer {
static func makeRandom() -> Self {
var result: Self = 0
withUnsafeMutablePointer(to: &result) { resultPtr in
resultPtr.withMemoryRebound(to: UInt8.self, capacity: MemoryLayout<Self>.size) { bytePtr in
SecRandomCopyBytes(nil, MemoryLayout<Self>.size, bytePtr)
}
}
return result
}
}
This works on any Integer. Basically we interpret a bunch of random bytes as an Integer. (BTW, this approach does not work nearly as well for floating point values. You can do it, but you'll discover that not all bit patterns are actually "numbers" in floating point. So it's a little more complicated.)
Now you want to get those values in a range without introducing bias. Just saying x % limit creates modulo bias. Don't do that. The correct way approach is to do what arc4random_uniform does. It's open source, so you can just go look at it. Applying the same approach in Swift looks like:
extension Int {
static func makeRandom(betweenZeroAnd limit: Int) -> Int {
assert(limit > 0)
// Convert our range from [0, Int.max) to [Int.max % limit, Int.max)
// This way, when we later % limit, there will be no bias
let minValue = Int.max % limit
var value = 0
// Keep guessing until we're in the range.
// In theory this could loop forever. It won't. A couple of times at worst
// (mostly because we'll pick some negatives that we'll throw away)
repeat {
value = makeRandom()
} while value < minValue
return value % limit
}
}
We can't build this on Integer because there's no .max property on Integer.
In Swift 4 this is all cleaned up with FixedWidthInteger, and we can make this more generic:
extension FixedWidthInteger {
static func makeRandom() -> Self {
var result: Self = 0
withUnsafeMutablePointer(to: &result) { resultPtr in
resultPtr.withMemoryRebound(to: UInt8.self, capacity: MemoryLayout<Self>.size) { bytePtr in
SecRandomCopyBytes(nil, MemoryLayout<Self>.size, bytePtr)
}
}
return result
}
static func makeRandom(betweenZeroAnd limit: Self) -> Self {
assert(limit > 0)
// Convert our range from [0, Int.max) to [Int.max % limit, Int.max)
// This way, when we later % limit, there will be no bias
let minValue = Self.max % limit
var value: Self = 0
// Keep guessing until we're in the range.
// In theory this could loop forever. It won't. A couple of times at worst
// (mostly because we'll pick some negatives that we'll throw away)
repeat {
value = makeRandom()
} while value < minValue
return value % limit
}
}

what samples will buffer contain when I use installTap(onBus for multichannel audio?

what samples will buffer contain when I use installTap(onBus for multichannel audio?
If channel count > 1 then it will contain left microphone ? or it will contain samples from left microphone and right microphone?
when I use iphone simulator then Format = pcmFormatFloat32, channelCount = 2, sampleRate = 44100.0, not Interleaved
I use this code
let bus = 0
inputNode.installTap(onBus: bus, bufferSize: myTapOnBusBufferSize, format: theAudioFormat) {
(buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
self.onNewBuffer(buffer)
}
func onNewBuffer(_ inputBuffer:AVAudioPCMBuffer!)
{
var samplesAsDoubles:[Double] = []
for i in 0 ..< Int(inputBuffer.frameLength)
{
let theSample = Double((inputBuffer.floatChannelData?.pointee[i])!)
samplesAsDoubles.append( theSample )
}
}
print("number of input busses = \(inputNode.numberOfInputs)")
it print
number of input buses = 1
for each sample in my samplesAsDoubles array from buffer that I have inside of block from what channel it will be? at what time this sample was recorded?
From the header comments for floatChannelData:
The returned pointer is to format.channelCount pointers to float. Each of these pointers
is to "frameLength" valid samples, which are spaced by "stride" samples.
If format.interleaved is false (as with the standard deinterleaved float format), then
the pointers will be to separate chunks of memory. "stride" is 1.
FloatChannelData gives you a 2D float array. For a non-interleaved 2 channel buffer, you would access the individual samples like this:
let channelCount = Int(buffer.format.channelCount)
let frameCount = Int(buffer.frameLength)
if let channels = buffer.floatChannelData { //channels is 2D float array
for channelIndex in 0..<channelCount {
let channel = channels[channelIndex] //1D float array
print(channelIndex == 0 ? "left" : "right")
for frameIndex in 0..<frameCount {
let sample = channel[frameIndex]
print(" \(sample)")
}
}
}

Resources