I am trying to take the fast Fast Fourier Transform. I am basing my calculation off of the Surge. I am having trouble getting correct results. When I take the fft of a 1000 hz sound I get something that looks like this. . When i take the same tone and use python I get something that looks way more correct. The python code looks like:
import numpy as np
import scipy.io.wavfile
import numpy.fft
import matplotlib.pyplot as plt
FILENAME = 'beep.wav'
fs, data = scipy.io.wavfile.read(FILENAME)
data = data[:801]
spacing = 1 / float(fs)
freq = numpy.fft.rfft(data)
freq_power = np.abs(freq)
a = 1 / (2 * spacing)
b = (len(data) + 1) // 2
freq_axis = np.linspace(0, a, b)
plt.plot(freq_axis, freq_power)
plt.show()
The swift code looks like
import Accelerate
public func sqrt(x: [Float]) -> [Float] {
var results = [Float](count: x.count, repeatedValue: 0.0)
vvsqrtf(&results, x, [Int32(x.count)])
return results
}
public func fft(input: [Float]) -> [Float] {
var real = [Float](input)
var imaginary = [Float](count: input.count, repeatedValue: 0.0)
var splitComplex = DSPSplitComplex(realp: &real, imagp: &imaginary)
let length = vDSP_Length(floor(log2(Float(input.count))))
let radix = FFTRadix(kFFTRadix2)
let weights = vDSP_create_fftsetup(length, radix)
println(weights)
vDSP_fft_zip(weights, &splitComplex, 1, 8, FFTDirection(FFT_FORWARD))
var magnitudes = [Float](count: input.count, repeatedValue: 0.0)
vDSP_zvmags(&splitComplex, 1, &magnitudes, 1, vDSP_Length(input.count))
var normalizedMagnitudes = [Float](count: input.count, repeatedValue: 0.0)
vDSP_vsmul(sqrt(magnitudes), 1, [2.0 / Float(input.count)], &normalizedMagnitudes, 1, vDSP_Length(input.count))
vDSP_destroy_fftsetup(weights)
return normalizedMagnitudes
}
To reiterate. The swift code is the code giving unexpected results. What am I doing wrong?
It looks like you are using Swift Float arrays with the Accelerate framework, but you might instead need to allocate your vectors using UnsafeMutablePointer<Float> types since the Accelerate framework is an Objective C framework. Here is an example how to do this.
public func sqrt(x: [Float]) -> [Float] {
// convert swift array to C vector
var temp = UnsafeMutablePointer<Float>.alloc(x.count)
for (var i=0;i<x.count;i++) {
temp[i] = x[i];
}
var count = UnsafeMutablePointer<Int32>.alloc(1)
count[0] = Int32(x.count)
vvsqrtf(temp, temp, count)
// convert C vector to swift array
var results = [Float](count: x.count, repeatedValue: 0.0)
for (var i=0;i<x.count;i++) {
results[i] = temp[i];
}
// Free memory
count.dealloc(1)
temp.dealloc(x.count)
return results
}
It will work out better for performance to use the UnsafeMutablePointer<Float> types throughout your code for your vectors of data rather than converting back and forth in function calls as I did for this example. Also you should save your FFT setup and reuse that as well for better performance.
Since you're using the vDSP FFT you might also like the vDSP_zvabs API which calculates magnitude in dB from the FFT results.
Finally be sure to read this link on data packing and scaling for the Accelerate framework FFT APIs.
https://developer.apple.com/library/mac/documentation/Performance/Conceptual/vDSP_Programming_Guide/UsingFourierTransforms/UsingFourierTransforms.html
To improve performance, the vDSP APIs do not output the most obvious scale values (since you will undoubtedly be scaling the data anyway somewhere else) and they pack in some extra data into a few of the FFT points.
Related
I've got an array of audio files that I want to normalize so they all have similar perceived loudness. For testing purposes, I decided to adapt the AVAudioPCMBuffer.normalize method from AudioKit to suit my purposes. See here for implementation: https://github.com/AudioKit/AudioKit/blob/main/Sources/AudioKit/Audio%20Files/AVAudioPCMBuffer%2BProcessing.swift
I am converting each file into an AVAudioPCMBuffer, and then performing a reduce on that array of buffers to get the highest peak across all of the buffers. Then I created a new version of normalize called normalize(with peakAmplitude: Float) -> AVAudioPCMBuffer takes that peak amplitude, calculates a gainFactor and then iterates through the floatData for each channel and multiplies the floatData by the gainFactor. I then call my new flavor of normalize with the peak.amplitude that I get from the reduce operation on all the audio buffers.
This produces useful results, sometimes.
Here's the actual code in question:
extension AVAudioPCMBuffer {
public func normalize(with peakAmplitude: Float) -> AVAudioPCMBuffer {
guard let floatData = floatChannelData else { return self }
let gainFactor: Float = 1 / peakAmplitude
let length: AVAudioFrameCount = frameLength
let channelCount = Int(format.channelCount)
// i is the index in the buffer
for i in 0 ..< Int(length) {
// n is the channel
for n in 0 ..< channelCount {
let sample = floatData[n][i] * gainFactor
self.floatChannelData?[n][i] = sample
}
}
self.frameLength = length
return self
}
}
extension Array where Element == AVAudioPCMBuffer {
public func normalized() -> [AVAudioPCMBuffer] {
var minPeak = AVAudioPCMBuffer.Peak()
minPeak.amplitude = AVAudioPCMBuffer.Peak.min
let maxPeakForAllBuffers: AVAudioPCMBuffer.Peak = reduce(minPeak) { result, buffer in
guard
let currentBufferPeak = buffer.peak(),
currentBufferPeak.amplitude > result.amplitude
else {
return result
}
return currentBufferPeak
}
return map { $0.normalize(with: maxPeakForAllBuffers.amplitude) }
}
}
Three questions:
Is my approach reasonable for multiple files?
This appears to be using "peak normalization" vs RMS or EBU R128 normalization. Is that why when I give it a batch of 3 audio files and 2 of them are correctly made louder that 1 of them is made louder even though ffmpeg-normalize on the same batch of files makes that 1 file significantly quieter?
Any other suggestions on ways to alter the floatData across multiple AVAudioAudioPCMBuffers in order to make them have similar perceived loudness?
I'm trying to port the HOWL vocoder synth from AudioKit 2 to the latest version.
https://github.com/dclelland/HOWL
I'm starting with the Vocoder:
https://github.com/dclelland/HOWL/blob/master/HOWL/Models/Audio/Vocoder.swift
I'm not sure how this next bit of code works.
Is the reduce() applying the resonant filter to the audio input in a consecutive manner? Is it doing the equivalent of AKResonantFilter(AKResonantFilter(AKResonantFilter(mutedInput)))) ?
Or is something else going on?
let mutedAudioInput = AKAudioInput() * AKPortamento(input: inputAmplitude, halfTime: 0.001.ak)
let mutedInput = (input + mutedAudioInput) * AKPortamento(input: amplitude, halfTime: 0.001.ak)
let filter = zip(frequencies, bandwidths).reduce(mutedInput) { input, parameters in
let (frequency, bandwidth) = parameters
return AKResonantFilter(
input: input,
centerFrequency: AKPortamento(input: frequency, halfTime: 0.001.ak),
bandwidth: AKPortamento(input: bandwidth, halfTime: 0.001.ak)
)
}
here is my attempt at porting the vocoder filters:
import AudioKitPlaygrounds
import AudioKit
import AudioKitUI
let mixer = AKMixer()
let sawtooth = AKTable(.sawtooth)
let sawtoothLFO = AKOscillator(waveform: sawtooth, frequency: 130.81, amplitude: 1, detuningOffset: 0.0, detuningMultiplier: 0.0)
let frequencyScale = 1.0
let topFrequencies = zip(voice.æ.formants,voice.α.formants).map {topLeftFrequency,topRightFrequency in
return 0.5 * (topRightFrequency - topLeftFrequency) + topLeftFrequency
}
let bottomFrequencies = zip(voice.i.formants,voice.u.formants).map {bottomLeftFrequency,bottomRightFrequency in
return 0.5 * (bottomRightFrequency - bottomLeftFrequency) + bottomLeftFrequency
}
let frequencies = zip(topFrequencies, bottomFrequencies).map { topFrequency, bottomFrequency in
return (0.5 * (bottomFrequency - topFrequency) + topFrequency) * frequencyScale
}
let bandwidthScale = 1.0
let bandwidths = frequencies.map { frequency in
return (frequency * 0.02 + 50.0) * bandwidthScale
}
let filteredLFO = AKResonantFilter(sawtoothLFO)
let filter = zip(frequencies,bandwidths).reduce(filteredLFO) { input,parameters in
let (frequency, bandwidth) = parameters
return AKResonantFilter(
input,
frequency: frequency,
bandwidth: bandwidth
)
}
[filter, sawtoothLFO] >>> mixer
filter.start()
sawtoothLFO.play()
I am getting some sound but it isn't quite right. I am not sure if I am taking the right approach.
In particular my question is : is this the right approach to rewriting the bit of code highlighted above?
let filteredLFO = AKResonantFilter(sawtoothLFO)
let filter = zip(frequencies,bandwidths).reduce(filteredLFO) { input,parameters in
let (frequency, bandwidth) = parameters
return AKResonantFilter(
input,
frequency: frequency,
bandwidth: bandwidth
)
}
Is there a more preferred way to do this whole thing, using AKOperation generators? Should I be using AKFormantFilter ? I've experimented with AKFormantFilter and AKVocalTract, but have not been able to get the audio results I wanted. The HOWL app pretty much sounds exactly like what I'm trying to do, which is why I started porting the code. (it's for a "talking" robot game)
I am currently using map property with a closure in Swift to extract linear factors from an array and calculate a list of musical frequencies spanning one octave.
let tonic: Double = 261.626 // middle C
let factors = [ 1.0, 1.125, 1.25, 1.333, 1.5, 1.625, 1.875]
let frequencies = factors.map { $0 * tonic }
print(frequencies)
// [261.62599999999998, 294.32925, 327.03249999999997, 348.74745799999994, 392.43899999999996, 425.14224999999999, 490.54874999999993]
I want to do this by making the closure extract two integers from a string and divide them to form each factor. The string comes from an SCL tuning file and might look something like this:
// C D E F G A B
let ratios = [ "1/1", "9/8", "5/4", "4/3", "3/2", "27/16", "15/8"]
Can this be done ?
SOLUTION
Thankfully, yes it can. In three Swift statements tuning ratios represented as fractions since before Ptolemy can be coverted into precise frequencies. A slight modification to the accepted answer makes it possible to derive the list of frequencies. Here is the code
import UIKit
class ViewController: UIViewController {
// Diatonic scale
let ratios = [ "1/1", "9/8", "5/4", "4/3", "3/2", "27/16", "15/8"]
// Mohajira scale
// let ratios = [ "21/20", "9/8", "6/5", "49/40", "4/3", "7/5", "3/2", "8/5", "49/30", "9/5", "11/6", "2/1"]
override func viewDidLoad() {
super.viewDidLoad()
_ = Tuning(ratios: ratios)
}
}
Tuning Class
import UIKit
class Tuning {
let tonic = 261.626 // frequency of middle C (in Hertz)
var ratios = [String]()
init(ratios: [String]) {
self.ratios = ratios
let frequencies = ratios.map { s -> Double in
let integers = s.characters.split(separator: "/").map(String.init).map({ Double($0) })
return (integers[0]!/integers[1]!) * tonic
}
print("// \(frequencies)")
}
}
And here is the list of frequencies in Hertz corresponding to notes of the diatonic scale
C D E F G A B
[261.626007, 294.329254, 327.032501, 348.834686, 392.439026, 441.493896, 490.548767]
It works for other scales with pitches not usually found on a black-and-white-note music keyboard
Mohajira scale created by Jacques Dudon
// D F G C'
let ratios = [ "21/20", "9/8", "6/5", "49/40", "4/3", "7/5", "3/2", "8/5", "49/30", "9/5", "11/6", "2/1"]
And here is a list of frequencies produced
// D F G C'
// [274.70729999999998, 294.32925, 313.95119999999997, 320.49185, 348.83466666666664, 366.27639999999997, 392.43899999999996, 418.60159999999996, 427.32246666666663, 470.92679999999996, 479.64766666666662, 523.25199999999995]
Disclaimer
Currently the closure only handles rational scales. To fully comply with Scala SCL format it must also be able to distinguish between strings with fractions and strings with a decimal point and interpret the latter using cents, i.e. logarithmic rather than linear factors.
Thank you KangKang Adrian and Atem
let ratios = [ "1/1", "9/8", "5/4", "4/3", "3/2", "27/16", "15/8"]
let factors = ratios.map { s -> Float in
let integers = s.characters.split(separator: "/").map(String.init).map({ Float($0) })
return integers[0]!/integers[1]!
}
If I understand your question, you can do something like that:
func linearFactors(from string: String) -> Double? {
let components = string.components(separatedBy: "/").flatMap { Double($0) }
if let numerator = components.first, let denominator = components.last {
return numerator / denominator
}
return nil
}
Convert ratios to array of double
let ratios = [ "1/1", "9/8", "5/4", "4/3", "3/2", "27/16", "15/8"]
let array = ratios.flatMap { element in
let parts = element.components(separatedBy: "/")
guard parts.count == 2,
let dividend = Double(parts[0]),
let divisor = Double(parts[1]),
divisor != 0
else {
return nil
}
return parts[0] / parts[1]
}
I have a long string (sometimes over 1000 characters) that I want to convert to an array of boolean values. And it needs to do this many times, very quickly.
let input: String = "001"
let output: [Bool] = [false, false, true]
My naive attempt was this:
input.characters.map { $0 == "1" }
But this is a lot slower than I'd like. My profiling has shown me that the map is where the slowdown is, but I'm not sure how much simpler I can make that.
I feel like this would be wicked fast without Swift's/ObjC's overhead. In C, I think this is a simple for loop where a byte of memory is compared to a constant, but I'm not sure what the functions or syntax is that I should be looking at.
Is there a way to do this much faster?
UPDATE:
I also tried a
output = []
for char in input.characters {
output.append(char == "1")
}
And it's about 15% faster. I'm hoping for a lot more than that.
This is faster:
// Algorithm 'A'
let input = "0101010110010101010"
var output = Array<Bool>(count: input.characters.count, repeatedValue: false)
for (index, char) in input.characters.enumerate() where char == "1" {
output[index] = true
}
Update: under input = "010101011010101001000100000011010101010101010101"
0.0741 / 0.0087, where this approach is faster that author's in 8.46 times. With bigger data correlation more positive.
Also, with using nulTerminatedUTF8 speed a little increased, but not always speed higher than algorithm A:
// Algorithm 'B'
let input = "10101010101011111110101000010100101001010101"
var output = Array<Bool>(count: input.nulTerminatedUTF8.count, repeatedValue: false)
for (index, code) in input.nulTerminatedUTF8.enumerate() where code == 49 {
output[index] = true
}
In result graph appears, with input length 2196, where first and last 0..1, A – second, B – third point.
A: 0.311sec, B: 0.304sec
import Foundation
let input:String = "010101011001010101001010101100101010100101010110010101010101011001010101001010101100101010100101010101011001010101001010101100101010100101010"
var start = clock()
var output = Array<Bool>(count: input.nulTerminatedUTF8.count, repeatedValue: false)
var index = 0
for val in input.nulTerminatedUTF8 {
if val != 49 {
output[index] = true
}
index+=1
}
var diff = clock() - start;
var msec = diff * 1000 / UInt(CLOCKS_PER_SEC);
print("Time taken \(Double(msec)/1000.0) seconds \(msec%1000) milliseconds");
This should be really fast. Try it out. For 010101011010101001000100000011010101010101010101 it takes 0.039 secs.
I would guess that this is as fast as possible:
let targ = Character("1")
let input: String = "001" // your real string goes here
let inputchars = Array(input.characters)
var output:[Bool] = Array.init(count: inputchars.count, repeatedValue: false)
inputchars.withUnsafeBufferPointer {
inputbuf in
output.withUnsafeMutableBufferPointer {
outputbuf in
var ptr1 = inputbuf.baseAddress
var ptr2 = outputbuf.baseAddress
for _ in 0..<inputbuf.count {
ptr2.memory = ptr1.memory == targ
ptr1 = ptr1.successor()
ptr2 = ptr2.successor()
}
}
}
// output now contains the result
The reason is that, thanks to the use of buffer pointers, we are simply cycling through contiguous memory, just like the way you cycle through a C array by incrementing its pointer. Thus, once we get past the initial setup, this should be as fast as it would be in C.
EDIT In an actual test, the time difference between the OP's original method and this one is the difference between
13.3660290241241
and
0.219357967376709
which is a pretty dramatic speed-up. I hasten to add, however, that I have excluded the initial set-up from the timing test. This line:
let inputchars = Array(input.characters)
...is particularly expensive.
This should be a little faster than the enumerate() where char == "1" version (0.557s for 500_000 alternating ones and zeros vs. 1.159s algorithm 'A' from diampiax)
let input = inputStr.utf8
let n = input.count
var output = [Bool](count: n, repeatedValue: false)
let one = UInt8(49) // 1
for (idx, char) in input.enumerate() {
if char == one { output[idx] = true }
}
but it's also a lot less readable ;-p
edit: both versions are slower than the map variant, maybe you forgot to compile with optimizations?
One more step should speed that up even more. Using reserveCapacity will resize the array once before the loops starts instead of trying to do it as the loop runs.
var output = [Bool]()
output.reserveCapacity(input.characters.count)
for char in input.characters {
output.append(char == "1")
}
Use withCString(_:) to retrieve a raw UnsafePointer<Int8>. Iterate over that and compare to 49 (ascii value of "1").
What about a more functional style? It's not fastest (47 ms), today, for sure...
import Cocoa
let start = clock()
let bools = [Bool](([Character] ("010101011001010101001010101100101010100101010110010101010101011001010101001010101100101010100101010101011001010101001010101100101010100101010".characters)).map({$0 == "1"}))
let msec = (clock() - start) * 1000 / UInt(CLOCKS_PER_SEC);
print("Time taken \(Double(msec)/1000.0) seconds \(msec%1000) milliseconds");
I need to some testing to be sure but I think one issue with many approaches given including the original map is that they need to iterate over the string to count the characters and then a second time to actually process the characters.
Have you tried:
let output = [Bool](input.characters.lazy.map { $0 == "1" })
This might only do a single iteration.
The other thing that could speed things up is if you can avoid using strings but instead use arrays of characters of an appropriate encoding (particularly if is more fixed size units (e.g. UTF16 or ASCII). Then then length lookup will be O(1) rather than O(n) and the iteration may be faster too
BTW always test performance with the optimiser enabled and never in the Playground because the performance characteristics are completely different, sometimes by a factor of 100.
Following the good instructions that I found here: https://github.com/haginile/SwiftAccelerate I verified that matrix inversion works. In fact it did for the example given.
But I get a EXC_BAD_ACCESS error for any other matrix (bigger than 2x2) for example the following 2D matrix (converted as a 1D array) has been tested in matlab and python successfully and it does not work
m = [0.55481645013013, -1.15522603580724, 0.962090414322894, -0.530226035807236, 0.168545207161447, -0.38627124296868, 0.93401699437494, -0.999999999999995, 0.684016994374945, -0.23176274578121, 0.123606797749979, -0.323606797749979, 0.432893622827287, -0.323606797749979, 0.123606797749979, 0.231762745781211, -0.684016994374948, 1.0, -0.934016994374947, 0.386271242968684, 0.168545207161448, -0.530226035807237, 0.962090414322895, -1.15522603580724, 0.554816450130132]
Its inverted matrix should be
inv(AA)
ans =
Columns 1 through 3
-262796763616197 -656991909040516 4.90007819375216
-162417332048282 -406043330120712 14.6405748712708
0.718958226823704 7.87760147961979 30.4010295628018
162417332048287 406043330120730 46.1614842543337
262796763616208 656991909040536 55.9019809318537
Columns 4 through 5
-656991909040528 262796763616211
-406043330120721 162417332048287
-4.28281034550088 -0.718958226823794
406043330120704 -162417332048283
656991909040497 -262796763616196
Could you please give me another way of matrix inversion in Swift? Or explain me how to fix this?
I really don't understand why it does not work.
It doesn't work because the instructions that you found are not so good. Specifically, both pivots and workspace need to be Arrays, not scalar values; it was only working for two-by-two matrices by random chance.
Here's a modified version of the invert function that allocates the workspaces correctly:
func invert(matrix : [Double]) -> [Double] {
var inMatrix = matrix
var N = __CLPK_integer(sqrt(Double(matrix.count)))
var pivots = [__CLPK_integer](count: Int(N), repeatedValue: 0)
var workspace = [Double](count: Int(N), repeatedValue: 0.0)
var error : __CLPK_integer = 0
dgetrf_(&N, &N, &inMatrix, &N, &pivots, &error)
dgetri_(&N, &inMatrix, &N, &pivots, &workspace, &N, &error)
return inMatrix
}
I should also note that your 5x5 matrix is extremely ill-conditioned, so even when you can compute the "inverse" the error of that computation will be very large, and the inverse really shouldn't be used.
A Swift 4 version:
func invert(matrix : [Double]) -> [Double] {
var inMatrix = matrix
var N = __CLPK_integer(sqrt(Double(matrix.count)))
var pivots = [__CLPK_integer](repeating: 0, count: Int(N))
var workspace = [Double](repeating: 0.0, count: Int(N))
var error : __CLPK_integer = 0
withUnsafeMutablePointer(to: &N) {
dgetrf_($0, $0, &inMatrix, $0, &pivots, &error)
dgetri_($0, &inMatrix, $0, &pivots, &workspace, $0, &error)
}
return inMatrix
}
I have written a library for linear algebra in Swift. I call this library swix and it includes functions to invert matrices (this function is called inv).
Example use case:
var b = ones(10)
var A = rand((10, 10))
var AI = inv(A)
var x = AI.dot(b)
Source: https://github.com/stsievert/swix
Documentation: http://scottsievert.com/swix/