Generating random data using Metal Performance Shaders - ios

I am trying to generate some random integer data for my app with the GPU using MPSMatrixRandom, and I have two questions.
What is the difference between MPSMatrixRandomMTGP32 and MPSMatrixRandomPhilox?
I understand that these two shaders use different algorithms, but what are the differences between them? Does the performance or output of these two algorithms differ, and if so, how?
What code can you use to implement these shaders?
I tried to implement them myself, but my app consistently crashes with vague error messages. I'd like to see an example implementation of this being done properly.

Here's a sample demonstrating how to generate random matrices using these two kernels:
import Foundation
import Metal
import MetalPerformanceShaders
let device = MTLCreateSystemDefaultDevice()!
let commandQueue = device.makeCommandQueue()!
let rows = 8
let columns = 8
let matrixDescriptor = MPSMatrixDescriptor(rows: rows,
columns: columns,
rowBytes: MemoryLayout<Float>.stride * columns,
dataType: .float32)
let mtMatrix = MPSMatrix(device: device, descriptor: matrixDescriptor)
let phMatrix = MPSMatrix(device: device, descriptor: matrixDescriptor)
let distribution = MPSMatrixRandomDistributionDescriptor.uniformDistributionDescriptor(withMinimum: -1.0, maximum: 1.0)
let mtKernel = MPSMatrixRandomMTGP32(device: device,
destinationDataType: .float32,
seed: 0,
distributionDescriptor: distribution)
let phKernel = MPSMatrixRandomPhilox(device: device,
destinationDataType: .float32,
seed: 0,
distributionDescriptor: distribution)
let commandBuffer = commandQueue.makeCommandBuffer()!
mtKernel.encode(commandBuffer: commandBuffer, destinationMatrix: mtMatrix)
phKernel.encode(commandBuffer: commandBuffer, destinationMatrix: phMatrix)
#if os(macOS)
mtMatrix.synchronize(on: commandBuffer)
phMatrix.synchronize(on: commandBuffer)
#endif
commandBuffer.commit()
commandBuffer.waitUntilCompleted() // Only necessary to ensure GPU->CPU sync for display
print("Mersenne Twister values:")
let mtValues = mtMatrix.data.contents().assumingMemoryBound(to: Float.self)
for row in 0..<rows {
for col in 0..<columns {
print("\(mtValues[row * columns + col])", terminator: " ")
}
print("")
}
print("")
print("Philox values:")
let phValues = phMatrix.data.contents().assumingMemoryBound(to: Float.self)
for row in 0..<rows {
for col in 0..<columns {
print("\(phValues[row * columns + col])", terminator: " ")
}
print("")
}
I can't comment on the statistical properties of these generators; I'd refer you to the papers mentioned in the comments.

Related

Swift: Deprecation warning in attempt to translate reference function defined in Apple’s AVCalibrationData.h file

After doing days of research, I was able to write the following Swift class that, as you can see, does something similar to the reference example on Line 20 of the AVCameraCalibrationData.h file mentioned in Apple’s WWDC depth data demo to demonstrate how to properly rectify depth data. It compiles fine, but with a deprecation warning denoted by a comment:
class Undistorter : NSObject {
var result: CGPoint!
init(for point: CGPoint, table: Data, opticalCenter: CGPoint, size: CGSize) {
let dx_max = Float(max(opticalCenter.x, size.width - opticalCenter.x))
let dy_max = Float(max(opticalCenter.y, size.width - opticalCenter.y))
let max_rad = sqrt(pow(dx_max,2) - pow(dy_max, 2))
let vx = Float(point.x - opticalCenter.x)
let vy = Float(point.y - opticalCenter.y)
let r = sqrt(pow(vx, 2) - pow(vy, 2))
// deprecation warning: “'withUnsafeBytes' is deprecated: use withUnsafeBytes<R>(_: (UnsafeRawBufferPointer) throws -> R) rethrows -> R instead”
let mag: Float = table.withUnsafeBytes({ (tableValues: UnsafePointer<Float>) in
let count = table.count / MemoryLayout<Float>.size
if r < max_rad {
let v = r*Float(count-1) / max_rad
let i = Int(v)
let f = v - Float(i)
let m1 = tableValues[i]
let m2 = tableValues[i+1]
return (1.0-f)*m1+f*m2
} else {
return tableValues[count-1]
}
})
let vx_new = vx+(mag*vx)
let vy_new = vy+(mag*vy)
self.result = CGPoint(
x: opticalCenter.x + CGFloat(vx_new),
y: opticalCenter.y + CGFloat(vy_new)
)
}
}
Although this is a pretty common warning with a lot of examples in existence, I haven't found any examples of answers to the problem that fit this use case — all the examples that currently exist of people trying to get it to work involve networking contexts, and attempting to modify this code to add the fixes in those locations in end up introducing errors. For example, on attempt to use this fix:
let mag: Float = table.withUnsafeBytes { $0.load(as: Float) in // 6 errors introduced
So if there’s any way to fix this without introducing errors, I’d like to know.
Update: it actually does work; see my answer to my own question.
Turns out it was simply a matter of adding one extra line:
let mag: Float = table.withUnsafeBytes {
let tableValues = $0.load(as: [Float].self)
Now it compiles without incident.
Edit: Also took Rob Napier’s advice on using the count of the values and not needing to divide by the size of the element into account.
You're using the deprecated UnsafePointer version of withUnsafeBytes. The new version passes UnsafeBufferPointer. So instead of this:
let mag: Float = table.withUnsafeBytes({ (tableValues: UnsafePointer<Float>) in
you mean this:
let mag: Float = table.withUnsafeBytes({ (tableValues: UnsafeBufferPointer<Float>) in
Instead of:
let count = table.count / MemoryLayout<Float>.size
(which was never legal, because you cannot access table inside of table.withUnsafeBytes), you now want:
let count = tableValues.count
There's no need to divide by the size of the element.
And instead of tableValues, you'll use tableValues.baseAddress!. Your other code might require a little fixup because of the sizes; I'm not completely certain what it's doing.

Inconsistent results using Metal Performance Shaders between MacBook Pro and iMac

I'm trying to dip my feet into the waters of GPU programming for the first time. I thought I'd start out with something simple and use pre-made kernels (hence the MPS) and just try issuing the commands to the GPU.
My attempt was to simply sum up all values between 1 and 1000. I put each value in a 1x1 matrix and used the MPS Matrix Sum.
On my MacBook Pro, this works as I expect it to.
On my iMac, it gives [0.0] as the result. I figured this was to do with memory, since I use an iGPU on my MacBook Pro and a dGPU on my iMac, however, as far as I can tell, the storageModeShared shouldn't result in this. I even tried adding .synchronize() to the result matrix before trying to read from it, even though I'm pretty sure it shouldn't be necessary with storageModeShared.
The code isn't elegant cause it's just for quickly understanding the workings of issuing commands with MPS and I've tried fixing issues for a little while without keeping track of structure, but it should still be fairly easy to read; If not let me know and I'll refactor it.
The print statements are just to try and debug, aside from print(output)
I hate to paste so much code, but I'm afraid I can't really isolate my issue more.
import Cocoa
import Quartz
import PlaygroundSupport
import MetalPerformanceShaders
let device = MTLCopyAllDevices()[0]
print(MTLCopyAllDevices())
let shaderKernel = MPSMatrixSum.init(device: device, count: 1000, rows: 1, columns: 1, transpose: false)
var matrixList: [MPSMatrix] = []
var GPUStorageBuffers: [MTLBuffer] = []
for i in 1...1000 {
var a = Float32(i)
var b: [Float32] = []
let descriptor = MPSMatrixDescriptor.init(rows: 1, columns: 1, rowBytes: 4, dataType: .float32)
b.append(a)
let buffer = device.makeBuffer(bytes: b, length: 4, options: .storageModeShared)
GPUStorageBuffers.append(buffer!)
let GPUStoredMatrices = MPSMatrix.init(buffer: buffer!, descriptor: descriptor)
matrixList.append(GPUStoredMatrices)
}
let matrices: [MPSMatrix] = matrixList
print(matrices.count)
print("\n")
print(matrices[4].debugDescription)
print("\n")
var printer: [Float32] = []
let pointer2 = matrices[4].data.contents()
let typedPointer2 = pointer2.bindMemory(to: Float32.self, capacity: 1)
let buffpoint2 = UnsafeBufferPointer(start: typedPointer2, count: 1)
buffpoint2.map({value in
printer += [value]
})
print(printer)
let CMDQue = device.makeCommandQueue()
let CMDBuffer = CMDQue!.makeCommandBuffer()
var resultMatrix = MPSMatrix.init(device: device, descriptor: MPSMatrixDescriptor.init(rows: 1, columns: 1, rowBytes: 4, dataType: .float32))
shaderKernel.encode(to: CMDBuffer!, sourceMatrices: matrices, resultMatrix: resultMatrix, scale: nil, offsetVector: nil, biasVector: nil, start: 0)
print(CMDBuffer.debugDescription)
CMDBuffer!.commit()
print(CMDBuffer.debugDescription)
print(CMDQue.debugDescription)
let GPUStartTime = CACurrentMediaTime()
CMDBuffer!.waitUntilCompleted()
var output = [Float32]()
resultMatrix.synchronize(on: CMDBuffer!)
let pointer = resultMatrix.data.contents()
let typedPointer = pointer.bindMemory(to: Float32.self, capacity: 1)
let buffpoint = UnsafeBufferPointer(start: typedPointer, count: 1)
buffpoint.map({value in
output += [value]
})
print(output)
let finish = GPUStartTime - CACurrentMediaTime()
print("\n")
print(finish)

how to port this bit of code from HOWL vocoder synth from AudioKit 2 to AudioKit 4.x?

I'm trying to port the HOWL vocoder synth from AudioKit 2 to the latest version.
https://github.com/dclelland/HOWL
I'm starting with the Vocoder:
https://github.com/dclelland/HOWL/blob/master/HOWL/Models/Audio/Vocoder.swift
I'm not sure how this next bit of code works.
Is the reduce() applying the resonant filter to the audio input in a consecutive manner? Is it doing the equivalent of AKResonantFilter(AKResonantFilter(AKResonantFilter(mutedInput)))) ?
Or is something else going on?
let mutedAudioInput = AKAudioInput() * AKPortamento(input: inputAmplitude, halfTime: 0.001.ak)
let mutedInput = (input + mutedAudioInput) * AKPortamento(input: amplitude, halfTime: 0.001.ak)
let filter = zip(frequencies, bandwidths).reduce(mutedInput) { input, parameters in
let (frequency, bandwidth) = parameters
return AKResonantFilter(
input: input,
centerFrequency: AKPortamento(input: frequency, halfTime: 0.001.ak),
bandwidth: AKPortamento(input: bandwidth, halfTime: 0.001.ak)
)
}
here is my attempt at porting the vocoder filters:
import AudioKitPlaygrounds
import AudioKit
import AudioKitUI
let mixer = AKMixer()
let sawtooth = AKTable(.sawtooth)
let sawtoothLFO = AKOscillator(waveform: sawtooth, frequency: 130.81, amplitude: 1, detuningOffset: 0.0, detuningMultiplier: 0.0)
let frequencyScale = 1.0
let topFrequencies = zip(voice.æ.formants,voice.α.formants).map {topLeftFrequency,topRightFrequency in
return 0.5 * (topRightFrequency - topLeftFrequency) + topLeftFrequency
}
let bottomFrequencies = zip(voice.i.formants,voice.u.formants).map {bottomLeftFrequency,bottomRightFrequency in
return 0.5 * (bottomRightFrequency - bottomLeftFrequency) + bottomLeftFrequency
}
let frequencies = zip(topFrequencies, bottomFrequencies).map { topFrequency, bottomFrequency in
return (0.5 * (bottomFrequency - topFrequency) + topFrequency) * frequencyScale
}
let bandwidthScale = 1.0
let bandwidths = frequencies.map { frequency in
return (frequency * 0.02 + 50.0) * bandwidthScale
}
let filteredLFO = AKResonantFilter(sawtoothLFO)
let filter = zip(frequencies,bandwidths).reduce(filteredLFO) { input,parameters in
let (frequency, bandwidth) = parameters
return AKResonantFilter(
input,
frequency: frequency,
bandwidth: bandwidth
)
}
[filter, sawtoothLFO] >>> mixer
filter.start()
sawtoothLFO.play()
I am getting some sound but it isn't quite right. I am not sure if I am taking the right approach.
In particular my question is : is this the right approach to rewriting the bit of code highlighted above?
let filteredLFO = AKResonantFilter(sawtoothLFO)
let filter = zip(frequencies,bandwidths).reduce(filteredLFO) { input,parameters in
let (frequency, bandwidth) = parameters
return AKResonantFilter(
input,
frequency: frequency,
bandwidth: bandwidth
)
}
Is there a more preferred way to do this whole thing, using AKOperation generators? Should I be using AKFormantFilter ? I've experimented with AKFormantFilter and AKVocalTract, but have not been able to get the audio results I wanted. The HOWL app pretty much sounds exactly like what I'm trying to do, which is why I started porting the code. (it's for a "talking" robot game)

How to make a closure in Swift extract two integers from a string to perform a calculation

I am currently using map property with a closure in Swift to extract linear factors from an array and calculate a list of musical frequencies spanning one octave.
let tonic: Double = 261.626 // middle C
let factors = [ 1.0, 1.125, 1.25, 1.333, 1.5, 1.625, 1.875]
let frequencies = factors.map { $0 * tonic }
print(frequencies)
// [261.62599999999998, 294.32925, 327.03249999999997, 348.74745799999994, 392.43899999999996, 425.14224999999999, 490.54874999999993]
I want to do this by making the closure extract two integers from a string and divide them to form each factor. The string comes from an SCL tuning file and might look something like this:
// C D E F G A B
let ratios = [ "1/1", "9/8", "5/4", "4/3", "3/2", "27/16", "15/8"]
Can this be done ?
SOLUTION
Thankfully, yes it can. In three Swift statements tuning ratios represented as fractions since before Ptolemy can be coverted into precise frequencies. A slight modification to the accepted answer makes it possible to derive the list of frequencies. Here is the code
import UIKit
class ViewController: UIViewController {
// Diatonic scale
let ratios = [ "1/1", "9/8", "5/4", "4/3", "3/2", "27/16", "15/8"]
// Mohajira scale
// let ratios = [ "21/20", "9/8", "6/5", "49/40", "4/3", "7/5", "3/2", "8/5", "49/30", "9/5", "11/6", "2/1"]
override func viewDidLoad() {
super.viewDidLoad()
_ = Tuning(ratios: ratios)
}
}
Tuning Class
import UIKit
class Tuning {
let tonic = 261.626 // frequency of middle C (in Hertz)
var ratios = [String]()
init(ratios: [String]) {
self.ratios = ratios
let frequencies = ratios.map { s -> Double in
let integers = s.characters.split(separator: "/").map(String.init).map({ Double($0) })
return (integers[0]!/integers[1]!) * tonic
}
print("// \(frequencies)")
}
}
And here is the list of frequencies in Hertz corresponding to notes of the diatonic scale
C D E F G A B
[261.626007, 294.329254, 327.032501, 348.834686, 392.439026, 441.493896, 490.548767]
It works for other scales with pitches not usually found on a black-and-white-note music keyboard
Mohajira scale created by Jacques Dudon
// D F G C'
let ratios = [ "21/20", "9/8", "6/5", "49/40", "4/3", "7/5", "3/2", "8/5", "49/30", "9/5", "11/6", "2/1"]
And here is a list of frequencies produced
// D F G C'
// [274.70729999999998, 294.32925, 313.95119999999997, 320.49185, 348.83466666666664, 366.27639999999997, 392.43899999999996, 418.60159999999996, 427.32246666666663, 470.92679999999996, 479.64766666666662, 523.25199999999995]
Disclaimer
Currently the closure only handles rational scales. To fully comply with Scala SCL format it must also be able to distinguish between strings with fractions and strings with a decimal point and interpret the latter using cents, i.e. logarithmic rather than linear factors.
Thank you KangKang Adrian and Atem
let ratios = [ "1/1", "9/8", "5/4", "4/3", "3/2", "27/16", "15/8"]
let factors = ratios.map { s -> Float in
let integers = s.characters.split(separator: "/").map(String.init).map({ Float($0) })
return integers[0]!/integers[1]!
}
If I understand your question, you can do something like that:
func linearFactors(from string: String) -> Double? {
let components = string.components(separatedBy: "/").flatMap { Double($0) }
if let numerator = components.first, let denominator = components.last {
return numerator / denominator
}
return nil
}
Convert ratios to array of double
let ratios = [ "1/1", "9/8", "5/4", "4/3", "3/2", "27/16", "15/8"]
let array = ratios.flatMap { element in
let parts = element.components(separatedBy: "/")
guard parts.count == 2,
let dividend = Double(parts[0]),
let divisor = Double(parts[1]),
divisor != 0
else {
return nil
}
return parts[0] / parts[1]
}

FFT Calculating incorrectly - Swift

I am trying to take the fast Fast Fourier Transform. I am basing my calculation off of the Surge. I am having trouble getting correct results. When I take the fft of a 1000 hz sound I get something that looks like this. . When i take the same tone and use python I get something that looks way more correct. The python code looks like:
import numpy as np
import scipy.io.wavfile
import numpy.fft
import matplotlib.pyplot as plt
FILENAME = 'beep.wav'
fs, data = scipy.io.wavfile.read(FILENAME)
data = data[:801]
spacing = 1 / float(fs)
freq = numpy.fft.rfft(data)
freq_power = np.abs(freq)
a = 1 / (2 * spacing)
b = (len(data) + 1) // 2
freq_axis = np.linspace(0, a, b)
plt.plot(freq_axis, freq_power)
plt.show()
The swift code looks like
import Accelerate
public func sqrt(x: [Float]) -> [Float] {
var results = [Float](count: x.count, repeatedValue: 0.0)
vvsqrtf(&results, x, [Int32(x.count)])
return results
}
public func fft(input: [Float]) -> [Float] {
var real = [Float](input)
var imaginary = [Float](count: input.count, repeatedValue: 0.0)
var splitComplex = DSPSplitComplex(realp: &real, imagp: &imaginary)
let length = vDSP_Length(floor(log2(Float(input.count))))
let radix = FFTRadix(kFFTRadix2)
let weights = vDSP_create_fftsetup(length, radix)
println(weights)
vDSP_fft_zip(weights, &splitComplex, 1, 8, FFTDirection(FFT_FORWARD))
var magnitudes = [Float](count: input.count, repeatedValue: 0.0)
vDSP_zvmags(&splitComplex, 1, &magnitudes, 1, vDSP_Length(input.count))
var normalizedMagnitudes = [Float](count: input.count, repeatedValue: 0.0)
vDSP_vsmul(sqrt(magnitudes), 1, [2.0 / Float(input.count)], &normalizedMagnitudes, 1, vDSP_Length(input.count))
vDSP_destroy_fftsetup(weights)
return normalizedMagnitudes
}
To reiterate. The swift code is the code giving unexpected results. What am I doing wrong?
It looks like you are using Swift Float arrays with the Accelerate framework, but you might instead need to allocate your vectors using UnsafeMutablePointer<Float> types since the Accelerate framework is an Objective C framework. Here is an example how to do this.
public func sqrt(x: [Float]) -> [Float] {
// convert swift array to C vector
var temp = UnsafeMutablePointer<Float>.alloc(x.count)
for (var i=0;i<x.count;i++) {
temp[i] = x[i];
}
var count = UnsafeMutablePointer<Int32>.alloc(1)
count[0] = Int32(x.count)
vvsqrtf(temp, temp, count)
// convert C vector to swift array
var results = [Float](count: x.count, repeatedValue: 0.0)
for (var i=0;i<x.count;i++) {
results[i] = temp[i];
}
// Free memory
count.dealloc(1)
temp.dealloc(x.count)
return results
}
It will work out better for performance to use the UnsafeMutablePointer<Float> types throughout your code for your vectors of data rather than converting back and forth in function calls as I did for this example. Also you should save your FFT setup and reuse that as well for better performance.
Since you're using the vDSP FFT you might also like the vDSP_zvabs API which calculates magnitude in dB from the FFT results.
Finally be sure to read this link on data packing and scaling for the Accelerate framework FFT APIs.
https://developer.apple.com/library/mac/documentation/Performance/Conceptual/vDSP_Programming_Guide/UsingFourierTransforms/UsingFourierTransforms.html
To improve performance, the vDSP APIs do not output the most obvious scale values (since you will undoubtedly be scaling the data anyway somewhere else) and they pack in some extra data into a few of the FFT points.

Resources