I'm trying to dip my feet into the waters of GPU programming for the first time. I thought I'd start out with something simple and use pre-made kernels (hence the MPS) and just try issuing the commands to the GPU.
My attempt was to simply sum up all values between 1 and 1000. I put each value in a 1x1 matrix and used the MPS Matrix Sum.
On my MacBook Pro, this works as I expect it to.
On my iMac, it gives [0.0] as the result. I figured this was to do with memory, since I use an iGPU on my MacBook Pro and a dGPU on my iMac, however, as far as I can tell, the storageModeShared shouldn't result in this. I even tried adding .synchronize() to the result matrix before trying to read from it, even though I'm pretty sure it shouldn't be necessary with storageModeShared.
The code isn't elegant cause it's just for quickly understanding the workings of issuing commands with MPS and I've tried fixing issues for a little while without keeping track of structure, but it should still be fairly easy to read; If not let me know and I'll refactor it.
The print statements are just to try and debug, aside from print(output)
I hate to paste so much code, but I'm afraid I can't really isolate my issue more.
import Cocoa
import Quartz
import PlaygroundSupport
import MetalPerformanceShaders
let device = MTLCopyAllDevices()[0]
print(MTLCopyAllDevices())
let shaderKernel = MPSMatrixSum.init(device: device, count: 1000, rows: 1, columns: 1, transpose: false)
var matrixList: [MPSMatrix] = []
var GPUStorageBuffers: [MTLBuffer] = []
for i in 1...1000 {
var a = Float32(i)
var b: [Float32] = []
let descriptor = MPSMatrixDescriptor.init(rows: 1, columns: 1, rowBytes: 4, dataType: .float32)
b.append(a)
let buffer = device.makeBuffer(bytes: b, length: 4, options: .storageModeShared)
GPUStorageBuffers.append(buffer!)
let GPUStoredMatrices = MPSMatrix.init(buffer: buffer!, descriptor: descriptor)
matrixList.append(GPUStoredMatrices)
}
let matrices: [MPSMatrix] = matrixList
print(matrices.count)
print("\n")
print(matrices[4].debugDescription)
print("\n")
var printer: [Float32] = []
let pointer2 = matrices[4].data.contents()
let typedPointer2 = pointer2.bindMemory(to: Float32.self, capacity: 1)
let buffpoint2 = UnsafeBufferPointer(start: typedPointer2, count: 1)
buffpoint2.map({value in
printer += [value]
})
print(printer)
let CMDQue = device.makeCommandQueue()
let CMDBuffer = CMDQue!.makeCommandBuffer()
var resultMatrix = MPSMatrix.init(device: device, descriptor: MPSMatrixDescriptor.init(rows: 1, columns: 1, rowBytes: 4, dataType: .float32))
shaderKernel.encode(to: CMDBuffer!, sourceMatrices: matrices, resultMatrix: resultMatrix, scale: nil, offsetVector: nil, biasVector: nil, start: 0)
print(CMDBuffer.debugDescription)
CMDBuffer!.commit()
print(CMDBuffer.debugDescription)
print(CMDQue.debugDescription)
let GPUStartTime = CACurrentMediaTime()
CMDBuffer!.waitUntilCompleted()
var output = [Float32]()
resultMatrix.synchronize(on: CMDBuffer!)
let pointer = resultMatrix.data.contents()
let typedPointer = pointer.bindMemory(to: Float32.self, capacity: 1)
let buffpoint = UnsafeBufferPointer(start: typedPointer, count: 1)
buffpoint.map({value in
output += [value]
})
print(output)
let finish = GPUStartTime - CACurrentMediaTime()
print("\n")
print(finish)
Related
I am trying to generate some random integer data for my app with the GPU using MPSMatrixRandom, and I have two questions.
What is the difference between MPSMatrixRandomMTGP32 and MPSMatrixRandomPhilox?
I understand that these two shaders use different algorithms, but what are the differences between them? Does the performance or output of these two algorithms differ, and if so, how?
What code can you use to implement these shaders?
I tried to implement them myself, but my app consistently crashes with vague error messages. I'd like to see an example implementation of this being done properly.
Here's a sample demonstrating how to generate random matrices using these two kernels:
import Foundation
import Metal
import MetalPerformanceShaders
let device = MTLCreateSystemDefaultDevice()!
let commandQueue = device.makeCommandQueue()!
let rows = 8
let columns = 8
let matrixDescriptor = MPSMatrixDescriptor(rows: rows,
columns: columns,
rowBytes: MemoryLayout<Float>.stride * columns,
dataType: .float32)
let mtMatrix = MPSMatrix(device: device, descriptor: matrixDescriptor)
let phMatrix = MPSMatrix(device: device, descriptor: matrixDescriptor)
let distribution = MPSMatrixRandomDistributionDescriptor.uniformDistributionDescriptor(withMinimum: -1.0, maximum: 1.0)
let mtKernel = MPSMatrixRandomMTGP32(device: device,
destinationDataType: .float32,
seed: 0,
distributionDescriptor: distribution)
let phKernel = MPSMatrixRandomPhilox(device: device,
destinationDataType: .float32,
seed: 0,
distributionDescriptor: distribution)
let commandBuffer = commandQueue.makeCommandBuffer()!
mtKernel.encode(commandBuffer: commandBuffer, destinationMatrix: mtMatrix)
phKernel.encode(commandBuffer: commandBuffer, destinationMatrix: phMatrix)
#if os(macOS)
mtMatrix.synchronize(on: commandBuffer)
phMatrix.synchronize(on: commandBuffer)
#endif
commandBuffer.commit()
commandBuffer.waitUntilCompleted() // Only necessary to ensure GPU->CPU sync for display
print("Mersenne Twister values:")
let mtValues = mtMatrix.data.contents().assumingMemoryBound(to: Float.self)
for row in 0..<rows {
for col in 0..<columns {
print("\(mtValues[row * columns + col])", terminator: " ")
}
print("")
}
print("")
print("Philox values:")
let phValues = phMatrix.data.contents().assumingMemoryBound(to: Float.self)
for row in 0..<rows {
for col in 0..<columns {
print("\(phValues[row * columns + col])", terminator: " ")
}
print("")
}
I can't comment on the statistical properties of these generators; I'd refer you to the papers mentioned in the comments.
I need to time duration and end event of midi file. I am using below code for play midi file. i tried but didn't found anything. thanks in advance
var s: MusicSequence?
NewMusicSequence(&s)
let midiFilePath = Bundle.main.path(forResource: "CCL-20180308-A-04", ofType: "mid")
let midiFileURL = URL(fileURLWithPath: midiFilePath ?? "")
MusicSequenceFileLoad(s!, midiFileURL as CFURL, MusicSequenceFileTypeID(rawValue: 0)!, [])
var p: MusicPlayer?
NewMusicPlayer(&p)
MusicPlayerSetSequence(p!, s)
MusicPlayerPreroll(p!)
MusicPlayerStart(p!)
usleep(3 * 100 * 100)
var now: MusicTimeStamp = 0
MusicPlayerGetTime(p!, &now)
This will work:
var s: MusicSequence!
NewMusicSequence(&s)
let midiFileURL = Bundle.main.url(forResource: "CCL-20180308-A-04", withExtension: "mid")!
MusicSequenceFileLoad(s!, midiFileURL as CFURL, .midiType, [])
var p: MusicPlayer!
NewMusicPlayer(&p)
MusicPlayerSetSequence(p, s)
MusicPlayerPreroll(p)
MusicPlayerStart(p)
var numTracks: UInt32 = 0
MusicSequenceGetTrackCount(s, &numTracks)
let length = (0..<numTracks).map { (index: UInt32) -> (MusicTimeStamp) in
var track: MusicTrack?
MusicSequenceGetIndTrack(s, index, &track)
var size = UInt32(MemoryLayout<MusicTimeStamp>.size)
var scratchLength = MusicTimeStamp(0)
MusicTrackGetProperty(track!, kSequenceTrackProperty_TrackLength, &scratchLength, &size)
return scratchLength
}.max() ?? 0
var lengthInSeconds = Float64(0)
MusicSequenceGetSecondsForBeats(s, length, &lengthInSeconds)
self.timer = Timer.scheduledTimer(withTimeInterval: 0.1, repeats: true, block: { (t) in
var now: MusicTimeStamp = 0
MusicPlayerGetTime(p, &now)
var nowInSeconds = Float64(0)
MusicSequenceGetSecondsForBeats(s, now, &nowInSeconds)
print("\(nowInSeconds) / \(lengthInSeconds)")
})
The important piece you were missing was to get the total sequence length by finding the length of the longest track. You can get the length of a track using MusicTrackGetProperty() for the kSequenceTrackProperty_TrackLength property.
For what it's worth, CoreMIDI is gnarly enough, especially in Swift, that I think it's worth using a higher level API. Check out AVMIDIPlayer, which is part of AVFoundation. If you need something more sophisticated, you might check out MIKMIDI, which is an open source MIDI library that builds on Core MIDI but adds a ton of additional functionality, and is significantly easier to use. (Disclaimer: I'm the original author and maintainer of MIKMIDI.) With MIKMIDI, you'd do this:
let midiFileURL = Bundle.main.url(forResource: "CCL-20180308-A-04", withExtension: "mid")!
let sequence = try! MIKMIDISequence(fileAt: midiFileURL)
let sequencer = MIKMIDISequencer(sequence: sequence)
sequencer.startPlayback()
self.timer = Timer.scheduledTimer(withTimeInterval: 0.1, repeats: true, block: { (t) in
let now = sequencer.timeInSeconds(forMusicTimeStamp: sequencer.currentTimeStamp, options: [])
let length = sequence.durationInSeconds
print("\(now) / \(length)")
})
Just a little bit simpler! Things get even more interesting if you're trying to do recording, more complex synthesis, routing MIDI to/from external devices, etc.
I'm trying to port the HOWL vocoder synth from AudioKit 2 to the latest version.
https://github.com/dclelland/HOWL
I'm starting with the Vocoder:
https://github.com/dclelland/HOWL/blob/master/HOWL/Models/Audio/Vocoder.swift
I'm not sure how this next bit of code works.
Is the reduce() applying the resonant filter to the audio input in a consecutive manner? Is it doing the equivalent of AKResonantFilter(AKResonantFilter(AKResonantFilter(mutedInput)))) ?
Or is something else going on?
let mutedAudioInput = AKAudioInput() * AKPortamento(input: inputAmplitude, halfTime: 0.001.ak)
let mutedInput = (input + mutedAudioInput) * AKPortamento(input: amplitude, halfTime: 0.001.ak)
let filter = zip(frequencies, bandwidths).reduce(mutedInput) { input, parameters in
let (frequency, bandwidth) = parameters
return AKResonantFilter(
input: input,
centerFrequency: AKPortamento(input: frequency, halfTime: 0.001.ak),
bandwidth: AKPortamento(input: bandwidth, halfTime: 0.001.ak)
)
}
here is my attempt at porting the vocoder filters:
import AudioKitPlaygrounds
import AudioKit
import AudioKitUI
let mixer = AKMixer()
let sawtooth = AKTable(.sawtooth)
let sawtoothLFO = AKOscillator(waveform: sawtooth, frequency: 130.81, amplitude: 1, detuningOffset: 0.0, detuningMultiplier: 0.0)
let frequencyScale = 1.0
let topFrequencies = zip(voice.æ.formants,voice.α.formants).map {topLeftFrequency,topRightFrequency in
return 0.5 * (topRightFrequency - topLeftFrequency) + topLeftFrequency
}
let bottomFrequencies = zip(voice.i.formants,voice.u.formants).map {bottomLeftFrequency,bottomRightFrequency in
return 0.5 * (bottomRightFrequency - bottomLeftFrequency) + bottomLeftFrequency
}
let frequencies = zip(topFrequencies, bottomFrequencies).map { topFrequency, bottomFrequency in
return (0.5 * (bottomFrequency - topFrequency) + topFrequency) * frequencyScale
}
let bandwidthScale = 1.0
let bandwidths = frequencies.map { frequency in
return (frequency * 0.02 + 50.0) * bandwidthScale
}
let filteredLFO = AKResonantFilter(sawtoothLFO)
let filter = zip(frequencies,bandwidths).reduce(filteredLFO) { input,parameters in
let (frequency, bandwidth) = parameters
return AKResonantFilter(
input,
frequency: frequency,
bandwidth: bandwidth
)
}
[filter, sawtoothLFO] >>> mixer
filter.start()
sawtoothLFO.play()
I am getting some sound but it isn't quite right. I am not sure if I am taking the right approach.
In particular my question is : is this the right approach to rewriting the bit of code highlighted above?
let filteredLFO = AKResonantFilter(sawtoothLFO)
let filter = zip(frequencies,bandwidths).reduce(filteredLFO) { input,parameters in
let (frequency, bandwidth) = parameters
return AKResonantFilter(
input,
frequency: frequency,
bandwidth: bandwidth
)
}
Is there a more preferred way to do this whole thing, using AKOperation generators? Should I be using AKFormantFilter ? I've experimented with AKFormantFilter and AKVocalTract, but have not been able to get the audio results I wanted. The HOWL app pretty much sounds exactly like what I'm trying to do, which is why I started porting the code. (it's for a "talking" robot game)
I am trying to take the fast Fast Fourier Transform. I am basing my calculation off of the Surge. I am having trouble getting correct results. When I take the fft of a 1000 hz sound I get something that looks like this. . When i take the same tone and use python I get something that looks way more correct. The python code looks like:
import numpy as np
import scipy.io.wavfile
import numpy.fft
import matplotlib.pyplot as plt
FILENAME = 'beep.wav'
fs, data = scipy.io.wavfile.read(FILENAME)
data = data[:801]
spacing = 1 / float(fs)
freq = numpy.fft.rfft(data)
freq_power = np.abs(freq)
a = 1 / (2 * spacing)
b = (len(data) + 1) // 2
freq_axis = np.linspace(0, a, b)
plt.plot(freq_axis, freq_power)
plt.show()
The swift code looks like
import Accelerate
public func sqrt(x: [Float]) -> [Float] {
var results = [Float](count: x.count, repeatedValue: 0.0)
vvsqrtf(&results, x, [Int32(x.count)])
return results
}
public func fft(input: [Float]) -> [Float] {
var real = [Float](input)
var imaginary = [Float](count: input.count, repeatedValue: 0.0)
var splitComplex = DSPSplitComplex(realp: &real, imagp: &imaginary)
let length = vDSP_Length(floor(log2(Float(input.count))))
let radix = FFTRadix(kFFTRadix2)
let weights = vDSP_create_fftsetup(length, radix)
println(weights)
vDSP_fft_zip(weights, &splitComplex, 1, 8, FFTDirection(FFT_FORWARD))
var magnitudes = [Float](count: input.count, repeatedValue: 0.0)
vDSP_zvmags(&splitComplex, 1, &magnitudes, 1, vDSP_Length(input.count))
var normalizedMagnitudes = [Float](count: input.count, repeatedValue: 0.0)
vDSP_vsmul(sqrt(magnitudes), 1, [2.0 / Float(input.count)], &normalizedMagnitudes, 1, vDSP_Length(input.count))
vDSP_destroy_fftsetup(weights)
return normalizedMagnitudes
}
To reiterate. The swift code is the code giving unexpected results. What am I doing wrong?
It looks like you are using Swift Float arrays with the Accelerate framework, but you might instead need to allocate your vectors using UnsafeMutablePointer<Float> types since the Accelerate framework is an Objective C framework. Here is an example how to do this.
public func sqrt(x: [Float]) -> [Float] {
// convert swift array to C vector
var temp = UnsafeMutablePointer<Float>.alloc(x.count)
for (var i=0;i<x.count;i++) {
temp[i] = x[i];
}
var count = UnsafeMutablePointer<Int32>.alloc(1)
count[0] = Int32(x.count)
vvsqrtf(temp, temp, count)
// convert C vector to swift array
var results = [Float](count: x.count, repeatedValue: 0.0)
for (var i=0;i<x.count;i++) {
results[i] = temp[i];
}
// Free memory
count.dealloc(1)
temp.dealloc(x.count)
return results
}
It will work out better for performance to use the UnsafeMutablePointer<Float> types throughout your code for your vectors of data rather than converting back and forth in function calls as I did for this example. Also you should save your FFT setup and reuse that as well for better performance.
Since you're using the vDSP FFT you might also like the vDSP_zvabs API which calculates magnitude in dB from the FFT results.
Finally be sure to read this link on data packing and scaling for the Accelerate framework FFT APIs.
https://developer.apple.com/library/mac/documentation/Performance/Conceptual/vDSP_Programming_Guide/UsingFourierTransforms/UsingFourierTransforms.html
To improve performance, the vDSP APIs do not output the most obvious scale values (since you will undoubtedly be scaling the data anyway somewhere else) and they pack in some extra data into a few of the FFT points.
Following the good instructions that I found here: https://github.com/haginile/SwiftAccelerate I verified that matrix inversion works. In fact it did for the example given.
But I get a EXC_BAD_ACCESS error for any other matrix (bigger than 2x2) for example the following 2D matrix (converted as a 1D array) has been tested in matlab and python successfully and it does not work
m = [0.55481645013013, -1.15522603580724, 0.962090414322894, -0.530226035807236, 0.168545207161447, -0.38627124296868, 0.93401699437494, -0.999999999999995, 0.684016994374945, -0.23176274578121, 0.123606797749979, -0.323606797749979, 0.432893622827287, -0.323606797749979, 0.123606797749979, 0.231762745781211, -0.684016994374948, 1.0, -0.934016994374947, 0.386271242968684, 0.168545207161448, -0.530226035807237, 0.962090414322895, -1.15522603580724, 0.554816450130132]
Its inverted matrix should be
inv(AA)
ans =
Columns 1 through 3
-262796763616197 -656991909040516 4.90007819375216
-162417332048282 -406043330120712 14.6405748712708
0.718958226823704 7.87760147961979 30.4010295628018
162417332048287 406043330120730 46.1614842543337
262796763616208 656991909040536 55.9019809318537
Columns 4 through 5
-656991909040528 262796763616211
-406043330120721 162417332048287
-4.28281034550088 -0.718958226823794
406043330120704 -162417332048283
656991909040497 -262796763616196
Could you please give me another way of matrix inversion in Swift? Or explain me how to fix this?
I really don't understand why it does not work.
It doesn't work because the instructions that you found are not so good. Specifically, both pivots and workspace need to be Arrays, not scalar values; it was only working for two-by-two matrices by random chance.
Here's a modified version of the invert function that allocates the workspaces correctly:
func invert(matrix : [Double]) -> [Double] {
var inMatrix = matrix
var N = __CLPK_integer(sqrt(Double(matrix.count)))
var pivots = [__CLPK_integer](count: Int(N), repeatedValue: 0)
var workspace = [Double](count: Int(N), repeatedValue: 0.0)
var error : __CLPK_integer = 0
dgetrf_(&N, &N, &inMatrix, &N, &pivots, &error)
dgetri_(&N, &inMatrix, &N, &pivots, &workspace, &N, &error)
return inMatrix
}
I should also note that your 5x5 matrix is extremely ill-conditioned, so even when you can compute the "inverse" the error of that computation will be very large, and the inverse really shouldn't be used.
A Swift 4 version:
func invert(matrix : [Double]) -> [Double] {
var inMatrix = matrix
var N = __CLPK_integer(sqrt(Double(matrix.count)))
var pivots = [__CLPK_integer](repeating: 0, count: Int(N))
var workspace = [Double](repeating: 0.0, count: Int(N))
var error : __CLPK_integer = 0
withUnsafeMutablePointer(to: &N) {
dgetrf_($0, $0, &inMatrix, $0, &pivots, &error)
dgetri_($0, &inMatrix, $0, &pivots, &workspace, $0, &error)
}
return inMatrix
}
I have written a library for linear algebra in Swift. I call this library swix and it includes functions to invert matrices (this function is called inv).
Example use case:
var b = ones(10)
var A = rand((10, 10))
var AI = inv(A)
var x = AI.dot(b)
Source: https://github.com/stsievert/swix
Documentation: http://scottsievert.com/swix/