In Python, I trained an image classification model with keras to receive input as a [224, 224, 3] array and output a prediction (1 or 0). When I load the save the model and load it into xcode, it states that the input has to be in MLMultiArray format.
Is there a way for me to convert a UIImage into MLMultiArray format? Or is there a way for me change my keras model to accept CVPixelBuffer type objects as an input.
In your Core ML conversion script you can supply the parameter image_input_names='data' where data is the name of your input.
Now Core ML will treat this input as an image (CVPixelBuffer) instead of a multi-array.
When you convert the caffe model to MLModel, you need to add this line:
image_input_names = 'data'
Take my own transfer script as an example, the script should be like this:
import coremltools
coreml_model = coremltools.converters.caffe.convert(('gender_net.caffemodel',
'deploy_gender.prototxt'),
image_input_names = 'data',
class_labels = 'genderLabel.txt')
coreml_model.save('GenderMLModel.mlmodel')
And then your MLModel's input data will be CVPixelBufferRef instead of MLMultiArray. Transferring UIImage to CVPixelBufferRef would be an easy thing.
Did not tried this, but here is how its done for the FOOD101 sample
func preprocess(image: UIImage) -> MLMultiArray? {
let size = CGSize(width: 299, height: 299)
guard let pixels = image.resize(to: size).pixelData()?.map({ (Double($0) / 255.0 - 0.5) * 2 }) else {
return nil
}
guard let array = try? MLMultiArray(shape: [3, 299, 299], dataType: .double) else {
return nil
}
let r = pixels.enumerated().filter { $0.offset % 4 == 0 }.map { $0.element }
let g = pixels.enumerated().filter { $0.offset % 4 == 1 }.map { $0.element }
let b = pixels.enumerated().filter { $0.offset % 4 == 2 }.map { $0.element }
let combination = r + g + b
for (index, element) in combination.enumerated() {
array[index] = NSNumber(value: element)
}
return array
}
https://github.com/ph1ps/Food101-CoreML
Related
I've got an array of audio files that I want to normalize so they all have similar perceived loudness. For testing purposes, I decided to adapt the AVAudioPCMBuffer.normalize method from AudioKit to suit my purposes. See here for implementation: https://github.com/AudioKit/AudioKit/blob/main/Sources/AudioKit/Audio%20Files/AVAudioPCMBuffer%2BProcessing.swift
I am converting each file into an AVAudioPCMBuffer, and then performing a reduce on that array of buffers to get the highest peak across all of the buffers. Then I created a new version of normalize called normalize(with peakAmplitude: Float) -> AVAudioPCMBuffer takes that peak amplitude, calculates a gainFactor and then iterates through the floatData for each channel and multiplies the floatData by the gainFactor. I then call my new flavor of normalize with the peak.amplitude that I get from the reduce operation on all the audio buffers.
This produces useful results, sometimes.
Here's the actual code in question:
extension AVAudioPCMBuffer {
public func normalize(with peakAmplitude: Float) -> AVAudioPCMBuffer {
guard let floatData = floatChannelData else { return self }
let gainFactor: Float = 1 / peakAmplitude
let length: AVAudioFrameCount = frameLength
let channelCount = Int(format.channelCount)
// i is the index in the buffer
for i in 0 ..< Int(length) {
// n is the channel
for n in 0 ..< channelCount {
let sample = floatData[n][i] * gainFactor
self.floatChannelData?[n][i] = sample
}
}
self.frameLength = length
return self
}
}
extension Array where Element == AVAudioPCMBuffer {
public func normalized() -> [AVAudioPCMBuffer] {
var minPeak = AVAudioPCMBuffer.Peak()
minPeak.amplitude = AVAudioPCMBuffer.Peak.min
let maxPeakForAllBuffers: AVAudioPCMBuffer.Peak = reduce(minPeak) { result, buffer in
guard
let currentBufferPeak = buffer.peak(),
currentBufferPeak.amplitude > result.amplitude
else {
return result
}
return currentBufferPeak
}
return map { $0.normalize(with: maxPeakForAllBuffers.amplitude) }
}
}
Three questions:
Is my approach reasonable for multiple files?
This appears to be using "peak normalization" vs RMS or EBU R128 normalization. Is that why when I give it a batch of 3 audio files and 2 of them are correctly made louder that 1 of them is made louder even though ffmpeg-normalize on the same batch of files makes that 1 file significantly quieter?
Any other suggestions on ways to alter the floatData across multiple AVAudioAudioPCMBuffers in order to make them have similar perceived loudness?
Thanks to this great article(http://machinethink.net/blog/coreml-custom-layers/), I understood how to write converting using coremltools and Lambda with Keras custom layer.
But, I cannot understand on the situation, function with two parameters.
#python
def scaling(x, scale):
return x * scale
Keras layer is here.
#python
up = conv2d_bn(mixed,
K.int_shape(x)[channel_axis],
1,
activation=None,
use_bias=True,
name=name_fmt('Conv2d_1x1'))
x = Lambda(scaling, # HERE !!
output_shape=K.int_shape(up)[1:],
arguments={'scale': scale})(up)
x = add([x, up])
On this situation, how can I write func evaluate(inputs: [MLMultiArray], outputs: [MLMultiArray]) in custom MLCustomLayer class on Swift? I understand just in one parameter function situation, like this,
#swift
func evaluate(inputs: [MLMultiArray], outputs: [MLMultiArray]) throws {
for i in 0..<inputs.count {
let input = inputs[i]
let output = outputs[i]
for j in 0..<input.count {
let x = input[j].floatValue
let y = x / (1 + exp(-x))
output[j] = NSNumber(value: y)
}
}
}
How about two parameters function, like x * scale?
Full code is here.
Converting to Core ML model with custom layer
https://github.com/osmszk/dla_team14/blob/master/facenet/coreml/CoremlTest.ipynb
Network model by Keras
https://github.com/osmszk/dla_team14/blob/master/facenet/code/facenet_keras_v2.py
Thank you.
It looks like scale is a hyperparameter, not a learnable parameter, is that correct?
In that case, you need to add scale to the parameters dictionary for the custom layer. Then in your Swift class, scale will also be inside the parameters dictionary that is passed into your init(parameters) function. Store it inside a property and then in evaluate(inputs, outputs) read from that property again.
My blog post actually shows how to do this. ;-)
I solved this problem on this way thanks to hollance's blog. On converting func, in this case, in convert_lambda, I should have added a scale parameter for the custom layer.
python code(converting Core ML)
def convert_lambda(layer):
if layer.function == scaling:
params = NeuralNetwork_pb2.CustomLayerParams()
params.className = "scaling"
params.description = "scaling input"
# HERE!! This is important.
params.parameters["scale"].doubleValue = layer.arguments['scale']
return params
else:
return None
coreml_model = coremltools.converters.keras.convert(
model,
input_names="image",
image_input_names="image",
output_names="output",
add_custom_layers=True,
custom_conversion_functions={ "Lambda": convert_lambda })
swift code(Custom layer)
//custom MLCustomLayer `scaling` class
let scale: Float
required init(parameters: [String : Any]) throws {
if let scale = parameters["scale"] as? Float {
self.scale = scale
} else {
self.scale = 1.0
}
print(#function, parameters, self.scale)
super.init()
}
func evaluate(inputs: [MLMultiArray], outputs: [MLMultiArray]) throws {
for i in 0..<inputs.count {
let input = inputs[i]
let output = outputs[i]
for j in 0..<input.count {
let x = input[j].floatValue
let y = x * self.scale
output[j] = NSNumber(value: y)
}
//faster
/*
let count = input.count
let inputPointer = UnsafeMutablePointer<Float>(OpaquePointer(input.dataPointer))
let outputPointer = UnsafeMutablePointer<Float>(OpaquePointer(output.dataPointer))
var scale = self.scale
vDSP_vsmul(inputPointer, 1, &scale, outputPointer, 1, vDSP_Length(count))
*/
}
}
Thank you.
I am currently using map property with a closure in Swift to extract linear factors from an array and calculate a list of musical frequencies spanning one octave.
let tonic: Double = 261.626 // middle C
let factors = [ 1.0, 1.125, 1.25, 1.333, 1.5, 1.625, 1.875]
let frequencies = factors.map { $0 * tonic }
print(frequencies)
// [261.62599999999998, 294.32925, 327.03249999999997, 348.74745799999994, 392.43899999999996, 425.14224999999999, 490.54874999999993]
I want to do this by making the closure extract two integers from a string and divide them to form each factor. The string comes from an SCL tuning file and might look something like this:
// C D E F G A B
let ratios = [ "1/1", "9/8", "5/4", "4/3", "3/2", "27/16", "15/8"]
Can this be done ?
SOLUTION
Thankfully, yes it can. In three Swift statements tuning ratios represented as fractions since before Ptolemy can be coverted into precise frequencies. A slight modification to the accepted answer makes it possible to derive the list of frequencies. Here is the code
import UIKit
class ViewController: UIViewController {
// Diatonic scale
let ratios = [ "1/1", "9/8", "5/4", "4/3", "3/2", "27/16", "15/8"]
// Mohajira scale
// let ratios = [ "21/20", "9/8", "6/5", "49/40", "4/3", "7/5", "3/2", "8/5", "49/30", "9/5", "11/6", "2/1"]
override func viewDidLoad() {
super.viewDidLoad()
_ = Tuning(ratios: ratios)
}
}
Tuning Class
import UIKit
class Tuning {
let tonic = 261.626 // frequency of middle C (in Hertz)
var ratios = [String]()
init(ratios: [String]) {
self.ratios = ratios
let frequencies = ratios.map { s -> Double in
let integers = s.characters.split(separator: "/").map(String.init).map({ Double($0) })
return (integers[0]!/integers[1]!) * tonic
}
print("// \(frequencies)")
}
}
And here is the list of frequencies in Hertz corresponding to notes of the diatonic scale
C D E F G A B
[261.626007, 294.329254, 327.032501, 348.834686, 392.439026, 441.493896, 490.548767]
It works for other scales with pitches not usually found on a black-and-white-note music keyboard
Mohajira scale created by Jacques Dudon
// D F G C'
let ratios = [ "21/20", "9/8", "6/5", "49/40", "4/3", "7/5", "3/2", "8/5", "49/30", "9/5", "11/6", "2/1"]
And here is a list of frequencies produced
// D F G C'
// [274.70729999999998, 294.32925, 313.95119999999997, 320.49185, 348.83466666666664, 366.27639999999997, 392.43899999999996, 418.60159999999996, 427.32246666666663, 470.92679999999996, 479.64766666666662, 523.25199999999995]
Disclaimer
Currently the closure only handles rational scales. To fully comply with Scala SCL format it must also be able to distinguish between strings with fractions and strings with a decimal point and interpret the latter using cents, i.e. logarithmic rather than linear factors.
Thank you KangKang Adrian and Atem
let ratios = [ "1/1", "9/8", "5/4", "4/3", "3/2", "27/16", "15/8"]
let factors = ratios.map { s -> Float in
let integers = s.characters.split(separator: "/").map(String.init).map({ Float($0) })
return integers[0]!/integers[1]!
}
If I understand your question, you can do something like that:
func linearFactors(from string: String) -> Double? {
let components = string.components(separatedBy: "/").flatMap { Double($0) }
if let numerator = components.first, let denominator = components.last {
return numerator / denominator
}
return nil
}
Convert ratios to array of double
let ratios = [ "1/1", "9/8", "5/4", "4/3", "3/2", "27/16", "15/8"]
let array = ratios.flatMap { element in
let parts = element.components(separatedBy: "/")
guard parts.count == 2,
let dividend = Double(parts[0]),
let divisor = Double(parts[1]),
divisor != 0
else {
return nil
}
return parts[0] / parts[1]
}
I have an array of images to be submitted.
var images = [NSData]()
I need before I submit these images to check their total size; because of the server limitation.
I've tried following code but it's not giving me the actual size.
if (images.description.lengthOfBytesUsingEncoding(NSUTF32StringEncoding) >= 3900000)
{
print("Max of images size reached")
} else {
// Continue
}
Since you are looking for the total size of all NSData elements of the array, you need to compute the aggregate length. One way of doing it is with reduce:
let totalLength = arr.reduce(0) {$0 + $1.length}
This is a short way of writing a loop:
var totalLength = 0
for let image in images {
totalLength += image.length
}
Try this:
let totalLength = images.reduce(0) { $0 + $1.length }
var mentions = ["#alex", "#jason", "#jessica", "#john"]
I want to limit my array to 3 items, so I want to splice it:
var slice = [String]()
if mentions.count > 3 {
slice = mentions[0...3] //alex, jason, jessica
} else {
slice = mentions
}
However, I'm getting:
Ambiguous subscript with base type '[String]' and index type 'Range'
Apple Swift version 2.2 (swiftlang-703.0.18.8 clang-703.0.31)
Target: x86_64-apple-macosx10.9
The problem is that mentions[0...3] returns an ArraySlice<String>, not an Array<String>. Therefore you could first use the Array(_:) initialiser in order to convert the slice into an array:
let first3Elements : [String] // An Array of up to the first 3 elements.
if mentions.count >= 3 {
first3Elements = Array(mentions[0 ..< 3])
} else {
first3Elements = mentions
}
Or if you want to use an ArraySlice (they are useful for intermediate computations, as they present a 'view' onto the original array, but are not designed for long term storage), you could subscript mentions with the full range of indices in your else:
let slice : ArraySlice<String> // An ArraySlice of up to the first 3 elements
if mentions.count >= 3 {
slice = mentions[0 ..< 3]
} else {
slice = mentions[mentions.indices] // in Swift 4: slice = mentions[...]
}
Although the simplest solution by far would be just to use the prefix(_:) method, which will return an ArraySlice of the first n elements, or a slice of the entire array if n exceeds the array count:
let slice = mentions.prefix(3) // ArraySlice of up to the first 3 elements
We can do like this,
let arr = [10,20,30,40,50]
let slicedArray = arr[1...3]
if you want to convert sliced array to normal array,
let arrayOfInts = Array(slicedArray)
You can try .prefix().
Returns a subsequence, up to the specified maximum length, containing the initial elements of the collection.
If the maximum length exceeds the number of elements in the collection, the result contains all the elements in the collection.
let numbers = [1, 2, 3, 4, 5]
print(numbers.prefix(2)) // Prints "[1, 2]"
print(numbers.prefix(10)) // Prints "[1, 2, 3, 4, 5]"
General solution:
extension Array {
func slice(size: Int) -> [[Element]] {
(0...(count / size)).map{Array(self[($0 * size)..<(Swift.min($0 * size + size, count))])}
}
}
Can also look at dropLast() function:
var mentions:[String] = ["#alex", "#jason", "#jessica", "#john"]
var slice:[String] = mentions
if mentions.count > 3 {
slice = Array(mentions.dropLast(mentions.count - 3))
}
//print(slice) => ["#alex", "#jason", "#jessica"]
I came up with this:
public extension Array {
func slice(count: Int) -> [some Collection] {
let n = self.count / count // quotient
let i = n * count // index
let r = self.count % count // remainder
let slices = (0..<n).map { $0 * count }.map { self[$0 ..< $0 + count] }
return (r > 0) ? slices + [self[i..<i + r]] : slices
}
}
You can also slice like this:
//Generic Method
func slice<T>(arrayList:[T], limit:Int) -> [T]{
return Array(arrayList[..<limit])
}
//How to Use
let firstThreeElements = slice(arrayList: ["#alex", "#jason", "#jessica", "#john"], limit: 3)
Array slice func extension:
extension Array {
func slice(with sliceSize: Int) -> [[Element]] {
guard self.count > 0 else { return [] }
var range = self.count / sliceSize
if self.count.isMultiple(of: sliceSize) {
range -= 1
}
return (0...range).map { Array(self[($0 * sliceSize)..<(Swift.min(($0 + 1) * sliceSize, self.count))]) }
}
}