Tensorflow lite image output different between python and iOS/Android - ios

I convert Keras model to TF lite the output dimension is (1, 256, 256, 1).
the result on python is correct, but when I try to construct the image on ios swift the result is wrong.
Here is the code, that I use to construct an UIImage from a list of output.
// helper function
---------------------------------------
// MARK: - Extensions
extension Data {
init<T>(copyingBufferOf array: [T]) {
self = array.withUnsafeBufferPointer(Data.init)
}
/// Convert a Data instance to Array representation.
func toArray<T>(type: T.Type) -> [T] where T: ExpressibleByIntegerLiteral {
var array = [T](repeating: 0, count: self.count/MemoryLayout<T>.stride)
_ = array.withUnsafeMutableBytes { copyBytes(to: $0) }
return array
}
}
func imageFromSRGBColorArray(pixels: [UInt32], width: Int, height: Int) -> UIImage?
{
guard width > 0 && height > 0 else { return nil }
guard pixels.count == width * height else { return nil }
// Make a mutable copy
var data = pixels
// Convert array of pixels to a CGImage instance.
let cgImage = data.withUnsafeMutableBytes { (ptr) -> CGImage in
let ctx = CGContext(
data: ptr.baseAddress,
width: width,
height: height,
bitsPerComponent: 8,
bytesPerRow: MemoryLayout<UInt32>.size * width,
space: CGColorSpace(name: CGColorSpace.sRGB)!,
bitmapInfo: CGBitmapInfo.byteOrder32Little.rawValue
+ CGImageAlphaInfo.premultipliedFirst.rawValue
)!
return ctx.makeImage()!
}
// Convert the CGImage instance to an UIImage instance.
return UIImage(cgImage: cgImage)
}
let results = outputTensor.data.toArray(type: UInt32.self)
let maskImage = imageFromSRGBColorArray(pixels: results, width: 256, height: 256)
the result I get is completely wrong compared to python.
I think the function imageFromSRGBColorArray is not correct.
can anyone help me to figure out the problem?

Related

My custom metal image filter is slow. How can I make it faster?

I've seen a lot of other's online tutorial that are able to achieve 0.0X seconds mark on filtering an image. Meanwhile my code here took 1.09 seconds to filter an image.(Just to reduce brightness by half).
edit after first comment
time measured with 2 methods
Date() timeinterval , when the button “apply filter” tapped and after the apply filter function is done running
build it on iphone and count manually with my timer on my watch
Since I'm new to metal & kernel stuff, I don't really know the difference between my code and those tutorials that achieve faster result. Which part of my code can be improved/ use different approach to make it a lot faster.
here's my kernel code
#include <metal_stdlib>
using namespace metal;
kernel void black(
texture2d<float, access::write> outTexture [[texture(0)]],
texture2d<float, access::read> inTexture [[texture(1)]],
uint2 id [[thread_position_in_grid]]) {
float3 val = inTexture.read(id).rgb;
float r = val.r / 4;
float g = val.g / 4;
float b = val.b / 2;
float4 out = float4(r, g, b, 1.0);
outTexture.write(out.rgba, id);
}
this is my swift code
import Metal
import MetalKit
// UIImage -> CGImage -> MTLTexture -> COMPUTE HAPPENS |
// UIImage <- CGImage <- MTLTexture <--
class Filter {
var device: MTLDevice
var defaultLib: MTLLibrary?
var grayscaleShader: MTLFunction?
var commandQueue: MTLCommandQueue?
var commandBuffer: MTLCommandBuffer?
var commandEncoder: MTLComputeCommandEncoder?
var pipelineState: MTLComputePipelineState?
var inputImage: UIImage
var height, width: Int
// most devices have a limit of 512 threads per group
let threadsPerBlock = MTLSize(width: 32, height: 32, depth: 1)
init(){
print("initialized")
self.device = MTLCreateSystemDefaultDevice()!
print(device)
//changes: I did do catch try, and use bundle parameter when making make default library
let frameworkBundle = Bundle(for: type(of: self))
print(frameworkBundle)
self.defaultLib = device.makeDefaultLibrary()
self.grayscaleShader = defaultLib?.makeFunction(name: "black")
self.commandQueue = self.device.makeCommandQueue()
self.commandBuffer = self.commandQueue?.makeCommandBuffer()
self.commandEncoder = self.commandBuffer?.makeComputeCommandEncoder()
//ERROR HERE
if let shader = grayscaleShader {
print("in")
self.pipelineState = try? self.device.makeComputePipelineState(function: shader)
} else { fatalError("unable to make compute pipeline") }
self.inputImage = UIImage(named: "stockImage")!
self.height = Int(self.inputImage.size.height)
self.width = Int(self.inputImage.size.width)
}
func getCGImage(from uiimg: UIImage) -> CGImage? {
UIGraphicsBeginImageContext(uiimg.size)
uiimg.draw(in: CGRect(origin: .zero, size: uiimg.size))
let contextImage = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
return contextImage?.cgImage
}
func getMTLTexture(from cgimg: CGImage) -> MTLTexture {
let textureLoader = MTKTextureLoader(device: self.device)
do{
let texture = try textureLoader.newTexture(cgImage: cgimg, options: nil)
let textureDescriptor = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: texture.pixelFormat, width: width, height: height, mipmapped: false)
textureDescriptor.usage = [.shaderRead, .shaderWrite]
return texture
} catch {
fatalError("Couldn't convert CGImage to MTLtexture")
}
}
func getCGImage(from mtlTexture: MTLTexture) -> CGImage? {
var data = Array<UInt8>(repeatElement(0, count: 4*width*height))
mtlTexture.getBytes(&data,
bytesPerRow: 4*width,
from: MTLRegionMake2D(0, 0, width, height),
mipmapLevel: 0)
let bitmapInfo = CGBitmapInfo(rawValue: (CGBitmapInfo.byteOrder32Big.rawValue | CGImageAlphaInfo.premultipliedLast.rawValue))
let colorSpace = CGColorSpaceCreateDeviceRGB()
let context = CGContext(data: &data,
width: width,
height: height,
bitsPerComponent: 8,
bytesPerRow: 4*width,
space: colorSpace,
bitmapInfo: bitmapInfo.rawValue)
return context?.makeImage()
}
func getUIImage(from cgimg: CGImage) -> UIImage? {
return UIImage(cgImage: cgimg)
}
func getEmptyMTLTexture() -> MTLTexture? {
let textureDescriptor = MTLTextureDescriptor.texture2DDescriptor(
pixelFormat: MTLPixelFormat.rgba8Unorm,
width: width,
height: height,
mipmapped: false)
textureDescriptor.usage = [.shaderRead, .shaderWrite]
return self.device.makeTexture(descriptor: textureDescriptor)
}
func getInputMTLTexture() -> MTLTexture? {
if let inputImage = getCGImage(from: self.inputImage) {
return getMTLTexture(from: inputImage)
}
else { fatalError("Unable to convert Input image to MTLTexture") }
}
func getBlockDimensions() -> MTLSize {
let blockWidth = width / self.threadsPerBlock.width
let blockHeight = height / self.threadsPerBlock.height
return MTLSizeMake(blockWidth, blockHeight, 1)
}
func applyFilter() -> UIImage? {
print("start")
let date = Date()
print(date)
if let encoder = self.commandEncoder, let buffer = self.commandBuffer,
let outputTexture = getEmptyMTLTexture(), let inputTexture = getInputMTLTexture() {
encoder.setTextures([outputTexture, inputTexture], range: 0..<2)
encoder.setComputePipelineState(self.pipelineState!)
encoder.dispatchThreadgroups(self.getBlockDimensions(), threadsPerThreadgroup: threadsPerBlock)
encoder.endEncoding()
buffer.commit()
buffer.waitUntilCompleted()
guard let outputImage = getCGImage(from: outputTexture) else { fatalError("Couldn't obtain CGImage from MTLTexture") }
print("stop")
let date2 = Date()
print(date2.timeIntervalSince(date))
return getUIImage(from: outputImage)
} else { fatalError("optional unwrapping failed") }
}
}
In case someone still need the answer, I found a different approach which is make it as custom CIFilter. It works pretty fast and super easy to undestand!
You using UIImage, CGImage. These objects stored in CPU memory.
Need implement code with using just CIImage or MTLTexture.
These object are storing in GPU memory and have best performace.

iOS Core Graphics how to get middle row of CGImage as vImage_Buffer?

I'm trying to process only a slice of (not full) CGImage. I have the code below that starts at 0 and gets the entire image buffer. I need to know how to get the pointer to the middle row, so I can pass it in as a parameter.
How do I get only a center byte row of CGImage ?
extension CGImage {
func processImage() -> UInt {
guard let imgProvider: CGDataProvider = self.dataProvider,
let imgBitmapData: CFData = imgProvider.data else {
return 0
}
let topLeftPixelPointer = UnsafeMutableRawPointer(mutating: CFDataGetBytePtr(imgBitmapData))
var imgBuffer = vImage_Buffer(data: topLeftPixelPointer,
height: vImagePixelCount(height),
width: vImagePixelCount(width),
rowBytes: bytesPerRow)
let specificAddressWithinBuffer = /*??*/ // I'm not sure how to set this to point to a specific row
var sliceOfBuffer = vImage_Buffer(data: /**/,
height: vImagePixelCount(height),
width: vImagePixelCount(width),
rowBytes: bytesPerRow)
}
}

Tensorflow Interpreter throwing error for "data count" iOS

I am using TensorFlowLiteSwift and the model I'm working with is responsible for flattening an image when the image is cropped in a trapezoidal shape.
Now, Tensorflow does not provide much of a documentation. So, I have been trying to implement things from their example projects.
But here is the catch, it throws error saying "Provided data count must match the required count" and the required count is 4. I backtracked the byteCount in Interpreter.swift but could not find the actual setter.
So, is the .tflite model responsible for the "required count?" And if no, then how does this get set?
Here is a chunk of code I think would help understanding my problem:
/// Performs image preprocessing, invokes the `Interpreter`, and processes the inference results.
func runModel(on item: ImageProcessInfo) -> UIImage? {
let rgbData = item.resizedImage.scaledData(with: CGSize(width: 1000, height: 900),
byteCount: inputWidth * inputHeight
* batchSize,
isQuantized: false)
var corner = item.corners.map { $0.map { p -> (Float, Float) in
return (Float(p.x), Float(p.y))
} }
var item = item
guard let height = NSMutableData(capacity: 0) else { return nil }
height.append(&item.originalHeight, length: 4)
guard let width = NSMutableData(capacity: 0) else { return nil }
width.append(&item.originalWidth, length: 4)
guard let corners = NSMutableData(capacity: 0) else { return nil }
corners.append(&corner, length: 4)
do {
try interpreter.copy(rgbData!, toInputAt: 0)
try interpreter.copy(height as Data, toInputAt: 1)
try interpreter.copy(width as Data, toInputAt: 2)
try interpreter.copy(corners as Data, toInputAt: 3)
try interpreter.invoke()
let outputTensor1 = try self.interpreter.output(at: 0)
guard let cgImage = postprocessImageData(data: outputTensor1.data, size: CGSize(width: 1000, height: 900)) else {
return nil
}
let outputImage = UIImage(cgImage: cgImage)
return outputImage
} catch {
dump(error)
return nil
}
}
extension UIImage {
func scaledData(with size: CGSize, byteCount: Int, isQuantized: Bool) -> Data? {
guard let cgImage = self.cgImage, cgImage.width > 0, cgImage.height > 0 else { return nil }
guard let imageData = imageData(from: cgImage, with: size) else { return nil }
var scaledBytes = [UInt8](repeating: 0, count: byteCount)
var index = 0
for component in imageData.enumerated() {
let offset = component.offset
let isAlphaComponent = (offset % 4)
== 3
guard !isAlphaComponent else { continue }
scaledBytes[index] = component.element
index += 1
}
if isQuantized { return Data(scaledBytes) }
let scaledFloats = scaledBytes.map { (Float32($0) - 127.5) / 127.5 }
return Data(copyingBufferOf: scaledFloats)
}
private func imageData(from cgImage: CGImage, with size: CGSize) -> Data? {
let bitmapInfo = CGBitmapInfo(
rawValue: CGBitmapInfo.byteOrder32Big.rawValue | CGImageAlphaInfo.premultipliedLast.rawValue
)
let width = Int(size.width)
let scaledBytesPerRow = (cgImage.bytesPerRow / cgImage.width) * width
guard let context = CGContext(
data: nil,
width: width,
height: Int(size.height),
bitsPerComponent: cgImage.bitsPerComponent,
bytesPerRow: scaledBytesPerRow,
space: CGColorSpaceCreateDeviceRGB(),
bitmapInfo: bitmapInfo.rawValue)
else {
return nil
}
context.draw(cgImage, in: CGRect(origin: .zero, size: size))
return context.makeImage()?.dataProvider?.data as Data?
}
}
#discardableResult
public func copy(_ data: Data, toInputAt index: Int) throws -> Tensor {
let maxIndex = inputTensorCount - 1
guard case 0...maxIndex = index else {
throw InterpreterError.invalidTensorIndex(index: index, maxIndex: maxIndex)
}
guard let cTensor = TfLiteInterpreterGetInputTensor(cInterpreter, Int32(index)) else {
throw InterpreterError.allocateTensorsRequired
}
/* Error here */
let byteCount = TfLiteTensorByteSize(cTensor)
guard data.count == byteCount else {
throw InterpreterError.invalidTensorDataCount(provided: data.count, required: byteCount)
}
#if swift(>=5.0)
let status = data.withUnsafeBytes {
TfLiteTensorCopyFromBuffer(cTensor, $0.baseAddress, data.count)
}
#else
let status = data.withUnsafeBytes { TfLiteTensorCopyFromBuffer(cTensor, $0, data.count) }
#endif // swift(>=5.0)
guard status == kTfLiteOk else { throw InterpreterError.failedToCopyDataToInputTensor }
return try input(at: index)
}
What are the input shapes? Can you identify which one is complaining about the size?
At the first glance, corners.append(&corner, length: 4) seems weird - does corners contain only 1 Float (byte size 4)?
The byteCount for a tensor is filled by underlying C API, and simply returns tensor->bytes for underlying TfLiteTensor struct that is filled in the model loading stage.

Why is CoreML Prediction using over 10 times more RAM on older device?

I am using CoreML style transfer based on the torch2coreml implementation on git. For purposes herein, I have only substituted my mlmodel with an input/output size of 1200 pixels with the sample mlmodels.
This works perfectly on my iPhone 7 plus and uses a maximum of 65.11 MB of RAM. Running the identical code and identical mlmodel on an iPad Mini 2, it uses 758.87 MB of RAM before it crashes with an out of memory error.
Memory allocations on the iPhone 7 Plus:
Memory allocations on the iPad Mini 2:
Running on the iPad Mini, there are two 200 MB and one 197.77 MB Espresso library allocations that are not present on the iPhone 7+. The iPad Mini also uses a 49.39 MB allocation that the iPhone 7+ doesn't use, and three 16.48 MB allocations versus one 16.48 MB allocations on the iPhone 7+ (see screenshots above).
What on earth is going on, and how can I fix it?
Relevant code (download project linked above for full source):
private var inputImage = UIImage(named: "input")!
let imageSize = 1200
private let models = [
test().model
]
#IBAction func styleButtonTouched(_ sender: UIButton) {
guard let image = inputImage.scaled(to: CGSize(width: imageSize, height: imageSize), scalingMode: .aspectFit).cgImage else {
print("Could not get a CGImage")
return
}
let model = models[0] //Use my test model
toggleLoading(show: true)
DispatchQueue.global(qos: .userInteractive).async {
let stylized = self.stylizeImage(cgImage: image, model: model)
DispatchQueue.main.async {
self.toggleLoading(show: false)
self.imageView.image = UIImage(cgImage: stylized)
}
}
}
private func stylizeImage(cgImage: CGImage, model: MLModel) -> CGImage {
let input = StyleTransferInput(input: pixelBuffer(cgImage: cgImage, width: imageSize, height: imageSize))
let outFeatures = try! model.prediction(from: input)
let output = outFeatures.featureValue(for: "outputImage")!.imageBufferValue!
CVPixelBufferLockBaseAddress(output, .readOnly)
let width = CVPixelBufferGetWidth(output)
let height = CVPixelBufferGetHeight(output)
let data = CVPixelBufferGetBaseAddress(output)!
let outContext = CGContext(data: data,
width: width,
height: height,
bitsPerComponent: 8,
bytesPerRow: CVPixelBufferGetBytesPerRow(output),
space: CGColorSpaceCreateDeviceRGB(),
bitmapInfo: CGImageByteOrderInfo.order32Little.rawValue | CGImageAlphaInfo.noneSkipFirst.rawValue)!
let outImage = outContext.makeImage()!
CVPixelBufferUnlockBaseAddress(output, .readOnly)
return outImage
}
private func pixelBuffer(cgImage: CGImage, width: Int, height: Int) -> CVPixelBuffer {
var pixelBuffer: CVPixelBuffer? = nil
let status = CVPixelBufferCreate(kCFAllocatorDefault, width, height, kCVPixelFormatType_32BGRA , nil, &pixelBuffer)
if status != kCVReturnSuccess {
fatalError("Cannot create pixel buffer for image")
}
CVPixelBufferLockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags.init(rawValue: 0))
let data = CVPixelBufferGetBaseAddress(pixelBuffer!)
let rgbColorSpace = CGColorSpaceCreateDeviceRGB()
let bitmapInfo = CGBitmapInfo(rawValue: CGBitmapInfo.byteOrder32Little.rawValue | CGImageAlphaInfo.noneSkipFirst.rawValue)
let context = CGContext(data: data, width: width, height: height, bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(pixelBuffer!), space: rgbColorSpace, bitmapInfo: bitmapInfo.rawValue)
context?.draw(cgImage, in: CGRect(x: 0, y: 0, width: width, height: height))
CVPixelBufferUnlockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags(rawValue: 0))
return pixelBuffer!
}
class StyleTransferInput : MLFeatureProvider {
/// input as color (kCVPixelFormatType_32BGRA) image buffer, 720 pixels wide by 720 pixels high
var input: CVPixelBuffer
var featureNames: Set<String> {
get {
return ["inputImage"]
}
}
func featureValue(for featureName: String) -> MLFeatureValue? {
if (featureName == "inputImage") {
return MLFeatureValue(pixelBuffer: input)
}
return nil
}
init(input: CVPixelBuffer) {
self.input = input
}
}

Pixel Array to UIImage in Swift

I've been trying to figure out how to convert an array of rgb pixel data to a UIImage in Swift.
I'm keeping the rgb data per pixel in a simple struct:
public struct PixelData {
var a: Int
var r: Int
var g: Int
var b: Int
}
I've made my way to the following function, but the resulting image is incorrect:
func imageFromARGB32Bitmap(pixels:[PixelData], width: Int, height: Int)-> UIImage {
let rgbColorSpace = CGColorSpaceCreateDeviceRGB()
let bitmapInfo:CGBitmapInfo = CGBitmapInfo(CGImageAlphaInfo.PremultipliedFirst.rawValue)
let bitsPerComponent:Int = 8
let bitsPerPixel:Int = 32
assert(pixels.count == Int(width * height))
var data = pixels // Copy to mutable []
let providerRef = CGDataProviderCreateWithCFData(
NSData(bytes: &data, length: data.count * sizeof(PixelData))
)
let cgim = CGImageCreate(
width,
height,
bitsPerComponent,
bitsPerPixel,
width * Int(sizeof(PixelData)),
rgbColorSpace,
bitmapInfo,
providerRef,
nil,
true,
kCGRenderingIntentDefault
)
return UIImage(CGImage: cgim)!
}
Any tips or pointers on how to properly convert an rgb array to an UIImage?
Note: This is a solution for iOS creating a UIImage. For a solution for macOS and NSImage, see this answer.
Your only problem is that the data types in your PixelData structure need to be UInt8. I created a test image in a Playground with the following:
public struct PixelData {
var a: UInt8
var r: UInt8
var g: UInt8
var b: UInt8
}
var pixels = [PixelData]()
let red = PixelData(a: 255, r: 255, g: 0, b: 0)
let green = PixelData(a: 255, r: 0, g: 255, b: 0)
let blue = PixelData(a: 255, r: 0, g: 0, b: 255)
for _ in 1...300 {
pixels.append(red)
}
for _ in 1...300 {
pixels.append(green)
}
for _ in 1...300 {
pixels.append(blue)
}
let image = imageFromARGB32Bitmap(pixels: pixels, width: 30, height: 30)
Update for Swift 4:
I updated imageFromARGB32Bitmap to work with Swift 4. The function now returns a UIImage? and guard is used to return nil if anything goes wrong.
func imageFromARGB32Bitmap(pixels: [PixelData], width: Int, height: Int) -> UIImage? {
guard width > 0 && height > 0 else { return nil }
guard pixels.count == width * height else { return nil }
let rgbColorSpace = CGColorSpaceCreateDeviceRGB()
let bitmapInfo = CGBitmapInfo(rawValue: CGImageAlphaInfo.premultipliedFirst.rawValue)
let bitsPerComponent = 8
let bitsPerPixel = 32
var data = pixels // Copy to mutable []
guard let providerRef = CGDataProvider(data: NSData(bytes: &data,
length: data.count * MemoryLayout<PixelData>.size)
)
else { return nil }
guard let cgim = CGImage(
width: width,
height: height,
bitsPerComponent: bitsPerComponent,
bitsPerPixel: bitsPerPixel,
bytesPerRow: width * MemoryLayout<PixelData>.size,
space: rgbColorSpace,
bitmapInfo: bitmapInfo,
provider: providerRef,
decode: nil,
shouldInterpolate: true,
intent: .defaultIntent
)
else { return nil }
return UIImage(cgImage: cgim)
}
Making it a convenience initializer for UIImage:
This function works well as a convenience initializer for UIImage. Here is the implementation:
extension UIImage {
convenience init?(pixels: [PixelData], width: Int, height: Int) {
guard width > 0 && height > 0, pixels.count == width * height else { return nil }
var data = pixels
guard let providerRef = CGDataProvider(data: Data(bytes: &data, count: data.count * MemoryLayout<PixelData>.size) as CFData)
else { return nil }
guard let cgim = CGImage(
width: width,
height: height,
bitsPerComponent: 8,
bitsPerPixel: 32,
bytesPerRow: width * MemoryLayout<PixelData>.size,
space: CGColorSpaceCreateDeviceRGB(),
bitmapInfo: CGBitmapInfo(rawValue: CGImageAlphaInfo.premultipliedFirst.rawValue),
provider: providerRef,
decode: nil,
shouldInterpolate: true,
intent: .defaultIntent)
else { return nil }
self.init(cgImage: cgim)
}
}
Here is an example of its usage:
// Generate a 500x500 image of randomly colored pixels
let height = 500
let width = 500
var pixels: [PixelData] = .init(repeating: .init(a: 0, r: 0, g: 0, b: 0), count: width * height)
for index in pixels.indices {
pixels[index].a = 255
pixels[index].r = .random(in: 0...255)
pixels[index].g = .random(in: 0...255)
pixels[index].b = .random(in: 0...255)
}
let image = UIImage(pixels: pixels, width: width, height: height)
Update for Swift 3
struct PixelData {
var a: UInt8 = 0
var r: UInt8 = 0
var g: UInt8 = 0
var b: UInt8 = 0
}
func imageFromBitmap(pixels: [PixelData], width: Int, height: Int) -> UIImage? {
assert(width > 0)
assert(height > 0)
let pixelDataSize = MemoryLayout<PixelData>.size
assert(pixelDataSize == 4)
assert(pixels.count == Int(width * height))
let data: Data = pixels.withUnsafeBufferPointer {
return Data(buffer: $0)
}
let cfdata = NSData(data: data) as CFData
let provider: CGDataProvider! = CGDataProvider(data: cfdata)
if provider == nil {
print("CGDataProvider is not supposed to be nil")
return nil
}
let cgimage: CGImage! = CGImage(
width: width,
height: height,
bitsPerComponent: 8,
bitsPerPixel: 32,
bytesPerRow: width * pixelDataSize,
space: CGColorSpaceCreateDeviceRGB(),
bitmapInfo: CGBitmapInfo(rawValue: CGImageAlphaInfo.premultipliedFirst.rawValue),
provider: provider,
decode: nil,
shouldInterpolate: true,
intent: .defaultIntent
)
if cgimage == nil {
print("CGImage is not supposed to be nil")
return nil
}
return UIImage(cgImage: cgimage)
}

Resources