I seem to be unable to wrap my head around the methodology behind manually accessing image pixel data in Swift. I am attempting to create an image mask from a CGImage that can later be used on a separate image. I want to identify all pixels of a specific value and convert everything else in the image to black/white or maybe alpha (not really important at the moment however). The code I'm playing with looks like this:
let colorSpace: CGColorSpace = CGColorSpaceCreateDeviceRGB()
let contextWidth: Int = Int(snapshot.size.width)
let contextHeight: Int = Int(snapshot.size.height)
let bytesPerPixel: Int = 24
let bitsPerComponent: Int = 8
let bytesPerRow: Int = bytesPerPixel * contextWidth
let bitmapInfo: CGBitmapInfo = CGBitmapInfo(rawValue: CGImageAlphaInfo.noneSkipLast.rawValue)
guard let context: CGContext = CGContext(data: nil, width: contextWidth, height: contextHeight, bitsPerComponent: bitsPerComponent, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo.rawValue) else {
print("Could not create CGContext")
return
}
context.draw(maskCGImage, in: CGRect(x: 0, y: 0, width: contextWidth, height: contextHeight))
guard let contextDataRaw: UnsafeMutableRawPointer = context.data else {
print("Could not get UnsafeMutableRawPointer from CGContext")
return
}
let contextData: UnsafeMutablePointer<UInt8> = contextDataRaw.bindMemory(to: UInt8.self, capacity: contextWidth * contextHeight)
for row in 0..<contextHeight {
for col in 0..<contextWidth {
let offset = (col * contextHeight) + row
let pixelArray = [contextData[offset], contextData[offset + 1], contextData[offset + 2]]
if pixelArray == [120, 120, 120] {
contextData[offset] = 0
contextData[offset + 1] = 0
contextData[offset + 2] = 0
}
}
}
I have tried various arrangements of the rows and columns trying to identify the correct order, i.e. let offset = (row * contextWidth) + col, let offset = (col * contextHeight) + row, let offset = ((row * contextWidth) + col) * 3, let offset = ((row * contextWidth) + col) * 4.
The output I get looks something like this (Keep in mind that this image IS supposed to look like a blob of random colors):
As my fancy little arrow shows, the black swatch across the top is my edited pixels, and those pixels are indeed supposed to be turned black, however, so are all the other gray pixels (the ones under the arrow for example). The are definitely the same RGB value of 120, 120, 120.
I know the issue is in the order that I'm moving across the array, I just can't seem to figure out what the pattern is. Also, as a note, using copy(maskingColorComponents:) won't do because I want to remove a few specific colors, not a range of them.
Any help is greatly appreciated as always. Thanks in advance!
You're obviously on the right track because you've correctly hit all the pixels in the top left corner. But you don't keep going the rest of the way down the image; clearly you are not surveying enough rows. So the problem might be merely that you are slightly off in your idea of what a row is.
You are saying
for row in 0..<contextHeight {
for col in 0..<contextWidth {
let offset = (col * contextHeight) + row
as if adding row would in fact get you to that row. But row is just the number of the desired row, not the byte that starts that row; it seems to me that the size of one row jump needs to be the size of all the bytes in one row.
Related
In combination with Core ML, I am trying to show a RGBA byte array in an UIImage using the following code:
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
CGContextRef context = CGBitmapContextCreate(bytes, width, height, 8, 4 * width, colorSpace, kCGImageAlphaPremultipliedLast);
CFRelease(colorSpace);
CGImageRef cgImage = CGBitmapContextCreateImage(context);
CGContextRelease(context);
UIImage *image = [UIImage imageWithCGImage:cgImage scale:0 orientation:UIImageOrientationUp];
CGImageRelease(cgImage);
dispatch_async(dispatch_get_main_queue(), ^{
[[self predictionView] setImage:image];
});
I create the image data like this:
uint32_t offset = h * width * 4 + w * 4;
struct Color rgba = colors[highestClass];
bytes[offset + 0] = (rgba.r);
bytes[offset + 1] = (rgba.g);
bytes[offset + 2] = (rgba.b);
bytes[offset + 3] = (255 / 2); // semi transparent
The image size is 500px by 500px. However the full image is not shown, it looks like the image is shown 50% zoomed in.
I started searching for this issue, and found others having the same issue as well. That's why I decided to edit my StoryBoard and set different values for the Content Mode, currently I use Aspect Fit. However, the result remains the same.
I also tried to draw a horizontal line in the center of the image to show how much the image is zoomed in. It confirms that the image is zoomed in 50%.
I wrote the same code in swift, which is working fine. See the code and result on swift here:
let offset = h * width * 4 + w * 4
let rgba = colors[highestClass]
bytes[offset + 0] = (rgba.r)
bytes[offset + 1] = (rgba.g)
bytes[offset + 2] = (rgba.b)
bytes[offset + 3] = (255/2) // semi transparent
let image = UIImage.fromByteArray(bytes, width: width, height: height,
scale: 0, orientation: .up,
bytesPerRow: width * 4,
colorSpace: CGColorSpaceCreateDeviceRGB(),
alphaInfo: .premultipliedLast)
https://github.com/hollance/CoreMLHelpers/blob/master/CoreMLHelpers/UIImage%2BCVPixelBuffer.swift
And below the wrong result in objective-c. You can see that it's very pixelated compared to the swift one. The phone is an iPhone 6s.
What am I missing or doing wrong?
I am trying to show a RGB byte array
Then kCGImageAlphaPremultipliedLast is incorrect. Try to switch to kCGImageAlphaNone.
I found out my problem. It turned out that it had nothing to do with the image stuff itself. There was a bug that the values (width and height) of 500does not fit in uint8_t. That's why the image was shown smaller. Very stupid. Changing it to the right values worked.
I am trying to determine if a MTLTexture (in bgra8Unorm format) is blank by calculating the sum of all the R G B and A components of each of its pixels.
This function intends to do this by adding adjacent floats in memory after a texture has been copied to a pointer. However I have determined that this function ends up returning false nomatter the MTLTexture given.
What is wrong with this function?
func anythingHere(_ texture: MTLTexture) -> Bool {
let width = texture.width
let height = texture.height
let bytesPerRow = width * 4
let data = UnsafeMutableRawPointer.allocate(bytes: bytesPerRow * height, alignedTo: 4)
defer {
data.deallocate(bytes: bytesPerRow * height, alignedTo: 4)
}
let region = MTLRegionMake2D(0, 0, width, height)
texture.getBytes(data, bytesPerRow: bytesPerRow, from: region, mipmapLevel: 0)
var bind = data.assumingMemoryBound(to: UInt8.self)
var sum:UInt8 = 0;
for i in 0..<width*height {
sum += bind.pointee
bind.advanced(by: 1)
}
return sum != 0
}
Matthijs' change is necessary, but there are also a couple of other issues with the correctness of this method.
You're actually only iterating over 1/4 of the pixels, since you're stepping byte-wise and the upper bound of your loop is width * height rather than bytesPerRow * height.
Additionally, computing the sum of the pixels doesn't really seem like what you want. You can save some work by returning true as soon as you encounter a non-zero value (if bind.pointee != 0).
(Incidentally, Swift's integer overflow protection will actually raise an exception if you accumulate a value greater than 255 into a UInt8. I suppose you could use a bigger integer, or disable overflow checking with sum = sum &+ bind.pointee, but again, breaking the loop on the first non-clear pixel will save some time and prevent false positives when the accumulator "rolls over" to exactly 0.)
Here's a version of your function that worked for me:
func anythingHere(_ texture: MTLTexture) -> Bool {
let width = texture.width
let height = texture.height
let bytesPerRow = width * 4
let data = UnsafeMutableRawPointer.allocate(byteCount: bytesPerRow * height, alignment: 4)
defer {
data.deallocate()
}
let region = MTLRegionMake2D(0, 0, width, height)
texture.getBytes(data, bytesPerRow: bytesPerRow, from: region, mipmapLevel: 0)
var bind = data.assumingMemoryBound(to: UInt8.self)
for _ in 0..<bytesPerRow * height {
if bind.pointee != 0 {
return true
}
bind = bind.advanced(by: 1)
}
return false
}
Keep in mind that on macOS, the default storageMode for textures is managed, which means their contents aren't automatically synchronized back to main memory when they're modified on the GPU. You must explicitly use a blit command encoder to sync the contents yourself:
let syncEncoder = buffer.makeBlitCommandEncoder()!
syncEncoder.synchronize(resource: texture)
syncEncoder.endEncoding()
Didn't look in detail at the rest of the code, but I think this,
bind.advanced(by: 1)
should be:
bind = bind.advanced(by: 1)
I need to perform some statistics and pixel-by-pixel analysis of a UIView containing sub views, sublayers and mask in a small iOS-swift3 project.
For the moment i came up with the following:
private func computeStatistics() {
// constants
let width: Int = Int(self.bounds.size.width)
let height: Int = Int(self.bounds.size.height)
// color extractor
let pixel = UnsafeMutablePointer<CUnsignedChar>.allocate(capacity: 4)
let colorSpace = CGColorSpaceCreateDeviceRGB()
let bitmapInfo = CGBitmapInfo(rawValue: CGImageAlphaInfo.premultipliedLast.rawValue)
for x in 0..<width {
for y in 0..<height {
let context = CGContext(data: pixel, width: 1, height: 1, bitsPerComponent: 8, bytesPerRow: 4, space: colorSpace, bitmapInfo: bitmapInfo.rawValue)
context!.translateBy(x: -CGFloat(x), y: -CGFloat(y))
layer.render(in: context!)
// analyse the pixel here
// eg: let totalRed += pixel[0]
}
}
pixel.deallocate(capacity: 4)
}
It's working, the problem is that on a fullscreen view even on an iphone4 this would mean 150.000 instantiations of the context and as many expensive renders, that beside being very slow must also have an issue with deallocation, saturating my memory (even in simulator).
I tried analysis only a fraction of the pixels
let definition: Int = width / 10
for x in 0..<width where x%definition == 0 {
...
}
But beside still taking up to 10 seconds on even on a simulated iphone7 is a very poor solution.
Is it possible to avoid re-rendering and translating the context everytime?
I am calculating the RGB values of pixels in my captured photo. I have this code
func getPixelColorAtLocation(context: CGContext, point: CGPoint) -> Color {
self.context = createARGBBitmapContext(imgView.image!)
let data = CGBitmapContextGetData(context)
let dataType = UnsafePointer<UInt8>(data)
let offset = 4 * ((Int(imageHeight) * Int(point.x)) + Int(point.y))
var color = Color()
color.blue = dataType[offset]
color.green = dataType[offset + 1]
color.red = dataType[offset + 2]
color.alpha = dataType[offset + 3]
color.point.x = point.x
color.point.y = point.y
But I am not sure what this line means in the code.
let offset = 4 * ((Int(imageHeight) * Int(point.x)) + Int(point.y))
Any help??
Thanks in advance
Image is the set of pixels. In order to get the pixel at (x,y) point, you need to calculate the offset for that set.
If you use dataType[0], it has no offset 'cos it points to the place where the pointer is. If you used dataType[10], it would mean you took 10-th element from the beginning where the pointer is.
Due to the fact, we have RGBA colour model, you should multiply by 4, then you need to get what offset by x (it will be x), and by y (it will be the width of the image multiplied by y, in order to get the necessary column) or:
offset = x + width * y
// offset, offset + 1, offset + 2, offset + 3 <- necessary values for you
Imagine, like you have a long array with values in it.
It will be clear if you imagine the implementation of two-dimensional array in the form of one-dimensional array. It would help you, I hope.
I'm trying to get the per-pixel RGBA values for a CIImage in floating point.
I expect the following to work, using CIContext and rendering as kCIFormatRGBAh, but the output is all zeroes. Otherwise my next step would be converting from half floats to full.
What am I doing wrong? I've also tried this in Objective-C and get the same result.
let image = UIImage(named: "test")!
let sourceImage = CIImage(CGImage: image.CGImage)
let context = CIContext(options: [kCIContextWorkingColorSpace: NSNull()])
let colorSpace = CGColorSpaceCreateDeviceRGB()
let bounds = sourceImage.extent()
let bytesPerPixel: UInt = 8
let format = kCIFormatRGBAh
let rowBytes = Int(bytesPerPixel * UInt(bounds.size.width))
let totalBytes = UInt(rowBytes * Int(bounds.size.height))
var bitmap = calloc(totalBytes, UInt(sizeof(UInt8)))
context.render(sourceImage, toBitmap: bitmap, rowBytes: rowBytes, bounds: bounds, format: format, colorSpace: colorSpace)
let bytes = UnsafeBufferPointer<UInt8>(start: UnsafePointer<UInt8>(bitmap), count: Int(totalBytes))
for (var i = 0; i < Int(totalBytes); i += 2) {
println("half float :: left: \(bytes[i]) / right: \(bytes[i + 1])")
// prints all zeroes!
}
free(bitmap)
Here's a related question about getting the output of CIAreaHistogram, which is why I want floating point values rather than integer, but I can't seem to make kCIFormatRGBAh work on any CIImage regardless of its origin, filter output or otherwise.
There are two constraints on using RGBAh with [CIContext render:toBitmap:rowBytes:bounds:format:colorSpace:] on iOS
the rowBytes must be a multiple of 8 bytes
calling it under simulator is not supported
These constraints come from the behavior of OpenGLES with RGBAh on iOS.