Unexpected behaviour with CIKernel - ios

I made this example to show the problem. It takes 1 pixel from texture by hardcoded coordinate and use as result for each pixel in shader. I expect that all the image will be in the same color. When images are small it works perfectly, but when I work with big images it has strange result. For example, here image has size 7680x8580 and you can see 4 squares:
Here is my code
kernel vec4 colorKernel(sampler source)
{
vec4 key = sample(source, samplerTransform(source, vec2(100., 200.)));
return key;
}
Here is how I init Kernel:
override var outputImage: CIImage? {
return colorFillKernel.apply(
extent: CGRect(origin: CGPoint.zero, size: inputImage!.extent.size),
roiCallback:
{
(index, rect) in
return rect
},
arguments: [
inputImage])
}
Also, this code shows image properly, without changes and squares:
vec2 dc = destCoord();
return sample(source, samplerTransform(source, dc));
On a public documentation it says "Core Image automatically splits large images into smaller tiles for rendering, so your callback may be called multiple times." but I can't find ways how to handle this situations. I have kaleidoscopic effects and from any this tile I need to be able to get pixel from another tile as well...

I think the problem occurs due to a wrongly defined region of interest in combination with tiling.
In the roiCallback, Core Image is asking you which area of the input image (at index in case you have multiple inputs) you kernel needs to look at in order to produce the given region (rect) of the output image. The reason why this is a closure is due to tiling:
If the processed image is too large, Core Image is breaking it down into multiple tiles, renders those tiles separately, and stitches them together again afterward. And for each tile Core Image is asking you what part of the input image your kernel needs to read to produce the tile.
So for your input image, the roiCallback might be called something like four times (or even more) during rendering, for example with the following rectangles:
CGRect(x: 0, y: 0, width: 4096, height: 4096) // top left
CGRect(x: 4096, y: 0, width: 3584, height: 4096) // top right
CGRect(x: 0, y: 4096, width: 4096, height: 4484) // bottom left
CGRect(x: 4096, y: 4096, width: 3584, height: 4484) // bottom right
This is an optimization mechanism of Core Image. It wants to only read and process the pixels that are needed to produce a given region of the output. So it's best to adapt the ROI as best as possible to your use case.
Now the ROI depends on the kernel. There are basically four scenarios:
Your kernel has a 1:1 mapping between input pixel and output pixel. So in order to produce an output color value, it needs to read the pixel at the same position from the input image. In this case, you just return the input rect in your roiCallback. (Or even better, you use a CIColorKernel that is made for this use case.)
Your kernel performs some kind of convolution and not only requires the input pixel at the same coordinate as the output but also some region around it (for instance for a blur operation). Your roiCallback could look like this then:
let inset = self.radius // like radius of CIGaussianBlur
let roiCallback: CIKernelROICallback = { _, rect in
return rect.insetBy(dx: -inset, dy: -inset)
}
Your kernel always needs to read a specific region of the input, regardless of which part of the output is rendered. Then you can just return that specific region in the callback:
let roiCallback: CIKernelROICallback = { CGRect(x: 100, y: 200, width: 1, height: 1) }
The kernel always needs access to the whole input image. This is for example the case when you use some kind of lookup table to derive colors. In this case, you can just return the extent of the input and ignore the parameters:
let roiCallback: CIKernelROICallback = { inputImage.extent }
For your example, scenario 3 should be the right choice. For your kaleidoscopic effects, I assume that you need a certain region or source pixels around the destination coordinate in order to produce an output pixel. So it would be best if you'd calculate the size of that region and use a roiCallback like in scenario 2.
P.S.: Using the Core Image Kernel Language (CIKernel(source: "<code>")) is super duper deprecated now. You should consider writing your kernels in the Metal Shading Language instead. Check out this year's WWDC talk to learn more. 🙂

Related

UIBezierPath corners curve according to the upcoming direction

I'm creating an audio waveform, which should look like this:
Notice how the corners of the lines are curved, according to the direction.
My waveform currently has only straight lines:
How can I achieve the desired results?
My current code for the waveform:
fileprivate func createPath(with points: [CGPoint], pointCount: Int, in rect: CGRect) -> CGPath {
let path = UIBezierPath()
guard pointCount > 0 else {
return path.cgPath
}
guard rect.height > 0, rect.width > 0 else {
return path.cgPath
}
path.move(to: CGPoint(x: 0, y: 0))
let minValue = 1 / (rect.height / 2)
for index in 0..<(pointCount / 2) {
var point = points[index * 2]
path.move(to: point)
point.y = max(point.y, minValue)
point.y = -point.y
path.addLine(to: point)
}
let scaleX = (rect.width - 1) / CGFloat(pointCount - 1)
let halfHeight = rect.height / 2
let scaleY = halfHeight
var transform = CGAffineTransform(scaleX: scaleX, y: scaleY)
transform.ty = halfHeight
path.apply(transform)
return path.cgPath
}
There are two ways to accomplish this:
Rather than treating each bar as a wide line, fill it as a shape, each with its own left, top, right, and bottom. And the top would then be a bézier.
Rather than adjust the top of each bar, you can make the bars all go from minimum to maximum values and then add a mask over the whole graph to render the smoothed shape.
E.g., this shows a few data points, overlays the Catmull Rom bézier, I then extend the bars (because sometimes the curve of the bézier goes above the existing bars, and then use the bézier as mask instead of an overlay.
Additional observations:
Please note, your first image with the curved tops of the bars has another feature that makes it look smooth: The data points, themselves, are smoothed. Your second image features far greater volatility than the first.
The source of this volatility (which is common in audio tracks, or pretty much any DSP dataset), or lack thereof, is not relevant here. What is relevant is that if the data samples are highly variable, an interpolation algorithm for curving the tops of the bars can actually exaggerate the volatility. Single point spikes will be unusually sharp. Double point spikes will actually appear higher than they really are.
E.g. consider this dataset:
With something this sort of variability, it could be argued that the “rounding” of the bars makes the results harder to read and/or is misleading:
While I have attempted to answer the question, one must ask whether this whole exercise is prudent. While there might be an aesthetic appeal to curves to the tops of the bars, it suggests a degree of continuity/precision that is greater than what the underlying data likely supports. The square bars more accurately represent the reality of ranges of values for which data was aggregated and, for this reason, are the common visual representation.

Use output of a reduction CIFilter as color input for another filter

I'm new to CoreImage / Metal, so my apologies in advance if my question is naive. I spent a week going over the CoreImage documentation and examples and I couldn't figure this one out.
Suppose I have a reduction filter such as CIAreaAverage which outputs a 1x1 image. Is it possible to convert that image into a color that I can pass as an argument of another CIFilter? I'm aware that I can do this by rendering the CIAreaAverage output into a CVPixelBuffer, but I'm trying to do this in one render pass.
Edit #1 (Clarification):
Let's say I want to correct the white balance by allowing the user to sample from an image a gray pixel:
let pixelImage = inputImage.applyingFilter("CICrop", arguments: [
"inputRectangle": CGRect(origin: pixelCoordinates, size: CGSize(width: 1, height: 1))
])
// We know now that the extent of pixelImage is 1x1.
// Do something to convert the image into a pixel to be passed as an argument in the filter below.
let pixelColor = ???
let outputImage = inputImage.applyingFilter("CIWhitePointAdjust", arguments: [
"inputColor": pixelColor
])
Is there a way to tell the CIContext to convert the 1x1 CIImage into CIColor?
If you want to use the result of CIAreaAverage in a custom CIFilter (i.e. you don't need it for a CIColor parameter), you can directly pass it as a CIImage to that filter and read the value via sampler in the kernel:
extern "C" float4 myKernel(sampler someOtherInput, sampler average) {
float4 avg = average.sample(float2(0.5, 0.5)); // average only contains one pixel, so always sample that
// ...
}
You can also call .clampedToExtent() on the average CIImage before you pass it to another filter/kernel. This will cause Core Image to treat the average image as if it were infinitely large, containing the same value everywhere. Then it doesn't matter at which coordinate you sample the value. This might be useful if you want to use the average value in a custom CIColorKernel.
Something you can do that doesn't involve Metal is use CoreImage itself. Let's say you want a 640x640 image of the output from CIAreaAverage that is called ciPixel:
let crop = CIFilter(name: "CICrop")
crop?.setValue(ciPixel, forKey: "inputImage")
crop?.setValue(CIVector(x: 0, y: 0, z: 640, w: 640), forKey: "inputRectangle")
ciOutput = crop?.outputImage

Best way to change pixels in iOS, swift

I'm currently implementing some sort of coloring book and I'm curious about the best way to change pixels in UIImage. Here is my code:
self.context = CGContext(data: nil, width: image.width, height: image.height, bitsPerComponent: 8, bytesPerRow: image.width * 4, space: colorSpace, bitmapInfo: CGBitmapInfo.byteOrder32Little.rawValue | CGImageAlphaInfo.premultipliedFirst.rawValue)!
self.context?.draw(image.cgImage, in: CGRect(x: 0, y: 0, width: CGFloat(image.width), height: CGFloat(image.height)))
let ptr = context.data
self.pixelBuffer = ptr!.bindMemory(to: UInt32.self, capacity: image.width * image.height)
And change pixels using this function:
#inline (__always) func fill(matrixPosition: MatrixPosition, color: UInt32) {
pixelsBuffer?[self.linearIndex(for: matrixPosition)] = color
}
The problem is that every time when I change pixels I have to invoke makeImage on context to generate new image and it takes a lot of time:
func generateImage() -> UIImage {
let cgImage = context.makeImage()!
let uiimage = UIImage(cgImage: cgImage)
return uiimage
}
Does my approach is correct? What are better and faster ways to implement it? Thanks.
Manipulating individual pixels and then copying the entire memory buffer to a CGContext and then creating a UIImage with that context is going to end up being inefficient, as you are discovering.
You can continue to improve and optimize a CoreGraphics canvas approach by being more efficient about what part of your offscreen is copied onto screen. You can detect the pixels that have changed and only copy the minimum bounding rectangle of those pixels onto screen. This approach may be good enough for your use case where you are only filling in areas with colors.
Instead of copying the entire offscreen, copy just the changed area:
self.context?.draw(image.cgImage, in: CGRect(x: diffX, y: diffY, width: diffWidth, height: diffHeight))
It is up to you to determine the changed rectangle and when to update the screen.
Here is an example of a painting app that uses CoreGraphics, CoreImage and CADisplayLink. The code is a bit old, but the concepts are still valid and will serve as a good starting point. You can see how the changes are accumulated and drawn to the screen using a CADisplayLink.
If you want to introduce various types of ink and paint effects, a CoreGraphics approach is going to be more challenging. You will want to look at Apple's Metal API. A good tutorial is here.

How to crop/resize texture array in Metal

Say I have a N-channel MPSImage or texture array that is based on MTLTexture.
How do I crop a region from it, copying all the N channels, but changing "pixel size"?
I'll just address the crop case, since the resize case involves resampling and is marginally more complicated. Let me know if you really need that.
Let's assume your source MPSImage is a 12 feature channel (3 slice) image that is 128x128 pixels, that your destination image is an 8 feature channel image (2 slices) that is 64x64 pixels, and that you want to copy the bottom-right 64x64 region of the last two slices of the source into the destination.
There is no API that I'm aware of that allows you to copy from/to multiple slices of an array texture at once, so you'll need to issue multiple blit commands to cover all the slices:
let sourceRegion = MTLRegionMake3D(64, 64, 0, 64, 64, 1)
let destOrigin = MTLOrigin(x: 0, y: 0, z: 0)
let firstSlice = 1
let lastSlice = 2 // inclusive
let commandBuffer = commandQueue.makeCommandBuffer()
let blitEncoder = commandBuffer.makeBlitCommandEncoder()
for slice in firstSlice...lastSlice {
blitEncoder.copy(from: sourceImage.texture,
sourceSlice: slice,
sourceLevel: 0,
sourceOrigin: sourceRegion.origin,
sourceSize: sourceRegion.size,
to: destImage.texture,
destinationSlice: slice - firstSlice,
destinationLevel: 0,
destinationOrigin: destOrigin)
}
blitEncoder.endEncoding()
commandBuffer.commit()
I'm not sure why you want to crop, but keep in mind that the MPSCNN layers can work on a smaller portion of your MPSImage. Just set the offset and clipRect properties and the layer will only work on that region of the source image.
In fact, you could do your crops this way using an MPSCNNNeuronLinear. Not sure if that is any faster or slower than using a blit encoder but it's definitely simpler.
Edit: added a code example. This is typed from memory so it may have small errors, but this is the general idea:
// Declare this somewhere:
let linearNeuron = MPSCNNNeuronLinear(a: 1, b: 0)
Then when you run your neural network, add the following:
let yourImage: MPSImage = ...
let commandBuffer = ...
// This describes the size of the cropped image.
let imgDesc = MPSImageDescriptor(...)
// If you're going to use the cropped image in other layers
// then it's a good idea to make it a temporary image.
let tempImg = MPSTemporaryImage(commandBuffer: commandBuffer, imageDescriptor: imgDesc)
// Set the cropping offset:
linearNeuron.offset = MPSOffset(x: ..., y: ..., z: 0)
// The clip rect is the size of the output image.
linearNeuron.clipRect = MTLRegionMake(0, 0, imgDesc.width, imgDesc.height)
linearNeuron.encode(commandBuffer: commandBuffer, sourceImage: yourImage, destinationImage: tempImg)
// Here do your other layers, taking tempImg as input.
. . .
commandBuffer.commit()

What is plane in a CVPixelbuffer?

In CVPixelBuffer object, have one or many planes. (reference)
We have methods to get number, heigh, the base address of plane.
So what exactly the plane is? And how it constructed inside a CVPixelBuffer?
Sample:
<CVPixelBuffer 0x1465f8b30 width=1280 height=720 pixelFormat=420v iosurface=0x14a000008 planes=2>
<Plane 0 width=1280 height=720 bytesPerRow=1280>
<Plane 1 width=640 height=360 bytesPerRow=1280>
Video formats are an incredibly complex subject.
Some video streams have the pixels stored in bytes RGBA, ARGB, ABGR, or several other variants (with or without an alpha channel)
(In RGBA format, you'd have the red, green, blue, and alpha values of a pixel one right after each other in memory, followed by another set of 4 bytes with the color values of the next pixel, etc.) This is interlaced color information.
Some video streams separate out the color channels so all the red channel, blue, green, and alpha are sent as separate "planes". You'd get a buffer with all the red information, then all the blue data, then all the green, and then alpha, if alpha is included. (Think of color negatives, where there are separate layers of emulsion to capture the different colors. The layers of emulsion are planes of color information. It's the same idea with digital.)
There are formats where the color data is in one or 2 planes, and then the luminance is in a separate plane. That's how old analog color TV works. It started out as black and white (luminance) and then broadcasters added side-band signals to convey the color information. (Chroma)
I don't muck around with CVPixelBuffers often enough to know the gory details of what you are asking, and have to invest large amounts of time and copious amounts of coffee before I can "spin up" my brain enough to grasp those gory details.
Edit:
Since your debug information shows 2 planes, it seems likely that this pixel buffer has a luminance channel and a chroma channel, as mentioned in #zeh's answer.
Although the existing and accepted answer is rich of important information when dealing with CVPixelBuffers, in this particular case the answer is wrong. The two planes that the question refers to are the luminance and chrominance planes
Luminance refers to brightness and chrominance refers to color - From Quora
The following code snippet from Apple makes it more clear:
let lumaBaseAddress = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0)
let lumaWidth = CVPixelBufferGetWidthOfPlane(pixelBuffer, 0)
let lumaHeight = CVPixelBufferGetHeightOfPlane(pixelBuffer, 0)
let lumaRowBytes = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0)
var sourceLumaBuffer = vImage_Buffer(data: lumaBaseAddress,
height: vImagePixelCount(lumaHeight),
width: vImagePixelCount(lumaWidth),
rowBytes: lumaRowBytes)
let chromaBaseAddress = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 1)
let chromaWidth = CVPixelBufferGetWidthOfPlane(pixelBuffer, 1)
let chromaHeight = CVPixelBufferGetHeightOfPlane(pixelBuffer, 1)
let chromaRowBytes = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 1)
var sourceChromaBuffer = vImage_Buffer(data: chromaBaseAddress,
height: vImagePixelCount(chromaHeight),
width: vImagePixelCount(chromaWidth),
rowBytes: chromaRowBytes)
See full reference here.

Resources