Core Image Kernel Language's OpenGL coordinate system - ios

I'm writing a simple(at least I thought it would be simple) custom kernel that takes the difference of a specified pixel and an entire image.
Below is the code that I have, this just makes the filter. It's good to use in a playground play with.
import UIKit
import CoreImage
let Flower = CIImage( image: UIImage(named: "flower.png")!)!
class Test: CIFilter
{
var inputImage1 : CIImage?
var inputImage2 : CIImage?
var kernel = CIKernel(string:
"kernel vec4 colorRemap(sampler inputIm, sampler GaussIm) " +
"{ " +
"vec4 size = samplerExtent(inputIm); " +
"float row = 1.0; " +
"float column = 1.0; " +
"float pixelx = (column - 1.0)/(size.w - 1.0)+1.0/(2.0*size.z);" +
"float pixely = (size.z - row)/(size.z - 1.0)-1.0/(2.0*size.w);" +
"vec3 g0 =sample(GaussIm,vec2(pixelx,pixely)).rgb; " +
"vec3 current = sample(inputIm,samplerCoord(inputIm)).rgb; " +
"vec3 diff =(current - g0); " +
"return vec4(diff,1.0); " +
"} "
)
var extentFunction: (CGRect, CGRect) -> CGRect =
{ (a: CGRect, b: CGRect) in return CGRectZero }
override var outputImage: CIImage!
{
if let inputImage1 = inputImage1,
inputImage2 = inputImage2,
kernel = kernel
{
let extent = inputImage1.extent
let arguments = [inputImage1,inputImage2]
return kernel.applyWithExtent(extent,
roiCallback:
{ (index, rect) in
return rect
},
arguments: arguments)
}
return nil
}
}
To use the filter, you can do the following
let filter = Test()
filter.inputImage1 = Flower
filter.inputImage2 = Flower
let output = filter.outputImage
Now, in the above code, I've specified that we're taking the difference between the pixel located at (1,1) of GaussIm, as if we were treating the image as a matrix (in the usual sense), and the entire image of inputIm.
After playing around, I had come to realize that the Custom Kernel Language treats images a bit like OpenGL does. The bottom left corner is mapped to (0,0), and the top right being (1,1), so that pixel coordinates are numbers between 0 and 1. The issue with this is that I want to specify whatever pixel I want to use to take the difference.
The first 5 lines of the kernel code attempts to alleviate this by computing the center of each pixel location in the image. I'm not sure if this is right considering how OpenGL treats it's images, or maybe there's a better way.
If I run this code above, with the below image:
I get the following with XCode:
Further, if I do the same thing in MATLAB, I get the following output:
Why am I getting a different output than in MATLAB? It almost seems a tad darker than what I'm getting from my custom filter, and yet they are close to the same output at the same time. My thought was that it must be the way the custom kernel is taking the difference amongst pixels, but I'm not really sure what's going on.

I ended up figuring this out -- the reason for the clipping is due to the nature of how images are computed. This work was done in a playground, not on a context, so anything that was displayed was clipped to the range of [0,1]. The way to fix this was to make sure that your CIContext that you are doing calculations on support a floating point precision in its calculations, via the options.

Related

Metal Core Image Kernel use of DOD

I wrote the following Metal Core Image Kernel to produce constant red color.
extern "C" float4 redKernel(coreimage::sampler inputImage, coreimage::destination dest)
{
return float4(1.0, 0.0, 0.0, 1.0);
}
And then I have this in Swift code:
class CIMetalRedColorKernel: CIFilter {
var inputImage:CIImage?
static var kernel:CIKernel = { () -> CIKernel in
let bundle = Bundle.main
let url = bundle.url(forResource: "Kernels", withExtension: "ci.metallib")!
let data = try! Data(contentsOf: url)
return try! CIKernel(functionName: "redKernel", fromMetalLibraryData: data)
}()
override var outputImage: CIImage? {
guard let inputImage = inputImage else {
return nil
}
let dod = inputImage.extent
return CIMetalRedColorKernel.kernel.apply(extent: dod, roiCallback: { index, rect in
return rect
}, arguments: [inputImage])
}
}
As you can see, the dod is given to be the extent of the input image. But when I run the filter, I get a whole red image beyond the extent of the input image (DOD), why? I have multiple filters chained together and the overall size is 1920x1080. Isn't the red filter supposed to run only for DOD rectangle passed in it and produce clear pixels for anything outside the DOD?
With the extent parameter of the kernel call you signal the region for which the kernel produces meaningful results—or, as you correctly named it, the domain of definition.
However, this also means that whatever it produces outside this region is basically undefined and up to you as the kernel developer to decide.
A generator kernel like the one you wrote usually has an infinite domain of definition since it just produces a red color, regardless of the input. To restrict the output to a specific area, you can apply a crop to it:
let dod = inputImage.extent
let result = CIMetalTestRenderer.kernel.apply(extent: .infinite, roiCallback: { index, rect in
return rect
}, arguments: [inputImage])
return result.cropped(to: dod)
After the cropping, everything outside of dod will be transparent.
Update:
It turns out you have to set the extent parameter of the kernel call to .infinite to make this work. I suspect that cropped(to:) checks if the image already has the given extent and will do nothing in this case. So to make CI really apply the cropping, you have to specify the domain of definition your kernel actually produces.
I think the counter-intuitive thing here is that CI does not apply your kernel to just the pixels of the extent you specify. It seems there is some automatic clamp-to-extent going on when the result is not cropped properly, but honestly, I'm also rather confused by this...

Detect Image Problems

I really don't know what it is called (distortion or something else)
But I would like to detect lens camera problems for some different types of images by using emgucv (or opencv)
Any ideas about which algorithms to use would be appreciated
Second image seems to have high noise, but is there any way to understand high noise via opencv?
This is very difficult to achieve generically without reference data or a homogeneity sample. However, I have developed a recommendation analyzing the Average SNR (Signal to Noise) ratio of the image. The algorithm divides the input image into a specified number of "sub images' based on a specified kernel size in order to evaluate each independently for local SNR. The computed SNRs for each sub image are then mean averaged to provide an indicator for the global SNR of the image.
You will need to test this approach exhaustively, however it shows promise on the following three images, producing AvgSNR;
Image #1 - AvgSNR = 0.9
Image #2 - AvgSNR = 7.0
Image #3 - AvgSNR = 0.6
NOTE: See how the "clean" control image produces a much higher AvgSNR.
The only variable to consider is the kernel size. I would recommend keeping this at a size that will support will even the smallest of your potential input images. 30 pixels square should likely be appropriate for many images.
I enclose my test code with annotation:
class Program
{
static void Main(string[] args)
{
// List of file names to load.
List<string> fileNames = new List<string>()
{
"IifXZ.png",
"o1z7p.jpg",
"NdQtj.jpg"
};
// For each image
foreach (string fileName in fileNames)
{
// Determine local file path
string path = Path.Combine(Environment.CurrentDirectory, #"TestImages\", fileName);
// Load the image
Image<Bgr, byte> inputImage = new Image<Bgr, byte>(path);
// Compute the AvgSNR with a kernel of 30x30
Console.WriteLine(ComputeAverageSNR(30, inputImage.Convert<Gray, byte>()));
// Display the image
CvInvoke.NamedWindow("Test");
CvInvoke.Imshow("Test", inputImage);
while (CvInvoke.WaitKey() != 27) { }
}
// Pause for evaluation
Console.ReadKey();
}
static double ComputeAverageSNR(int kernelSize, Image<Gray, byte> image)
{
// Calculate the number of sub-divisions given the kernel size
int widthSubDivisions, heightSubDivisions;
widthSubDivisions = (int)Math.Floor((double)image.Width / kernelSize);
heightSubDivisions = (int)Math.Floor((double)image.Height / kernelSize);
int totalNumberSubDivisions = widthSubDivisions * heightSubDivisions;
Rectangle ROI = new Rectangle(0, 0, kernelSize, kernelSize);
double avgSNR = 0;
// Foreach sub-divions, calculate the SNR and sum to the avgSNR
for (int v = 0; v < heightSubDivisions; v++)
{
for (int u = 0; u < widthSubDivisions; u++)
{
// Iterate the sub-division position
ROI.Location = new Point(u * kernelSize, v * kernelSize);
// Calculate the SNR of this sub-division
avgSNR += ComputeSNR(image.GetSubRect(ROI));
}
}
avgSNR /= totalNumberSubDivisions;
return avgSNR;
}
static double ComputeSNR(Image<Gray, byte> image)
{
// Local varibles
double mean, sigma, snr;
// Calculate the mean pixel value for the sub-division
int population = image.Width * image.Height;
mean = CvInvoke.Sum(image).V0 / population;
// Calculate the Sigma of the sub-division population
double sumDeltaSqu = 0;
for (int v = 0; v < image.Height; v++)
{
for (int u = 0; u < image.Width; u++)
{
sumDeltaSqu += Math.Pow(image.Data[v, u, 0] - mean, 2);
}
}
sumDeltaSqu /= population;
sigma = Math.Pow(sumDeltaSqu, 0.5);
// Calculate and return the SNR value
snr = sigma == 0 ? mean : mean / sigma;
return snr;
}
}
NOTE: Without a reference, it is not possible to differentiate between natural variance/fidelity and "noise". For example, a highly texture background, or a scene with few homogeneous regions will yield a high AvgSNR. This approach will perform best when the evaluated scene consists mostly of plain, mono-color surfaces, such as the server room or shop front. Grass for example would contain a large amount of texture and therefore "noise".
An alternative method is to consider evaluating your images in the frequency domain following a Fourier transform. Principally, the noise examples you have provided are images containing unwanted, high frequency content. Conduct FFT and evaluate for images violating a threshold for high frequencies. Here you will from an example of FFT with Emgu: FFT with Emgu

i wanted to detect objects in a hsv image. but i keep getting an error,,Expected Ptr<cv::UMat> for argument '%s'

i was trying to create a trackbar window and get hsv value of the image by adjusting the trackbar. created a mask and then adjusted the trackbar to detect an object of the hsv image
enter code here
def nothing(x):
pass
cv.namedWindow("Tracking")
cv.createTrackbar("LH","Tracking",0,255,nothing)
cv.createTrackbar("LS","Tracking",0,255,nothing)
cv.createTrackbar("LV","Tracking",0,255,nothing)
cv.createTrackbar("UH","Tracking",255,255,nothing)
cv.createTrackbar("US","Tracking",255,255,nothing)
cv.createTrackbar("UV","Tracking",255,255,nothing)
while True:
frame = cv.imread("C:/Users/acer/Desktop/insects/New folder/ins.jpg")
hsv = cv.cvtColor(frame,cv.COLOR_BGR2HSV)
l_h = cv.getTrackbarPos("LH","Tracking")
l_s = cv.getTrackbarPos("LS","Tracking")
l_v = cv.getTrackbarPos("LV","Tracking")
u_h = cv.getTrackbarPos("UH","Tracking")
u_s = cv.getTrackbarPos("US","Tracking")
u_v = cv.getTrackbarPos("UV","Tracking")
l_b = np.array([l_h,l_s,l_v])
u_b = np.array([u_h,u_s,u_v])
mask = (hsv,l_b,u_b)
res = cv.bitwise_and(frame,frame,mask=mask)
cv.imshow("frame",frame)
cv.imshow("mask",mask)
cv.imshow("res",res)
key = cv.waitKey(1)
if key == 27:
break
cv.destroyAllWindows()
There are a few issues with your code:
1) You have no import statements. You need at least:
import cv2 as cv
import numpy as np
2) Your indentation is incorrect. Your function nothing() should not be indented.
3) You omitted to call inRange(), you need:
mask = cv.inRange(hsv,l_b,u_b)
4) You have scaled the Hue into the range 0..255 when it actually has the range 0..180 when used with uint8 images so that 360 degrees comes out as 180 degrees which is less than the 255 upper limit of uint8.
By the way, it is fairly poor practice to do "loop invariant" stuff inside a loop - I mean the part where you hit the disk every millisecond and re-read the image, re-decode the JPEG and convert it to HSV. All that can be done outside the loop, then inside it, just do a quick memory copy of the HSV image.

Metal Shading language for Core Image color kernel, how to pass an array of float3

I'm trying to port some CIFilter from this source by using metal shading language for Core Image.
I have a palette of color composed by an array of RGB struct and I want to pass them as an argument to a custom CI color image kernel.
The RGB struct is converted into an array of SIMD3<Float>.
static func SIMD3Palette(_ palette: [RGB]) -> [SIMD3<Float>] {
return palette.map{$0.toFloat3()}
}
The kernel should take and array of simd_float3 values, the problem is the when I launch the filter it tells me that the argument at index 1 is expecting an NSData.
override var outputImage: CIImage? {
guard let inputImage = inputImage else
{
return nil
}
let palette = EightBitColorFilter.palettes[Int(inputPaletteIndex)]
let extent = inputImage.extent
let arguments = [inputImage, palette, Float(palette.count)] as [Any]
let final = colorKernel.apply(extent: extent, arguments: arguments)
return final
}
This is the kernel:
float4 eight_bit(sample_t image, simd_float3 palette[], float paletteSize, destination dest) {
float dist = distance(image.rgb, palette[0]);
float3 returnColor = palette[0];
for (int i = 1; i < floor(paletteSize); ++i) {
float tempDist = distance(image.rgb, palette[i]);
if (tempDist < dist) {
dist = tempDist;
returnColor = palette[i];
}
}
return float4(returnColor, 1);
}
I'm wondering how can I pass a data buffer to the kernel since converting it into an NSData seems not enough. I saw some example but they are using "full" shading language that is not available for Core Image that is a sort of subset for dealing only with fragments.
Update
We have now figured out how to pass data buffers directly into Core Image kernels. Using a CIImage as described below is not needed, but still possible.
Assuming that you have your raw data as an NSData, you can just pass it to the kernel on invocation:
kernel.apply(..., arguments: [data, ...])
Note: Data might also work, but I know that NSData is an argument type that allows Core Image to cache filter results based on input arguments. So when in doubt, better cast to NSData.
Then in the kernel function, you only need to declare the parameter with an appropriate constant type:
extern "C" float4 myKernel(constant float3 data[], ...) {
float3 data0 = data[0];
// ...
}
Previous Answer
Core Image kernels don't seem to support pointer or array parameter types. Though there seem to be something coming with iOS 13. From the Release Notes:
Metal CIKernel instances support arguments with arbitrarily structured data.
But, as so often with Core Image, there seem to be no further documentation for that…
However, you can still use the "old way" of passing buffer data by wrapping it in a CIImage and sampling it in the kernel. For example:
let array: [Float] = [1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]
let data = array.withUnsafeBufferPointer { Data(buffer: $0) }
let dataImage = CIImage(bitmapData: data, bytesPerRow: data.count, size: CGSize(width: array.count/4, height: 1), format: .RGBAf, colorSpace: nil)
Note that there is no CIFormat for 3-channel images since the GPU doesn't support those. So you either have to use single-channel .Rf and re-pack the values inside your kernel to float3 again, or add some strides to your data and use .RGBAf and float4 respectively (which I'd recommend since it reduces texture fetches).
When you pass that image into your kernel, you probably want to set the sampling mode to nearest, otherwise you might get interpolated values when sampling between two pixels:
kernel.apply(..., arguments: [dataImage.samplingNearest(), ...])
In your (Metal) kernel, you can assess the data as you would with a normal input image via a sampler:
extern "C" float4 myKernel(coreimage::sampler data, ...) {
float4 data0 = data.sample(data.transform(float2(0.5, 0.5))); // data[0]
float4 data1 = data.sample(data.transform(float2(1.5, 0.5))); // data[1]
// ...
}
Note that I added 0.5 to the coordinates so that they point in the middle of a pixel in the data image to avoid ambiguity and interpolation.
Also note that pixel values you get from a sampler always have 4 channels. So even when you are creating your data image with formate .Rf, you'll get a float4 when sampling it (the other values are filled with 0.0 for G and B and 1.0 for alpha). In this case, you can just do
float data0 = data.sample(data.transform(float2(0.5, 0.5))).x;
Edit
I previously forgot to transform the sample coordinate from absolute pixel space (where (0.5, 0.5) would be the middle of the first pixel) to relative sampler space (where (0.5, 0.5) would be the middle of the whole buffer). It's fixed now.
I made it, event if the answer was good and also deploys to lower target the result wasn't exactly what I was expecting. The difference between the original kernel written as a string and the above method to create an image to be used as a source of data were kind of big.
Didn't get exactly the reason, but the image I was passing as a source of the palette was kind of different from the created one in size and color(probably due to color spaces).
Since there was no documentation about this statement:
Metal CIKernel instances support arguments with arbitrarily structured
data.
I tried a lot in my spare time and came up to this.
First the shader:
float4 eight_bit_buffer(sampler image, constant simd_float3 palette[], float paletteSize, destination dest) {
float4 color = image.sample(image.transform(dest.coord()));
float dist = distance(color.rgb, palette[0]);
float3 returnColor = palette[0];
for (int i = 1; i < floor(paletteSize); ++i) {
float tempDist = distance(color.rgb, palette[i]);
if (tempDist < dist) {
dist = tempDist;
returnColor = palette[i];
}
}
return float4(returnColor, 1);
}
Second the palette transformation into SIMD3<Float>:
static func toSIMD3Buffer(from palette: [RGB]) -> Data {
var simd3Palette = SIMD3Palette(palette)
let size = MemoryLayout<SIMD3<Float>>.size
let count = palette.count * size
let palettePointer = UnsafeMutableRawPointer.allocate(
byteCount: simd3Palette.count * MemoryLayout<SIMD3<Float>>.stride,
alignment: MemoryLayout<SIMD3<Float>>.alignment)
let simd3Pointer = simd3Palette.withUnsafeMutableBufferPointer { (buffer) -> UnsafeMutablePointer<SIMD3<Float>> in
let p = palettePointer.initializeMemory(as: SIMD3<Float>.self,
from: buffer.baseAddress!,
count: buffer.count)
return p
}
let data = Data(bytesNoCopy: simd3Pointer, count: count * MemoryLayout<SIMD3<Float>>.stride, deallocator: .free)
return data
}
The first time I tried by appending SIMD3 to the Data object but wasn't working probably due to memory alignment.
Remember to dealloc the memory created after you used it.
Hope to help someone else.

How to add texture (image) to SceneKit model so that it covers the model (mesh) uniformly?

Just started playing around with SceneKit & ARKit. One issue I'm having is covering the 3DModel with an image texture. What is happening is that part of the 3DModel when rendered in ARKit uses the texture correctly. The other parts are not.
For example I'm using a sofa 3DModel at scale roughly (80 inches W x 36 inches H x 36 inches D). I'm going to cover the mesh with a swatch fabric image texture that measures 290 x 290 pixels.
The results that I'm seeing are the following:
In my code:
if let matts = child.geometry?.materials {
var matIndex = 0
for mat in matts {
let material = SCNMaterial()
material.diffuse.contents = UIImage(named: "fabric-coral")
material.isDoubleSided = true
child.geometry?.replaceMaterial(at: matIndex, with: material)
matIndex += 1
}
}
My question is how to take the UIImage and have it repeat and cover the mesh without the weird square pattern that it is doing now?
In this case your UV texture coordinates must extend outside the 0.0 - 1.0 range, which is fine, but it would explain what you are seeing. The default texture wrapping behaviour of SCNMaterialProperty is CLAMP, that means it takes the pixel values at the edge of your texture and uses this to 'fill in' the rest of the model (parts with UV coords outside 0.0 - 1.0 range).
Changing the wrapT and wrapS properties should fix this.
. . .
material.diffuse.contents = UIImage(named: "fabric-coral")
material.diffuse.wrapT = SCNWrapMode.repeat
material.diffuse.wrapS = SCNWrapMode.repeat
material.isDoubleSided = true
. . .
The SCNWrapMode documentation includes a figure showing the effect of each mode, including one that looks close to what you are currently observing.

Resources