Filtering a video stream using GPUImage2 - ios

I have access to the CVPixelBufferRef for each frame and want to apply the ChromaKey filter to it before rendering it.
So far, the only solution I can think of is to first convert the pixel buffer to an image. Here is my barebones solution just for PoC.
var cgImage: CGImage?
VTCreateCGImageFromCVPixelBuffer(pixelBuffer, nil, &cgImage)
let image = UIImage.init(cgImage: cgImage!).filterWithOperation(filter!)
Once I get the filtered image, I pass it to an MTKView to draw.
So my specific question is, can I avoid converting the pixel buffer to an image and still use GPUImage2 for the filter?

Related

Core Image workingColorSpace & outputColorSpace

I am rendering video frames using Metal Core Image shaders. One of the requirements I have is to be able to pick a particular color (and user selected nearby range) from the CIImage, keep that color in the output and turn every other else black and white (color splash). But I am confused about the right approach that would work for videos shot in all kinds of color spaces (including 10 bit HDR):
First job is to extract the color value from the CIImage at any given pixel location. From my understanding, this can be extracted using the following API:
func render(_ image: CIImage,
toBitmap data: UnsafeMutableRawPointer,
rowBytes: Int,
bounds: CGRect,
format: CIFormat,
colorSpace: CGColorSpace?)
The API says passing NULL to colorSpace will cause the output to be in ciContext outputColorSpace. It's not clear how to correctly use this API to extract the exact color at given pixel locations, given the possibility of both 8 bit and 10 bit input images?
Having extracted the value, the next issue is how to pass the values to Metal Core Image shader? Shaders use normalized color ranges dependent on the workingcolorSpace of ciContext. Do I need to create a 1D texture with the color that should be passed to shader or there is a better way?
Based on your comment, here is another alternative:
You can read the pixel value as floats using the context's working color space. By using float values, you ensure that the bit depth of the input doesn't matter and that extended color values are correctly represented.
So for instance, a 100% red in BT.2020 would result in an extended sRGB value of (1.2483, -0.3880, -0.1434).
To read the value, you could use our small helper library CoreImageExtensions (or check out the implementation to see how to use render to get float values):
let pixelColor = context.readFloat32PixelValue(from: image, at: coordinate, colorSpace: context.workingColorSpace)
// you can convert that to CIVector, which can be passed to a kernel
let vectorValue = CIVector(x: pixelColor.r, y: pixelColor.g, ...)
In your Metal kernel, you can use a float4 input parameter for that color.
You can store and use the color value on later rendering calls as long as you are using the same workingColorSpace for the context.
I think you can achieve that without worrying about color spaces and even without the intermediate rendering step (which should speed up performance a lot).
You can simply crop your image to a 1x1 px square image that contains the specific color and make the image to extent virtually infinitely in all directions.
You can then pass that image into your next kernel and sample it anywhere to retrieve the color value (in the same color space as before).
let pixelCoordinate: CGPoint // the coordinate of the pixel that contains the color
// crop down to a single pixel
let colorPixel = inputImage.cropped(to: CGRect(origin: pixelCoordinate, size: CGSize(width: 1, height: 1))
// make the pixel extent infinite
let colorImage = colorPixel.clampedToExtent()
// simply pass it to your kernel
myKernel.apply(..., arguments: [colorImage, ...])
In the Metal kernel code, you can simply access it via sampler (or sample_t in a color kernel) and sample it like this:
// you can sample at any coord since the image contains the single color everywhere
float4 pickedColor = colorImage.sample(colorImage.coord());
To read the "original" color values from a CIImage, the CIContext used to render the pixels to the bitmap needs to be created with both workingColorSpace and outputColorSpace set to NSNull(). Then there will be no color space conversion and you don't have to worry about color spaces:
let context = CIContext(options: [.workingColorSpace: NSNull(), .outputColorSpace: NSNull()])
And then, when rendering the pixels to bitmap, specify the highest precision color format CIFormat.RGBAf to make sure you are not clipping any values. And use nil for colorSpace parameter. You will get 4 Float32 values per pixel that can be passed to the shader in CIVector as suggested by the first answer.
But here is another thing you can do, borrowing the cropping and clamping idea from the second Answer.
Create an infinite image that contains only the selected color using the approach suggested in that answer.
Crop that image to the frame's extent
Use CIColorAbsoluteDifference where one input is the original frame and another is this uniform color image.
The output of that filter will make all pixels that match the selected color exactly back, and none of the other pixels will be black, since this filter calculates the absolute difference between the colors and only pixels of exactly the same color will produce the (0,0,0) output.
Pass that image to the shader. If the color sampled from the image has exactly 0 in all its color components (ignore alpha) it means you need to copy the input pixel to the output intact. Otherwise set it to whatever you need it to be set (black or white or whatever).

Retrieve the last frame of live camera preview in swift

I have an AR app where the view is constantly showing what the back camera is seeing and sending each frame for analysis to VisionRequest.
When the object was identified, I would like to capture that particular last frame and save it as a regular UIImage and send it down the segue chain to the final view controller where I display that last frame. I have issues capturing that last frame and showing it.
Here is what I tried so far:
When the image is recognized with a high-enough confidence, I attempt to retrieve the current last frame from the CVPixelBuffer and save it in a local variable that is later passed in a segue to subsequent view controllers.
Is this the correct way of doing it? or do I have to add a second output to the session (a photo output in addition to a video data output) ?
//attempting to get the current last frame of captured video
let attachments = CMCopyDictionaryOfAttachments(allocator: kCFAllocatorDefault, target: self.currentlyAnalyzedPixelBuffer!, attachmentMode: kCMAttachmentMode_ShouldPropagate)
let ciImage = CIImage(cvImageBuffer: self.currentlyAnalyzedPixelBuffer!, options: attachments as? [CIImageOption : Any])
self.image = UIImage(ciImage: ciImage)
Actually, there are more chances that you get not exact output you needed. Because You never know that last frame captured has exact same you wanted. There might be possibilities where you can have false results like the camera is in motion and frame you got is blurred or not properly as per your need.
May be I am wrong with it. But my suggestion or solution would  keep array of 10 images or pixel buffers and store last 10 Frames or pixel buffers. When you get your object identified from vision check that array again and get the highest quality (confidence) frame or you may show the user a collection view as an option to choose the correct image.
Hope it may helpful
The current last frame may not be the one that triggered the successful image recognition, so you may want to hold to the pixelBuffer that triggered it.
Then you can get the UIImage from the pixelBuffer like so:
import VideoToolbox
var cgImage: CGImage?
VTCreateCGImageFromCVPixelBuffer(matchingPixelBuffer, options: nil, imageOut: &cgImage)
let uiImage = UIImage(cgImage: cgImage)

CGImage from CAMetalLayer

I've set up a custom iOS UIView that displays a list of layers, one of which is a CAMetalLayer. I was trying to paint that same content to an image in a CGContext. I tried to extract the layer's contents in order to build a CGImage, but couldn't find a good path.
First, I've tried to asynchronously extract the contents of the framebuffer with a addCompletedHandler() callback, but the async nature of the callback didn't behave nicely with CGContext.
I've also tried extracting the contents of the CAMetalLayer by using the layer.contents property, but apparently the type of the contents is CAImageQueue, which has no documentation or exposed through the API.
I've also tried rendering the layer directly to the CGContext like so:
layer.render(in: cgContext)
but that didn't bear any results either.
Lastly all I'd need to do is retrieve the bytes making up the layer texture, and I could build my own CGImage from scratch.

iOS Swift - How to draw simple 2d Image with metal?

In my project i need to draw 2D image in realtime corresponding with UIGestureRecognizer updates .
The image would be the same UIImage , drawing on various positions.
let arrayOfPositions = [pos1,pos2,pos3]
And i need to transfer the result image on MetalLayer into Single UIImage , the result image will have the same size as device's screen.
something similar to
let resultImage = UIGraphicsGetImageFromCurrentImageContext()
I'm new to Metal and after watching realm's video and apple documentation my sanity went to chaos . Most tutorials focus on 3D rendering which is beyond my need (and my knowledge)
If anyone would help me a simple code how to draw UIImage in to MetalLayer , then convert the whole into single UIImage as a result ? thanks

How do I convert a CVPixelBuffer / CVImageBuffer to Data?

My camera app captures a photo, enhances it in a certain way, and saves it.
To do so, I get the input image from the camera in the form of a CVPixelBuffer (wrapped in a CMSampleBuffer). I perform some modifications on the pixel buffer, and I then want to convert it to a Data object. How do I do this?
Note that I don't want to convert the pixel buffer / image buffer to a UIImage or CGImage since those don't have metadata (like EXIF). I need a Data object. How do I get one from a CVPixelBuffer / CVImageBuffer?
PS: I tried calling AVCapturePhotoOutput.jpegPhotoDataRepresentation() but that fails saying "Not a JPEG sample buffer". Which makes sense since the CMSampleBuffer contains a pixel buffer (a bitmap), not a JPEG.
As you said that you are able to get the CMSampleBuffer, then you can get it using
NSData *myata = [AVCaptureStillImageOutput jpegStillImageNSDataRepresentation:<your_cmsample_buffer_obj‌​>];

Resources