How to superimpose views over each captured frame inside CVImageBuffer, realtime not post process - ios

I have managed to setup a basic AVCaptureSession which records a video and saves it on device by using AVCaptureFileOutputRecordingDelegate. I have been searching through docs to understand how we can add statistics overlays on top of the video which is being recorded.
i.e.
As you can see in the above image. I have multiple overlays on top of video preview layer. Now, when I save my video output I would like to compose those views onto the video as well.
What have I tried so far?
Honestly, I am just jumping around on internet to find a reputable blog explaining how one would do this. But failed to find one.
I have read few places that one could render text layer overlays as described in following post by creating CALayer and adding it as a sublayer.
But, what about if I want to render MapView on top of the video being recorded. Also, I am not looking for screen capture. Some of the content on the screen will not be part of the final recording so I want to be able to cherry pick view that will be composed.
What am I looking for?
Direction.
No straight up solution
Documentation link and class names I should be reading more about to create this.
Progress So Far:
I have managed to understand that I need to get hold of CVImageBuffer from CMSampleBuffer and draw text over it. There are things still unclear to me whether it is possible to somehow overlay MapView over the video that is being recorded.

The best way that helps you to achieve your goal is to use a Metal framework. Using a Metal camera is good for minimising the impact on device’s limited computational resources. If you are trying to achieve the lowest-overhead access to camera sensor, using a AVCaptureSession would be a really good start.
You need to grab each frame data from CMSampleBuffer (you're right) and then to convert a frame to a MTLTexture. AVCaptureSession will continuously send us frames from device’s camera via a delegate callback.
All available overlays must be converted to MTLTextures too. Then you can composite all MTLTextures layers with over operation.
So, here you'll find all necessary info in four-part Metal Camera series.
And here's a link to a blog: About Compositing in Metal.
Also, I'd like to publish code's excerpt (working with AVCaptureSession in Metal):
import Metal
guard let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else {
// Handle an error here.
}
// Texture cache for converting frame images to textures
var textureCache: CVMetalTextureCache?
// `MTLDevice` for initializing texture cache
var metalDevice = MTLCreateSystemDefaultDevice()
guard
let metalDevice = metalDevice
where CVMetalTextureCacheCreate(kCFAllocatorDefault, nil, metalDevice, nil, &textureCache) == kCVReturnSuccess
else {
// Handle an error (failed to create texture cache)
}
let width = CVPixelBufferGetWidth(imageBuffer)
let height = CVPixelBufferGetHeight(imageBuffer)
var imageTexture: CVMetalTexture?
let result = CVMetalTextureCacheCreateTextureFromImage(kCFAllocatorDefault, textureCache.takeUnretainedValue(), imageBuffer, nil, pixelFormat, width, height, planeIndex, &imageTexture)
// `MTLTexture` is in the `texture` variable now.
guard
let unwrappedImageTexture = imageTexture,
let texture = CVMetalTextureGetTexture(unwrappedImageTexture),
result == kCVReturnSuccess
else {
throw MetalCameraSessionError.failedToCreateTextureFromImage
}
And here you can find a final project on a GitHub: MetalRenderCamera

Related

Smooth Shading in SceneKit

I'm currently working on an iPadOS app that uses SceneKit to render some 3D models, nothing too fancy but I hit a big of a snag when it comes to shading these models...
Basically what I do is just set up a SceneKit scene (using pretty much all the default settings, I don't even have a camera) and instantiate the 3D objects from some .obj files I have, like so.
let url = <URL of my .obj file>
let asset = MDLAsset(url: url)
let object = asset.object(at: 0)
let node = SCNNode(mdlObject: object)
let texture = UIImage(data: try Data(contentsOf: <URL of the texture>))
let material = SCNMaterial()
material.diffuse.contents = texture
node.geometry!.materials = [material]
self.sceneView.scene!.rootNode.addChildNode(node)
The textures are loaded manually because unfortunately that's how the client set up the files for me to use. The code works fine and I can see the mesh and its texture, however it also looks like this
As you can see the shading is not smooth at all... and I have no idea how to fix it.
The client has been bothering me to implement Phong shading, and according to Apple's Documentation this is how you do it.
material.lightingModel = .phong
Unfortunately that's still what it looks like with Phong enabled. I'm an absolute beginner when it comes to 3D rendering so this might be laughably easy but I swear I cannot figure out how to get a smoother shading on this model.
I have tried looking left and right and the only thing that has had any kind of noticeable result was to use subdivisionLevel to increase the actual faces in the geometry but this does not scale well as the actual app needs to load a ton of these meshes and it runs out of memory fast even when subdivision set to just 1
Surely there must be a way to smooth those shadows without improving the actual geometry?
Thanks in advance!
Shading requires having correct normals for your geometry. Have you tried using Model IO to generate them?
https://developer.apple.com/documentation/modelio/mdlmesh/1391284-addnormals

Retrieve the last frame of live camera preview in swift

I have an AR app where the view is constantly showing what the back camera is seeing and sending each frame for analysis to VisionRequest.
When the object was identified, I would like to capture that particular last frame and save it as a regular UIImage and send it down the segue chain to the final view controller where I display that last frame. I have issues capturing that last frame and showing it.
Here is what I tried so far:
When the image is recognized with a high-enough confidence, I attempt to retrieve the current last frame from the CVPixelBuffer and save it in a local variable that is later passed in a segue to subsequent view controllers.
Is this the correct way of doing it? or do I have to add a second output to the session (a photo output in addition to a video data output) ?
//attempting to get the current last frame of captured video
let attachments = CMCopyDictionaryOfAttachments(allocator: kCFAllocatorDefault, target: self.currentlyAnalyzedPixelBuffer!, attachmentMode: kCMAttachmentMode_ShouldPropagate)
let ciImage = CIImage(cvImageBuffer: self.currentlyAnalyzedPixelBuffer!, options: attachments as? [CIImageOption : Any])
self.image = UIImage(ciImage: ciImage)
Actually, there are more chances that you get not exact output you needed. Because You never know that last frame captured has exact same you wanted. There might be possibilities where you can have false results like the camera is in motion and frame you got is blurred or not properly as per your need.
May be I am wrong with it. But my suggestion or solution would  keep array of 10 images or pixel buffers and store last 10 Frames or pixel buffers. When you get your object identified from vision check that array again and get the highest quality (confidence) frame or you may show the user a collection view as an option to choose the correct image.
Hope it may helpful
The current last frame may not be the one that triggered the successful image recognition, so you may want to hold to the pixelBuffer that triggered it.
Then you can get the UIImage from the pixelBuffer like so:
import VideoToolbox
var cgImage: CGImage?
VTCreateCGImageFromCVPixelBuffer(matchingPixelBuffer, options: nil, imageOut: &cgImage)
let uiImage = UIImage(cgImage: cgImage)

How to add overlay on AVCaptureVideoPreviewLayer?

I am building an iOS app using Swift which requires QR code scanner functionality.
I have implemented a QR code scanner using AVFoundation, right now my capture screen looks same as a video recording screen i.e. AVCaptureVideoPreviewLayer shows what is being captured by the camera.
But since it is a QR code scanner and not a regular image or video capture, I would like my VideoPreviewLayer to look like this:
I understand this can be achieved by adding another VideoPreviewLayer on top of one VideoPreviewLayer.
My questions are:
How do I add the borders only to the edges in the upper (or smaller) preview layer?
How do I change the brightness level for the VideoPreviewLayer in the background?
How to ignore media captured by the the background layer?
You shouldn't use another VideoPreviewLayer. Instead you should add two sublayers - one for the masked background area and one for the corners.
Have a look at the source code in this repo for an example.
To limit the video capturing to the masked area you have to set the rectOfInterest of your AVCaptureMetadataOutput.
let rectOfInterest = videoPreviewLayer.metadataOutputRectConverted(fromLayerRect: rect)
metadataOutput.rectOfInterest = rectOfInterest
Long story short: you can use AVCaptureVideoPreviewLayer for video capturing, create another CALayer() and use layer.insertSublayer(..., above: ...) to insert your "custom" layer above the video layer, and by custom I mean just yet another CALayer with let say
layer.contents = spinner.cgImage
Here's a bit more detailed instructions

Capturing a preview image with AVCaptureStillImageOutput

Before stackoverflow members answer with "You shouldn't. It's a privacy violation" let me counter with why there is a legitimate need for this.
I have a scenario where a user can change the camera device by swiping left and right. In order to make this animation not look like absolute crap, I need to grab a freeze frame before making this animation.
The only sane answer I have seen is capturing the buffer of AVCaptureVideoDataOutput, which is fine, but now I can't let the user take the video/photo with kCVPixelFormatType_420YpCbCr8BiPlanarFullRange which is a nightmare trying to get a CGImage from with CGBitmapContextCreate See How to convert a kCVPixelFormatType_420YpCbCr8BiPlanarFullRange buffer to UIImage in iOS
When capturing a still photo are there any serious quality considerations when using AVCaptureVideoDataOutput instead of AVCaptureStillImageOutput? Since the user will be taking both video and still photos (not just freeze-frame preview stills) Also, can some one "Explain it to me like I'm five" about the differences between kCVPixelFormatType_420YpCbCr8BiPlanarFullRange/kCVPixelFormatType_32BGRA besides one doesn't work on old hardware?
I don't think there is a way to directly capture a preview image using AVFoundation. You could however take a capture the preview layer by doing the following:
UIGraphicsBeginImageContext(previewView.frame.size);
[previewLayer renderInContext:UIGraphicsGetCurrentContext()];
UIImage *image = UIGraphicsGetImageFromCurrentImageContext();
UIGraphicsEndImageContext();
Where previewView.layer is the
previewLayer is the AVCaptureVideoPreviewLayer added to the previewView. "image" is rendered from this layer and can be used for your animation.

How do you get an UIImage from an AVCaptureVideoPreviewLayer?

I have already tried this solution CGImage (or UIImage) from a CALayer
However I do not get anything.
Like the question says, I am trying to get an UIImage from the preview layer of the camera. I know I can either capture a still image or use the outputsamplebuffer but my session quality video is set to photo so either of these 2 aproaches are slow and will give me a big image.
So what I thought could work is to get the image directly from the preview layer, since this has exactly the size I need and the operations have already been made on it. I just dont know how to get this layer to draw into my context so that I can get it as an UIImage.
Perhaps another solution would be to use OpenGL to get this layer directly as a texture?
Any help will be appreciated, thanks.
Quoting Apple from this Technical Q&A:
A: Starting from iOS 7, the UIView class provides a method
-drawViewHierarchyInRect:afterScreenUpdates:, which lets you render a snapshot of the complete view hierarchy as visible onscreen into a
bitmap context. On iOS 6 and earlier, how to capture a view's drawing
contents depends on the underlying drawing technique. This new method
-drawViewHierarchyInRect:afterScreenUpdates: enables you to capture the contents of the receiver view and its subviews to an image
regardless of the drawing techniques (for example UIKit, Quartz,
OpenGL ES, SpriteKit, AV Foundation, etc) in which the views are
rendered
In my experience regarding AVFoundation is not like that, if you use that method on view that host a preview layer you will only obtain the content of the view without the image of the preview layer. Using the -snapshotViewAfterScreenUpdates: will return a UIView that host a special layer. If you try to make an image from that view you won't see nothing.
The only solution I know are AVCaptureVideoDataOutput and AVCaptureStillImageOutput. Each one has its own limit. The first one can't work simultaneously with a AVCaptureMovieFileOutput acquisition, the latter makes the shutter noise.

Resources