I am trying to develop an image segmentation app and process the live camera view in my coreml model. However I see some slowness on the output. Camera view with masked prediction is slower. Below is my vision manager class to predict the pixelbuffer and function calling this class to convert to colors before proceed to camera output. Anyone facing this issue before? Do you see an error in my code causing slowness?
Vision Manager Class:
class VisionManager: NSObject {
static let shared = VisionManager()
static let MODEL = ba_224_segm().model
private lazy var predictionRequest: VNCoreMLRequest = {
do{
let model = try VNCoreMLModel(for: VisionManager.MODEL)
let request = VNCoreMLRequest(model: model)
request.imageCropAndScaleOption = VNImageCropAndScaleOption.centerCrop
return request
} catch {
fatalError("can't load Vision ML Model")
}
}()
func predict(pixelBuffer: CVImageBuffer, sampleBuffer: CMSampleBuffer, onResult: ((_ observations: [VNCoreMLFeatureValueObservation]) -> Void)) {
var requestOptions: [VNImageOption: Any] = [:]
if let cameraIntrinsicData = CMGetAttachment(sampleBuffer, key: kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix, attachmentModeOut: nil) {
requestOptions = [.cameraIntrinsics: cameraIntrinsicData]
}
let handler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: requestOptions)
do {
try handler.perform([predictionRequest])
} catch {
print("error handler")
}
guard let observations = predictionRequest.results as? [VNCoreMLFeatureValueObservation] else {
fatalError("unexpected result type from VNCoreMLRequest")
}
onResult(observations)
}
Predicted Camera Output function:
func handleCameraOutput(pixelBuffer: CVImageBuffer, sampleBuffer: CMSampleBuffer, onFinish: #escaping ((_ image: UIImage?) -> Void)) {
VisionManager.shared.predict(pixelBuffer: pixelBuffer, sampleBuffer: sampleBuffer) { [weak self ] (observations) in
if let multiArray: MLMultiArray = observations[0].featureValue.multiArrayValue {
mask = maskEdit.maskToRGBA(maskArray: MultiArray<Float32>(multiArray), rgba: (Float(r),Float(g),Float(b),Float(a)))!
maskInverted = maskEdit.maskToRGBAInvert(maskArray: MultiArray<Float32>(multiArray), rgba: (r: 1.0, g: 1.0, b:1.0, a: 0.4))!
let image = maskEdit.mergeMaskAndBackground( invertedMask: maskInverted, mask: mask, background: pixelBuffer, size: Int(size))
DispatchQueue.main.async {
onFinish(image)
}
}
}
I call these models under viwDidAppear as below:
CameraManager.shared.setDidOutputHandler { [weak self] (output, pixelBuffer, sampleBuffer, connection) in
self!.maskColor.getRed(&self!.r, green:&self!.g, blue:&self!.b, alpha:&self!.a)
self!.a = 0.5
self?.handleCameraOutput(pixelBuffer: pixelBuffer, sampleBuffer: sampleBuffer, onFinish: { (image) in
self?.predictionView.image = image
})
}
It takes time for your model to perform the segmentation, and then it takes time to convert the output into an image. There is not much you can do to make this delay shorter, except for making the model smaller and making sure the output -> image conversion code is as fast as possible.
I have found out my issue about not using different thread. Since I am new developer I don't know such details and still learning thanks to experts in the field and their shared knowledge. Please see my old and new captureOutput function. To use a different thread solved my problem:
old status:
public func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
else { return }
self.handler?(output, pixelBuffer, sampleBuffer, connection)
self.onCapture?(pixelBuffer, sampleBuffer)
self.onCapture = nil
}
and new status:
public func captureOutput(_ output: AVCaptureOutput,
didOutput sampleBuffer: CMSampleBuffer,
from connection: AVCaptureConnection) {
if currentBuffer == nil{
let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
currentBuffer = pixelBuffer
DispatchQueue.global(qos: .userInitiated).async {
self.handler?(output, self.currentBuffer!, sampleBuffer, connection)
self.currentBuffer = nil
}
}
}
Related
I'm trying to captur image from camera preview but can't get image from preview layer. What I want to do is kinda similar to iOS 15 OCR mode in Photo app which processes image during camera preview, does not require user to take a shot nor start recording video, just process image in preview. I looked into docs and searched on net but could not find any useful info.
What I tried was, save previewLayer and call previewLayer.draw(in: context) periodically. But the image drawn in the context is blank. Now I wonder if it is possible first of all.
There might be some security issue there to restrict processing image in camera preview that only genuine app is allowed to access I guess, so I probably need to find other ways.
Please enlighten me if any workaround.
Thanks!
Ok. With MadProgrammer's help I got things working properly. Anurag Ajwani's site is very helpful.
Here is my simple snippet to capture video frames. You need to ensure permissions before CameraView gets instantiated.
class VideoCapture: NSObject, AVCaptureVideoDataOutputSampleBufferDelegate {
//private var previewLayer: AVCaptureVideoPreviewLayer? = nil
private var session: AVCaptureSession? = nil
private var videoOutput: AVCaptureVideoDataOutput? = nil
private var videoHandler: ((UIImage) -> Void)?
override init() {
super.init()
let deviceSession = AVCaptureDevice.DiscoverySession(deviceTypes: [.builtInDualWideCamera, .builtInWideAngleCamera], mediaType: .video, position: .back)
guard deviceSession.devices.count > 0 else { return }
if let input = try? AVCaptureDeviceInput(device: deviceSession.devices.first!) {
let session = AVCaptureSession()
session.addInput(input)
let videoOutput = AVCaptureVideoDataOutput()
videoOutput.videoSettings = [(kCVPixelBufferPixelFormatTypeKey as NSString): NSNumber(value: kCVPixelFormatType_32BGRA)] as [String:Any]
videoOutput.setSampleBufferDelegate(self, queue: DispatchQueue(label: "my.image.handling.queue"))
videoOutput.alwaysDiscardsLateVideoFrames = true
if session.canAddOutput(videoOutput) {
session.sessionPreset = .high
session.addOutput(videoOutput)
self.videoOutput = videoOutput
}
for connection in videoOutput.connections {
if connection.isVideoOrientationSupported {
connection.videoOrientation = .portrait
}
}
session.commitConfiguration()
self.session = session
/*
self.previewLayer = AVCaptureVideoPreviewLayer(session: session)
if let previewLayer = self.previewLayer {
previewLayer.videoGravity = .resizeAspectFill
layer.insertSublayer(previewLayer, at: 0)
CameraPreviewView.initialized = true
}
*/
}
}
func startCapturing(_ videoHandler: #escaping (UIImage) -> Void) -> Void {
if let session = session {
session.startRunning()
}
self.videoHandler = videoHandler
}
// AVCaptureVideoDataOutputSampleBufferDelegate
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
guard let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else {
debugPrint("unable to get video frame")
return
}
//print("got video frame")
if let videoHandler = self.videoHandler {
let rect = CGRect(x: 0, y: 0, width: CVPixelBufferGetWidth(imageBuffer), height: CVPixelBufferGetHeight(imageBuffer))
let ciImage = CIImage.init(cvImageBuffer: imageBuffer)
let ciContext = CIContext()
let cgImage = ciContext.createCGImage(ciImage, from: rect)
guard cgImage != nil else {return }
let uiImage = UIImage(cgImage: cgImage!)
videoHandler(uiImage)
}
}
}
struct CameraView: View {
#State var capturedVideo: UIImage? = nil
let videoCapture = VideoCapture()
var body: some View {
VStack {
ZStack(alignment: .center) {
if let capturedVideo = self.capturedVideo {
Image(uiImage: capturedVideo)
.resizable()
.scaledToFill()
}
}
}
.background(Color.black)
.onAppear {
self.videoCapture.startCapturing { uiImage in
self.capturedVideo = uiImage
}
}
}
I'm trying to detect rectangle from live preview layer, but not able to detect all rectangles.
What I'm doing
To setup Vision Request
func setupVision() {
let rectanglesDetectionRequest = VNDetectRectanglesRequest(completionHandler: self.handleRectangles)
rectanglesDetectionRequest.maximumObservations = 0
rectanglesDetectionRequest.quadratureTolerance = 45.0
rectanglesDetectionRequest.minimumAspectRatio = 0.64
self.requests = [rectanglesDetectionRequest]
}
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else {
return
}
let exifOrientation = self.exifOrientationFromDeviceOrientation()
DispatchQueue.main.asyncAfter(deadline: .now() + 2) {
var requestOptions: [VNImageOption : Any] = [:]
if let cameraIntrinsicData = CMGetAttachment(sampleBuffer, kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix, nil) {
requestOptions = [.cameraIntrinsics: cameraIntrinsicData]
}
DispatchQueue.global(qos: .background).async {
let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, orientation:exifOrientation, options: requestOptions)
do {
try imageRequestHandler.perform(self.requests)
} catch {
print(error)
}
}
}
var arr = Array<VNTrackRectangleRequest>()
for obs in self.rectanglesss{
let trackRequest = VNTrackRectangleRequest(rectangleObservation: obs, completionHandler: self.handleSequenceRequestUpdate)
trackRequest.trackingLevel = .accurate
arr.append(trackRequest)
}
do {
try self.sequenceHandler.perform(arr, on: pixelBuffer, orientation: exifOrientation)
} catch {
print(error)
}
}
Can someone help me to figure out what I'm doing wrong ?
When I try with Right angle sometimes it detect few of them, with Acute angle its detect only near by 2-3 rectangles. Here I try with SET cards, I added 2images of what I'm getting.
Result
Try iphone to see them via bird view? And get a different table with non-white color
I suggest you put your phone flat and shoot ,then setting minimumConfidence
Use this...
https://developer.apple.com/documentation/vision/vndetectrectanglesrequest/2875373-maximumobservations
let request = VNDetectRectanglesRequest { (request, error) in
// Your completion handler code
}
request.maximumObservations = 2
I need to detect real face from iPhone front camera. So I have used vision framework to achieve it. But it is detecting the face from static image (human photo) also which is not required. Here is my code snippet.
class ViewController {
func sessionPrepare() {
session = AVCaptureSession()
guard let session = session, let captureDevice = frontCamera else { return }
do {
let deviceInput = try AVCaptureDeviceInput(device: captureDevice)
session.beginConfiguration()
if session.canAddInput(deviceInput) {
session.addInput(deviceInput)
}
let output = AVCaptureVideoDataOutput()
output.videoSettings = [
String(kCVPixelBufferPixelFormatTypeKey) : Int(kCVPixelFormatType_420YpCbCr8BiPlanarFullRange)
]
output.alwaysDiscardsLateVideoFrames = true
if session.canAddOutput(output) {
session.addOutput(output)
}
session.commitConfiguration()
let queue = DispatchQueue(label: "output.queue")
output.setSampleBufferDelegate(self, queue: queue)
print("setup delegate")
} catch {
print("can't setup session")
}
}
}
}
It is also detecting face from a static image if I place it in front of camera.
extension ViewController: AVCaptureVideoDataOutputSampleBufferDelegate {
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
let attachments = CMCopyDictionaryOfAttachments(kCFAllocatorDefault, sampleBuffer, kCMAttachmentMode_ShouldPropagate)
let ciImage = CIImage(cvImageBuffer: pixelBuffer!, options: attachments as! [String : Any]?)
let ciImageWithOrientation = ciImage.applyingOrientation(Int32(UIImageOrientation.leftMirrored.rawValue))
detectFace(on: ciImageWithOrientation)
}
}
func detectFace(on image: CIImage) {
try? faceDetectionRequest.perform([faceDetection], on: image)
if let results = faceDetection.results as? [VNFaceObservation] {
if !results.isEmpty {
faceLandmarks.inputFaceObservations = results
detectLandmarks(on: image)
DispatchQueue.main.async {
self.shapeLayer.sublayers?.removeAll()
}
}
}
}
func detectLandmarks(on image: CIImage) {
try? faceLandmarksDetectionRequest.perform([faceLandmarks], on: image)
if let landmarksResults = faceLandmarks.results as? [VNFaceObservation] {
for observation in landmarksResults {
DispatchQueue.main.async {
if let boundingBox = self.faceLandmarks.inputFaceObservations?.first?.boundingBox {
let faceBoundingBox = boundingBox.scaled(to: self.view.bounds.size)
//different types of landmarks
let faceContour = observation.landmarks?.faceContour
let leftEye = observation.landmarks?.leftEye
let rightEye = observation.landmarks?.rightEye
let nose = observation.landmarks?.nose
let lips = observation.landmarks?.innerLips
let leftEyebrow = observation.landmarks?.leftEyebrow
let rightEyebrow = observation.landmarks?.rightEyebrow
let noseCrest = observation.landmarks?.noseCrest
let outerLips = observation.landmarks?.outerLips
}
}
}
}
}
So is there any way to get it done using only from real time camera detection? I would be very grateful for your help and advice
I need to do the same and after a lot of experiments finally, I have found this
https://github.com/syaringan357/iOS-MobileFaceNet-MTCNN-FaceAntiSpoofing
It is detecting only Live camera faces. But it is not using Vision framework.
I am implementing a camera application. I initiate the camera as follows:
let input = try AVCaptureDeviceInput(device: captureDevice!)
captureSession = AVCaptureSession()
captureSession?.addInput(input)
videoPreviewLayer = AVCaptureVideoPreviewLayer(session: captureSession!)
videoPreviewLayer?.videoGravity = AVLayerVideoGravity.resizeAspectFill
videoPreviewLayer?.frame = view.layer.bounds
previewView.layer.insertSublayer(videoPreviewLayer!, at: 0)
Now I want to have a small rectangle on top of the preview layer. In that rectangle area, I want to zoom a specific area from the preview layer. To do it, I add a new UIView on top of other views, but I don't know how to display a specific area from the previewer (e.g. zoom factor = 2).
The following figure shows what I want to have:
How can I do it?
Finally, I found a solution.
The idea is to extract the real-time frames from the output of the camera, then use an UIImage view to show the enlarged frame. Following is the portion of code to add an video output:
let videoOutput = AVCaptureVideoDataOutput()
videoOutput.setSampleBufferDelegate(self, queue: DispatchQueue(label: "sample buffer"))
guard captureSession.canAddOutput(videoOutput) else { return }
captureSession.addOutput(videoOutput)
and we need to implement a delegate function:
func captureOutput(_ captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) {
guard let uiImage = imageFromSampleBuffer(sampleBuffer: sampleBuffer) else { return }
DispatchQueue.main.async { [unowned self] in
self.delegate?.captured(image: uiImage)
}
}
private func imageFromSampleBuffer(sampleBuffer: CMSampleBuffer) -> UIImage? {
guard let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return nil }
let ciImage = CIImage(cvPixelBuffer: imageBuffer)
guard let cgImage = context.createCGImage(ciImage, from: ciImage.extent) else { return nil }
return UIImage(cgImage: cgImage)
}
The code was taken from this article.
I am creating a custom camera with filters. When I add the following line it crashes without showing any exception.
//Setting video output
func setupBuffer() {
videoBuffer = AVCaptureVideoDataOutput()
videoBuffer?.alwaysDiscardsLateVideoFrames = true
videoBuffer?.videoSettings = [(kCVPixelBufferPixelFormatTypeKey as NSString): NSNumber(value: kCVPixelFormatType_32RGBA)]
videoBuffer?.setSampleBufferDelegate(self, queue: DispatchQueue.main)
captureSession?.addOutput(videoBuffer)
}
public func captureOutput(_ captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) {
if connection.videoOrientation != .portrait {
connection.videoOrientation = .portrait
}
guard let image = GMVUtility.sampleBufferTo32RGBA(sampleBuffer) else {
print("No Image 😂")
return
}
pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
ciImage = CIImage(cvImageBuffer: pixelBuffer!, options: CMCopyDictionaryOfAttachments(kCFAllocatorDefault, sampleBuffer, kCMAttachmentMode_ShouldPropagate)as! [String : Any]?)
CameraView.filter = CIFilter(name: "CIPhotoEffectProcess")
CameraView.filter?.setValue(ciImage, forKey: kCIInputImageKey)
let cgimg = CameraView.context.createCGImage(CameraView.filter!.outputImage!, from: ciImage.extent)
DispatchQueue.main.async {
self.preview.image = UIImage(cgImage: cgimg!)
}
}
But it's crashing on -
guard let image = GMVUtility.sampleBufferTo32RGBA(sampleBuffer) else {
print("No Image 😂")
return
}
When I pass image which is created from CIImage, it doesn't recognize the face in the image.
Complete code file is https://www.dropbox.com/s/y1ewd1sh18h3ezj/CameraView.swift.zip?dl=0
1) Create separate queue for buffer.
fileprivate var videoDataOutputQueue = DispatchQueue(label: "VideoDataOutputQueue")
2) Setup buffer with this
let videoBuffer = AVCaptureVideoDataOutput()
videoBuffer?.alwaysDiscardsLateVideoFrames = true
videoBuffer?.videoSettings = [(kCVPixelBufferPixelFormatTypeKey as NSString): NSNumber(value: kCVPixelFormatType_32BGRA)]
videoBuffer?.setSampleBufferDelegate(self, queue: videoDataOutputQueue ) //
captureSession?.addOutput(videoBuffer)