Issues with cropping an image - ios

I am having difficulties with cropping an image; two things are happening currently that I wish to improve:
1) The quality of the photo degrades once it is cropped.
2) The view of the orientation is not correct after the photo is taken.
In Summary:
What is occurring, is the photo quality after cropping is not of correct standard and when the image appears in the ImageView it is rotated 90 degrees; why are these occurring? I am trying to crop an image based on the view of the captured stream.
Here is the cropping of the image:
func crop(capture : UIImage) -> UIImage {
let crop = cameraView.bounds
//CGRect(x:0,y:0,width:50,height:50)
let cgImage = capture.cgImage!.cropping(to: crop)
let image : UIImage = UIImage(cgImage: cgImage!)
return image
}
Here is where I am calling the crop
func capture(_ captureOutput: AVCapturePhotoOutput, didFinishProcessingPhotoSampleBuffer photoSampleBuffer: CMSampleBuffer?, previewPhotoSampleBuffer: CMSampleBuffer?, resolvedSettings: AVCaptureResolvedPhotoSettings, bracketSettings: AVCaptureBracketedStillImageSettings?, error: Error?) {
if let photoSampleBuffer = photoSampleBuffer {
let photoData = AVCapturePhotoOutput.jpegPhotoDataRepresentation(forJPEGSampleBuffer: photoSampleBuffer, previewPhotoSampleBuffer: previewPhotoSampleBuffer)
var imageTaken = UIImage(data: photoData!)
//post photo
let croppedImage = self.crop(capture: imageTaken!)
imageTaken = croppedImage
self.imageView.image = imageTaken
// UIImageWriteToSavedPhotosAlbum(imageTaken!, nil, nil, nil)
}
}
& Here is the whole class
import UIKit
import AVFoundation
class CameraVC: UIViewController, UIImagePickerControllerDelegate, UINavigationControllerDelegate, AVCapturePhotoCaptureDelegate {
var captureSession : AVCaptureSession?
var stillImageOutput : AVCapturePhotoOutput?
var previewLayer : AVCaptureVideoPreviewLayer?
#IBOutlet weak var imageView: UIImageView!
#IBOutlet var cameraView: UIView!
override func viewDidLoad() {
super.viewDidLoad()
// Do any additional setup after loading the view.
}
override func viewDidLayoutSubviews() {
super.viewDidLayoutSubviews()
previewLayer?.frame = cameraView.bounds
}
override func viewWillAppear(_ animated: Bool) {
super.viewWillAppear(animated)
captureSession = AVCaptureSession()
captureSession?.sessionPreset = AVCaptureSessionPreset1920x1080
stillImageOutput = AVCapturePhotoOutput()
let backCamera = AVCaptureDevice.defaultDevice(withMediaType: AVMediaTypeVideo)
do {
let input = try AVCaptureDeviceInput(device: backCamera)
if (captureSession?.canAddInput(input))!{
captureSession?.addInput(input)
if (captureSession?.canAddOutput(stillImageOutput) != nil){
captureSession?.addOutput(stillImageOutput)
previewLayer = AVCaptureVideoPreviewLayer(session: captureSession)
previewLayer?.videoGravity = AVLayerVideoGravityResizeAspect
previewLayer?.connection.videoOrientation = AVCaptureVideoOrientation.portrait
cameraView.layer.addSublayer(previewLayer!)
captureSession?.startRunning()
let captureVideoLayer: AVCaptureVideoPreviewLayer = AVCaptureVideoPreviewLayer.init(session: captureSession!)
captureVideoLayer.frame = self.cameraView.bounds
captureVideoLayer.videoGravity = AVLayerVideoGravityResizeAspectFill
self.cameraView.layer.addSublayer(captureVideoLayer)
}
}
} catch {
print("An error has occured")
}
}
#IBAction func takePhoto(_ sender: UIButton) {
didPressTakePhoto()
}
func didPressTakePhoto(){
if let videoConnection = stillImageOutput?.connection(withMediaType: AVMediaTypeVideo) {
videoConnection.videoOrientation = AVCaptureVideoOrientation.portrait
let settingsForCapture = AVCapturePhotoSettings()
settingsForCapture.flashMode = .auto
settingsForCapture.isAutoStillImageStabilizationEnabled = true
settingsForCapture.isHighResolutionPhotoEnabled = false
stillImageOutput?.capturePhoto(with: settingsForCapture, delegate: self)
}
}
func capture(_ captureOutput: AVCapturePhotoOutput, didFinishProcessingPhotoSampleBuffer photoSampleBuffer: CMSampleBuffer?, previewPhotoSampleBuffer: CMSampleBuffer?, resolvedSettings: AVCaptureResolvedPhotoSettings, bracketSettings: AVCaptureBracketedStillImageSettings?, error: Error?) {
if let photoSampleBuffer = photoSampleBuffer {
let photoData = AVCapturePhotoOutput.jpegPhotoDataRepresentation(forJPEGSampleBuffer: photoSampleBuffer, previewPhotoSampleBuffer: previewPhotoSampleBuffer)
var imageTaken = UIImage(data: photoData!)
//post photo
let croppedImage = self.crop(capture: imageTaken!)
imageTaken = croppedImage
self.imageView.image = imageTaken
// UIImageWriteToSavedPhotosAlbum(imageTaken!, nil, nil, nil)
}
}
func crop(capture : UIImage) -> UIImage {
let crop = cameraView.bounds
//CGRect(x:0,y:0,width:50,height:50)
let cgImage = capture.cgImage!.cropping(to: crop)
let image : UIImage = UIImage(cgImage: cgImage!)
return image
}
}

An alternative is to use a Core Image filter called CIPerspectiveCorrect. Since it uses a CIImage - which isn't a true image but a "recipe" for an image - it doesn't suffer from degradation.
Basically, turn your UIImage/CGImage into a CIImage, pick any 4 points in it, and crop. It needn't be a parallelogram (or CGRect), just 4 points. There are two differences of note when using CI filters:
Instead of using a CGRect, you use a CIVector. A vector can have 2, 3, 4, even more parameters depending on the filter. In this case you want 4 CIVectors with 2 parameters each, corresponding to top left (TL), top right (TR), bottom left (BL), and bottom right (BR).
CI images have their point of origin (X/Y == 0/0) at their bottom left, not top left. This basically means your Y coordinate is upside down from CG or UI images.
Here's some sample code. First, some sample declarations, including a CI context:
let uiTL = CGPoint(x: 50, y: 50)
let uiTR = CGPoint(x: 75, y: 75)
let uiBL = CGPoint(x: 100, y: 300)
let uiBR = CGPoint(x: 25, y: 200)
var ciImage:CIImage!
var ctx:CIContext!
#IBOutlet weak var imageView: UIImageView!
In viewDidLoad we set the context and get our CIImage from the UIImageView:
override func viewDidLoad() {
super.viewDidLoad()
ctx = CIContext(options: nil)
ciImage = CIImage(image: imageView.image!)
}
UIImageViews have a frame or CGRect, and UIImages have a size or CGSize. CIImages have an extent, which is basically your CGSize. But remember, the Y axis is flipped, and it is possible for an infinite extent! (This isn't thecae for a UIImage source though.) Here's some helper functions to convert things:
func createScaledPoint(_ pt:CGPoint) -> CGPoint {
let x = (pt.x / imageView.frame.width) * ciImage.extent.width
let y = (pt.y / imageView.frame.height) * ciImage.extent.height
return CGPoint(x: x, y: y)
}
func createVector(_ point:CGPoint) -> CIVector {
return CIVector(x: point.x, y: ciImage.extent.height - point.y)
}
func createPoint(_ vector:CGPoint) -> CGPoint {
return CGPoint(x: vector.x, y: ciImage.extent.height - vector.y)
}
Here's the actual call to CIPerspectiveCorrection. If I remember correctly, a change in Swift 3 is to use AnyObject. While more strongly-typed variables worked in previous versions of Swift, they cause dumps now:
func doPerspectiveCorrection(
_ image:CIImage,
context:CIContext,
topLeft:AnyObject,
topRight:AnyObject,
bottomRight:AnyObject,
bottomLeft:AnyObject)
-> UIImage {
let filter = CIFilter(name: "CIPerspectiveCorrection")
filter?.setValue(topLeft, forKey: "inputTopLeft")
filter?.setValue(topRight, forKey: "inputTopRight")
filter?.setValue(bottomRight, forKey: "inputBottomRight")
filter?.setValue(bottomLeft, forKey: "inputBottomLeft")
filter!.setValue(image, forKey: kCIInputImageKey)
let cgImage = context.createCGImage((filter?.outputImage)!, from: (filter?.outputImage!.extent)!)
return UIImage(cgImage: cgImage!)
}
Now that we have our CIImage, we create the four CIVectors. In this sample project I hard-coded the 4 CGPoints and chose to create the CIVector in viewWillLayoutSubviews, the earliest I have the UI frames:
override func viewWillLayoutSubviews() {
let ciTL = createVector(createScaledPoint(uiTL))
let ciTR = createVector(createScaledPoint(uiTR))
let ciBR = createVector(createScaledPoint(uiBR))
let ciBL = createVector(createScaledPoint(uiBL))
imageView.image = doPerspectiveCorrection(CIImage(image: imageView.image!)!,
context: ctx,
topLeft: ciTL,
topRight: ciTR,
bottomRight: ciBR,
bottomLeft: ciBL)
}
If you put this code into a project, load in your image into a UIImageView, figure out what 4 CGPoints you want and run this, you should see your capped image. Good luck!

Adding to #dfd answer. If you just don't want to use fancy core image filter follow the work around here.
the quality of the photo after I crop it is poor?
Though you use highest possible session preset AVCaptureSessionPreset1920x1080 you are converting the sample image buffer to JPEG format and then cropping it. So obviously there will be some loss on quality.
To get nice quality cropped image try session preset AVCaptureSessionPresetHigh to let AVFoundation decide the high quality video and dngPhotoDataRepresentation(forRawSampleBuffer:previewPhotoSampleBuffer:) to get the image in DNG format.
The view and the orientation is not correct after I take the photo?
didFinishProcessingPhotoSampleBuffer delegate gives your sample buffer which is always 90 degree rotated to left. So you might want to rotate 90 degree right by yourself.
init your UIImage with cgImage and orientation as right and scale as 1.0.
init(cgImage: CGImage, scale: CGFloat, orientation: UIImageOrientation)

Related

How do you apply Core Image filters to an onscreen image using Swift/MacOS or iOS and Core Image

Photos editing adjustments provides a realtime view of the applied adjustments as they are applied. I wasn't able to find any samples of how you do this. All the examples seems to show that you apply the filters through a pipeline of sorts and then take the resulting image and update the screen with the result. See code below.
Photos seems to show the adjustment applied to the onscreen image. How do they achieve this?
func editImage(inputImage: CGImage) {
DispatchQueue.global().async {
let beginImage = CIImage(cgImage: inputImage)
guard let exposureOutput = self.exposureFilter(beginImage, ev: self.brightness) else {
return
}
guard let vibranceOutput = self.vibranceFilter(exposureOutput, amount: self.vibranceAmount) else {
return
}
guard let unsharpMaskOutput = self.unsharpMaskFilter(vibranceOutput, intensity: self.unsharpMaskIntensity, radius: self.unsharpMaskRadius) else {
return
}
guard let sharpnessOutput = self.sharpenFilter(unsharpMaskOutput, sharpness: self.unsharpMaskIntensity) else {
return
}
if let cgimg = self.context.createCGImage(sharpnessOutput, from: vibranceOutput.extent) {
DispatchQueue.main.async {
self.cgImage = cgimg
}
}
}
}
OK, I just found the answer - use MTKView, which is working fine except for getting the image to fill the view correctly!
For the benefit of others here are the basics... I have yet to figure out how to position the image correctly in the view - but I can see the filter applied in realtime!
class ViewController: NSViewController, MTKViewDelegate {
....
#objc dynamic var cgImage: CGImage? {
didSet {
if let cgimg = cgImage {
ciImage = CIImage(cgImage: cgimg)
}
}
}
var ciImage: CIImage?
// Metal resources
var device: MTLDevice!
var commandQueue: MTLCommandQueue!
var sourceTexture: MTLTexture! // 2
let colorSpace = CGColorSpaceCreateDeviceRGB()
var context: CIContext!
var textureLoader: MTKTextureLoader!
override func viewDidLoad() {
super.viewDidLoad()
// Do view setup here.
let metalView = MTKView()
metalView.translatesAutoresizingMaskIntoConstraints = false
self.imageView.addSubview(metalView)
NSLayoutConstraint.activate([
metalView.bottomAnchor.constraint(equalTo: view.bottomAnchor),
metalView.trailingAnchor.constraint(equalTo: view.trailingAnchor),
metalView.leadingAnchor.constraint(equalTo: view.leadingAnchor),
metalView.topAnchor.constraint(equalTo: view.topAnchor)
])
device = MTLCreateSystemDefaultDevice()
commandQueue = device.makeCommandQueue()
metalView.delegate = self
metalView.device = device
metalView.framebufferOnly = false
context = CIContext()
textureLoader = MTKTextureLoader(device: device)
}
public func draw(in view: MTKView) {
if let ciImage = self.ciImage {
if let currentDrawable = view.currentDrawable {
let commandBuffer = commandQueue.makeCommandBuffer()
let inputImage = ciImage // 2
exposureFilter.setValue(inputImage, forKey: kCIInputImageKey)
exposureFilter.setValue(ev, forKey: kCIInputEVKey)
context.render(exposureFilter.outputImage!,
to: currentDrawable.texture,
commandBuffer: commandBuffer,
bounds: CGRect(origin: .zero, size: view.drawableSize),
colorSpace: colorSpace)
commandBuffer?.present(currentDrawable)
commandBuffer?.commit()
}
}
}

AVCaptureVideoDataOutputSampleBufferDelegate drop frames using CIFilters for video filtering

I have very strange case where AVCaptureVideoDataOutputSampleBufferDelegate drops frames if I use 13 different filter chains. Let me explain:
I have CameraController setup, nothing special, here is my delegate method:
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
if !paused {
if connection.output?.connection(with: .audio) == nil {
//capture video
// my try to avoid "Out of buffers error", no luck ;(
lastCapturedBuffer = nil
let err = CMSampleBufferCreateCopy(allocator: kCFAllocatorDefault, sampleBuffer: sampleBuffer, sampleBufferOut: &lastCapturedBuffer)
if err == noErr {
}
connection.videoOrientation = .portrait
// getting image
let pixelBuffer = CMSampleBufferGetImageBuffer(lastCapturedBuffer!)
// remove if any
CVPixelBufferLockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags(rawValue: 0))
// captured - is just ciimage property
captured = CIImage(cvPixelBuffer: pixelBuffer!)
//remove if any
CVPixelBufferUnlockBaseAddress(pixelBuffer!,CVPixelBufferLockFlags(rawValue: 0))
//CVPixelBufferUnlockBaseAddress(pixelBuffer!, .readOnly)
// transform image to targer resolution
let srcWidth = CGFloat(captured.extent.width)
let srcHeight = CGFloat(captured.extent.height)
let dstWidth: CGFloat = ConstantsManager.shared.k_video_width
let dstHeight: CGFloat = ConstantsManager.shared.k_video_height
let scaleX = dstWidth / srcWidth
let scaleY = dstHeight / srcHeight
var transform = CGAffineTransform.init(scaleX: scaleX, y: scaleY)
captured = captured.transformed(by: transform).cropped(to: CGRect(x: 0, y: 0, width: dstWidth, height: dstHeight))
// mirror for front camera
if front {
var t = CGAffineTransform.init(scaleX: -1, y: 1)
t = t.translatedBy(x: -ConstantsManager.shared.k_video_width, y: 0)
captured = captured.transformed(by: t)
}
// video capture logic
let writable = canWrite()
if writable,
sessionAtSourceTime == nil {
sessionAtSourceTime = CMSampleBufferGetPresentationTimeStamp(lastCapturedBuffer!)
videoWriter.startSession(atSourceTime: sessionAtSourceTime!)
}
if writable, (videoWriterInput.isReadyForMoreMediaData) {
videoWriterInput.append(lastCapturedBuffer!)
}
// apply effect in realtime <- here is problem. If I comment next line, it will be fixed but effect will n't be applied
captured = FilterManager.shared.applyFilterForCamera(inputImage: captured)
// current frame in case user wants to save image as photo
self.capturedPhoto = captured
// sent frame to Camcoder view controller
self.delegate?.didCapturedFrame(frame: captured)
} else {
// capture sound
let writable = canWrite()
if writable, (audioWriterInput.isReadyForMoreMediaData) {
//print("write audio buffer")
audioWriterInput?.append(lastCapturedBuffer!)
}
}
} else {
// paused
}
}
I also implemented didDrop delegate method, here is how I figure out why it drops frames:
func captureOutput(_ output: AVCaptureOutput, didDrop sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
print("did drop")
var mode: CMAttachmentMode = 0
let reason = CMGetAttachment(sampleBuffer, key: kCMSampleBufferAttachmentKey_DroppedFrameReason, attachmentModeOut: &mode)
print("reason \(String(describing: reason))") // Optional(OutOfBuffers)
}
So I did it like a pro and just commented parts of code to find where is the problem. So, it here:
captured = FilterManager.shared.applyFilterForCamera(inputImage: captured)
FilterManager - is singleton, here is called func:
func applyFilterForCamera(inputImage: CIImage) -> CIImage {
return currentVsFilter!.apply(sourceImage: inputImage)
}
currentVsFilter is object of VSFilter type - here is example of one:
import Foundation
import AVKit
class TestFilter: CustomFilter {
let _name = "Тестовый Фильтр"
let _displayName = "Test Filter"
var tempImage: CIImage?
var final: CGImage?
override func name() -> String {
return _name
}
override func displayName() -> String {
return _displayName
}
override init() {
super.init()
print("Test Filter init")
// setup my custom kernel filter
self.noise.type = GlitchFilter.GlitchType.allCases[2]
}
// this returns composition for playback using AVPlayer
override func composition(asset: AVAsset) -> AVMutableVideoComposition {
let composition = AVMutableVideoComposition(asset: asset, applyingCIFiltersWithHandler: { request in
let inputImage = request.sourceImage.cropped(to: request.sourceImage.extent)
DispatchQueue.global(qos: .userInitiated).async {
let output = self.apply(sourceImage: inputImage, forComposition: true)
request.finish(with: output, context: nil)
}
})
let size = FilterManager.shared.cropRectForOrientation().size
composition.renderSize = size
return composition
}
// this returns actual filtered CIImage, used for both AVPlayer composition and realtime camera
override func apply(sourceImage: CIImage, forComposition: Bool = false) -> CIImage {
// rendered text
tempImage = FilterManager.shared.textRenderedImage()
// some filters chained one by one
self.screenBlend?.setValue(tempImage, forKey: kCIInputImageKey)
self.screenBlend?.setValue(sourceImage, forKey: kCIInputBackgroundImageKey)
self.noise.inputImage = self.screenBlend?.outputImage
self.noise.inputAmount = CGFloat.random(in: 1.0...3.0)
// result
tempImage = self.noise.outputImage
// correct crop
let rect = forComposition ? FilterManager.shared.cropRectForOrientation() : FilterManager.shared.cropRect
final = self.context.createCGImage(tempImage!, from: rect!)
return CIImage(cgImage: final!)
}
}
And now, the most strange thing, I have 30 VSFilters and when I got to 13(switching one by one by UIButton) I got error "Out of Buffer", this one:
kCMSampleBufferDroppedFrameReason_OutOfBuffers
What I tested:
I changed vsFilters order in filters array inside FilterManager singleton - same
I tried switch from first to 12 one by one, then go back - works, but after I switched to 13tn(of 30th from 0) - bug
Looks like it can handle only 12 VSFIlter objects, like if it retains them somehow or maybe it's related to threading, I don't know.
This app made for iOs devices, tested on iPhone X iOs 13.3.1
This is video editor app to apply different effects to both live stream from camera and video files from camera roll
Maybe someone has experience with this?
Have a great day
Best, Victor
Edit 1. If I reinit cameraController(AVCaptureSession. input/output devices) it works but this is ugly option and it adds lag when switching filters
Ok, so I finally won this battle. In case some one else get this "OutOfBuffer" problem, here is my solution
As I figured out, CIFilter grabs CVPixelBuffer and don't release it while filtering images. It's kinda creates one huge buffer, I guess. Strange thing: it don't create memory leak, so I guess it grabs not particular buffer but creates strong reference to it. As rumors(me) say, it can handle only 12 such references.
So, my approach was to copy CVPixelBuffer and then work with it instead of buffer I got from AVCaptureVideoDataOutputSampleBufferDelegate didOutput func
Here is my new code:
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
if !paused {
//print("camera controller \(id) got frame")
if connection.output?.connection(with: .audio) == nil {
//capture video
connection.videoOrientation = .portrait
// getting image
guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
// this works!
let copyBuffer = pixelBuffer.copy()
// captured - is just ciimage property
captured = CIImage(cvPixelBuffer: copyBuffer)
//remove if any
// transform image to targer resolution
let srcWidth = CGFloat(captured.extent.width)
let srcHeight = CGFloat(captured.extent.height)
let dstWidth: CGFloat = ConstantsManager.shared.k_video_width
let dstHeight: CGFloat = ConstantsManager.shared.k_video_height
let scaleX = dstWidth / srcWidth
let scaleY = dstHeight / srcHeight
var transform = CGAffineTransform.init(scaleX: scaleX, y: scaleY)
captured = captured.transformed(by: transform).cropped(to: CGRect(x: 0, y: 0, width: dstWidth, height: dstHeight))
// mirror for front camera
if front {
var t = CGAffineTransform.init(scaleX: -1, y: 1)
t = t.translatedBy(x: -ConstantsManager.shared.k_video_width, y: 0)
captured = captured.transformed(by: t)
}
// video capture logic
let writable = canWrite()
if writable,
sessionAtSourceTime == nil {
sessionAtSourceTime = CMSampleBufferGetPresentationTimeStamp(sampleBuffer)
videoWriter.startSession(atSourceTime: sessionAtSourceTime!)
}
if writable, (videoWriterInput.isReadyForMoreMediaData) {
videoWriterInput.append(sampleBuffer)
}
self.captured = FilterManager.shared.applyFilterForCamera(inputImage: self.captured)
// current frame in case user wants to save image as photo
self.capturedPhoto = captured
// sent frame to Camcoder view controller
self.delegate?.didCapturedFrame(frame: captured)
} else {
// capture sound
let writable = canWrite()
if writable, (audioWriterInput.isReadyForMoreMediaData) {
//print("write audio buffer")
audioWriterInput?.append(sampleBuffer)
}
}
} else {
// paused
//print("paused camera controller \(id)")
}
}
and there is func to copy buffer:
func copy() -> CVPixelBuffer {
precondition(CFGetTypeID(self) == CVPixelBufferGetTypeID(), "copy() cannot be called on a non-CVPixelBuffer")
var _copy : CVPixelBuffer?
CVPixelBufferCreate(
kCFAllocatorDefault,
CVPixelBufferGetWidth(self),
CVPixelBufferGetHeight(self),
CVPixelBufferGetPixelFormatType(self),
nil,
&_copy)
guard let copy = _copy else { fatalError() }
CVPixelBufferLockBaseAddress(self, CVPixelBufferLockFlags.readOnly)
CVPixelBufferLockBaseAddress(copy, CVPixelBufferLockFlags(rawValue: 0))
let copyBaseAddress = CVPixelBufferGetBaseAddress(copy)
let currBaseAddress = CVPixelBufferGetBaseAddress(self)
print("copy data size: \(CVPixelBufferGetDataSize(copy))")
print("self data size: \(CVPixelBufferGetDataSize(self))")
memcpy(copyBaseAddress, currBaseAddress, CVPixelBufferGetDataSize(copy))
//memcpy(copyBaseAddress, currBaseAddress, CVPixelBufferGetDataSize(self) * 2)
CVPixelBufferUnlockBaseAddress(copy, CVPixelBufferLockFlags(rawValue: 0))
CVPixelBufferUnlockBaseAddress(self, CVPixelBufferLockFlags.readOnly)
return copy
}
I used it as extension
I hope, this will help anyone with similar problem
Best, Victor

Filtering Depth Data on iOS 12 appears to be rotated

I am having an issue where the Depth Data for the .builtInDualCamera appears to be rotated 90 degrees when isFilteringEnabled = true
Here is my code:
fileprivate let session = AVCaptureSession()
fileprivate let meta = AVCaptureMetadataOutput()
fileprivate let video = AVCaptureVideoDataOutput()
fileprivate let depth = AVCaptureDepthDataOutput()
fileprivate let camera: AVCaptureDevice
fileprivate let input: AVCaptureDeviceInput
fileprivate let synchronizer: AVCaptureDataOutputSynchronizer
init(delegate: CaptureSessionDelegate?) throws {
self.delegate = delegate
session.sessionPreset = .vga640x480
// Setup Camera Input
let discovery = AVCaptureDevice.DiscoverySession(deviceTypes: [.builtInDualCamera], mediaType: .video, position: .unspecified)
if let device = discovery.devices.first {
camera = device
} else {
throw SessionError.CameraNotAvailable("Unable to load camera")
}
input = try AVCaptureDeviceInput(device: camera)
session.addInput(input)
// Setup Metadata Output (Face)
session.addOutput(meta)
if meta.availableMetadataObjectTypes.contains(AVMetadataObject.ObjectType.face) {
meta.metadataObjectTypes = [ AVMetadataObject.ObjectType.face ]
} else {
print("Can't Setup Metadata: \(meta.availableMetadataObjectTypes)")
}
// Setup Video Output
video.videoSettings = [kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_32BGRA]
session.addOutput(video)
video.connection(with: .video)?.videoOrientation = .portrait
// ****** THE ISSUE IS WITH THIS BLOCK HERE ******
// Setup Depth Output
depth.isFilteringEnabled = true
session.addOutput(depth)
depth.connection(with: .depthData)?.videoOrientation = .portrait
// Setup Synchronizer
synchronizer = AVCaptureDataOutputSynchronizer(dataOutputs: [depth, video, meta])
let outputRect = CGRect(x: 0, y: 0, width: 1, height: 1)
let videoRect = video.outputRectConverted(fromMetadataOutputRect: outputRect)
let depthRect = depth.outputRectConverted(fromMetadataOutputRect: outputRect)
// Ratio of the Depth to Video
scale = max(videoRect.width, videoRect.height) / max(depthRect.width, depthRect.height)
// Set Camera to the framerate of the Depth Data Collection
try camera.lockForConfiguration()
if let fps = camera.activeDepthDataFormat?.videoSupportedFrameRateRanges.first?.minFrameDuration {
camera.activeVideoMinFrameDuration = fps
}
camera.unlockForConfiguration()
super.init()
synchronizer.setDelegate(self, queue: syncQueue)
}
func dataOutputSynchronizer(_ synchronizer: AVCaptureDataOutputSynchronizer, didOutput data: AVCaptureSynchronizedDataCollection) {
guard let delegate = self.delegate else {
return
}
// Check to see if all the data is actually here
guard
let videoSync = data.synchronizedData(for: video) as? AVCaptureSynchronizedSampleBufferData,
!videoSync.sampleBufferWasDropped,
let depthSync = data.synchronizedData(for: depth) as? AVCaptureSynchronizedDepthData,
!depthSync.depthDataWasDropped
else {
return
}
// It's OK if the face isn't found.
let face: AVMetadataFaceObject?
if let metaSync = data.synchronizedData(for: meta) as? AVCaptureSynchronizedMetadataObjectData {
face = (metaSync.metadataObjects.first { $0 is AVMetadataFaceObject }) as? AVMetadataFaceObject
} else {
face = nil
}
// Convert Buffers to CIImage
let videoImage = convertVideoImage(fromBuffer: videoSync.sampleBuffer)
let depthImage = convertDepthImage(fromData: depthSync.depthData, andFace: face)
// Call Delegate
delegate.captureImages(video: videoImage, depth: depthImage, face: face)
}
fileprivate func convertVideoImage(fromBuffer sampleBuffer: CMSampleBuffer) -> CIImage {
// Convert from "CoreMovie?" to CIImage - fairly straight-forward
let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
let image = CIImage(cvPixelBuffer: pixelBuffer!)
return image
}
fileprivate func convertDepthImage(fromData depthData: AVDepthData, andFace face: AVMetadataFaceObject?) -> CIImage {
var convertedDepth: AVDepthData
// Convert 16-bif floats up to 32
if depthData.depthDataType != kCVPixelFormatType_DisparityFloat32 {
convertedDepth = depthData.converting(toDepthDataType: kCVPixelFormatType_DisparityFloat32)
} else {
convertedDepth = depthData
}
// Pixel buffer comes straight from depthData
let pixelBuffer = convertedDepth.depthDataMap
let image = CIImage(cvPixelBuffer: pixelBuffer)
return image
}
The original Video Looks like this: (For reference)
When the values are:
// Setup Depth Output
depth.isFilteringEnabled = false
depth.connection(with: .depthData)?.videoOrientation = .portrait
The Image looks like this: (you can see the closer jacket is white, the farther jacket is grey, and the distance is dark grey - as expected)
When the values are:
// Setup Depth Output
depth.isFilteringEnabled = true
depth.connection(with: .depthData)?.videoOrientation = .portrait
The image looks like this: (You can see the color values appear to be in the right places, but the shapes in the smoothing filter appear to be rotated)
When the values are:
// Setup Depth Output
depth.isFilteringEnabled = true
depth.connection(with: .depthData)?.videoOrientation = .landscapeRight
The image looks like this: (Both the colors and the shapes appear to be horizontal)
Am I doing something wrong to get these incorrect values?
I have tried re-ordering the code
// Setup Depth Output
depth.connection(with: .depthData)?.videoOrientation = .portrait
depth.isFilteringEnabled = true
But that does nothing.
I think this is an issue related to iOS 12, because I remember this working just fine under iOS 11 (although I don't have any images saved to prove it)
Any Help is appreciated, thanks!
Unlike the suggestion to review other answers on rotating the image after creation, which I found did not work, in the AVDepthData documentation, there is a method available that does the orientation correction for you.
The method is called: depthDataByApplyingExifOrientation: which returns an instance of AVDepthData with the orientation applied, ie. you can create your image in the correct orientation you desire by passing in the parameter of your choice.
This is my helper method that returns a UIImage with the orientation fix.
- (UIImage *)createDepthMapImageFromCapturePhoto:(AVCapturePhoto *)photo {
// AVCapturePhoto which has depthData - in swift you should confirm this exists
AVDepthData *frontDepthData = [photo depthData];
// Overwrite the instance with the correct orientation applied.
frontDepthData = [frontDepthData depthDataByApplyingExifOrientation:kCGImagePropertyOrientationRight];
// Create the CIImage from the depth data using the available method.
CIImage *ciDepthImage = [CIImage imageWithDepthData:frontDepthData];
// Create CIContext which enables converting CIImage to CGImage
CIContext *context = [[CIContext alloc] init];
// Create the CGImage
CGImageRef img = [context createCGImage:ciDepthImage fromRect:[ciDepthImage extent]];
// Create the final image.
UIImage *depthImage = [UIImage imageWithCGImage:img];
// Return the depth image.
return depthImage;
}

Render a MTIImage

Please don't judge me I'm just learning Swift.
Recently I installed MetalPetal framework and I followed the instructions:
https://github.com/MetalPetal/MetalPetal#example-code
But I get error because of MTIContext. Maybe I have to declare something more of MetalPetal?
My Code:
import UIKit
import MetalPetal
import CoreGraphics
class ViewController: UIViewController {
#IBOutlet weak var image1: UIImageView!
override func viewDidLoad() {
super.viewDidLoad()
weak var image: UIImage?
image = image1.image
var ciImage = CIImage(image: image!)
var cgImage1 = convertCIImageToCGImage(inputImage: ciImage!)
let imageFromCGImage = MTIImage(cgImage: cgImage1!)
let inputImage = imageFromCGImage
let filter = MTISaturationFilter()
filter.saturation = 1
filter.inputImage = inputImage
let outputImage = filter.outputImage
let context = MTIContext()
do {
try context.render(outputImage, to: pixelBuffer)
var image3: CIImage? = try context.makeCIImage(from: outputImage!)
//context.makeCIImage(from: image)
//context.makeCGImage(from: image)
} catch {
print(error)
}
// Do any additional setup after loading the view, typically from a nib.
}
override func didReceiveMemoryWarning() {
super.didReceiveMemoryWarning()
// Dispose of any resources that can be recreated.
}
func convertCIImageToCGImage(inputImage: CIImage) -> CGImage? {
let context = CIContext(options: nil)
if let cgImage = context.createCGImage(inputImage, from: inputImage.extent) {
return cgImage
}
return nil
}
}
#YuAo
Input Image
An UIImage is based on either underlying Quartz image (can be retrieved with cgImage) or an underlying Core Image (can be retrieved from UIImage with ciImage).
MTIImage offers constructors for both types.
MTIContext
A MTIContext must be initialized with a device that can be retrieved by calling MTLCreateSystemDefaultDevice().
Rendering
A rendering to a pixel buffer is not needed. We can get the result by calling makeCGImage.
Test
I've taken your source code above and slightly adapted it to the aforementioned points.
I also added a second UIImageView to see the result of the filtering. I also changed the saturation to 0 to see if the filter works
If GPU or shaders are involved it makes sense to test on a real device and not on the simulator.
The result looks like this:
In the upper area you see the original jpg, in the lower area the filter is applied.
Swift
The simplified Swift code that produces this result looks like this:
override func viewDidLoad() {
super.viewDidLoad()
guard let image = UIImage(named: "regensburg.jpg") else { return }
guard let cgImage = image.cgImage else { return }
imageView1.image = image
let filter = MTISaturationFilter()
filter.saturation = 0
filter.inputImage = MTIImage(cgImage: cgImage)
if let device = MTLCreateSystemDefaultDevice(),
let outputImage = filter.outputImage {
do {
let context = try MTIContext(device: device)
let filteredImage = try context.makeCGImage(from: outputImage)
imageView2.image = UIImage(cgImage: filteredImage)
} catch {
print(error)
}
}
}

Face Detection with Camera

How can I do face detection in realtime just as "Camera" does?
I noticed that AVCaptureStillImageOutput is deprecated after 10.0, so I use
AVCapturePhotoOutput instead. However, I found that the image I saved for facial detection is not so satisfied? Any ideas?
UPDATE
After giving a try of #Shravya Boggarapu mentioned. Currently, I use AVCaptureMetadataOutput to detect the face without CIFaceDetector. It works as expected. However, when I'm trying to draw bounds of the face, it seems mislocated. Any idea?
let metaDataOutput = AVCaptureMetadataOutput()
captureSession.sessionPreset = AVCaptureSessionPresetPhoto
let backCamera = AVCaptureDevice.defaultDevice(withDeviceType: .builtInWideAngleCamera, mediaType: AVMediaTypeVideo, position: .back)
do {
let input = try AVCaptureDeviceInput(device: backCamera)
if (captureSession.canAddInput(input)) {
captureSession.addInput(input)
// MetadataOutput instead
if(captureSession.canAddOutput(metaDataOutput)) {
captureSession.addOutput(metaDataOutput)
metaDataOutput.setMetadataObjectsDelegate(self, queue: DispatchQueue.main)
metaDataOutput.metadataObjectTypes = [AVMetadataObjectTypeFace]
previewLayer = AVCaptureVideoPreviewLayer(session: captureSession)
previewLayer?.frame = cameraView.bounds
previewLayer?.videoGravity = AVLayerVideoGravityResizeAspectFill
cameraView.layer.addSublayer(previewLayer!)
captureSession.startRunning()
}
}
} catch {
print(error.localizedDescription)
}
and
extension CameraViewController: AVCaptureMetadataOutputObjectsDelegate {
func captureOutput(_ captureOutput: AVCaptureOutput!, didOutputMetadataObjects metadataObjects: [Any]!, from connection: AVCaptureConnection!) {
if findFaceControl {
findFaceControl = false
for metadataObject in metadataObjects {
if (metadataObject as AnyObject).type == AVMetadataObjectTypeFace {
print("😇😍😎")
print(metadataObject)
let bounds = (metadataObject as! AVMetadataFaceObject).bounds
print("origin x: \(bounds.origin.x)")
print("origin y: \(bounds.origin.y)")
print("size width: \(bounds.size.width)")
print("size height: \(bounds.size.height)")
print("cameraView width: \(self.cameraView.frame.width)")
print("cameraView height: \(self.cameraView.frame.height)")
var face = CGRect()
face.origin.x = bounds.origin.x * self.cameraView.frame.width
face.origin.y = bounds.origin.y * self.cameraView.frame.height
face.size.width = bounds.size.width * self.cameraView.frame.width
face.size.height = bounds.size.height * self.cameraView.frame.height
print(face)
showBounds(at: face)
}
}
}
}
}
Original
see in Github
var captureSession = AVCaptureSession()
var photoOutput = AVCapturePhotoOutput()
var previewLayer: AVCaptureVideoPreviewLayer?
override func viewWillAppear(_ animated: Bool) {
super.viewWillAppear(true)
captureSession.sessionPreset = AVCaptureSessionPresetHigh
let backCamera = AVCaptureDevice.defaultDevice(withMediaType: AVMediaTypeVideo)
do {
let input = try AVCaptureDeviceInput(device: backCamera)
if (captureSession.canAddInput(input)) {
captureSession.addInput(input)
if(captureSession.canAddOutput(photoOutput)){
captureSession.addOutput(photoOutput)
captureSession.startRunning()
previewLayer = AVCaptureVideoPreviewLayer(session: captureSession)
previewLayer?.videoGravity = AVLayerVideoGravityResizeAspectFill
previewLayer?.frame = cameraView.bounds
cameraView.layer.addSublayer(previewLayer!)
}
}
} catch {
print(error.localizedDescription)
}
}
func captureImage() {
let settings = AVCapturePhotoSettings()
let previewPixelType = settings.availablePreviewPhotoPixelFormatTypes.first!
let previewFormat = [kCVPixelBufferPixelFormatTypeKey as String: previewPixelType
]
settings.previewPhotoFormat = previewFormat
photoOutput.capturePhoto(with: settings, delegate: self)
}
func capture(_ captureOutput: AVCapturePhotoOutput, didFinishProcessingPhotoSampleBuffer photoSampleBuffer: CMSampleBuffer?, previewPhotoSampleBuffer: CMSampleBuffer?, resolvedSettings: AVCaptureResolvedPhotoSettings, bracketSettings: AVCaptureBracketedStillImageSettings?, error: Error?) {
if let error = error {
print(error.localizedDescription)
}
// Not include previewPhotoSampleBuffer
if let sampleBuffer = photoSampleBuffer,
let dataImage = AVCapturePhotoOutput.jpegPhotoDataRepresentation(forJPEGSampleBuffer: sampleBuffer, previewPhotoSampleBuffer: nil) {
self.imageView.image = UIImage(data: dataImage)
self.imageView.isHidden = false
self.previewLayer?.isHidden = true
self.findFace(img: self.imageView.image!)
}
}
The findFace works with normal image. However, the image I capture via camera will not work or sometimes only recognize one face.
Normal Image
Capture Image
func findFace(img: UIImage) {
guard let faceImage = CIImage(image: img) else { return }
let accuracy = [CIDetectorAccuracy: CIDetectorAccuracyHigh]
let faceDetector = CIDetector(ofType: CIDetectorTypeFace, context: nil, options: accuracy)
// For converting the Core Image Coordinates to UIView Coordinates
let detectedImageSize = faceImage.extent.size
var transform = CGAffineTransform(scaleX: 1, y: -1)
transform = transform.translatedBy(x: 0, y: -detectedImageSize.height)
if let faces = faceDetector?.features(in: faceImage, options: [CIDetectorSmile: true, CIDetectorEyeBlink: true]) {
for face in faces as! [CIFaceFeature] {
// Apply the transform to convert the coordinates
var faceViewBounds = face.bounds.applying(transform)
// Calculate the actual position and size of the rectangle in the image view
let viewSize = imageView.bounds.size
let scale = min(viewSize.width / detectedImageSize.width,
viewSize.height / detectedImageSize.height)
let offsetX = (viewSize.width - detectedImageSize.width * scale) / 2
let offsetY = (viewSize.height - detectedImageSize.height * scale) / 2
faceViewBounds = faceViewBounds.applying(CGAffineTransform(scaleX: scale, y: scale))
print("faceBounds = \(faceViewBounds)")
faceViewBounds.origin.x += offsetX
faceViewBounds.origin.y += offsetY
showBounds(at: faceViewBounds)
}
if faces.count != 0 {
print("Number of faces: \(faces.count)")
} else {
print("No faces 😢")
}
}
}
func showBounds(at bounds: CGRect) {
let indicator = UIView(frame: bounds)
indicator.frame = bounds
indicator.layer.borderWidth = 3
indicator.layer.borderColor = UIColor.red.cgColor
indicator.backgroundColor = .clear
self.imageView.addSubview(indicator)
faceBoxes.append(indicator)
}
There are two ways to detect faces: CIFaceDetector and AVCaptureMetadataOutput. Depending on your requirements, choose what is relevant for you.
CIFaceDetector has more features, it gives you the location of the eyes and mouth, a smile detector, etc.
On the other hand, AVCaptureMetadataOutput is computed on the frames and the detected faces are tracked and there is no extra code to be added by us. I find that, because of tracking. faces are detected more reliably in this process. The downside of this is that you will simply detect faces, no the position of the eyes or mouth.
Another advantage of this method is that orientation issues are smaller as you can use videoOrientation whenever the device orientation changes and the orientation of the faces will relative to that orientation.
In my case, my application uses YUV420 as the required format so using CIDetector (which works with RGB) in real-time was not viable. Using AVCaptureMetadataOutput saved a lot of effort and performed more reliably due to continuous tracking.
Once I had the bounding box for the faces, I coded extra features, such as skin detection and applied it on the still image.
Note: When you capture a still image, the face box information is added along with the metadata so there are no sync issues.
You can also use a combination of the two to get better results.
Explore and evaluate the pros and cons as per your application.
The face rectangle is wrt image origin. So, for the screen, it may be different.
Use:
for (AVMetadataFaceObject *faceFeatures in metadataObjects) {
CGRect face = faceFeatures.bounds;
CGRect facePreviewBounds = CGRectMake(face.origin.y * previewLayerRect.size.width,
face.origin.x * previewLayerRect.size.height,
face.size.width * previewLayerRect.size.height,
face.size.height * previewLayerRect.size.width);
/* Draw rectangle facePreviewBounds on screen */
}
To perform face detection on iOS, there are either CIDetector (Apple)
or Mobile Vision (Google) API.
IMO, Google Mobile Vision provides better performance.
If you are interested, here is the project you can play with. (iOS 10.2, Swift 3)
After WWDC 2017, Apple introduces CoreML in iOS 11.
The Vision framework makes the face detection more accurate :)
I've made a Demo Project. containing Vision v.s. CIDetector. Also, it contains face landmarks detection in real time.
A bit late, but here it is the solution for the coordinates problem. There is a method you can call on the preview layer to transform the metadata object to your coordinate system: transformedMetadataObject(for: metadataObject).
guard let transformedObject = previewLayer.transformedMetadataObject(for: metadataObject) else {
continue
}
let bounds = transformedObject.bounds
showBounds(at: bounds)
Source: https://developer.apple.com/documentation/avfoundation/avcapturevideopreviewlayer/1623501-transformedmetadataobjectformeta
By the way, in case you are using (or upgrade your project to) Swift 4, the delegate method of AVCaptureMetadataOutputsObject has change into:
func metadataOutput(_ output: AVCaptureMetadataOutput, didOutput metadataObjects: [AVMetadataObject], from connection: AVCaptureConnection)
Kind regards
extension CameraViewController: AVCaptureMetadataOutputObjectsDelegate {
func captureOutput(_ captureOutput: AVCaptureOutput!, didOutputMetadataObjects metadataObjects: [Any]!, from connection: AVCaptureConnection!) {
if findFaceControl {
findFaceControl = false
let faces = metadata.flatMap { $0 as? AVMetadataFaceObject } .flatMap { (face) -> CGRect in
guard let localizedFace =
previewLayer?.transformedMetadataObject(for: face) else { return nil }
return localizedFace.bounds }
for face in faces {
let temp = UIView(frame: face)
temp.layer.borderColor = UIColor.white
temp.layer.borderWidth = 2.0
view.addSubview(view: temp)
}
}
}
}
Be sure to remove the views created by didOutputMetadataObjects.
Keeping track of the active facial ids is the best way to do this ^
Also when you're trying to find the location of faces for your preview layer, it is much easier to use facial data and transform. Also I think CIDetector is junk, metadataoutput will use hardware stuff for face detection making it really fast.
Create CaptureSession
For AVCaptureVideoDataOutput create following settings
output.videoSettings = [ kCVPixelBufferPixelFormatTypeKey as AnyHashable: Int(kCMPixelFormat_32BGRA) ]
3.When you receive CMSampleBuffer, create image
DispatchQueue.main.async {
let sampleImg = self.imageFromSampleBuffer(sampleBuffer: sampleBuffer)
self.imageView.image = sampleImg
}
func imageFromSampleBuffer(sampleBuffer : CMSampleBuffer) -> UIImage
{
// Get a CMSampleBuffer's Core Video image buffer for the media data
let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
// Lock the base address of the pixel buffer
CVPixelBufferLockBaseAddress(imageBuffer!, CVPixelBufferLockFlags.readOnly);
// Get the number of bytes per row for the pixel buffer
let baseAddress = CVPixelBufferGetBaseAddress(imageBuffer!);
// Get the number of bytes per row for the pixel buffer
let bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer!);
// Get the pixel buffer width and height
let width = CVPixelBufferGetWidth(imageBuffer!);
let height = CVPixelBufferGetHeight(imageBuffer!);
// Create a device-dependent RGB color space
let colorSpace = CGColorSpaceCreateDeviceRGB();
// Create a bitmap graphics context with the sample buffer data
var bitmapInfo: UInt32 = CGBitmapInfo.byteOrder32Little.rawValue
bitmapInfo |= CGImageAlphaInfo.premultipliedFirst.rawValue & CGBitmapInfo.alphaInfoMask.rawValue
//let bitmapInfo: UInt32 = CGBitmapInfo.alphaInfoMask.rawValue
let context = CGContext.init(data: baseAddress, width: width, height: height, bitsPerComponent: 8, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo)
// Create a Quartz image from the pixel data in the bitmap graphics context
let quartzImage = context?.makeImage();
// Unlock the pixel buffer
CVPixelBufferUnlockBaseAddress(imageBuffer!, CVPixelBufferLockFlags.readOnly);
// Create an image object from the Quartz image
let image = UIImage.init(cgImage: quartzImage!);
return (image);
}
By looking at your code I detected 2 things that could lead to wrong/poor face detection.
One of them is the face detector features options where you are filtering the results by [CIDetectorSmile: true, CIDetectorEyeBlink: true]. Try to set it to nil: faceDetector?.features(in: faceImage, options: nil)
Another guess I have is the result image orientation. I noticed you use AVCapturePhotoOutput.jpegPhotoDataRepresentation method to generate the source image for the detection and the system, by default, it generates that image with a specific orientation, of type Left/LandscapeLeft, I think. So, basically you can tell the face detector to have that in mind by using the CIDetectorImageOrientation key.
CIDetectorImageOrientation: the value for this key is an integer NSNumber from 1..8 such as that found in kCGImagePropertyOrientation. If present, the detection will be done based on that orientation but the coordinates in the returned features will still be based on those of the image.
Try to set it like faceDetector?.features(in: faceImage, options: [CIDetectorImageOrientation: 8 /*Left, bottom*/]).

Resources