Changing size of image obtained from camera explanation - ios

I am new to iOS and have experience with image processing coding in other languages, which I was hoping to translate into an app, however I am getting some unusual behavior which I don't understand. When I convert an image to a Data array and look at the number of elements in the array, with every new image this number changes. When I look at the specific data in the array, the values are 0-255 which matches what I would expect for a grayscale image, but I am confused why the size (or number of elements) in the data array changes. I would expect it to be held constant since I set the captureSession to 640x480. Why is this not the case? Even if it wasn't a grayscale image, I would expect the size to remain the same and not change picture to picture.
UPDATE:
I am getting the uiimage from AV, and the code is shown below. The other rest of the code not shown is just beginning the session. I basically want to turn the image into raw pixel data, which I have seen a lot of different ways to do this, but this seems like a good method.
Relevant Code:
#objc func timerHandle() {
imageView.image = uiimages
}
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
uiimages = sampleBuffer.image(orientation: .down, scale: 1.0)!
print(uiimages) //output1
let data = sampleBuffer.data()
let newData = Array(data!)
print(data!.count) //output2
}
extension CMSampleBuffer {
func image(orientation: UIImageOrientation = .left, scale: CGFloat = 1.0) -> UIImage? {
if let buffer = CMSampleBufferGetImageBuffer(self) {
let ciImage = CIImage(cvPixelBuffer: buffer).applyingFilter("CIColorControls", parameters: [kCIInputSaturationKey:0.0])
return UIImage(ciImage: ciImage, scale: scale, orientation: orientation)
}
return nil
}
func data(orientation: UIImageOrientation = .left, scale: CGFloat = 1.0) -> Data? {
if let buffer = CMSampleBufferGetImageBuffer(self) {
let size = self.image()?.size
let scale = self.image()?.scale
let ciImage = CIImage(cvPixelBuffer: buffer).applyingFilter("CIColorControls", parameters: [kCIInputSaturationKey:0.0])
UIGraphicsBeginImageContextWithOptions(size!, false, scale!)
defer { UIGraphicsEndImageContext() }
UIImage(ciImage: ciImage).draw(in: CGRect(origin: .zero, size: size!))
guard let redraw = UIGraphicsGetImageFromCurrentImageContext() else { return nil }
return UIImagePNGRepresentation(redraw)
}
return nil
}
}
At output1, when I straight print the uiimage variable I get:
<UIImage: 0x1c40b0ec0>, {640, 480}
which shows correct dimensions
at output2, when I print the count, every time captureOutput is called I get a different value:
225726
224474
225961
640x480 should give me 307,200, so why am I not getting constant numbers at least, even if the value isn't correct.

Related

slow frame rate when rendering cifiltered ciimage and MTKView while using face detection (Vision and CIDetection)

I have an app which does real time filtering on camera feed, i'm getting each frame from camera and then do some filtering using CIFilter and then pass the final frame(CIImage) to MTKView to be shown on my swiftUI view, it works fine, but when i want to do face/body detection, real time, on camera feed, frame rate goes down to 8 frames per second and super laggy.
i tried anything i could find on the internet, using vision, CIDetector, CoreML, everything is the same result, well, i would do this on global thread, which makes the UI responsive but the feed which i'm showing into the main view is still laggy, but things like scrollview are working fine.
so i tried to change the view from MTKView to UIImageView, Xcode shows its rendering at 120FPS (which i dont understand why, its 30FPS when not using any face detection) but the feed is still laggy, cannot keep up somehow to the output frame rate, i'm new to this, i dont understand why is it like that.
i also tried just to pass the coming image to MTKView (without any filtering in between, with face detection) also the same laggy result, without face detection, it goes to 30FPS (why not 120?).
this is the code i'm using for converting sampleBuffer to ciImage
extension CICameraCapture: AVCaptureVideoDataOutputSampleBufferDelegate {
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
guard let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
var ciImage = CIImage(cvImageBuffer: imageBuffer)
if self.cameraPosition == AVCaptureDevice.Position.front {
ciImage = ciImage.oriented(.downMirrored)
}
ciImage = ciImage.transformed(by: CGAffineTransform(rotationAngle: 3 * .pi / 2))
ciImage = ciImage.transformToOrigin(withSize: ciImage.extent.size)
detectFace(image: ciImage) // this is for detecting face realtime, i have done it in vision
//and also cidetector - cidetector is a little bit faster when setted to low accuracy
//but still not desired result(frame rate)
DispatchQueue.main.async {
self.callback(ciImage)
}
}
}
and this is the MTKView code, which is very simple and basic implementation of it:
import MetalKit
import CoreImage
class MetalRenderView: MTKView {
//var textureCache: CVMetalTextureCache?
override init(frame frameRect: CGRect, device: MTLDevice?) {
super.init(frame: frameRect, device: device)
if super.device == nil {
fatalError("No support for Metal. Sorry")
}
framebufferOnly = false
preferredFramesPerSecond = 120
sampleCount = 2
}
required init(coder: NSCoder) {
fatalError("init(coder:) has not been implemented")
}
private lazy var commandQueue: MTLCommandQueue? = {
[unowned self] in
return self.device!.makeCommandQueue()
}()
private lazy var ciContext: CIContext = {
[unowned self] in
return CIContext(mtlDevice: self.device!)
}()
var image: CIImage? {
didSet {
renderImage()
}
}
private func renderImage() {
guard var image = image else { return }
image = image.transformToOrigin(withSize: drawableSize) // this is an extension to resize
//the image to the render size so i dont get the render error while rendering a frame
let commandBuffer = commandQueue?.makeCommandBuffer()
let destination = CIRenderDestination(width: Int(drawableSize.width),
height: Int(drawableSize.height),
pixelFormat: .bgra8Unorm,
commandBuffer: commandBuffer) { () -> MTLTexture in
return self.currentDrawable!.texture
}
try! ciContext.startTask(toRender: image, to: destination)
commandBuffer?.present(currentDrawable!)
commandBuffer?.commit()
draw()
}
}
and here is the code for face detection using CIDetector:
func detectFace (image: CIImage){
//DispatchQueue.global().async {
let options = [CIDetectorAccuracy: CIDetectorAccuracyHigh,
CIDetectorSmile: true, CIDetectorTypeFace: true] as [String : Any]
let faceDetector = CIDetector(ofType: CIDetectorTypeFace, context: nil,
options: options)!
let faces = faceDetector.features(in: image)
if let face = faces.first as? CIFaceFeature {
AppState.shared.mouth = face.mouthPosition
AppState.shared.leftEye = face.leftEyePosition
AppState.shared.rightEye = face.rightEyePosition
}
//}
}
what I have tried
1) different face detection methods, using Vision, CIDetector and also CoreML(this one not very deeply as i dont have experience in it)
I would get the detection info, but frame rate is 8 or at the best case its 15 (which would be a delayed detection)
2) I've read somewhere that it might be result of the image colorsapce so i have tried different video setting and different rendering colorspace, still no change in the frame rate.
3) I'm somehow sure that it might be regarding to pixelbuffer release time, so i deep copied the imageBuffer and pass it to the detection, beside some memory issues it went up to 15 FPS, but still not minimum 30FPS. in here i also tried to convert imageBuffer to ciimage and then render ciimage to cgimage and the back to ciimage to just release the buffer, but also could not get more than 15FPS (well on average, sometimes goes to 17 or 19, but still laggy)
i'm new in this and still trying to figure it out, i would appreciate any suggestions, samples or tips that could direct me to a better path of solving this.
update
this is the camera capture setup code:
class CICameraCapture: NSObject {
typealias Callback = (CIImage?) -> ()
private var cameraPosition = AVCaptureDevice.Position.front
var ciContext: CIContext?
let callback: Callback
private let session = AVCaptureSession()
private let sampleBufferQueue = DispatchQueue(label: "buffer", qos: .userInitiated)//, attributes: [], autoreleaseFrequency: .workItem)
// face detection
//private var sequenceHandler = VNSequenceRequestHandler()
//var request: VNCoreMLRequest!
//var visionModel: VNCoreMLModel!
//let detectionQ = DispatchQueue(label: "detectionQ", qos: .background)//, attributes: [], autoreleaseFrequency: .workItem)
init(callback: #escaping Callback) {
self.callback = callback
super.init()
prepareSession()
ciContext = CIContext(mtlDevice: MTLCreateSystemDefaultDevice()!)
}
func start() {
session.startRunning()
}
func stop() {
session.stopRunning()
}
private func prepareSession() {
session.sessionPreset = .high //.hd1920x1080
let cameraDiscovery = AVCaptureDevice.DiscoverySession(deviceTypes: [.builtInDualCamera, .builtInWideAngleCamera], mediaType: .video, position: cameraPosition)
guard let camera = cameraDiscovery.devices.first else { fatalError("Can't get hold of the camera") }
//try! camera.lockForConfiguration()
//camera.activeVideoMinFrameDuration = camera.formats[0].videoSupportedFrameRateRanges[0].minFrameDuration
//camera.activeVideoMaxFrameDuration = camera.formats[0].videoSupportedFrameRateRanges[0].maxFrameDuration
//camera.unlockForConfiguration()
guard let input = try? AVCaptureDeviceInput(device: camera) else { fatalError("Can't get hold of the camera") }
session.addInput(input)
let output = AVCaptureVideoDataOutput()
output.videoSettings = [:]
//print(output.videoSettings.description)
//[875704438, 875704422, 1111970369]
//output.videoSettings = [kCVPixelBufferPixelFormatTypeKey as String : Int(kCVPixelFormatType_32BGRA)]
output.setSampleBufferDelegate(self, queue: sampleBufferQueue)
session.addOutput(output)
session.commitConfiguration()
}
}

Filtering Depth Data on iOS 12 appears to be rotated

I am having an issue where the Depth Data for the .builtInDualCamera appears to be rotated 90 degrees when isFilteringEnabled = true
Here is my code:
fileprivate let session = AVCaptureSession()
fileprivate let meta = AVCaptureMetadataOutput()
fileprivate let video = AVCaptureVideoDataOutput()
fileprivate let depth = AVCaptureDepthDataOutput()
fileprivate let camera: AVCaptureDevice
fileprivate let input: AVCaptureDeviceInput
fileprivate let synchronizer: AVCaptureDataOutputSynchronizer
init(delegate: CaptureSessionDelegate?) throws {
self.delegate = delegate
session.sessionPreset = .vga640x480
// Setup Camera Input
let discovery = AVCaptureDevice.DiscoverySession(deviceTypes: [.builtInDualCamera], mediaType: .video, position: .unspecified)
if let device = discovery.devices.first {
camera = device
} else {
throw SessionError.CameraNotAvailable("Unable to load camera")
}
input = try AVCaptureDeviceInput(device: camera)
session.addInput(input)
// Setup Metadata Output (Face)
session.addOutput(meta)
if meta.availableMetadataObjectTypes.contains(AVMetadataObject.ObjectType.face) {
meta.metadataObjectTypes = [ AVMetadataObject.ObjectType.face ]
} else {
print("Can't Setup Metadata: \(meta.availableMetadataObjectTypes)")
}
// Setup Video Output
video.videoSettings = [kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_32BGRA]
session.addOutput(video)
video.connection(with: .video)?.videoOrientation = .portrait
// ****** THE ISSUE IS WITH THIS BLOCK HERE ******
// Setup Depth Output
depth.isFilteringEnabled = true
session.addOutput(depth)
depth.connection(with: .depthData)?.videoOrientation = .portrait
// Setup Synchronizer
synchronizer = AVCaptureDataOutputSynchronizer(dataOutputs: [depth, video, meta])
let outputRect = CGRect(x: 0, y: 0, width: 1, height: 1)
let videoRect = video.outputRectConverted(fromMetadataOutputRect: outputRect)
let depthRect = depth.outputRectConverted(fromMetadataOutputRect: outputRect)
// Ratio of the Depth to Video
scale = max(videoRect.width, videoRect.height) / max(depthRect.width, depthRect.height)
// Set Camera to the framerate of the Depth Data Collection
try camera.lockForConfiguration()
if let fps = camera.activeDepthDataFormat?.videoSupportedFrameRateRanges.first?.minFrameDuration {
camera.activeVideoMinFrameDuration = fps
}
camera.unlockForConfiguration()
super.init()
synchronizer.setDelegate(self, queue: syncQueue)
}
func dataOutputSynchronizer(_ synchronizer: AVCaptureDataOutputSynchronizer, didOutput data: AVCaptureSynchronizedDataCollection) {
guard let delegate = self.delegate else {
return
}
// Check to see if all the data is actually here
guard
let videoSync = data.synchronizedData(for: video) as? AVCaptureSynchronizedSampleBufferData,
!videoSync.sampleBufferWasDropped,
let depthSync = data.synchronizedData(for: depth) as? AVCaptureSynchronizedDepthData,
!depthSync.depthDataWasDropped
else {
return
}
// It's OK if the face isn't found.
let face: AVMetadataFaceObject?
if let metaSync = data.synchronizedData(for: meta) as? AVCaptureSynchronizedMetadataObjectData {
face = (metaSync.metadataObjects.first { $0 is AVMetadataFaceObject }) as? AVMetadataFaceObject
} else {
face = nil
}
// Convert Buffers to CIImage
let videoImage = convertVideoImage(fromBuffer: videoSync.sampleBuffer)
let depthImage = convertDepthImage(fromData: depthSync.depthData, andFace: face)
// Call Delegate
delegate.captureImages(video: videoImage, depth: depthImage, face: face)
}
fileprivate func convertVideoImage(fromBuffer sampleBuffer: CMSampleBuffer) -> CIImage {
// Convert from "CoreMovie?" to CIImage - fairly straight-forward
let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
let image = CIImage(cvPixelBuffer: pixelBuffer!)
return image
}
fileprivate func convertDepthImage(fromData depthData: AVDepthData, andFace face: AVMetadataFaceObject?) -> CIImage {
var convertedDepth: AVDepthData
// Convert 16-bif floats up to 32
if depthData.depthDataType != kCVPixelFormatType_DisparityFloat32 {
convertedDepth = depthData.converting(toDepthDataType: kCVPixelFormatType_DisparityFloat32)
} else {
convertedDepth = depthData
}
// Pixel buffer comes straight from depthData
let pixelBuffer = convertedDepth.depthDataMap
let image = CIImage(cvPixelBuffer: pixelBuffer)
return image
}
The original Video Looks like this: (For reference)
When the values are:
// Setup Depth Output
depth.isFilteringEnabled = false
depth.connection(with: .depthData)?.videoOrientation = .portrait
The Image looks like this: (you can see the closer jacket is white, the farther jacket is grey, and the distance is dark grey - as expected)
When the values are:
// Setup Depth Output
depth.isFilteringEnabled = true
depth.connection(with: .depthData)?.videoOrientation = .portrait
The image looks like this: (You can see the color values appear to be in the right places, but the shapes in the smoothing filter appear to be rotated)
When the values are:
// Setup Depth Output
depth.isFilteringEnabled = true
depth.connection(with: .depthData)?.videoOrientation = .landscapeRight
The image looks like this: (Both the colors and the shapes appear to be horizontal)
Am I doing something wrong to get these incorrect values?
I have tried re-ordering the code
// Setup Depth Output
depth.connection(with: .depthData)?.videoOrientation = .portrait
depth.isFilteringEnabled = true
But that does nothing.
I think this is an issue related to iOS 12, because I remember this working just fine under iOS 11 (although I don't have any images saved to prove it)
Any Help is appreciated, thanks!
Unlike the suggestion to review other answers on rotating the image after creation, which I found did not work, in the AVDepthData documentation, there is a method available that does the orientation correction for you.
The method is called: depthDataByApplyingExifOrientation: which returns an instance of AVDepthData with the orientation applied, ie. you can create your image in the correct orientation you desire by passing in the parameter of your choice.
This is my helper method that returns a UIImage with the orientation fix.
- (UIImage *)createDepthMapImageFromCapturePhoto:(AVCapturePhoto *)photo {
// AVCapturePhoto which has depthData - in swift you should confirm this exists
AVDepthData *frontDepthData = [photo depthData];
// Overwrite the instance with the correct orientation applied.
frontDepthData = [frontDepthData depthDataByApplyingExifOrientation:kCGImagePropertyOrientationRight];
// Create the CIImage from the depth data using the available method.
CIImage *ciDepthImage = [CIImage imageWithDepthData:frontDepthData];
// Create CIContext which enables converting CIImage to CGImage
CIContext *context = [[CIContext alloc] init];
// Create the CGImage
CGImageRef img = [context createCGImage:ciDepthImage fromRect:[ciDepthImage extent]];
// Create the final image.
UIImage *depthImage = [UIImage imageWithCGImage:img];
// Return the depth image.
return depthImage;
}

Crop CMSampleBuffer

I'm using Firebase ML Kit to recognise text that's visible within a small window. But I want to make it more efficient, helping it to not analyse unnecessary data. So I would like to crop the image being sent to the model.
Firebase ML is taking a VisionImage which takes a CMSampleBuffer.
The only part of the image I would need is the same width, but is only about 100px high, like the blue part here:
I haven't found a good way to do this yet, or is it better to take the image from the preview that is shown to the user and then convert it back?
I'm thinking that this should be done inside the captureOutput function from the AVCaptureVideoDataOutputSampleBufferDelegate. My function looks like this today:
func captureOutput(
_ output: AVCaptureOutput,
didOutput sampleBuffer: CMSampleBuffer,
from connection: AVCaptureConnection
) {
DispatchQueue.main.async {
self.updatePreviewOverlayView()
}
guard let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
self.lastFrame = sampleBuffer
let visionImage = VisionImage(buffer: sampleBuffer)
let metadata = VisionImageMetadata()
let visionOrientation = VisionDetectorImageOrientation.rightTop
metadata.orientation = visionOrientation
visionImage.metadata = metadata
let imageWidth = CGFloat(CVPixelBufferGetWidth(imageBuffer))
let imageHeight = CGFloat(CVPixelBufferGetHeight(imageBuffer))
self.recognizeTextOnDevice(in: visionImage, width: imageWidth, height: imageHeight)
}

Image output from AVFoundation occupied only 1/4 of screen

I played around with AVFoundation try to apply filter to live video. I tried to apply filter to AVCaptureVideoDataOutput, but the output occupied only 1/4 of the view.
Here are some of my related code
Capturing
let availableCameraDevices = AVCaptureDevice.devicesWithMediaType(AVMediaTypeVideo)
for device in availableCameraDevices as! [AVCaptureDevice] {
if device.position == .Back {
backCameraDevice = device
} else if device.position == .Front {
frontCameraDevice = device
}
}
Configure output
private func configureVideoOutput() {
videoOutput = AVCaptureVideoDataOutput()
videoOutput?.setSampleBufferDelegate(self, queue: dispatch_queue_create("sample buffer delegate", DISPATCH_QUEUE_SERIAL))
if session.canAddOutput(videoOutput) {
session.addOutput(videoOutput)
}
}
Get image
func captureOutput(captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer:
CMSampleBuffer!, fromConnection connection: AVCaptureConnection!) {
// Grab the pixelbuffer
let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)!
// create a CIImage from it, rotate it, and zero the origin
var image = CIImage(CVPixelBuffer: pixelBuffer)
image = image.imageByApplyingTransform(CGAffineTransformMakeRotation(CGFloat(-M_PI_2)))
let origin = image.extent.origin
image = image.imageByApplyingTransform(CGAffineTransformMakeTranslation(-origin.x, -origin.y))
self.manualDelegate?.cameraController(self, didOutputImage: image)
}
Render
func cameraController(cameraController: CameraController, didOutputImage image: CIImage) {
if glContext != EAGLContext.currentContext() {
EAGLContext.setCurrentContext(glContext)
}
let filteredImage = image.imageByApplyingFilter("CIColorControls", withInputParameters: [kCIInputSaturationKey: 0.0])
var rect = view.bounds
glView.bindDrawable()
ciContext.drawImage(filteredImage, inRect: rect, fromRect: image.extent)
glView.display()
}
I expected retina display and scale factor causing this, but don't sure where should I deal with this. I already set content scale factor to GLKView, but no luck.
private var glView: GLKView {
// Set in storyboard
return view as! GLKView
}
glView.contentScaleFactor = glView.bounds.size.width / UIScreen.mainScreen().bounds.size.width * UIScreen.mainScreen().scale
Your problem is with the output rect used in the drawImage function:
ciContext.drawImage(filteredImage, inRect: rect, fromRect: image.extent)
The image's extent is in actual pixels, while the view's bounds are points, and are not adjusted by the contentScaleFactor to get pixels. Your device undoubtedly has a contentScaleFactor of 2.0, thus it's 1/2 the size in each dimension.
Instead, set the rect as:
var rect = CGRect(x: 0, y: 0, width: glView.drawableWidth,
height: glView.drawableHeight)
drawableWidth and drawableHeight return the dimension in pixels, accounting for the contentScaleFactor. See:
https://developer.apple.com/reference/glkit/glkview/1615591-drawablewidth
Also, there is no need to set glView's contentScaleFactor
Which format are you setting before starting the capture? Are you sure that the video preview layer is filling the whole screen?
You have 2 ways to set resolution during an avcapture:
Choose the AVCaptureDeviceFormat with the highest resolution, by looking through available capture format
Using the sessionPreset property of you capture session. Doc here.

Custom CIFilter subclass returns image with scale out of whack

I'm new to writing CIFilters, and I'm stuck on this problem. Here is my source image being displayed in a UIImageView with contentMode set to Aspect Fit:
Here is the image returned from my CIFilter object being displayed in the same UIImageView:
I've tried copying over the original scale and orientation from my source image to the UIImage being constructed from the CIImage returned from the filter with no luck.
What might be causing this?
I'm thinking I'm doing something wrong in my CIFilter class. I am starting to suspect something in my outputImage method:?
func outputImage() -> CIImage? {
if let inputImage = inputImage {
let dod = inputImage.extent()
if let kernel = kernel {
let args = [inputImage as AnyObject]
let dod = inputImage.extent().rectByInsetting(dx: -1, dy: -1)
return kernel.applyWithExtent(dod, roiCallback: {
(index, rect) in
return rect.rectByInsetting(dx: -1, dy: -1)
}, arguments: args)
}
}
return nil
}

Resources