How to handle a video overexposure in Swift - ios

I'm working on a camera app, and I think the behavior of my app and the iPhone default camera app against overexposure is very different.
Like the image below, the default camera app adjusts the overexposure when it's detected. (I feel the whole screen gets slightly yellow-ish to get rid of the overexposed brightness area. So I can see the white keyboard even putting dark stuff covers most of the screen.
Here is my app and I set the exposure mode to the continuous exposure mode, but it won't adjust the overexposed area.
I want to adjust the brightness, but I also don't want to display the image including the overexposed part (I mean... I just want my app to show like the default camera does.)
This is the code for adjust the focus and exposure.
func setFocus(with focusMode: AVCaptureDevice.FocusMode, with exposureMode: AVCaptureDevice.ExposureMode, at point: CGPoint, monitorSubjectAreaChange: Bool, completion: #escaping (Bool) -> Void) {
guard let captureDevice = captureDevice else { return }
do {
try captureDevice.lockForConfiguration()
} catch {
completion(false)
return
}
if captureDevice.isSmoothAutoFocusSupported, !captureDevice.isSmoothAutoFocusEnabled { captureDevice.isSmoothAutoFocusEnabled = true }
if captureDevice.isFocusPointOfInterestSupported, captureDevice.isFocusModeSupported(focusMode) {
captureDevice.focusPointOfInterest = point
captureDevice.focusMode = focusMode
}
if captureDevice.isExposurePointOfInterestSupported, captureDevice.isExposureModeSupported(exposureMode) {
captureDevice.exposurePointOfInterest = point
captureDevice.exposureMode = exposureMode
}
captureDevice.isSubjectAreaChangeMonitoringEnabled = monitorSubjectAreaChange
captureDevice.unlockForConfiguration()
completion(true)
}
and this is how I call the function
func setFocusToCenter() {
let center: CGPoint = CGPoint(x: cameraView.bounds.width / 2, y: cameraView.bounds.height / 2)
let pointInCamera = cameraView.layer.captureDevicePointConverted(fromLayerPoint: center)
setFocus(with: .continuousAutoFocus, with: .continuousAutoExposure, at: pointInCamera, monitorSubjectAreaChange: false, completion: { [weak self] success in
guard let self = self, success else { return }
// do some animation
})
}
if I need to work on the camera exposure and even if I set the ExposureMode as continuous auto exposure, do I still need to handle overexposure in code?
Also, if you have experienced for adjusting the overexposure, how did you achieve that?
Added this part later...
I took screenshots to compare the my app camera and the native iPhone camera app.
Here is my camera app with .continuousAutoExposure and set the exposurePointOfInterest to center of the screen.
However, the native iPhone camera app wont overexposed if I shoot a dark image from the similar distance...
I think the native iPhone app is also .continuousAutoExposure mode until I touch the screen and adjust focus to a point.
I droped the image quality in order to paste on this post, but I don't really see the blur on the original screenshots. I configure the fps to 30 (also the native iPhone camera is also 30).
So waht could be the reason for getting this overexposure....

Related

Choosing suitable camera for barcode scanning when using AVCaptureDeviceTypeBuiltInTripleCamera

I've had some barcode scanning code in my iOS app for many years now. Recently, users have begun complaining that it doesn't work with an iPhone 13 Pro.
During investigation, it seemed that I should be using the built in triple camera if available. Doing that did fix it for iPhone 13 Pro but subsequently broke it for iPhone 12 Pro, which seemed to be working fine with the previous code.
How are you supposed to choose a suitable camera for all devices? It seems bizarre to me that Apple has suddenly made it so difficult to use this previously working code.
Here is my current code. The "fallback" section is what the code has used for years.
_session = [[AVCaptureSession alloc] init];
// Must use macro camera for barcode scanning on newer devices, otherwise the image is blurry
if (#available(iOS 13.0, *)) {
AVCaptureDeviceDiscoverySession * discoverySession =
[AVCaptureDeviceDiscoverySession discoverySessionWithDeviceTypes:#[AVCaptureDeviceTypeBuiltInTripleCamera]
mediaType:AVMediaTypeVideo
position:AVCaptureDevicePositionBack];
if (discoverySession.devices.count == 0) {
// no BuiltInTripleCamera
_device = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo];
} else {
_device = discoverySession.devices.firstObject;
}
} else {
// Fallback on earlier versions
_device = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo];
}
The accepted answer works but not all the time. Because lenses have different minimum focus distance it is harder for the device to focus on small barcodes because you have to put you device too close (before the minimum focus distance). This way it will never autofocus on small barcodes. It used to work on older lenses where autofocus was 10-12 cm but newer lenses especially those on iPhone 14 Pros that have the distance 20cm will be problematic.
The solution is to use ideally AVCaptureDeviceTypeBuiltInWideAngleCamera and setting videoZoomFactor on the AVCaptureDevice to zoom in little bit so the barcode will be nicely focused. The value should be calculated based on the input video properties and minimum size of barcode.
For details please refer to this WWDC 2019 video where they address exactly this issue https://developer.apple.com/videos/play/wwdc2021/10047/?time=133.
Here is implementation of class that sets zoom factor on a device that works for me. You can instantiate this class providing your device instance and call applyAutomaticZoomFactorIfNeeded() just before you are about to commit your capture session configuration.
///
/// Calling this method will automatically zoom the device to increase minimum focus distance. This distance appears to be problematic
/// when scanning barcodes too small or if a device's minimum focus distance is too large (like on iPhone 14 Pro and Max - 20cm, iPhone 13 Pro - 15 cm, older iPhones 12 or less.). By zooming
/// the input the device will be able to focus on a preview and complete the scan more easily.
///
/// - See https://developer.apple.com/videos/play/wwdc2021/10047/?time=133 for more detailed explanation and
/// - See https://developer.apple.com/documentation/avfoundation/capture_setup/avcambarcode_detecting_barcodes_and_faces
/// for implementation instructions.
///
#available(iOS 15.0, *)
final class DeviceAutomaticVideoZoomFactor {
enum Errors : Error {
case minimumFocusDistanceUnknown
case deviceLockFailed
}
private let device: AVCaptureDevice
private let minimumCodeSize: Float
init(device: AVCaptureDevice, minimumCodeSize: Float) {
self.device = device
self.minimumCodeSize = minimumCodeSize
}
///
/// Optimize the user experience for scanning QR codes down to smaller sizes (determined by `minimumCodeSize`, for example 2x2 cm).
/// When scanning a QR code of that size, the user may need to get closer than the camera's minimum focus distance to fill the rect of interest.
/// To have the QR code both fill the rect and still be in focus, we may need to apply some zoom.
///
func applyAutomaticZoomFactorIfNeeded() throws {
let deviceMinimumFocusDistance = Float(self.device.minimumFocusDistance)
guard deviceMinimumFocusDistance != -1 else {
throw Errors.minimumFocusDistanceUnknown
}
Logger.logIfStaging("Video Zoom Factor", "using device: \(self.device)")
Logger.logIfStaging("Video Zoom Factor", "device minimum focus distance: \(deviceMinimumFocusDistance)")
/*
Set an inital square rect of interest that is 100% of the view's shortest side.
This means that the region of interest will appear in the same spot regardless
of whether the app starts in portrait or landscape.
*/
let formatDimensions = CMVideoFormatDescriptionGetDimensions(self.device.activeFormat.formatDescription)
let rectOfInterestWidth = Double(formatDimensions.height) / Double(formatDimensions.width)
let deviceFieldOfView = self.device.activeFormat.videoFieldOfView
let minimumSubjectDistanceForCode = self.minimumSubjectDistanceForCode(fieldOfView: deviceFieldOfView,
minimumCodeSize: self.minimumCodeSize,
previewFillPercentage: Float(rectOfInterestWidth))
Logger.logIfStaging("Video Zoom Factor", "minimum subject distance: \(minimumSubjectDistanceForCode)")
guard minimumSubjectDistanceForCode < deviceMinimumFocusDistance else {
return
}
let zoomFactor = deviceMinimumFocusDistance / minimumSubjectDistanceForCode
Logger.logIfStaging("Video Zoom Factor", "computed zoom factor: \(zoomFactor)")
try self.device.lockForConfiguration()
self.device.videoZoomFactor = CGFloat(zoomFactor)
self.device.unlockForConfiguration()
Logger.logIfStaging("Video Zoom Factor", "applied zoom factor: \(self.device.videoZoomFactor)")
}
private func minimumSubjectDistanceForCode(fieldOfView: Float,
minimumCodeSize: Float,
previewFillPercentage: Float) -> Float {
/*
Given the camera horizontal field of view, we can compute the distance (mm) to make a code
of minimumCodeSize (mm) fill the previewFillPercentage.
*/
let radians = self.degreesToRadians(fieldOfView / 2)
let filledCodeSize = minimumCodeSize / previewFillPercentage
return filledCodeSize / tan(radians)
}
private func degreesToRadians(_ degrees: Float) -> Float {
return degrees * Float.pi / 180
}
}
Thankfully with the help of reddit I was able to figure out that the solution is simply to replace
AVCaptureDeviceTypeBuiltInTripleCamera
with
AVCaptureDeviceTypeBuiltInWideAngleCamera

Vision CoreML Object Detection Full Screen Landscape

How can I get my VNCoreMLRequest to detect objects appearing anywhere within the fullscreen view?
I am currently using the Apple sample project for object recognition in breakfast foods:BreakfastFinder. The model and recognition works well, and generally gives the correct bounding box (visual) of the objects it is detecting / finding.
The issue arises here with changing the orientation of this detection.
In portrait mode, the default orientation for this project, the model identifies objects well in the full bounds of the view. Naturally, given the properties of the SDK objects, rotating the camera causes poor performance and visual identification.
In landscape mode, the model behaves strangely. The window / area of which the model is detecting objects is not the full view. Instead, it is (what seems like) the same aspect ratio of the phone itself, but centered and in portrait mode. I have a screenshot below showing approximately where the model stops detecting objects when in landscape:
The blue box with red outline is approximately where the detection stops. It behaves strangely, but consistently does not find any objects outside this approbate view / near the left or right edge. However, the top and bottom edges near the center detect without any issue.
regionOfInterest
I have adjusted this to be the maximum: x: 0, y: 0, width: 1, height: 1. This made no difference
imageCropAndScaleOption
This is the only setting that allows detection in the full screen, however, the performance became noticeably worse, and that's not really an allowable con.
Is there a scale / size setting somewhere in this process that I have not set properly? Or perhaps a mode I am not using. Any help would be most appreciated. Below is my detection controller:
ViewController.swift
// All unchanged from the download in Apples folder
" "
session.sessionPreset = .hd1920x1080 // Model image size is smaller.
...
previewLayer.connection?.videoOrientation = .landscapeRight
" "
VisionObjectRecognitionViewController
#discardableResult
func setupVision() -> NSError? {
// Setup Vision parts
let error: NSError! = nil
guard let modelURL = Bundle.main.url(forResource: "ObjectDetector", withExtension: "mlmodelc") else {
return NSError(domain: "VisionObjectRecognitionViewController", code: -1, userInfo: [NSLocalizedDescriptionKey: "Model file is missing"])
}
do {
let visionModel = try VNCoreMLModel(for: MLModel(contentsOf: modelURL))
let objectRecognition = VNCoreMLRequest(model: visionModel, completionHandler: { (request, error) in
DispatchQueue.main.async(execute: {
// perform all the UI updates on the main queue
if let results = request.results {
self.drawVisionRequestResults(results)
}
})
})
// These are the only properties that impact the detection area
objectRecognition.regionOfInterest = CGRect(x: 0, y: 0, width: 1, height: 1)
objectRecognition.imageCropAndScaleOption = VNImageCropAndScaleOption.scaleFit
self.requests = [objectRecognition]
} catch let error as NSError {
print("Model loading went wrong: \(error)")
}
return error
}
EDIT:
When running the project in portrait mode only (locked by selecting only Portrait in Targets -> General), then rotating the device to landscape, the detection occurs perfectly across the entire screen.
The issue seemed to reside in the rotation of the physical device.
When telling Vision that the device is “not rotated”, but passing all other elements the current orientation, this allowed for the detection bounds to remain the full screen (as if portrait), but allowing the controller to in fact be landscape.
The bounding Boxes are normalised rect which we get from CoreML bounding box observation which we have convert with due ratio of screen to generate boxes in the image for Words

Swift - How to crop a QR code properly using an ARSession and Vision library?

This is a long question so I wanted to put a TL;DR on top:
I want to track QR codes via on of two methods: image tracking by cropping them upon detection, or placing anchors with raycasting. Both of these methods fail when the phone is in portrait mode. Camera source is an ARSession, SceneKit and RealityKit not used. There's only ARKit. What to do?
I am currently working on an application with Swift in which I try to render some stuff on a server, transmit the video to iPhone and display it on screen using a MTKView. I only needed a custom Meal shader to apply some complex calculations to received frames, so I did not use SceneKit or RealityKit. I only have ARSession from ARKit and a Metal view here, and up to this point everything works fine.
I am able to do image tracking at this point. However, I want to apply this behaviour to QR codes. What I want is to detect a QR code (multiple if possible) and then track it just like images. Since I don't have the QR code as ARReferenceImages beforehand like normal image tracking, I was left with two options:
Option 1: Using raycast(_:) on ARSession
This is probably the right way to do it. However, for this I need to activate both plane tracking options on ARSession, which then creates many anchors and managing them with image tracking becomes harder. This is not the actual problem though. Actual problem is that when the phone is in landscape mode, raycasting works as intended. When phone goes into portrait mode, even if I pass the frame in correct orientation it misses everything and hit test results return empty. I am not using hitTest(_:) because it is deprecated.
I want to explain the "correct orientation" thing here before going into second option. ARSession is capturing frames and I am able to check each frame through didUpdate delegate function of the session. When I read the pixel buffer out of the frame using frame.capturedImage and turn it into a CIImage, the image is always in landscape mode (width > height). Doesn't matter if the phone is in portrait mode or not. So whenever I want to pass this image, I am using oriented(.right) for portrait and oriented(.up) for landscape. I got that idea from another question asked about QR bounding box, and so far it is the best option (but not good enough). Just want to note that when I tried raycasting, I tried it with the image size, not screen size (screen size = my Metal view size because it is fullscreen) since the image is larger than the screen in reality. I am able to see this if I put a breakpoint and quicklook my CIImage created from current camera frame.
Option 2: Cropping the QR and treating it as image tracking
This is another approach which I am currently working on. Algorithm is simple: check every frame with Vision. If there are detected QR codes, read their data first. If that data matches with an existing QR, then re-read it if the cropped QR size is larger than existing one. If not, do nothing. Then use this cropped QR image for tracking QR as an image. At this point we would have the data already so no problems here.
However, I tried many times to do the proper transformation explained here in the answer. Again, I think I am able to transform normalized bounding box into a real rect which can correctly crop the image. Yet, as it is in raycasting, works perfectly only if the phone is in landscape position. When in portrait it works good enough ONLY IF the phone is really close to QR code and it is centered on the screen.
For related code, I have this in my View controller:
private var ciContext: CIContext = CIContext.init(options: nil)
private var sequenceHandler: VNImageRequestHandler?
And then I have this code to extract QR codes from CIImage:
func extractQrCode(image: CIImage) -> [VNBarcodeObservation]? {
self.sequenceHandler = VNImageRequestHandler(ciImage: image)
let barcodeRequest = VNDetectBarcodesRequest()
barcodeRequest.symbologies = [.QR]
try? self.sequenceHandler?.perform([barcodeRequest])
guard let results = barcodeRequest.results else {
return nil
}
return results
}
An this is the delegate that checks and operates on every frame (code currently for Option 2):
func session(_ session: ARSession, didUpdate frame: ARFrame) {
let rotImg = self.renderer?.getInterfaceOrientation() == .portrait ? CIImage(cvPixelBuffer: frame.capturedImage).oriented(.right) : CIImage(cvPixelBuffer: frame.capturedImage)
if let barcodes = self.extractQrCode(image: rotImg) {
for barcode in barcodes {
guard let payload = barcode.payloadStringValue else { continue }
var rect = CGRect()
rect = VNImageRectForNormalizedRect(barcode.boundingBox.botToTop(), Int(rotImg.extent.width), Int(rotImg.extent.height))
let existingQR = TrackedImagesManager.imagesToTrack.filter{ $0.isQR && $0.QRData == payload}.first
if ((rect.size.width < 800 || rect.size.height < 800 || abs(rect.size.height - rect.size.width) > 32) && existingQR == nil) {
DispatchQueue.main.async {
self.showToastMessage(message: "Please get closer to the QR code and try centering it on your screen.", font: UIFont.systemFont(ofSize: 18), duration: 3)
}
continue
} else if (existingQR != nil) {
if (rect.width > existingQR?.originalImage?.size.width ?? 999) {
let croppedImg = rotImg.cropped(to: rect)
let croppedCgImage = self.ciContext.createCGImage(croppedImg, from: croppedImg.extent)!
let trackImg = UIImage(cgImage: croppedCgImage)
existingQR?.originalImage = trackImg
existingQR?.image = ARReferenceImage(croppedCgImage, orientation: .up, physicalWidth: 0.1)
} else {
continue
}
} else if rect.width != 0 {
let croppedImg = rotImg.cropped(to: rect)
let croppedCgImage = self.ciContext.createCGImage(croppedImg, from: croppedImg.extent)!
let trackImg = UIImage(cgImage: croppedCgImage)
TrackedImagesManager.imagesToTrack.append(TrackedImage(id: 9, type: 1, image: ARReferenceImage(croppedCgImage, orientation: .up, physicalWidth: 0.1), originalImage: trackImg, isQR: true, QRData: payload))
print("qr norm rect: \(barcode.boundingBox) \n qr rect: \(rect) \nqr data: \(payload) \nqr hittestres: ")
}
}
}
}
Finally, for the transformation, I have this extension (tried various ways, this is the best so far):
extension CGRect {
func botToTop() -> CGRect {
let transform = CGAffineTransform(scaleX: 1, y: -1).translatedBy(x: 0, y: -1)
return self.applying(transform)
}
}
So for both options I need some advice to make things right. Android side of the same thing is implemented as in Option 2, but Android returns a nicely cropped QR code upon detection. We don't have that. What do I do now?

AVCaptureDevice's exposurePointOfInterest does not work

I'm trying to change the exposure in my camera app according to certain point of the image.
I'm using the following code that is triggered when the user taps on screen. For now I simply try to expose to the center.
#IBAction func didTap()
{
if captureDevice.isExposurePointOfInterestSupported
{
try! captureDevice.lockForConfiguration()
captureDevice.exposurePointOfInterest = CGPoint(x: 0.5, y: 0.5)
captureDevice.exposureMode = .continuousAutoExposure
captureDevice.unlockForConfiguration()
}
}
But nothing happens.
captureDevice.isExposurePointOfInterestSupported is true. The captureDevice currently is .builtInDualCamera.
This code is in a simple camera test app based on sample code. It shows the live camera image on screen.
Has anyone got exposurePointOfInterest working on iOS 14.4?
What could I be missing?
I actually ran into this issue yesterday. Turns out there's a problem with using exactly (0.5, 0.5). When I use (0.51, 0.51) it works every time 🤷
extension AVCaptureDevice {
func change(_ block: (AVCaptureDevice) -> ()) {
try! self.lockForConfiguration()
block(self)
self.unlockForConfiguration()
}
}
#objc func handleTap() {
device.change {
$0.exposurePointOfInterest = CGPoint(x: 0.51, y: 0.51)
$0.exposureMode = .autoExpose
}
}
Update
It may also be worth noting that, although it's a point specified exposure, the region around that point still has to be large enough to trigger an exposure adjust. Let's call this the trigger region.
From what I understand from my tests, the point (0.5, 0.5) has a special effect on the trigger region's size. Whenever this point is used as the exposurePointOfInterest, the trigger region is rather large, regardless of whether exposureMode is .continuousAutoExpose or .autoExpose.
You can get an idea of the size of this region by using the following code, pointing your phone at a bright area (like a lamp), and seeing how close you have to get until a tap adjusts the exposure. You'll find that the exposure does adjust, but you have to get rather close.
#objc func handleTap() {
device.change {
$0.exposurePointOfInterest = CGPoint(x: 0.5, y: 0.5)
$0.exposureMode = .autoExpose
}
}
Or, you could not use a tap, and just keep the properties exposureMode and exposurePointOfInterest at their default values of .continuousAutoExpose and (0.5, 0.5). Or you could use the native camera app and see when it automatically adjusts the exposure. The results are the same.
Now, if you were to set the exposurePointOfInterest to a value close to but not equal to the midpoint, say (0.51, 0.51), you'll find that the trigger region becomes much, much smaller.
You could also use .continuousAutoExpose and call this only once, and you'll find that the automatic exposure adjustments are a lot more sensitive as the trigger region is a lot smaller:
func viewDidLoad() {
super.viewDidLoad()
device.change {
$0.exposurePointOfInterest = CGPoint(x: 0.51, y: 0.51)
$0.exposureMode = .continuousAutoExpose
}
}
To get an idea of the size of this smaller region, open the native camera app and tap somewhere to focus/expose at that point. You'll see a small bounding box. That's pretty much the size of the trigger region.
Say you have a tap like so:
#objc func handleTap() {
device.change {
$0.exposurePointOfInterest = CGPoint(x: 0.51, y: 0.51)
$0.exposureMode = .autoExpose
}
}
If nothing happens, the region is not large enough, and you should be able to reproduce the same no-effect in the native camera app when you try to tap to expose at that point.
Side Note
Your didTap() method is setting the default values, so it's essentially useless.
If you want to adjust exposure on a tap, use .autoExpose if the point is always the same. Don't use .continuousAutoExpose cuz that's gonna be adjusting exposure all the time, not just on a tap. It only makes sense to do this if the tap will change the point.

iPhone back camera cannot focus correctly

I've been making an iOS camera app and trying to solve this problem for two days (but cannot solve this).
What I'm working on now is change the focus and exposure automatically depending on the user's tapped location. Sometimes it works fine (maybe about 20% in total), but mostly it fails. Especially when I try to focus on a far object (like 5+metre) or when there are two objects and try to switch the focus of one object to another. The image below is an example.
The yellow square locates where the user tapped and even though I tapped the black cup in the first picture, the camera still focuses on the red cup.
override func touchesBegan(_ touches: Set<UITouch>, with event: UIEvent?) {
let touchPoint = touches.first! as UITouch
let focusPoint = touchPoint.location(in: lfView)
print("focusPoint \(focusPoint)")
showPointOfInterestViewAtPoint(point: focusPoint)
setFocus(focusMode: .autoFocus, exposureMode: .autoExpose, atPoint: focusPoint, shouldMonitorSujectAreaChange: true)
}
func setFocus(focusMode: AVCaptureDevice.FocusMode, exposureMode: AVCaptureDevice.ExposureMode, atPoint devicePoint: CGPoint, shouldMonitorSujectAreaChange: Bool) {
guard let captureDevice = captureDevice else { return }
do {
try captureDevice.lockForConfiguration()
} catch let error as NSError { return }
if captureDevice.isFocusPointOfInterestSupported, captureDevice.isFocusModeSupported(focusMode) {
captureDevice.focusPointOfInterest = devicePoint
captureDevice.focusMode = focusMode
print("devicePoint: \(devicePoint)")
}
// other codes in here...
captureDevice.isSubjectAreaChangeMonitoringEnabled = shouldMonitorSujectAreaChange
captureDevice.unlockForConfiguration()
}
I called the setFocus function in touchesBegan function and both focusPoint & devicePoint comments show the same coordinate, like (297.5, 88.0).
When I tapped the black cup in the picture, I can see the iPhone camera is zooming in and out a little bit, like same as when I use the default iPhone camera app and try to focus on an object. So I guess my camera app is trying to focus on the black cup but it fails.
Since this is not an error, I'm not sure which code to change. Is there any clue what is going on here and what causes this problem?
ADD THIS PART LATER
I also read this document and it says
This property’s CGPoint value uses a coordinate system where {0,0} is the top-left of the picture area and {1,1} is the bottom-right.
As I wrote before, the value of devicePoint gives me more than 1, like 297.5, 88.0. Does this cause the problem?
Thanks to #Artem I was able to solve the problem. All I needed to do was convert the absolute coordinate to the value used in focusPointOfInterest (min (0,0) to max (1,1)).
Thank you, Artem!!

Resources