iOS 11.3 ARKit Autofocus and higher resolution - ios

As mentioned here Apple is allowing us to use higher resolution and Autofocus in the apps based on ArKit, have anybody already tried implementing those features in own apps ?
Where apple is usually sharing more technical details about such updates ?
Regards !

You haven’t been able to set the ARCamera image resolution... which is why you may not find anything relating to adjusting the camera image resolution. Changing ARCamera Resolution
If you want to check the ARCamera image resolution you can do so access via the currentFrame.
let currentFrame = sceneView.session.currentFrame
print(currentFrame?.camera.imageResolution)
to date it was set to 1280.0, 720.0
if you want more information about focal-length, which i believe auto-focus may now be able to adjust automatically. You can just check the camera property of currentFrame.
print(currentFrame?.camera)

Ok, autofocus can be added with:
var isAutoFocusEnabled: Bool { get set }
var configuration = ARWorldTrackingConfiguration()
configuration.isAutoFocusEnabled = true // or false
Any clues what about higher res ?

I am able to select desired resolution for AR camera using following code -
ARWorldTrackingConfiguration* configuration = [ARWorldTrackingConfiguration new];
NSArray<ARVideoFormat*>* supportedVideoFormats = [ARWorldTrackingConfiguration supportedVideoFormats];
int bestFormatIndex = 0;
for (int i = 0; i < [supportedVideoFormats count]; i++) {
float width = supportedVideoFormats[i].imageResolution.width;
float height = supportedVideoFormats[i].imageResolution.height;
NSLog(#"AR Video Format %f x %f", width, height);
if (width * 9 == height * 16) { // <-- Use your own condition for selecting video format
bestFormatIndex = i;
break;
}
}
[configuration setVideoFormat:supportedVideoFormats[bestFormatIndex]];
// Run the view's session
[_mSceneView.session runWithConfiguration:configuration];
For my requirement I wanted biggest 16:9 ratio. On iPhone 8+, following image resolutions are getting listed -
1920 x 1440
1920 x 1080
1280 x 720
Notice that they are sorted in the supportedVideoFormats array, with biggest resolution at 0 index. And 0th index is the default selected video format.

Related

Choosing suitable camera for barcode scanning when using AVCaptureDeviceTypeBuiltInTripleCamera

I've had some barcode scanning code in my iOS app for many years now. Recently, users have begun complaining that it doesn't work with an iPhone 13 Pro.
During investigation, it seemed that I should be using the built in triple camera if available. Doing that did fix it for iPhone 13 Pro but subsequently broke it for iPhone 12 Pro, which seemed to be working fine with the previous code.
How are you supposed to choose a suitable camera for all devices? It seems bizarre to me that Apple has suddenly made it so difficult to use this previously working code.
Here is my current code. The "fallback" section is what the code has used for years.
_session = [[AVCaptureSession alloc] init];
// Must use macro camera for barcode scanning on newer devices, otherwise the image is blurry
if (#available(iOS 13.0, *)) {
AVCaptureDeviceDiscoverySession * discoverySession =
[AVCaptureDeviceDiscoverySession discoverySessionWithDeviceTypes:#[AVCaptureDeviceTypeBuiltInTripleCamera]
mediaType:AVMediaTypeVideo
position:AVCaptureDevicePositionBack];
if (discoverySession.devices.count == 0) {
// no BuiltInTripleCamera
_device = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo];
} else {
_device = discoverySession.devices.firstObject;
}
} else {
// Fallback on earlier versions
_device = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo];
}
The accepted answer works but not all the time. Because lenses have different minimum focus distance it is harder for the device to focus on small barcodes because you have to put you device too close (before the minimum focus distance). This way it will never autofocus on small barcodes. It used to work on older lenses where autofocus was 10-12 cm but newer lenses especially those on iPhone 14 Pros that have the distance 20cm will be problematic.
The solution is to use ideally AVCaptureDeviceTypeBuiltInWideAngleCamera and setting videoZoomFactor on the AVCaptureDevice to zoom in little bit so the barcode will be nicely focused. The value should be calculated based on the input video properties and minimum size of barcode.
For details please refer to this WWDC 2019 video where they address exactly this issue https://developer.apple.com/videos/play/wwdc2021/10047/?time=133.
Here is implementation of class that sets zoom factor on a device that works for me. You can instantiate this class providing your device instance and call applyAutomaticZoomFactorIfNeeded() just before you are about to commit your capture session configuration.
///
/// Calling this method will automatically zoom the device to increase minimum focus distance. This distance appears to be problematic
/// when scanning barcodes too small or if a device's minimum focus distance is too large (like on iPhone 14 Pro and Max - 20cm, iPhone 13 Pro - 15 cm, older iPhones 12 or less.). By zooming
/// the input the device will be able to focus on a preview and complete the scan more easily.
///
/// - See https://developer.apple.com/videos/play/wwdc2021/10047/?time=133 for more detailed explanation and
/// - See https://developer.apple.com/documentation/avfoundation/capture_setup/avcambarcode_detecting_barcodes_and_faces
/// for implementation instructions.
///
#available(iOS 15.0, *)
final class DeviceAutomaticVideoZoomFactor {
enum Errors : Error {
case minimumFocusDistanceUnknown
case deviceLockFailed
}
private let device: AVCaptureDevice
private let minimumCodeSize: Float
init(device: AVCaptureDevice, minimumCodeSize: Float) {
self.device = device
self.minimumCodeSize = minimumCodeSize
}
///
/// Optimize the user experience for scanning QR codes down to smaller sizes (determined by `minimumCodeSize`, for example 2x2 cm).
/// When scanning a QR code of that size, the user may need to get closer than the camera's minimum focus distance to fill the rect of interest.
/// To have the QR code both fill the rect and still be in focus, we may need to apply some zoom.
///
func applyAutomaticZoomFactorIfNeeded() throws {
let deviceMinimumFocusDistance = Float(self.device.minimumFocusDistance)
guard deviceMinimumFocusDistance != -1 else {
throw Errors.minimumFocusDistanceUnknown
}
Logger.logIfStaging("Video Zoom Factor", "using device: \(self.device)")
Logger.logIfStaging("Video Zoom Factor", "device minimum focus distance: \(deviceMinimumFocusDistance)")
/*
Set an inital square rect of interest that is 100% of the view's shortest side.
This means that the region of interest will appear in the same spot regardless
of whether the app starts in portrait or landscape.
*/
let formatDimensions = CMVideoFormatDescriptionGetDimensions(self.device.activeFormat.formatDescription)
let rectOfInterestWidth = Double(formatDimensions.height) / Double(formatDimensions.width)
let deviceFieldOfView = self.device.activeFormat.videoFieldOfView
let minimumSubjectDistanceForCode = self.minimumSubjectDistanceForCode(fieldOfView: deviceFieldOfView,
minimumCodeSize: self.minimumCodeSize,
previewFillPercentage: Float(rectOfInterestWidth))
Logger.logIfStaging("Video Zoom Factor", "minimum subject distance: \(minimumSubjectDistanceForCode)")
guard minimumSubjectDistanceForCode < deviceMinimumFocusDistance else {
return
}
let zoomFactor = deviceMinimumFocusDistance / minimumSubjectDistanceForCode
Logger.logIfStaging("Video Zoom Factor", "computed zoom factor: \(zoomFactor)")
try self.device.lockForConfiguration()
self.device.videoZoomFactor = CGFloat(zoomFactor)
self.device.unlockForConfiguration()
Logger.logIfStaging("Video Zoom Factor", "applied zoom factor: \(self.device.videoZoomFactor)")
}
private func minimumSubjectDistanceForCode(fieldOfView: Float,
minimumCodeSize: Float,
previewFillPercentage: Float) -> Float {
/*
Given the camera horizontal field of view, we can compute the distance (mm) to make a code
of minimumCodeSize (mm) fill the previewFillPercentage.
*/
let radians = self.degreesToRadians(fieldOfView / 2)
let filledCodeSize = minimumCodeSize / previewFillPercentage
return filledCodeSize / tan(radians)
}
private func degreesToRadians(_ degrees: Float) -> Float {
return degrees * Float.pi / 180
}
}
Thankfully with the help of reddit I was able to figure out that the solution is simply to replace
AVCaptureDeviceTypeBuiltInTripleCamera
with
AVCaptureDeviceTypeBuiltInWideAngleCamera

iPhone X/Xs Max AVCaptureVideoPreviewLayer scale factor and coordinates in resizeAspectFill mode

I am working on an application with standard AVCaptureDevice flow for getting and displaying frames from iPhone camera, but then it processes them through OpenCV algorithm and puts markers on the display.
The flow is:
Setting up a AVCaptureVideoPreviewLayer.
Getting frames from AVCaptureVideoDataOutputSampleBufferDelegate's function captureOutput
Converting frame to OpenCV mat and processing it.
Getting results from the algorithm as rectangles.
Scaling them back to iPhone's screen and showing in UI.
My current problem is that everything works accurately on rectangle screened devices (iPhone 7, 8, 7 Plus, 8 Plus), but I've got a lot of problems with top-notch devices like iPhone X, iPhone Xs Max and later.
The fact is that due to usage of previewLayer?.videoGravity = .resizeAspectFill on iPhone X family devices the image on the screen (e.g. in AVCaptureVideoPreviewLayer) gets scaled and cropped if to compare with the original from from the camera , but I can't calculate the exact difference to perform correct back scaling.
If I try to render results in OpenCV straight on the device and save them into the memory, the output image is correct. If I do all the scaling and rendering on rectangle-screened devices, the result is also correct. The only problem are top notch devices, as filling their screen with camera frames make them look differently.
I tried getting such methods as metadataOutputRectConverted, but couldn't understand the right usage of the results I get.
let metaRect = self.camera.previewLayer?.metadataOutputRectConverted(fromLayerRect: self.camera.previewLayer?.bounds ?? CGRect.zero) ?? CGRect.zero
// on iPhone 8 I get: (-3.442597823690085e-17, 0.0, 1.0, 1.0)
// so it means that width and height coefficients are 1 and it's nearly not skewed on both x and y,
//so it gives me the right result on the screen
// on iPhone X I get (-3.136083667459394e-17, 0.08949096880131369, 1.0, 0.8210180623973728)
// I see that there's a skew on Y axis and in height, but I don't know how to correctly use it
The code that I use to initialise the layer:
session.sessionPreset = AVCaptureSession.Preset.hd1920x1080
previewLayer = AVCaptureVideoPreviewLayer(session: session)
previewLayer?.videoGravity = .resizeAspectFill
DispatchQueue.main.async {
layer.connection?.videoOrientation = orientation
layer.frame = UIScreen.main.bounds
view.layer.insertSublayer(layer, at: 0)
}
The code that I use to put objects at screen, resultRect I get from my C++ OpenCV module:
let aspectRatioWidth = CGFloat(1080)/UIScreen.main.bounds.size.width
let aspectRatioHeight = CGFloat(1920)/UIScreen.main.bounds.size.height
let width = CGFloat(resultRect.width) / aspectRatioWidth
let height = CGFloat(resultRect.height) / aspectRatioHeight
let rectx = CGFloat(resultRect.x) / aspectRatioWidth - width / 2.0
let recty = CGFloat(resultRect.y) / aspectRatioHeight - height / 2.0
I would appreciate any help, thank you very much in advance.

Spritekit scaling for universal app

I am having trouble properly setting up my app so that it displays correctly on all devices. I want my game to look best for iPhone and I understand that setting my scene size using GameScene(size: CGSize(width: 1334, height: 750)) and use .aspectFill means that on iPads there will be less space to display things which I'm fine with. The problem is, how do I position my nodes so that they are relative to the each devices frame height and width? I use self.frame.height, self.frame.width, self.frame.midX, etc. for positioning my nodes and when I run my game, it positions things properly considering I run on my iPhone 6, but on my iPad, everything seems blown up and nodes are off the screen. I'm going crazy trying to figure this out
I solved this in my game by using scaleFactors, numbers which tell my app how much to enlarge each length, width, height etc. Then I just make my game look well for one phone, and use that phone's width and height to calculate with which factor I need to enlarge it for other devices. In this example I use the iPhone 4 as a base, but you can use any device just change the numbers according to that device.
Portrait mode:
var widthFactor = UIScreen.main.bounds.width/320.0 //I divide it by the default iPhone 4 width
var heightFactor = UIScreen.main.bounds.height/480.0
Landscape mode:
var widthFactor = UIScreen.main.bounds.width/480.0 //I divide it by the default iPhone 4 landscape width
var heightFactor = UIScreen.main.bounds.height/320.0
Then when you make a node, a coin image for example, multiply its coordinates or width/height by the scaleFactors:
let coin = SKSpriteNode(imageNamed: "coin")
coin.position = CGPoint(x: 25 * widthFactor, y: self.size.height - 70 * heightFactor)
I think what you're looking for might be in this answer: https://stackoverflow.com/a/34878528/6728196
Specifically I think this part is what you're looking for (edited to fit your example):
if UIDevice.current.userInterfaceIdiom == .pad {
// Set things only for iPad
// Example: Adjust y positions using += or -=
buttonNode.position.y += 100
labelNode.position.y -= 100
}
Basically, this just adds or subtracts a certain amount from the iPhone position if the user is using an iPad. It's not too complicated, and you can increase or decrease both x and y values of position by a certain value of percentage of the screen (self.size.width * decimalPercentage).
Another benefit of using this way is that you're just modifying the iPhone positions, so it starts by using the default values that you set. Then if on iPad, it will make changes.
If this is hard to understand let me know so I can clear up the explanation

Scaling Image Causes Crash In AS3 Flex AIR Mobile App

Problem:
Zooming in on image by scaling and moving using matrix causes the app to run out of memory and crash.
Additional Libraries used:
Gestouch - https://github.com/fljot/Gestouch
Description:
In my Flex Mobile app I have an Image inside a Group with pan/zoom enabled using the Gestouch library. The zoom works to an extent but causes the app to die (not freeze, just exit) with no error message after a certain zoom level.
This would be manageable except I can’t figure out how to implement a threshold to stop the zoom at, as it crashes at a different zoom level almost every time. I also use dynamic images so the source of the image could be any size or resolution.
They are usually JPEGS ranging from about 800x600 - 9000x6000 and are downloaded from a server so cannot be packaged with the app.
As of the AS3 docs there is no longer a limit to the size of the BitmapData object so that shouldn't be the issue.
“Starting with AIR 3 and Flash player 11, the size limits for a BitmapData object have been removed. The maximum size of a bitmap is now dependent on the operating system.”
The group is used as a marker layer for overlaying pins on.
The crash mainly happens on iPad Mini and older Android devices.
Things I have tried already tried:
1.Using Adobe Scout to pin point when the memory leak occurs.
2.Debugging to find the exact height and width of the marker layer and image at the time of crash.
3.Setting a max zoom variable based on the size of the image.
4.Cropping the image on zoom to only show the visible area. ( crashes on copyPixels function and BitmapData.draw() function )
5.Using imagemagick to make lower quality images ( small images still crash )
6.Using imagemagick to make very low res image and make a grid of smaller images . Displaying in the mobile app using a List and Tile layout.
7.Using weak references when adding event listeners.
Any suggestions would be appreciated.
Thanks
private function layoutImageResized(e: Event):void
{
markerLayer.scaleX = markerLayer.scaleY = 1;
markerLayer.x = markerLayer.y = 0;
var scale: Number = Math.min(width / image.sourceWidth , height / image.sourceHeight);
image.scaleX = image.scaleY = scale;
_imageIsWide = (image.sourceWidth / image.sourceHeight) > (width / height);
// centre image
if(_imageIsWide)
{
markerLayer.y = (height - image.sourceHeight * image.scaleY ) / 2 ;
}
else
{
markerLayer.x = (width -image.sourceWidth * image.scaleX ) / 2 ;
}
// set max scale
_maxScale = scale*_maxZoom;
}
private function onGesture(event:org.gestouch.events.GestureEvent):void
{
trace("Gesture start");
// if the user starts moving around while the add Pin option is up
// the state will be changed and the menu will disappear
if(currentState == "addPin")
{
return;
}
const gesture:TransformGesture = event.target as TransformGesture;
////trace("gesture state is ", gesture.state);
if(gesture.state == GestureState.BEGAN)
{
currentState = "zooming";
imgOldX = image.x;
imgOldY = image.y;
oldImgWidth = markerLayer.width;
oldImgHeight = markerLayer.height;
if(!_hidePins)
{
showHidePins(false);
}
}
var matrix:Matrix = markerLayer.transform.matrix;
// Pan
matrix.translate(gesture.offsetX, gesture.offsetY);
markerLayer.transform.matrix = matrix;
if ( (gesture.scale != 1 || gesture.rotation != 0) && ( (markerLayer.scaleX < _maxScale && markerLayer.scaleY < _maxScale) || gesture.scale < 1 ) && gesture.scale < 1.4 )
{
storedScale = gesture.scale;
// Zoom
var transformPoint:Point = matrix.transformPoint(markerLayer.globalToLocal(gesture.location));
matrix.translate(-transformPoint.x, -transformPoint.y);
matrix.scale(gesture.scale, gesture.scale);
/** THIS IS WHERE THE CRASH HAPPENS **/
matrix.translate(transformPoint.x, transformPoint.y);
markerLayer.transform.matrix = matrix;
}
}
I would say that's not a good idea to work with such a large image like (9000x6000) on mobile devices.
I suppose you are trying to implement some sort of map navigation so you need to zoom some areas hugely.
My solution would be to split that 9000x6000 into 2048x2048 pieces, then compress it using png2atf utility with mipmaps enabled.
Then you can use Starling to easily load these atf images and add it to stage3d and easily manage it.
In case you are dealing with 9000x6000 image - you'll get about 15 2048x2048 pieces, having them all added on the stage at one time you might think it would be heavy, but mipmaps will make it so that there are only tiny thumbnails of image are in memory until they are not zoomed - so you'll never run out of memory in case you remove invisible pieces from stage from time to time while zooming in, and return it back on zoom out

mouse handler in opencv for large images, wrong x,y coordinates?

i am using images that are 2048 x 500 and when I use cvShowImage, I only see half the image. This is not a big deal because the interesting part is on the top half of the image. Now, when I use the mouseHandler to get the x,y coordinates of my clicks, I noticed that the coordinate for y (the dimension that doesnt fit in the screen) is wrong.
It seems OpenCV think this is the whole image and recalibrates the coordinate system although we are only effectively showing half the image.
I would need to know how to do 2 things:
- display a resized image that would fit in the screen
get the proper coordinate.
Did anybody encounter similar problems?
Thanks!
Update: it seems the y coordinate is divided by 2 of what it is supposed to be
code:
EXPORT void click_rect(uchar * the_img, int size_x, int size_y, int * points)
{
CvSize size;
size.height = size_y ;
size.width = size_x;
IplImage * img;
img = cvCreateImageHeader(size, IPL_DEPTH_8U, 1);
img->imageData = (char *)the_img;
img->imageDataOrigin = img->imageData;
img1 = cvCreateImage(cvSize((int)((size.width)) , (int)((size.height)) ),IPL_DEPTH_8U, 1);
cvNamedWindow("mainWin",CV_WINDOW_AUTOSIZE);
cvMoveWindow("mainWin", 100, 100);
cvSetMouseCallback( "mainWin", mouseHandler_rect, NULL );
cvShowImage("mainWin", img1 );
//// wait for a key
cvWaitKey(0);
points[0] = x_1;
points[1] = x_2;
points[2] = y_1;
points[3] = y_2;
//// release the image
cvDestroyWindow("mainWin");
cvReleaseImage(&img1 );
cvReleaseImage(&img);
}
You should create a window with the CV_WINDOW_KEEPRATIO flag instead of the CV_WINDOW_AUTOSIZE flag. This temporarily fixes the problem with your y values being wrong.
I use OpenCV2.1 and visual studio C++ compiler. I fix this problem with another flag CV_WINDOW_NORMAL and work properly and returns correct coordinates, this flag enables you to resize the image window.
cvNamedWindow("Box Example", CV_WINDOW_NORMAL);
I am having the same problem with OpenCV 2.1 using it with Windows and mingw compiler. It took me forever to find out what was wrong. As you describe it, cvSetMouseCallback gets too large y coordinates. This is apparently due to the image and the cvNamedWindow it is shown in being bigger than my screen resolution; thus I cannot see the bottom of the image.
As a solution I resize the images to a fixed size, such that they fit on the screen (in this case with resolution 800x600, which can be any other values:
// g_input_image, g_output_image and g_resized_image are global IplImage* pointers.
int img_w = cvGetSize(g_input_image).width;
int img_h = cvGetSize(g_input_image).height;
// If the height/width ratio is greater than 6/8 resize height to 600.
if (img_h > (img_w*6)/8) {
g_resized_image = cvCreateImage(cvSize((img_w*600)/img_h, 600), 8, 3);
}
// else adjust width to 800.
else {
g_resized_image = cvCreateImage(cvSize(800, (img_h*800)/img_w), 8, 3);
}
cvResize(g_output_image, g_resized_image);
Not a perfect solution, but works for me...
Cheers,
Linus
How are you building the window? You are not passing CV_WINDOW_AUTOSIZE to cvNamedWindow(), are you?
Share some source, #Denis.

Resources