Transforming ARFrame#capturedImage to view size - ios

When using the ARSessionDelegate to process the raw camera image in ARKit...
func session(_ session: ARSession, didUpdate frame: ARFrame) {
guard let currentFrame = session.currentFrame else { return }
let capturedImage = currentFrame.capturedImage
debugPrint("Display size", UIScreen.main.bounds.size)
debugPrint("Camera frame resolution", CVPixelBufferGetWidth(capturedImage), CVPixelBufferGetHeight(capturedImage))
// ...
}
... as documented, the camera image data doesn't match the screen size, for example, on iPhone X I get:
Display size: 375x812pt
Camera resolution: 1920x1440px
Now there is the displayTransform(for:viewportSize:) API to transform camera coordinates to view coordinates. When using the API like this:
let ciimage = CIImage(cvImageBuffer: capturedImage)
let transform = currentFrame.displayTransform(for: .portrait, viewportSize: UIScreen.main.bounds.size)
var transformedImage = ciimage.transformed(by: transform)
debugPrint("Transformed size", transformedImage.extent.size)
I get a size of 2340x1920 which seems incorrect, the result should have an aspect ratio of 375:812 (~0.46). What do I miss here / what's the correct way to use this API to transform the camera image to an image "as displayed by ARSCNView"?
(Example project: ARKitCameraImage)

This turned out to be quite complicated because displayTransform(for:viewportSize) expects normalized image coordinates, it seems you have to flip the coordinates only in portrait mode and the image needs to be not only transformed but also cropped. The following code does the trick for me. Suggestions how to improve this would be appreciated.
guard let frame = session.currentFrame else { return }
let imageBuffer = frame.capturedImage
let imageSize = CGSize(width: CVPixelBufferGetWidth(imageBuffer), height: CVPixelBufferGetHeight(imageBuffer))
let viewPort = sceneView.bounds
let viewPortSize = sceneView.bounds.size
let interfaceOrientation : UIInterfaceOrientation
if #available(iOS 13.0, *) {
interfaceOrientation = self.sceneView.window!.windowScene!.interfaceOrientation
} else {
interfaceOrientation = UIApplication.shared.statusBarOrientation
}
let image = CIImage(cvImageBuffer: imageBuffer)
// The camera image doesn't match the view rotation and aspect ratio
// Transform the image:
// 1) Convert to "normalized image coordinates"
let normalizeTransform = CGAffineTransform(scaleX: 1.0/imageSize.width, y: 1.0/imageSize.height)
// 2) Flip the Y axis (for some mysterious reason this is only necessary in portrait mode)
let flipTransform = (interfaceOrientation.isPortrait) ? CGAffineTransform(scaleX: -1, y: -1).translatedBy(x: -1, y: -1) : .identity
// 3) Apply the transformation provided by ARFrame
// This transformation converts:
// - From Normalized image coordinates (Normalized image coordinates range from (0,0) in the upper left corner of the image to (1,1) in the lower right corner)
// - To view coordinates ("a coordinate space appropriate for rendering the camera image onscreen")
// See also: https://developer.apple.com/documentation/arkit/arframe/2923543-displaytransform
let displayTransform = frame.displayTransform(for: interfaceOrientation, viewportSize: viewPortSize)
// 4) Convert to view size
let toViewPortTransform = CGAffineTransform(scaleX: viewPortSize.width, y: viewPortSize.height)
// Transform the image and crop it to the viewport
let transformedImage = image.transformed(by: normalizeTransform.concatenating(flipTransform).concatenating(displayTransform).concatenating(toViewPortTransform)).cropped(to: viewPort)

Thank you so much for your answer! I was working on this for a week.
Here's an alternative way to do it without messing with the orientation. Instead of using the capturedImage property you can use a snapshot of the screen.
func session(_ session: ARSession, didUpdate frame: ARFrame) {
guard let image = CIImage(image: sceneView.snapshot()) else { return }
let imageSize = image.extent.size
// Convert to "normalized image coordinates"
let resize = CGAffineTransform(scaleX: 1.0 / imageSize.width, y: 1.0 / imageSize.height)
// Convert to view size
let viewSize = CGAffineTransform(scaleX: sceneView.bounds.size.width, y: sceneView.bounds.size.height)
// Transform image
let editedImage = image.transformed(by: resize.concatenating(viewSize)).cropped(to: sceneView.bounds)
sceneView.scene.background.contents = context.createCGImage(editedImage, from: editedImage.extent)
}

Related

MTKView is blurry - samplingNearest() does not appear to work

I'm using a MTKView to display some pixel art, but it shows up blurry.
Here is the really weird part: I took a screenshot to show you all what it looks like, but the screenshot is perfectly sharp! Yet, the contents of the MTKView is blurry. Here's the screenshot, and a simulation of what it looks like in the app:
Note the test pattern displayed in the app is 32 x 32 pixels.
When switching from one app to this one, the view is briefly sharp, before instantly becoming blurry.
I suspect this has something to do with anti-aliasing, but I can't seem to find a way to turn it off. Here is my code:
import UIKit
import MetalKit
class ViewController: UIViewController, MTKViewDelegate {
var metalView: MTKView!
var image: CIImage!
var commandQueue: MTLCommandQueue!
var context: CIContext!
override func viewDidLoad() {
super.viewDidLoad()
setup()
layout()
}
func setup() {
guard let image = loadTestPattern() else { return }
self.image = image
let metalView = MTKView(frame: CGRect(origin: CGPoint.zero, size: image.extent.size))
metalView.device = MTLCreateSystemDefaultDevice()
metalView.delegate = self
metalView.framebufferOnly = false
metalView.isPaused = true
metalView.enableSetNeedsDisplay = true
commandQueue = metalView.device?.makeCommandQueue()
context = CIContext(mtlDevice: metalView.device!)
self.metalView = metalView
view.addSubview(metalView)
}
func layout() {
let size = image.extent.size
metalView.translatesAutoresizingMaskIntoConstraints = false
NSLayoutConstraint.activate([
metalView.centerXAnchor.constraint(equalTo: view.centerXAnchor),
metalView.centerYAnchor.constraint(equalTo: view.centerYAnchor),
metalView.widthAnchor.constraint(equalToConstant: size.width),
metalView.heightAnchor.constraint(equalToConstant: size.height),
])
let viewBounds = view.bounds.size
let scale = min(viewBounds.width/size.width, viewBounds.height/size.height)
metalView.layer.magnificationFilter = CALayerContentsFilter.nearest;
metalView.transform = metalView.transform.scaledBy(x: floor(scale * 0.8), y: floor(scale * 0.8))
}
func loadTestPattern() -> CIImage? {
guard let uiImage = UIImage(named: "TestPattern_32.png") else { return nil }
guard let image = CIImage(image: uiImage) else { return nil }
return image
}
func mtkView(_ view: MTKView, drawableSizeWillChange size: CGSize) {}
func draw(in view: MTKView) {
guard let image = self.image else { return }
if let currentDrawable = view.currentDrawable,
let commandBuffer = self.commandQueue.makeCommandBuffer() {
let drawableSize = view.drawableSize
let scaleX = drawableSize.width / image.extent.width
let scaleY = drawableSize.height / image.extent.height
let scale = min(scaleX, scaleY)
let scaledImage = image.samplingNearest().transformed(by: CGAffineTransform(scaleX: scale, y: scale))
let destination = CIRenderDestination(width: Int(drawableSize.width),
height: Int(drawableSize.height),
pixelFormat: view.colorPixelFormat,
commandBuffer: nil,
mtlTextureProvider: { () -> MTLTexture in return currentDrawable.texture })
try! self.context.startTask(toRender: scaledImage, to: destination)
commandBuffer.present(currentDrawable)
commandBuffer.commit()
}
}
}
Any ideas on what is going on?
Edit 01:
Some additional clues: I attached a pinch gesture recognizer to the MTKView, and printed how much it's being scaled by. Up to a scale factor of approximately 31-32, it appears to be using a linear filter, but beyond 31 or 32, nearest filtering takes over.
Clue #2: Problem disappears when MTKView is replaced with a standard UIImageView.
I'm not sure why that is.
You can find how to turn on/off multisampling anti-aliasing How to use multisampling with an MTKView?
Just have .sampleCount = 1. However, you problem doesn't look like MSAA-related.
My only idea. Here I'd check framebuffer sizes in Metal Debugger in XCode. Sometimes (depending on contentScale factor on your device) framebuffer can be stretched. E.g. your have a device with virtual resolution 100x100 and content scale factor 2. Physical resolution would be 200x200 in this case, and framebuffer 100x100 will be stretched by the system. This may happen with implicit linear filtering, instead of nearest one you set for main render pass. For screenshots it can use 1:1 resolution and system stretching doesn't happen.

Metal View (MTKView) Drawing Size Issue

Here I have a MTKView and running a simple CIFilter live on camera feed. This works fine.
Issue
On older devices' selfie camera's, such as iPhone 5, iPad Air, the feed gets drawn on a smaller area. UPDATE: Found out that CMSampleBuffer fed to MTKView is smaller in size when this happens. I guess the texture in each update needs to be scaled up?
import UIKit
import MetalPerformanceShaders
import MetalKit
import AVFoundation
final class MetalObject: NSObject, MTKViewDelegate {
private var metalBufferView : MTKView?
private var metalDevice = MTLCreateSystemDefaultDevice()
private var metalCommandQueue : MTLCommandQueue!
private var metalSourceTexture : MTLTexture?
private var context : CIContext?
private var filter : CIFilter?
init(with frame: CGRect, filterType: Int, scaledUp: Bool) {
super.init()
self.metalCommandQueue = self.metalDevice!.makeCommandQueue()
self.metalBufferView = MTKView(frame: frame, device: self.metalDevice)
self.metalBufferView!.framebufferOnly = false
self.metalBufferView!.isPaused = true
self.metalBufferView!.contentScaleFactor = UIScreen.main.nativeScale
self.metalBufferView!.delegate = self
self.context = CIContext()
}
final func update (sampleBuffer: CMSampleBuffer) {
var textureCache : CVMetalTextureCache?
CVMetalTextureCacheCreate(kCFAllocatorDefault, nil, self.metalDevice!, nil, &textureCache)
var cameraTexture: CVMetalTexture?
guard
let cameraTextureCache = textureCache,
let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else {
return
}
let cameraTextureWidth = CVPixelBufferGetWidthOfPlane(pixelBuffer, 0)
let cameraTextureHeight = CVPixelBufferGetHeightOfPlane(pixelBuffer, 0)
CVMetalTextureCacheCreateTextureFromImage(kCFAllocatorDefault,
cameraTextureCache,
pixelBuffer,
nil,
MTLPixelFormat.bgra8Unorm,
cameraTextureWidth,
cameraTextureHeight,
0,
&cameraTexture)
if let cameraTexture = cameraTexture,
let metalTexture = CVMetalTextureGetTexture(cameraTexture) {
self.metalSourceTexture = metalTexture
self.metalBufferView!.draw()
}
}
//MARK: - Metal View Delegate
final func draw(in view: MTKView) {
guard let currentDrawable = self.metalBufferView!.currentDrawable,
let sourceTexture = self.metalSourceTexture
else { return }
let commandBuffer = self.metalCommandQueue!.makeCommandBuffer()
var inputImage = CIImage(mtlTexture: sourceTexture)!.applyingOrientation(self.orientationNumber)
if self.showFilter {
self.filter!.setValue(inputImage, forKey: kCIInputImageKey)
inputImage = filter!.outputImage!
}
self.context!.render(inputImage, to: currentDrawable.texture, commandBuffer: commandBuffer, bounds: inputImage.extent, colorSpace: self.colorSpace!)
commandBuffer.present(currentDrawable)
commandBuffer.commit()
}
final func mtkView(_ view: MTKView, drawableSizeWillChange size: CGSize) {
}
}
Observations
Only happens on selfie cameras of older devices
Selfie cameras on newer devices are fine
when the issue occurs, new content gets drawn in a smaller area (gravitated towards top left), with old content from back camera is still remaining outside of new content.
Constraints and the sizing/placement of Metal View is fine.
self.metalBufferView!.contentScaleFactor = UIScreen.main.nativeScale
solves the weird scaling issue on Plus devices.
It looks like the resolution of the front (selfie) camera on older devices is lower, so you'll need to scale the video up if you want it to use the full width or height. Since you're already using CIContext and Metal, you can simply instruct the rendering call to draw the image to whatever rectangle you like.
In your draw method, you execute
self.context!.render(inputImage,
to: currentDrawable.texture,
commandBuffer: commandBuffer,
bounds: inputImage.extent,
colorSpace: self.colorSpace!)
The bounds argument is the destination rectangle in which the image will be rendered. Currently, you are using the image extent, which means the image will not be scaled.
To scale the video up, use the display rectangle instead. You can simply use your metalBufferView.bounds since this will be the size of your display view. You'll end up with
self.context!.render(inputImage,
to: currentDrawable.texture,
commandBuffer: commandBuffer,
bounds: self.metalBufferView.bounds,
colorSpace: self.colorSpace!)
If the image and the view are different aspect ratios (width/height is the aspect ratio), then you'll have to compute the correct size such that the image's aspect ratio is preserved. To do this, you'll end up with code like this:
CGRect dest = self.metalBufferView.bounds;
CGSize imageSize = inputImage.extent.size;
CGSize viewSize = dest.size;
double imageAspect = imageSize.width / imageSize.height;
double viewAspect = viewSize.width / viewSize.height;
if (imageAspect > viewAspect) {
// the image is wider than the view, adjust height
dest.size.height = 1/imageAspect * dest.size.width;
} else {
// the image is taller than the view, adjust the width
dest.size.width = imageAspect * dest.size.height;
// center the tall image
dest.origin.x = (viewSize.width - dest.size.width) / 2;
}
Hope this is useful, please let me know if anything doesn't work or clarification would be helpful.

ARKit - Projection of ARAnchor to 2D space

I am trying to project an ARAnchor to the 2D space but I am facing on an orientation issue...
Below my function to project the top left, top right, bottom left, bottom right corner position to 2D space:
/// Returns the projection of an `ARImageAnchor` from the 3D world space
/// detected by ARKit into the 2D space of a view rendering the scene.
///
/// - Parameter from: An Anchor instance for projecting.
/// - Returns: An optional `CGRect` corresponding on `ARImageAnchor` projection.
internal func projection(from anchor: ARImageAnchor,
alignment: ARPlaneAnchor.Alignment,
debug: Bool = false) -> CGRect? {
guard let camera = session.currentFrame?.camera else {
return nil
}
let refImg = anchor.referenceImage
let anchor3DPoint = anchor.transform.columns.3
let size = view.bounds.size
let width = Float(refImg.physicalSize.width / 2)
let height = Float(refImg.physicalSize.height / 2)
/// Upper left corner point
let projection = ProjectionHelper.projection(from: anchor3DPoint,
width: width,
height: height,
focusAlignment: alignment)
let topLeft = projection.0
let topLeftProjected = camera.projectPoint(topLeft,
orientation: .portrait,
viewportSize: size)
let topRight:simd_float3 = projection.1
let topRightProjected = camera.projectPoint(topRight,
orientation: .portrait,
viewportSize: size)
let bottomLeft = projection.2
let bottomLeftProjected = camera.projectPoint(bottomLeft,
orientation: .portrait,
viewportSize: size)
let bottomRight = projection.3
let bottomRightProjected = camera.projectPoint(bottomRight,
orientation: .portrait,
viewportSize: size)
let result = CGRect(origin: topLeftProjected,
size: CGSize(width: topRightProjected.distance(point: topLeftProjected),
height: bottomRightProjected.distance(point: bottomLeftProjected)))
return result
}
This function works pretty well when I am in front of the world origin. However, if I move left or right the calculation of the corner points does not work.
I found a solution to get corner 3D points of an ARImageAnchor depending on the anchor.transform and project them to 2D space:
extension simd_float4 {
var vector_float3: vector_float3 { return simd_float3([x, y, z]) }
}
/// Returns the projection of an `ARImageAnchor` from the 3D world space
/// detected by ARKit into the 2D space of a view rendering the scene.
///
/// - Parameter from: An Anchor instance for projecting.
/// - Returns: An optional `CGRect` corresponding on `ARImageAnchor` projection.
internal func projection(from anchor: ARImageAnchor) -> CGRect? {
guard let camera = session.currentFrame?.camera else {
return nil
}
let refImg = anchor.referenceImage
let transform = anchor.transform.transpose
let size = view.bounds.size
let width = Float(refImg.physicalSize.width / 2)
let height = Float(refImg.physicalSize.height / 2)
// Get corner 3D points
let pointsWorldSpace = [
matrix_multiply(simd_float4([width, 0, -height, 1]), transform).vector_float3, // top right
matrix_multiply(simd_float4([width, 0, height, 1]), transform).vector_float3, // bottom right
matrix_multiply(simd_float4([-width, 0, -height, 1]), transform).vector_float3, // bottom left
matrix_multiply(simd_float4([-width, 0, height, 1]), transform).vector_float3 // top left
]
// Project 3D point to 2D space
let pointsViewportSpace = pointsWorldSpace.map { (point) -> CGPoint in
return camera.projectPoint(
point,
orientation: .portrait,
viewportSize: size
)
}
// Create a rectangle shape of the projection
// to calculate the Intersection Over Union of other `ARImageAnchor`
let result = CGRect(
origin: pointsViewportSpace[3],
size: CGSize(
width: pointsViewportSpace[0].distance(point: pointsViewportSpace[3]),
height: pointsViewportSpace[1].distance(point: pointsViewportSpace[2])
)
)
return result
}

Crop picture from UIImagePickerController like credit card scanning in iOS [duplicate]

I'm trying to crop a sub-image of a image view using an overlay UIView that can be positioned anywhere in the UIImageView. I'm borrowing a solution from a similar post on how to solve this when the UIImageView content mode is 'Aspect Fit'. That proposed solution is:
func computeCropRect(for sourceFrame : CGRect) -> CGRect {
let widthScale = bounds.size.width / image!.size.width
let heightScale = bounds.size.height / image!.size.height
var x : CGFloat = 0
var y : CGFloat = 0
var width : CGFloat = 0
var height : CGFloat = 0
var offSet : CGFloat = 0
if widthScale < heightScale {
offSet = (bounds.size.height - (image!.size.height * widthScale))/2
x = sourceFrame.origin.x / widthScale
y = (sourceFrame.origin.y - offSet) / widthScale
width = sourceFrame.size.width / widthScale
height = sourceFrame.size.height / widthScale
} else {
offSet = (bounds.size.width - (image!.size.width * heightScale))/2
x = (sourceFrame.origin.x - offSet) / heightScale
y = sourceFrame.origin.y / heightScale
width = sourceFrame.size.width / heightScale
height = sourceFrame.size.height / heightScale
}
return CGRect(x: x, y: y, width: width, height: height)
}
The problem is that using this solution when the image view is aspect fill causes the cropped segment to not line up exactly with where the overlay UIView was positioned. I'm not quite sure how to adapt this code to accommodate for Aspect Fill or reposition my overlay UIView so that it lines up 1:1 with the segment I'm trying to crop.
UPDATE Solved using Matt's answer below
class ViewController: UIViewController {
#IBOutlet weak var catImageView: UIImageView!
private var cropView : CropView!
override func viewDidLoad() {
super.viewDidLoad()
cropView = CropView(frame: CGRect(x: 0, y: 0, width: 45, height: 45))
catImageView.image = UIImage(named: "cat")
catImageView.clipsToBounds = true
catImageView.layer.borderColor = UIColor.purple.cgColor
catImageView.layer.borderWidth = 2.0
catImageView.backgroundColor = UIColor.yellow
catImageView.addSubview(cropView)
let imageSize = catImageView.image!.size
let imageViewSize = catImageView.bounds.size
var scale : CGFloat = imageViewSize.width / imageSize.width
if imageSize.height * scale < imageViewSize.height {
scale = imageViewSize.height / imageSize.height
}
let croppedImageSize = CGSize(width: imageViewSize.width/scale, height: imageViewSize.height/scale)
let croppedImrect =
CGRect(origin: CGPoint(x: (imageSize.width-croppedImageSize.width)/2.0,
y: (imageSize.height-croppedImageSize.height)/2.0),
size: croppedImageSize)
let renderer = UIGraphicsImageRenderer(size:croppedImageSize)
let _ = renderer.image { _ in
catImageView.image!.draw(at: CGPoint(x:-croppedImrect.origin.x, y:-croppedImrect.origin.y))
}
}
#IBAction func performCrop(_ sender: Any) {
let cropFrame = catImageView.computeCropRect(for: cropView.frame)
if let imageRef = catImageView.image?.cgImage?.cropping(to: cropFrame) {
catImageView.image = UIImage(cgImage: imageRef)
}
}
#IBAction func resetCrop(_ sender: Any) {
catImageView.image = UIImage(named: "cat")
}
}
The Final Result
Let's divide the problem into two parts:
Given the size of a UIImageView and the size of its UIImage, if the UIImageView's content mode is Aspect Fill, what is the part of the UIImage that fits into the UIImageView? We need, in effect, to crop the original image to match what the UIImageView is actually displaying.
Given an arbitrary rect within the UIImageView, what part of the cropped image (derived in part 1) does it correspond to?
The first part is the interesting part, so let's try it. (The second part will then turn out to be trivial.)
Here's the original image I'll use:
https://static1.squarespace.com/static/54e8ba93e4b07c3f655b452e/t/56c2a04520c64707756f4267/1455596221531/
That image is 1000x611. Here's what it looks like scaled down (but keep in mind that I'm going to be using the original image throughout):
My image view, however, will be 139x182, and is set to Aspect Fill. When it displays the image, it looks like this:
The problem we want to solve is: what part of the original image is being displayed in my image view, if my image view is set to Aspect Fill?
Here we go. Assume that iv is the image view:
let imsize = iv.image!.size
let ivsize = iv.bounds.size
var scale : CGFloat = ivsize.width / imsize.width
if imsize.height * scale < ivsize.height {
scale = ivsize.height / imsize.height
}
let croppedImsize = CGSize(width:ivsize.width/scale, height:ivsize.height/scale)
let croppedImrect =
CGRect(origin: CGPoint(x: (imsize.width-croppedImsize.width)/2.0,
y: (imsize.height-croppedImsize.height)/2.0),
size: croppedImsize)
So now we have solved the problem: croppedImrect is the region of the original image that is showing in the image view. Let's proceed to use our knowledge, by actually cropping the image to a new image matching what is shown in the image view:
let r = UIGraphicsImageRenderer(size:croppedImsize)
let croppedIm = r.image { _ in
iv.image!.draw(at: CGPoint(x:-croppedImrect.origin.x, y:-croppedImrect.origin.y))
}
The result is this image (ignore the gray border):
But lo and behold, that is the correct answer! I have extracted from the original image exactly the region portrayed in the interior of the image view.
So now you have all the information you need. croppedIm is the UIImage actually displayed in the clipped area of the image view. scale is the scale between the image view and that image. Therefore, you can easily solve the problem you originally proposed! Given any rectangle imposed upon the image view, in the image view's bounds coordinates, you simply apply the scale (i.e. divide all four of its attributes by scale) — and now you have the same rectangle as a portion of croppedIm.
(Observe that we didn't really need to crop the original image to get croppedIm; it was sufficient, in reality, to know how to perform that crop. The important information is the scale along with the origin of croppedImRect; given that information, you can take the rectangle imposed upon the image view, scale it, and offset it to get the desired rectangle of the original image.)
EDIT I added a little screencast just to show that my approach works as a proof of concept:
EDIT Also created a downloadable example project here:
https://github.com/mattneub/Programming-iOS-Book-Examples/blob/39cc800d18aa484d17c26ffcbab8bbe51c614573/bk2ch02p058cropImageView/Cropper/ViewController.swift
But note that I can't guarantee that URL will last forever, so please read the discussion above to understand the approach used.
Matt answered the question perfectly. I was creating a full-screen camera and had a need to make the final output match the full-screen preview. Offering here a compact extension of Matt's overall answer in Swift 5 for easy use by others. Recommend reading Matt's answer as it explains things very well.
extension UIImage {
func cropToRect(rect: CGRect) -> UIImage? {
var scale = rect.width / self.size.width
scale = self.size.height * scale < rect.height ? rect.height/self.size.height : scale
let croppedImsize = CGSize(width:rect.width/scale, height:rect.height/scale)
let croppedImrect = CGRect(origin: CGPoint(x: (self.size.width-croppedImsize.width)/2.0,
y: (self.size.height-croppedImsize.height)/2.0),
size: croppedImsize)
UIGraphicsBeginImageContextWithOptions(croppedImsize, true, 0)
self.draw(at: CGPoint(x:-croppedImrect.origin.x, y:-croppedImrect.origin.y))
let croppedImage = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
return croppedImage
}
}

Crop Image from Camera in Swift without move to another ViewController

I have an image overlay inside CameraViewController:
I want to get the image from inside this red square.
I don't want to move to another view controller to setup a CropViewController, the crop should be done inside this Controller.
This code behind almost works, the problem is that the image generated from camera is 1080x1920 and the self.cropView.bounds is (0,0,185,120) and of course it do not represent the same scale used to take the image
extension UIImage {
func crop(rect: CGRect) -> UIImage {
var rect = rect
rect.origin.x*=self.scale
rect.origin.y*=self.scale
rect.size.width*=self.scale
rect.size.height*=self.scale
let imageRef = self.cgImage!.cropping(to: rect)
let image = UIImage(cgImage: imageRef!, scale: self.scale, orientation: self.imageOrientation)
return image
}
}
You can always crop visually any image in a quadrilateral (a four sided shape - doesn't have to be rectangle) using a Core Image filter call CIPerspectiveCorrection.
Let's say you have an imageView frame that is 414 width by 716 height, with an image that is 1600 width by 900 height in size. (You are using a content mode of .aspectFit, right?) Let's say you want to crop a 4 sided shape that's corners - in (X,Y) coordinates in the imageView - are (50,50), (75,75), (100,300), and (25,200). Note that I'm listing the points in top left (TL, top right (TR), bottom right (BR), bottom left (BL) order. Also note that this is not a straight forward rectangle.
What you need to do is this:
Convert the UIImage to a CIImage where the "extent" is the UIImage size,
Convert those UIImageView coordinates to CIImage coordinates,
pass them and the CIImage into the CIPerspectiveCorrection filter for cropping, and
render the CIImage output into a UIImageView.
The below code is a little rough around the edges, but hopefully you get the concept:
class ViewController: UIViewController {
let uiTL = CGPoint(x: 50, y: 50)
let uiTR = CGPoint(x: 75, y: 75)
let uiBL = CGPoint(x: 100, y: 300)
let uiBR = CGPoint(x: 25, y: 200)
var ciImage:CIImage!
var ctx:CIContext!
#IBOutlet weak var imageView: UIImageView!
override func viewDidLoad() {
super.viewDidLoad()
ctx = CIContext(options: nil)
ciImage = CIImage(image: imageView.image!)
}
override func viewWillLayoutSubviews() {
let ciTL = createVector(createScaledPoint(uiTL))
let ciTR = createVector(createScaledPoint(uiTR))
let ciBR = createVector(createScaledPoint(uiBR))
let ciBL = createVector(createScaledPoint(uiBL))
imageView.image = doPerspectiveCorrection(CIImage(image: imageView.image!)!,
context: ctx,
topLeft: ciTL,
topRight: ciTR,
bottomRight: ciBR,
bottomLeft: ciBL)
}
func doPerspectiveCorrection(
_ image:CIImage,
context:CIContext,
topLeft:AnyObject,
topRight:AnyObject,
bottomRight:AnyObject,
bottomLeft:AnyObject)
-> UIImage {
let filter = CIFilter(name: "CIPerspectiveCorrection")
filter?.setValue(topLeft, forKey: "inputTopLeft")
filter?.setValue(topRight, forKey: "inputTopRight")
filter?.setValue(bottomRight, forKey: "inputBottomRight")
filter?.setValue(bottomLeft, forKey: "inputBottomLeft")
filter!.setValue(image, forKey: kCIInputImageKey)
let cgImage = context.createCGImage((filter?.outputImage)!, from: (filter?.outputImage!.extent)!)
return UIImage(cgImage: cgImage!)
}
func createScaledPoint(_ pt:CGPoint) -> CGPoint {
let x = (pt.x / imageView.frame.width) * ciImage.extent.width
let y = (pt.y / imageView.frame.height) * ciImage.extent.height
return CGPoint(x: x, y: y)
}
func createVector(_ point:CGPoint) -> CIVector {
return CIVector(x: point.x, y: ciImage.extent.height - point.y)
}
func createPoint(_ vector:CGPoint) -> CGPoint {
return CGPoint(x: vector.x, y: ciImage.extent.height - vector.y)
}
}
EDIT: I'm putting this here to explain things. The two of us swapped projects, and there was an issue with the questioner's code where a nil return was happening. First, here's the corrected code, which should be in the cropImage() function:
let ciTL = createVector(createScaledPoint(topLeft, overlay: cameraView, image: image), image: image)
let ciTR = createVector(createScaledPoint(topRight, overlay: cameraView, image: image), image: image)
let ciBR = createVector(createScaledPoint(bottomRight, overlay: cameraView, image: image), image: image)
let ciBL = createVector(createScaledPoint(bottomLeft, overlay: cameraView, image: image), image: image)
The issue is with the last two lines, which were transposed by passing bottomLeft where it should have been bottomRight, and vice-versa. (Easy mistake to make, I've done it too!)
Some explanation to help those who use CIPerspectiveCorrection (and other filters that use CIVectors).
A CIVector can have anywhere from - I think 2 to, well, almost infinite amount of components. It depends on the filter. In this case there are two components (X, Y). Simple enough, but the twist is that the 4 CIVectors describe 4 points inside the CIImage extent where the origin is the bottom left, not the top left.
Note I did not say a 4 sided shape. You can actually have a "figure 8" like shape where the "bottom right" point is left of the "bottom left" point! This would result in a shape where two sides cross each other.
All that matters is that all 4 points lie with the CIImage extent. If they don't, the filter with return nil for it's output image.
One last note for those who haven't work with CIImage filters before - the filters will not execute until you ask for the outputImage. You can instantiate one, fill in the parameters, chain them, whatever. You can even make a typo in the filter name (or any of their keys). Until your code asks for the filter.outputImage, nothing happens.

Resources