Why is my photo becoming bigger?, Swift, iOS - ios

I am using the AVFoundation to take a photo but when I do the photo image that comes back is more enlarged than the photo I took.
Does anyone know why this would be?
I have the camera in landscape mode.
Below is my code for taking the image and then assigning it to a new variable.
var cropArea:CGRect{
get{
let factor = backImage.image!.size.width/view.frame.width
let scale = CGFloat(1)
let imageFrame = viewFinder.imageFrame()
let x = (cropAreaView.frame.origin.x - imageFrame.origin.x) * scale * factor
let y = (cropAreaView.frame.origin.y - imageFrame.origin.y) * scale * factor
let width = cropAreaView.frame.size.width * scale * factor
let height = cropAreaView.frame.size.height * scale * factor
return CGRect(x: x, y: y, width: width, height: height)
}
}
func photoOutput(_ output: AVCapturePhotoOutput, didFinishProcessingPhoto photo: AVCapturePhoto, error: Error?) {
if let imageData = photo.fileDataRepresentation() {
print(imageData)
image = UIImage(data: imageData)
backImage.image = image
let croppedCGImage = self.backImage.image?.cgImage?.cropping(to: self.cropArea)
let croppedImage = UIImage(cgImage: croppedCGImage!)
self.pictureView.image = croppedImage

Related

How do I get the distance of a specific coordinate from the screen size depthMap of ARDepthData?

I am trying to get the distance of a specific coordinate from a depthMap resized to the screen size, but it is not working.
I have tried to implement the following steps.
convert the depthMap to CIImage, and then resize the image to the orientation and size of the screen using affine transformation
convert the converted image to a screen-sized CVPixelBuffer
get the distance in meters stored in CVPixelBuffer from a one-dimensional array by width * y + x when getting the coordinates of (x, y).
I have implemented the above procedure, but I cannot get the appropriate index from the one-dimensional array. What should I do?
The code for the procedure is shown below.
1.
let depthMap = depthData.depthMap
// convert the depthMap to CIImage
let image = CIImage(cvPixelBuffer: depthMap)
let imageSize = CGSize(width: depthMap.width, height: depthMap.height)
// 1) キャプチャ画像を 0.0〜1.0 の座標に変換
let normalizeTransform = CGAffineTransform(scaleX: 1.0/imageSize.width, y: 1.0/imageSize.height)
// 2) 「Flip the Y axis (for some mysterious reason this is only necessary in portrait mode)」とのことでポートレートの場合に座標変換。
// Y軸だけでなくX軸も反転が必要。
let interfaceOrientation = self.arView.window!.windowScene!.interfaceOrientation
let flipTransform = (interfaceOrientation.isPortrait) ? CGAffineTransform(scaleX: -1, y: -1).translatedBy(x: -1, y: -1) : .identity
// 3) キャプチャ画像上でのスクリーンの向き・位置に移動
let displayTransform = frame.displayTransform(for: interfaceOrientation, viewportSize: arView.bounds.size)
// 4) 0.0〜1.0 の座標系からスクリーンの座標系に変換
let toViewPortTransform = CGAffineTransform(scaleX: arView.bounds.size.width, y: arView.bounds.size.height)
// 5) 1〜4までの変換を行い、変換後の画像をスクリーンサイズでクリップ
let transformedImage = image.transformed(by: normalizeTransform.concatenating(flipTransform).concatenating(displayTransform).concatenating(toViewPortTransform)).cropped(to: arView.bounds)
// convert the converted image to a screen-sized CVPixelBuffer
if let convertDepthMap = transformedImage.pixelBuffer(cgSize: arView.bounds.size) {
previewImage.image = transformedImage.toUIImage()
DispatchQueue.main.async {
self.processDepthData(convertDepthMap)
}
}
// The process of acquiring CVPixelBuffer is implemented in extension
extension CIImage {
func toUIImage() -> UIImage {
UIImage(ciImage: self)
}
func pixelBuffer(cgSize size:CGSize) -> CVPixelBuffer? {
var pixelBuffer: CVPixelBuffer?
let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue] as CFDictionary
let width:Int = Int(size.width)
let height:Int = Int(size.height)
CVPixelBufferCreate(kCFAllocatorDefault,
width,
height,
kCVPixelFormatType_DepthFloat32,
attrs,
&pixelBuffer)
// put bytes into pixelBuffer
let context = CIContext()
context.render(self, to: pixelBuffer!)
return pixelBuffer
}
}
private func processDepthData(_ depthMap: CVPixelBuffer) {
CVPixelBufferLockBaseAddress(depthMap, .readOnly)
let width = CVPixelBufferGetWidth(depthMap)
let height = CVPixelBufferGetHeight(depthMap)
if let baseAddress = CVPixelBufferGetBaseAddress(depthMap) {
let mutablePointer = baseAddress.bindMemory(to: Float32.self, capacity: width*height)
let bufferPointer = UnsafeBufferPointer(start: mutablePointer, count: width*height)
let depthArray = Array(bufferPointer)
CVPixelBufferUnlockBaseAddress(depthMap, .readOnly)
// index = width * y + x to trying to get the distance in meters for the coordinate of (300, 100), but it gets the distance for another coordinate
print(depthArray[width * 100 + 300])
}
}

Core Image: merge two CIImage Swift

So I have 2 CIImage that I want to merge together, so each with an alpha of 0.5. How can I do it?
I tired the following code but the resulting image is not the correct size and the two images aren't allied correctly... Please help!
if let image = CIImage(contentsOf: imageURL) {
let randomFilter = CIFilter(name: "CIRandomGenerator")
let noiseImage = randomFilter!.outputImage!.cropped(to: (CGRect(x: CGFloat(Int.random(in: 1..<1000)), y: CGFloat(Int.random(in: 1..<1000)), width: image.extent.width, height: image.extent.height)))
let compoimg = noiseImage.composited(over: image) //Misaligned image
}
The Problem lies on the random noise generator, because of it's nature that the random noise is cropped from an infinite noise map... the correct code compensates this translation:
if let image = CIImage(contentsOf: imageURL) {
let randomFilter = CIFilter(name: "CIRandomGenerator")
let randX = CGFloat(Int.random(in: 0..<1000))
let randY = CGFloat(Int.random(in: 0..<1000))
let noiseImage = randomFilter!.outputImage!.cropped(to: (CGRect(x: randX, y: randY, width: image.extent.width, height: image.extent.height)))
let tt = noiseImage.transformed(by: CGAffineTransform.init(translationX: -randX, y: -randY))
let compoimg = tt.composited(over: image) //Correctly allied image
}

Swift 3 - How do I improve image quality for Tesseract?

I am using Swift 3 to build a mobile app that allows the user to take a picture and run Tesseract OCR over the resulting image.
However, I've been trying to increase the quality of scan and it doesn't seem to be working much. I've segmented the photo into a more "zoomed in" region that I want to recognize and even tried making it black and white. Are there any strategies for "enhancing" or optimizing the picture quality/size so that Tesseract can recognize it better? Thanks!
tesseract.image = // the camera photo here
tesseract.recognize()
print(tesseract.recognizedText)
I got these errors and have no idea what to do:
Error in pixCreateHeader: depth must be {1, 2, 4, 8, 16, 24, 32}
Error in pixCreateNoInit: pixd not made
Error in pixCreate: pixd not made
Error in pixGetData: pix not defined
Error in pixGetWpl: pix not defined
2017-03-11 22:22:30.019717 ProjectName[34247:8754102] Cannot convert image to Pix with bpp = 64
Error in pixSetYRes: pix not defined
Error in pixGetDimensions: pix not defined
Error in pixGetColormap: pix not defined
Error in pixClone: pixs not defined
Error in pixGetDepth: pix not defined
Error in pixGetWpl: pix not defined
Error in pixGetYRes: pix not defined
Please call SetImage before attempting recognition.Please call SetImage before attempting recognition.2017-03-11 22:22:30.026605 EOB-Reader[34247:8754102] No recognized text. Check that -[Tesseract setImage:] is passed an image bigger than 0x0.
ive been using tesseract fairly successfully in swift 3 using the following:
func performImageRecognition(_ image: UIImage) {
let tesseract = G8Tesseract(language: "eng")
var textFromImage: String?
tesseract?.engineMode = .tesseractCubeCombined
tesseract?.pageSegmentationMode = .singleBlock
tesseract?.image = imageView.image
tesseract?.recognize()
textFromImage = tesseract?.recognizedText
print(textFromImage!)
}
I also found pre-processing the image helped too. I added the following extension to UIImage
import UIKit
import CoreImage
extension UIImage {
func toGrayScale() -> UIImage {
let greyImage = UIImageView()
greyImage.image = self
let context = CIContext(options: nil)
let currentFilter = CIFilter(name: "CIPhotoEffectNoir")
currentFilter!.setValue(CIImage(image: greyImage.image!), forKey: kCIInputImageKey)
let output = currentFilter!.outputImage
let cgimg = context.createCGImage(output!,from: output!.extent)
let processedImage = UIImage(cgImage: cgimg!)
greyImage.image = processedImage
return greyImage.image!
}
func binarise() -> UIImage {
let glContext = EAGLContext(api: .openGLES2)!
let ciContext = CIContext(eaglContext: glContext, options: [kCIContextOutputColorSpace : NSNull()])
let filter = CIFilter(name: "CIPhotoEffectMono")
filter!.setValue(CIImage(image: self), forKey: "inputImage")
let outputImage = filter!.outputImage
let cgimg = ciContext.createCGImage(outputImage!, from: (outputImage?.extent)!)
return UIImage(cgImage: cgimg!)
}
func scaleImage() -> UIImage {
let maxDimension: CGFloat = 640
var scaledSize = CGSize(width: maxDimension, height: maxDimension)
var scaleFactor: CGFloat
if self.size.width > self.size.height {
scaleFactor = self.size.height / self.size.width
scaledSize.width = maxDimension
scaledSize.height = scaledSize.width * scaleFactor
} else {
scaleFactor = self.size.width / self.size.height
scaledSize.height = maxDimension
scaledSize.width = scaledSize.height * scaleFactor
}
UIGraphicsBeginImageContext(scaledSize)
self.draw(in: CGRect(x: 0, y: 0, width: scaledSize.width, height: scaledSize.height))
let scaledImage = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
return scaledImage!
}
func orientate(img: UIImage) -> UIImage {
if (img.imageOrientation == UIImageOrientation.up) {
return img;
}
UIGraphicsBeginImageContextWithOptions(img.size, false, img.scale)
let rect = CGRect(x: 0, y: 0, width: img.size.width, height: img.size.height)
img.draw(in: rect)
let normalizedImage : UIImage = UIGraphicsGetImageFromCurrentImageContext()!
UIGraphicsEndImageContext()
return normalizedImage
}
}
And then called this before passing the image to performImageRecognition
func processImage() {
self.imageView.image! = self.imageView.image!.toGrayScale()
self.imageView.image! = self.imageView.image!.binarise()
self.imageView.image! = self.imageView.image!.scaleImage()
}
Hope this helps

CGImageCreateWithImageInRect() returning nil

I'm trying to crop an image into a square, but once I actually try to do the crop by using CGImageCreateWithImageInRect(), this line crashes. I set breakpoints and made sure that the arguments passed into this function are not nil.
I'm fairly new to programming and Swift, but have searched around and haven't found any solution to my problem.
The failure reason:
fatal error: unexpectedly found nil while unwrapping an Optional value
func cropImageToSquare(imageData: NSData) -> NSData {
let image = UIImage(data: imageData)
let contextImage : UIImage = UIImage(CGImage: image!.CGImage!)
let contextSize: CGSize = contextImage.size
let imageDimension: CGFloat = contextSize.height
let posY : CGFloat = (contextSize.height + (contextSize.width - contextSize.height)/2)
let rect: CGRect = CGRectMake(0, posY, imageDimension, imageDimension)
// error on line below: fatal error: unexpectedly found nil while unwrapping an Optional value
let imageRef: CGImageRef = CGImageCreateWithImageInRect(contextImage.CGImage, rect)!
let croppedImage : UIImage = UIImage(CGImage: imageRef, scale: 1.0, orientation: image!.imageOrientation)
let croppedImageData = UIImageJPEGRepresentation(croppedImage, 1.0)
return croppedImageData!
}
Your code uses a lot of force-unwrapping with !s. I would recommend avoiding this — the compiler is trying to help you write code that won't crash. Use optional chaining with ?, and if let / guard let, instead.
The ! on that particular line is hiding an issue where CGImageCreateWithImageInRect might return nil. The documentation explains that this happens when the rect is not correctly inside the image bounds. Your code works for images in portrait orientation, but not landscape.
Furthermore, there's a convenient function provided by AVFoundation which can automatically find the right rectangle for you to use, called AVMakeRectWithAspectRatioInsideRect. No need to do the calculations manually :-)
Here's what I would recommend:
import AVFoundation
extension UIImage
{
func croppedToSquare() -> UIImage
{
guard let cgImage = self.CGImage else { return self }
// Note: self.size depends on self.imageOrientation, so we use CGImageGetWidth/Height here.
let boundingRect = CGRect(
x: 0, y: 0,
width: CGImageGetWidth(cgImage),
height: CGImageGetHeight(cgImage))
// Crop to square (1:1 aspect ratio) and round the resulting rectangle to integer coordinates.
var cropRect = AVMakeRectWithAspectRatioInsideRect(CGSize(width: 1, height: 1), boundingRect)
cropRect.origin.x = ceil(cropRect.origin.x)
cropRect.origin.y = ceil(cropRect.origin.y)
cropRect.size.width = floor(cropRect.size.width)
cropRect.size.height = floor(cropRect.size.height)
guard let croppedImage = CGImageCreateWithImageInRect(cgImage, cropRect) else {
assertionFailure("cropRect \(cropRect) was not inside \(boundingRect)")
return self
}
return UIImage(CGImage: croppedImage, scale: self.scale, orientation: self.imageOrientation)
}
}
// then:
let croppedImage = myUIImage.croppedToSquare()

Tesseract OCR w/ iOS & Swift returns error or gibberish

I used this tutorial to get Tesseract OCR working with Swift: http://www.piterwilson.com/blog/2014/10/18/minimal-tesseact-ocr-setup-in-swift/
It works fine if I upload the demo image and call
tesseract.image = UIImage(named: "image_sample.jpg");
But if I use my camera code and take a picture of that same image and call
tesseract.image = self.image.blackAndWhite();
the result is either gibberish like
s I 5E251 :Ec
‘-. —7.//:E*髧
a g :_{:7 IC‘
J 7 iii—1553‘
: fizzle —‘;-—:
; ~:~./: -:-‘-
‘- :~£:': _-'~‘:
: 37%; §:‘—_
: ::::E 7,;.
1f:,:~ ——,
Or it returns a BAD_EXC_ACCESS error. I haven't been able to reproduce the reasoning behind why it gives the error or the gibberish. This is the code of my camera capture (photo taken()) and the processing step (nextStepTapped()):
#IBAction func photoTaken(sender: UIButton) {
var videoConnection = stillImageOutput.connectionWithMediaType(AVMediaTypeVideo)
if videoConnection != nil {
// Show next step button
self.view.bringSubviewToFront(self.nextStep)
self.nextStep.hidden = false
// Secure image
stillImageOutput.captureStillImageAsynchronouslyFromConnection(videoConnection) {
(imageDataSampleBuffer, error) -> Void in
var imageData = AVCaptureStillImageOutput.jpegStillImageNSDataRepresentation(imageDataSampleBuffer)
self.image = UIImage(data: imageData)
//var dataProvider = CGDataProviderCreateWithCFData(imageData)
//var cgImageRef = CGImageCreateWithJPEGDataProvider(dataProvider, nil, true, kCGRenderingIntentDefault)
//self.image = UIImage(CGImage: cgImageRef, scale: 1.0, orientation: UIImageOrientation.Right)
}
// Freeze camera preview
captureSession.stopRunning()
}
}
#IBAction func nextStepTapped(sender: UIButton) {
// Save to camera roll & proceeed
//UIImageWriteToSavedPhotosAlbum(self.image.blackAndWhite(), nil, nil, nil)
//UIImageWriteToSavedPhotosAlbum(self.image, nil, nil, nil)
// OCR
var tesseract:Tesseract = Tesseract();
tesseract.language = "eng";
tesseract.delegate = self;
tesseract.image = self.image.blackAndWhite();
tesseract.recognize();
NSLog("%#", tesseract.recognizedText);
}
The image saves to the Camera Roll and is completely legible if I uncomment the commented lines. Not sure why it won't work. It has no problem reading the text on the image if it's uploaded directly into Xcode as a supporting file, but if I take a picture of the exact same image on my screen then it can't read it.
Stumbled upon this tutorial: http://www.raywenderlich.com/93276/implementing-tesseract-ocr-ios
It happened to mention scaling the image. They chose the max dimension as 640. I was taking my pictures as 640x480, so I figured I didn't need to scale them, but I think this code essentially redraws the image. For some reason now my photos OCR fairly well. I still need to work on image processing for smaller text, but it works perfectly for large text. Run my image through this scaling function and I'm good to go.
func scaleImage(image: UIImage, maxDimension: CGFloat) -> UIImage {
var scaledSize = CGSize(width: maxDimension, height: maxDimension)
var scaleFactor: CGFloat
if image.size.width > image.size.height {
scaleFactor = image.size.height / image.size.width
scaledSize.width = maxDimension
scaledSize.height = scaledSize.width * scaleFactor
} else {
scaleFactor = image.size.width / image.size.height
scaledSize.height = maxDimension
scaledSize.width = scaledSize.height * scaleFactor
}
UIGraphicsBeginImageContext(scaledSize)
image.drawInRect(CGRectMake(0, 0, scaledSize.width, scaledSize.height))
let scaledImage = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
return scaledImage
}

Resources