I've setup an AVCaptureSession with a video data output and am attempting to use iOS 11's Vision framework to read QR codes. The camera is setup like basically any AVCaptureSession is. I will abbreviate and just show setting up the output.
let output = AVCaptureVideoDataOutput()
output.setSampleBufferDelegate(self, queue: captureQueue)
captureSession.addOutput(output)
// I did this to get the CVPixelBuffer to be oriented in portrait.
// I don't know if it's needed and I'm not sure it matters anyway.
output.connection(with: .video)!.videoOrientation = .portrait
So the camera is up and running as always. Here is the code I am using to perform a VNImageRequestHandler for QR codes.
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, orientation: .up, options: [:])
let qrRequest = VNDetectBarcodesRequest { request, error in
let barcodeObservations = request.results as? [VNBarcodeObservation]
guard let qrCode = barcodeObservations?.flatMap({ $0.barcodeDescriptor as? CIQRCodeDescriptor }).first else { return }
if let code = String(data: qrCode.errorCorrectedPayload, encoding: .isoLatin1) {
debugPrint(code)
}
}
qrRequest.symbologies = [.QR]
try! imageRequestHandler.perform([qrRequest])
}
I am using a QR code that encodes http://www.google.com as a test. The debugPrint line prints out:
AVGG\u{03}¢ò÷wwrævöövÆRæ6öÐì\u{11}ì
I have tested this same QR code with the AVCaptureMetadataOutput that has been around for a while and that method decodes the QR code correctly. So my question is, what have I missed to get the output that I am getting?
(Obviously I could just use the AVCaptureMetadataOutput as a solution, because I can see that it works. But that doesn't help me learn how to use the Vision framework.)
Most likely the problem is here:
if let code = String(data: qrCode.errorCorrectedPayload, encoding: .isoLatin1)
Try to use .utf8.
Also i would suggest to look at the raw output of the 'errorCorrectedPayload' without encoding. Maybe it already has correct encoding.
The definition of errorCorrectedPayload says:
-- QR Codes are formally specified in ISO/IEC 18004:2006(E). Section 6.4.10 "Bitstream to codeword conversion" specifies the set of 8-bit codewords in the symbol immediately prior to splitting the message into blocks and applying error correction. --
This seems to work fine with VNBarcodeObservation.payloadStringValue instead of transforming VNBarcodeObservation.barcodeDescriptor.
Related
I am developing application with ability to scan barcodes but i have problem with some characters which mess up everything for me. Same problem occured on android and i fixed it but i can't fix it on swift in same fashion.
I have tried multiple libraries and native ways to generate image of code128 barcode from provided String. It works on everything but special characters like '¿'. I tried everything i read after googling problem but i still could not fix it.
extension UIImage {
convenience init?(barcode: String) {
let data = barcode.data(using: .ascii)
guard let filter = CIFilter(name: "CICode128BarcodeGenerator") else {
return nil
}
filter.setValue(data, forKey: "inputMessage")
guard let ciImage = filter.outputImage else {
return nil
}
self.init(ciImage: ciImage)
}
}
let barcode = UIImage(barcode: "some text")
Everything works fine when scanning this exact barcode image from card and saving the value. It even says that ";038388¿" is type code128, but when I try to generate code128 barcode image out of it, somehow it has problem with "¿" character.
Code128 is defined as only capable of encoding ASCII, but ASCII does not have the "¿" character.
The conversion let data = barcode.data(using: .ascii) fails.
I would recommend catching this early using code like
guard let data = barcode.data(using: .ascii) else {
return nil
}
I am working on a function in my app to write images from my sample buffer to an AVAssetWriter. Curiously, this works fine on a 10.5" iPad Pro, but causes a crash on a 7.9" iPad Mini 2. I can't fathom how the same code could be problematic on two different devices. But here's my code;
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
// Setup the pixel buffer image
let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)!
// Setup the format description
let formatDescription = CMSampleBufferGetFormatDescription(sampleBuffer)!
// Setup the current video dimensions
self.currentVideoDimensions = CMVideoFormatDescriptionGetDimensions(formatDescription)
// Setup the current sample time
self.currentSampleTime = CMSampleBufferGetOutputPresentationTimeStamp(sampleBuffer)
// Handle record
if self.isCapturing {
// Setup auto release pool
autoreleasepool {
// Setup the output image
let outputImage = CIImage(cvPixelBuffer: pixelBuffer)
// Ensure the video writer is ready for more data
if self.videoWriter?.assetWriterPixelBufferInput?.assetWriterInput.isReadyForMoreMediaData == true {
// Setup the new pixel buffer (THIS IS WHERE THE ERROR OCCURS)
var newPixelBuffer: CVPixelBuffer? = nil
// Setup the pixel buffer pool
CVPixelBufferPoolCreatePixelBuffer(nil, (self.videoWriter?.assetWriterPixelBufferInput!.pixelBufferPool!)!, &newPixelBuffer)
// Render the image to context
self.context.render(outputImage, to: newPixelBuffer!, bounds: outputImage.extent, colorSpace: nil)
// Setup a success case
let success = self.videoWriter?.assetWriterPixelBufferInput?.append(newPixelBuffer!, withPresentationTime: self.currentSampleTime!)
// Ensure the success case exists
guard let mySuccess = success else { return }
// If unsuccessful, log
if !mySuccess {
print("Error with the sample buffer. Check for dropped frames.")
}
}
}
}
}
I receive an error that newPixelBuffer is nil, but again, only on a 7.9" iPad. The iPad Pro functions without any errors. Any thoughts? Thanks!
I eventually resolved this issue by tracing the problem back to my chosen codec in my Asset Writer's video output settings. I had my codec set to:
let codec: AVVideoCodecType = AVVideoCodecType.hevc
In doing some research, I found this article, which indicates that only certain devices can capture media in HEVC. As my first device was a 10.5" iPad Pro, it captured media with no problem. My second device was an iPad Mini, which resulted in the original problem occurring each time I tried to capture.
I have since changed my codec choice to:
let codec: AVVideoCodecType = AVVideoCodecType.h264, and the issue has now disappeared.
I'm trying to take two images using the camera, and align them using the iOS Vision framework:
func align(firstImage: CIImage, secondImage: CIImage) {
let request = VNTranslationalImageRegistrationRequest(
targetedCIImage: firstImage) {
request, error in
if error != nil {
fatalError()
}
let observation = request.results!.first
as! VNImageTranslationAlignmentObservation
secondImage = secondImage.transformed(
by: observation.alignmentTransform)
let compositedImage = firstImage!.applyingFilter(
"CIAdditionCompositing",
parameters: ["inputBackgroundImage": secondImage])
// Save the compositedImage to the photo library.
}
try! visionHandler.perform([request], on: secondImage)
}
let visionHandler = VNSequenceRequestHandler()
But this produces grossly mis-aligned images:
You can see that I've tried three different types of scenes — a close-up subject, an indoor scene, and an outdoor scene. I tried more outdoor scenes, and the result is the same in almost every one of them.
I was expecting a slight misalignment at worst, but not such a complete misalignment. What is going wrong?
I'm not passing the orientation of the images into the Vision framework, but that shouldn't be a problem for aligning images. It's a problem only for things like face detection, where a rotated face isn't detected as a face. In any case, the output images have the correct orientation, so orientation is not the problem.
My compositing code is working correctly. It's only the Vision framework that's a problem. If I remove the calls to the Vision framework, put the phone of a tripod, the composition works perfectly. There's no misalignment. So the problem is the Vision framework.
This is on iPhone X.
How do I get Vision framework to work correctly? Can I tell it to use gyroscope, accelerometer and compass data to improve the alignment?
You should set secondImage as targetImage, and perform handler with firstImage.
I use your composite way.
check out this example from MLBoy:
let request = VNTranslationalImageRegistrationRequest(targetedCIImage: image2, options: [:])
let handler = VNImageRequestHandler(ciImage: image1, options: [:])
do {
try handler.perform([request])
} catch let error {
print(error)
}
guard let observation = request.results?.first as? VNImageTranslationAlignmentObservation else { return }
let alignmentTransform = observation.alignmentTransform
image2 = image2.transformed(by: alignmentTransform)
let compositedImage = image1.applyingFilter("CIAdditionCompositing", parameters: ["inputBackgroundImage": image2])
I am trying to use the Google Mobile Vision API to detect when a user smiles from the camera feed. The problem I have is that the Google Mobile Vision API is not detecting any faces while apple's vision api immediately recognizes and tracks any face that I test my app with. I am using func captureOutput(_ captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) { } to detect when a user is smiling. Apple's Vision API seems to work fine but Google's API does not detect any faces. How would I fix my code to get Google's API to work as well? What am I doing wrong?
My Code...
var options = [GMVDetectorFaceTrackingEnabled: true, GMVDetectorFaceLandmarkType: GMVDetectorFaceLandmark.all.rawValue, GMVDetectorFaceMinSize: 0.15] as [String : Any]
var GfaceDetector = GMVDetector.init(ofType: GMVDetectorTypeFace, options: options)
extension ViewController: AVCaptureVideoDataOutputSampleBufferDelegate {
func captureOutput(_ captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) {
let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
let attachments = CMCopyDictionaryOfAttachments(kCFAllocatorDefault, sampleBuffer, kCMAttachmentMode_ShouldPropagate)
let ciImage1 = CIImage(cvImageBuffer: pixelBuffer!, options: attachments as! [String : Any]?)
let Gimage = UIImage(ciImage: ciImage1)
var Gfaces = GfaceDetector?.features(in: Gimage, options: nil) as? [GMVFaceFeature]
let options: [String : Any] = [CIDetectorImageOrientation: exifOrientation(orientation: UIDevice.current.orientation),
CIDetectorSmile: true,
CIDetectorEyeBlink: true]
let allFeatures = faceDetector?.features(in: ciImage1, options: options)
let formatDescription = CMSampleBufferGetFormatDescription(sampleBuffer)
let cleanAperture = CMVideoFormatDescriptionGetCleanAperture(formatDescription!, false)
var smilingProb = CGFloat()
guard let features = allFeatures else { return }
print("GFace \(Gfaces?.count)")
//THE PRINT ABOVE RETURNS 0
//MARK: ------ Google System Setup
for face: GMVFaceFeature in Gfaces! {
print("Google1")
if face.hasSmilingProbability {
print("Google \(face.smilingProbability)")
smilingProb = face.smilingProbability
}
}
for feature in features {
if let faceFeature = feature as? CIFaceFeature {
let faceRect = calculateFaceRect(facePosition: faceFeature.mouthPosition, faceBounds: faceFeature.bounds, clearAperture: cleanAperture)
let featureDetails = ["has smile: \(faceFeature.hasSmile), \(smilingProb)",
"has closed left eye: \(faceFeature.leftEyeClosed)",
"has closed right eye: \(faceFeature.rightEyeClosed)"]
update(with: faceRect, text: featureDetails.joined(separator: "\n"))
}
}
if features.count == 0 {
DispatchQueue.main.async {
self.detailsView.alpha = 0.0
}
}
}
UPDATE
I copied and pasted the Google Mobile Vision Detection Code into another app and it worked. The difference was that instead of constantly receiving frames, the app had only one image to analyze. Could this have something to do with how often I send a request or the format/quality of the CIImage?
ANOTHER UPDATE
I have identified an issue with how my app works. It seems that the image that the API is receiving is not upright or inline with the orientation of the phone. For example if I hold my phone up in front of my face (in normal portrait mode) the image is rotated 90 degrees anti clockwise. I have absolutely no idea why this is happening as the live camera preview is normal. The Google Docs say...
The face detector expects images and the faces in them to be in an upright orientation. If you need to rotate the image, pass in orientation information in the dictionary options with GMVDetectorImageOrientation key. The detector will rotate the images for you based on the orientation value.
New Question: (I believe the answer to either one of these questions will solve my problem)
A: How would I use the GMVImageDetectorImageOrientation key to set the orientation right?
B: How would I rotate the UIImage 90 degrees clockwise (NOT THE UIIMAGEVIEW)?
THIRD UPDATE
I have successfully rotated the image right side up but Google Mobile Vision is still not detecting any faces, the image is a bit distorted but I do not think the amount of distortion is affecting Google Mobile Vision's response. So...
How would I use the GMVImageDetectorImageOrientation key to set the orientation right?
ANY HELP/RESPONSE IS APPRECIATED.
I'm using AVCaptureSession to create a QR code scanner with AVCaptureMetadataOutput.
Everything is working as expected, however I'm wanting to put a graphical overlay on the scanner. In doing so, I'd like the scanner to only scan once the QR code is in a given section of the frame. Currently, it detects the QR code anywhere thats in the view, and I'd like it to trigger only when in the middle of the screen.
Is this even possible? For AVCapturePreviewLayer, I'm setting rectForMetadataOutputRectOfInterest but it doesn't seem to be working. Maybe I'm doing this wrong?
Some insight would be great. Thanks in advance!
I have not done this, but I believe this can be achieved. You see, when you catch the AVCaptureMetadataOutput, you can use the AVCaptureVideoPreviewLayer to get the QR code frame within the view, that's useful when you want to draw a rectangle around the captured QR.
func metadataOutput(_ output: AVCaptureMetadataOutput, didOutput metadataObjects: [AVMetadataObject], from connection: AVCaptureConnection) {
guard let readableCode = metadataObjects.first as? AVMetadataMachineReadableCodeObject, let code = readableCode.stringValue else { return }
if let barcodeObject = videoPreview?.transformedMetadataObject(for: readableCode) {
qrCodeFrameView?.frame = barcodeObject.bounds
}
stopReading()
didRead?(code)
}
You could use that barcodeObject to then ask if your rect contains the barcode using CGRect's contains(_:) method
if qrReader.frame.contains(barcodeObject.bounds) {
stopReading()
didRead?(code)
}
I think that could/should work.