How VNClassificationObservation identifier works in VNCoreMLRequest - ios

Im new in Core ML Model implementation. Im working on example of ARKit with Core ML. I have created core ML model with set of images. There are two folder, one folder (named as 'James') in which I have 3 images of my friend with front, left and right angle. and other folder (named as 'Unknown') qhwew I have some random faces. I kept iteration maximum 20 and augmented data options I have selected only crop.
I have following code where I'm integrating that Model. My requirement is when Im scanning the face of friend it should display my friend name as I mentioned in the Core Model and if any other face or any random face scanned then it should display "Uknown" or "This is not me" something text.
With this code when Im scanning my friend face its not showing his name as James or when Im scanning any random face apart from my friend face then its not showing Unknown as text. Where Im exactly lacking? What is the issue?
let sceneView = ARSCNView(frame: UIScreen.main.bounds)
override func viewDidLoad() {
super.viewDidLoad()
sceneView.delegate = self
sceneView.showsStatistics = true
guard ARFaceTrackingConfiguration.isSupported else { return }
let configuration = ARFaceTrackingConfiguration()
configuration.isLightEstimationEnabled = true
sceneView.session.run(configuration, options: [.resetTracking, .removeExistingAnchors])
view.addSubview(sceneView)
}
}
extension ViewController: ARSCNViewDelegate {
func renderer(_ renderer: SCNSceneRenderer, nodeFor anchor: ARAnchor) -> SCNNode? {
guard let device = sceneView.device else {
return nil
}
let faceGeometry = ARSCNFaceGeometry(device: device)
let node = SCNNode(geometry: faceGeometry)
node.geometry?.firstMaterial?.fillMode = .lines
return node
}
func renderer(_ renderer: SCNSceneRenderer, didUpdate node: SCNNode, for anchor: ARAnchor) {
guard let faceAnchor = anchor as? ARFaceAnchor,
let faceGeometry = node.geometry as? ARSCNFaceGeometry else {
return
}
faceGeometry.update(from: faceAnchor.geometry)
let text = SCNText(string: "", extrusionDepth: 2)
let font = UIFont(name: "Avenir-Heavy", size: 20)
text.font = font
let material = SCNMaterial()
material.diffuse.contents = UIColor.green
text.materials = [material]
text.firstMaterial?.isDoubleSided = true
let textNode = SCNNode(geometry: faceGeometry)
textNode.position = SCNVector3(-0.1, -0.1, -0.5)
print(textNode.position)
textNode.scale = SCNVector3(0.002, 0.002, 0.002)
textNode.geometry = text
guard let model = try? VNCoreMLModel(for: FaceRecognitionPerson_1().model) else {
fatalError("Unable to load model")
}
let coreMlRequest = VNCoreMLRequest(model: model) {[weak self] request, error in
guard let results = request.results as? [VNClassificationObservation],
let topResult = results.first
else {
fatalError("Unexpected results")
}
DispatchQueue.main.async {[weak self] in
print("Identifier Received //: ===> \(topResult.identifier)")
text.string = topResult.identifier
if topResult.identifier != "James" {
print("**===Known User Detected**===")
}
self!.sceneView.scene.rootNode.addChildNode(textNode)
self!.sceneView.autoenablesDefaultLighting = true
}
}
guard let pixelBuffer = self.sceneView.session.currentFrame?.capturedImage else { return }
let handler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: [:])
DispatchQueue.global().async {
do {
try handler.perform([coreMlRequest])
} catch {
print(error)
}
}
}

Related

Saving and Loading ARWorldMap in RealityKit

I am using SwiftUI and RealityKit to save my world map, along with one entity (box) that I will place in the world.
Inside my Coordinator I have the following code to create a box and add it to the anchor entity.
#objc func onTap(_ recognizer: UITapGestureRecognizer) {
guard let arView = arView else {
return
}
let location = recognizer.location(in: arView)
let results = arView.raycast(from: location, allowing: .estimatedPlane, alignment: .horizontal)
if let result = results.first {
let arAnchor = ARAnchor(name: "boxAnchor", transform: result.worldTransform)
let anchorEntity = AnchorEntity(anchor: arAnchor)
let box = ModelEntity(mesh: MeshResource.generateBox(size: 0.3), materials: [SimpleMaterial(color: .green, isMetallic: true)])
arView.session.add(anchor: arAnchor)
anchorEntity.addChild(box)
arView.scene.addAnchor(anchorEntity)
}
}
When I click Save button it saves the worldMap
func saveWorldMap() {
guard let arView = arView else {
return
}
arView.session.getCurrentWorldMap { worldMap, error in
if let error = error {
print(error)
return
}
if let worldMap = worldMap {
guard let data = try? NSKeyedArchiver.archivedData(withRootObject: worldMap, requiringSecureCoding: true) else {
return
}
// save the data into user defaults
let userDefaults = UserDefaults.standard
userDefaults.set(data, forKey: "worldMap")
userDefaults.synchronize()
}
}
}
And finally, the loadWorldMap function is supposed to load the map with the anchors and the entities attached to the anchors. Unfortunately, it does load the ARAnchors but there are no entities attached to them. The main reason being that I attached the entity to the AnchorEntity and not ARAnchor. How can I save an entity like a box attached to the ARAnchor.
func loadWorldMap() {
guard let arView = arView else {
return
}
let userDefaults = UserDefaults.standard
if let data = userDefaults.data(forKey: "worldMap") {
print(data)
print("loading world map")
guard let worldMap = try? NSKeyedUnarchiver.unarchivedObject(ofClass: ARWorldMap.self, from: data) else {
return
}
for anchor in worldMap.anchors {
print(anchor.name)
}
let configuration = ARWorldTrackingConfiguration()
configuration.initialWorldMap = worldMap
configuration.planeDetection = .horizontal
arView.session.run(configuration)
}
}

How to make a draggable UIView snap to the corners over the screen?

I have a draggable UIView and I am trying to make it snap to four corners of the screen. I tried a few things, but none of them have worked. Here's the code that I have:
import UIKit
import AVKit
import Vision
class ViewController: UIViewController, AVCaptureVideoDataOutputSampleBufferDelegate {
#IBOutlet weak var crystalName: UILabel!
#IBOutlet weak var crystalInfoContainer: UIView!
#IBOutlet weak var accuracy: UILabel!
var model = IdenticrystClassification().model
override func viewDidLoad() {
super.viewDidLoad()
// This method starts the camera.
let captureSession = AVCaptureSession()
guard let captureDevice = AVCaptureDevice.default(for: .video) else { return }
guard let input = try? AVCaptureDeviceInput(device: captureDevice) else { return }
captureSession.addInput(input)
captureSession.startRunning()
let previewLayer = AVCaptureVideoPreviewLayer(session: captureSession)
previewLayer.videoGravity = .resizeAspectFill
view.layer.addSublayer(previewLayer)
previewLayer.frame = view.frame
// This method defines sub view and defines it's properties.
view.addSubview(crystalInfoContainer)
crystalInfoContainer.clipsToBounds = true
crystalInfoContainer.layer.cornerRadius = 10.0
//crystalInfoContainer.layer.maskedCorners = [.layerMinXMinYCorner, .layerMaxXMinYCorner]
// This method defines torch functionality.
func toggleTorch(on: Bool) {
guard let device = AVCaptureDevice.default(for: .video) else { return }
if device.hasTorch {
do {
try device.lockForConfiguration()
if on == true {
device.torchMode = .on
} else {
device.torchMode = .off
}
device.unlockForConfiguration()
} catch {
print("Torch could not be used")
}
} else {
print("Torch is not available")
}
}
// This is the code that I am trying to work out.
func relativeVelocity(forVelocity velocity: CGFloat, from currentValue: CGFloat, to targetValue: CGFloat) -> CGFloat {
guard currentValue - targetValue != 0 else { return 0 }
return velocity / (targetValue - currentValue)
}
func nearestCorner(to point: CGPoint) -> CGPoint {
var minDistance = CGFloat.greatestFiniteMagnitude
var closestPosition = CGPoint.zero
for position in crystalInfoContainer { **Error1**
let distance = point.distance(to: position)
if distance < minDistance {
closestPosition = position
minDistance = distance
}
}
return closestPosition
let decelerationRate = UIScrollView.DecelerationRate.normal.rawValue
let velocity = UIPanGestureRecognizer.velocity(in: view)**Error2**
let projectedPosition = CGPoint(
x: crystalInfoContainer.center.x + project(initialVelocity: velocity.x, decelerationRate: decelerationRate),
y: crystalInfoContainer.center.y + project(initialVelocity: velocity.y, decelerationRate: decelerationRate)
)
let nearestCornerPosition = nearestCorner(to: projectedPosition)
let relativeInitialVelocity = CGVector(
dx: relativeVelocity(forVelocity: velocity.x, from: crystalInfoContainer.center.x, to: nearestCornerPosition.x),
dy: relativeVelocity(forVelocity: velocity.y, from: crystalInfoContainer.center.y, to: nearestCornerPosition.y)
)
let params = UISpringTimingParameters(damping: 1, response: 0.4, initialVelocity: relativeInitialVelocity)
let animator = UIViewPropertyAnimator(duration: 0, timingParameters: params)
animator.addAnimations {
self.crystalInfoContainer.center = nearestCornerPosition
}
animator.startAnimation()
}
let dataOutput = AVCaptureVideoDataOutput()
dataOutput.setSampleBufferDelegate(self, queue: DispatchQueue(label: "video"))
captureSession.addOutput(dataOutput)
toggleTorch(on: true)
}
// Handles Visiout output.
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
guard let pixelBuffer: CVPixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
guard let model = try? VNCoreMLModel(for: model) else { return }
let request = VNCoreMLRequest(model: model)
{ (finishedReq, err) in
guard let results = finishedReq.results as? [VNClassificationObservation] else { return }
guard let firstObservation = results.first else { return }
let name: String = firstObservation.identifier
let acc: Int = Int(firstObservation.confidence * 100)
DispatchQueue.main.async {
self.crystalName.text = name
self.accuracy.text = "Confidence: \(acc)%"
}
}
try? VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: [:]).perform([request])
}
override var prefersStatusBarHidden: Bool {
return true
}
}
Error1: For-in loop requires 'UIView?' to conform to 'Sequence'; did you mean to unwrap optional?
Error2: Instance member 'velocity' cannot be used on type 'UIPanGestureRecognizer'; did you mean to use a value of this type instead?
The problem is that your panView method is wrong. You need to switch on the gesture recognizer’s state — began, changed, or ended. Pan only when the gesture changes. When the gesture ends, then and only then, animate the view into the nearest corner.

Detect text from an image and get the rect of an keyword

I'm trying to get all the text found in an UIImage using VisionKit and get the location of a keyword (if it exists) in the image. So far I've got this:
var detectedText = ""
var textRecognitionRequest = VNRecognizeTextRequest(completionHandler: nil)
let textRecognitionWorkQueue = DispatchQueue(label: "TextRecognitionQueue", qos: .userInitiated, attributes: [], autoreleaseFrequency: .workItem)
private func recognizeTextInImage(_ image: UIImage?) {
guard let cgImage = image?.cgImage else { return }
textRecognitionWorkQueue.async {
let requestHandler = VNImageRequestHandler(cgImage: cgImage, options: [:])
do {
try requestHandler.perform([self.textRecognitionRequest])
} catch {
// You should handle errors appropriately in your app.
print(error)
}
}
}
And in viewDidLoad:
override func viewDidLoad() {
super.viewDidLoad()
let imgData = object.scannedImage ?? Data()
recognizeTextInImage(UIImage(data: imgData, scale: 1.0))
textRecognitionRequest.recognitionLevel = .accurate
textRecognitionRequest.usesLanguageCorrection = true
textRecognitionRequest.recognitionLanguages = ["en-US"]
textRecognitionRequest.customWords = ["KEYWORD"]
textRecognitionRequest = VNRecognizeTextRequest { (request, error) in
guard let observations = request.results as? [VNRecognizedTextObservation] else { return }
for observation in observations {
guard let topCandidate = observation.topCandidates(1).first else { return }
self.detectedText += topCandidate.string
self.detectedText += " "
if topCandidate.string == "KEYWORD" {
let boundingBox = observation.boundingBox
guard let imageData = object.scannedImage else { return }
let imgSize = UIImage(data: imageData)!.size
let rect = CGRect(x: boundingBox.minX * imgSize.width,
y: boundingBox.minY * imgSize.height,
width: boundingBox.width * imgSize.width,
height: boundingBox.height * imgSize.height)
print(rect)
}
}
}
}
But the detection of the boundingBox really slows down the process of finding all the text in the image and it gets really inaccurate and the printing of the rect never gets called.
Is there a better way of doing this?

Record depth map from iPhone as sequence

I want to create an application on IOS that can record and save RGB+Depth data. I have been able to capture both data from the dual-camera and preview on the screen in real-time. Now I want to save it as two sequences in the library (one RGB sequence and one depth map sequence).
So my question is how can I save this depth information on the iPhone gallery as a video or sequence, saving at the same time the RGB info, for future deep processing?
I am working with Xcode 10.2, Swift 5 and an iPhone XS.
import UIKit
import AVFoundation
class ViewController: UIViewController {
#IBOutlet weak var previewView: UIImageView!
#IBOutlet weak var previewModeControl: UISegmentedControl!
var previewMode = PreviewMode.original //Original(RGB) or Depth
let session = AVCaptureSession()
let dataOutputQueue = DispatchQueue(label: "video data queue", qos: .userInitiated, attributes: [], autoreleaseFrequency: .workItem)
var background: CIImage?
var depthMap: CIImage?
var scale: CGFloat = 0.0
override func viewDidLoad() {
super.viewDidLoad()
previewMode = PreviewMode(rawValue: previewModeControl.selectedSegmentIndex) ?? .original
configureCaptureSession()
session.startRunning()
}
override var shouldAutorotate: Bool {
return false
}
func configureCaptureSession() {
session.beginConfiguration()
//Add input to the session
guard let camera = AVCaptureDevice.default(.builtInDualCamera, for: .video, position: .unspecified) else {
fatalError("No depth video camera available")
}
session.sessionPreset = .photo
do{
let cameraInput = try AVCaptureDeviceInput(device: camera)
if session.canAddInput(cameraInput){
session.addInput(cameraInput)
}else{
fatalError("Error adding input device to session")
}
}catch{
fatalError(error.localizedDescription)
}
//Add output to the session
let videoOutput = AVCaptureVideoDataOutput()
videoOutput.setSampleBufferDelegate(self, queue: dataOutputQueue)
videoOutput.videoSettings = [kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_32BGRA]
if session.canAddOutput(videoOutput){
session.addOutput(videoOutput)
}else{
fatalError("Error adding output to session")
}
let videoConnection = videoOutput.connection(with: .video)
videoConnection?.videoOrientation = .portrait
//Add output to the session DEPTH
let depthOutput = AVCaptureDepthDataOutput()
//Set the current view controller as the delegate for the new object
depthOutput.setDelegate(self, callbackQueue: dataOutputQueue)
depthOutput.isFilteringEnabled = true //take advantge of holesin the data
if session.canAddOutput(depthOutput){
session.addOutput(depthOutput)
}else{
fatalError("Error adding output to session")
}
let depthConnection = depthOutput.connection(with: .depthData)
depthConnection?.videoOrientation = .portrait
let outputRect = CGRect(x: 0, y: 0, width: 1, height: 1)
let videoRect = videoOutput.outputRectConverted(fromMetadataOutputRect: outputRect)
let depthRect = depthOutput.outputRectConverted(fromMetadataOutputRect: outputRect)
scale = max(videoRect.width, videoRect.height) / max(depthRect.width, depthRect.height)
do{
try camera.lockForConfiguration()
if let frameDuration = camera.activeDepthDataFormat?.videoSupportedFrameRateRanges.first?.minFrameDuration{
camera.activeVideoMinFrameDuration = frameDuration
}
camera.unlockForConfiguration()
}catch{
fatalError(error.localizedDescription)
}
session.commitConfiguration()
}
#IBAction func previewModeChanged(_ sender: UISegmentedControl) {
previewMode = PreviewMode(rawValue: previewModeControl.selectedSegmentIndex) ?? .original
}
}
extension ViewController: AVCaptureVideoDataOutputSampleBufferDelegate{
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
let image = CIImage(cvPixelBuffer: pixelBuffer!)
let previewImage: CIImage
switch previewMode {
case .original:
previewImage = image
case .depth:
previewImage = depthMap ?? image
//default:
//previewImage = image
}
let displayImage = UIImage(ciImage: previewImage)
DispatchQueue.main.async {
[weak self] in self?.previewView.image = displayImage
}
}
}
extension ViewController: AVCaptureDepthDataOutputDelegate{
func depthDataOutput(_ output: AVCaptureDepthDataOutput, didOutput depthData: AVDepthData, timestamp: CMTime, connection: AVCaptureConnection) {
if previewMode == .original{
return
}
var convertedDepth: AVDepthData
if depthData.depthDataType != kCVPixelFormatType_DisparityFloat32{
convertedDepth = depthData.converting(toDepthDataType: kCVPixelFormatType_DisparityFloat32)
}else{
convertedDepth = depthData
}
let pixelBuffer = convertedDepth.depthDataMap
pixelBuffer.clamp()
let depthMap = CIImage(cvPixelBuffer: pixelBuffer)
DispatchQueue.main.async {
[weak self] in self?.depthMap = depthMap
}
}
}
Actual result preview on screen in real-time the different CIImage selected on the UI (image or depthMap)

How can I rescan an image in ARKit?

in my project I'm using ARKit for detecting a specific image, and when he is detected, the app show me the information. If I've already scan the image, and I want to rescan it for seeing the information, it doesn't work. This is the code that I used for the image recognition:
sceneView.delegate = self
sceneView.showsFPS = true
sceneView.showsNodeCount = true
if let scene = SKScene(fileNamed: "Scene") {
sceneView.presentScene(scene)
}
guard let referenceImages = ARReferenceImage.referenceImages(inGroupNamed: "image", bundle: nil) else {
fatalError("Missing expected asset catalog resources.")
}
let configuration = ARWorldTrackingConfiguration()
configuration.detectionImages = referenceImages
sceneView.session.run(configuration, options: [.resetTracking, .removeExistingAnchors])
}
// MARK: - ARSKViewDelegate
func view(_ view: ARSKView, nodeFor anchor: ARAnchor) -> SKNode? {
if let imageAnchor = anchor as? ARImageAnchor,
let referenceImageName = imageAnchor.referenceImage.name,
let scannedImage = self.images[referenceImageName] {
self.selectedImage = scannedImage
self.performSegue(withIdentifier: "showImageInformation", sender: self)
}
return nil
}
override func prepare(for segue: UIStoryboardSegue, sender: Any?) {
if segue.identifier == "showImageInformation"{
if let imageInformationVC = segue.destination as? ImageInformationViewController,
let actualSelectedImage = selectedImage {
imageInformationVC.imageInformation = actualSelectedImage
}
}
}
The only way is to reset your current session.
Example:
func resetExperience(session: ARSession, configuration: ARWorldTrackingConfiguration) {
guard let referenceImages = ARReferenceImage.referenceImages(inGroupNamed: "image", bundle: nil) else {
fatalError("Missing expected asset catalog resources.")
}
configuration.detectionImages = referenceImages
session.run(configuration, options: [.resetTracking, .removeExistingAnchors])
}
And some general info: ARWorldTrackingConfiguration.
Hope it helps!

Resources