Classify faces from VNFaceObservation - ios

I'm working with Vision framework to detect faces and objects on multiple images and works fantastic.
But I have a question that I can't find on documentation. The Photos app on iOS classify faces and you can click on face and show all the images with this face.
How can I classify faces like Photos app? Is there any unique identifier or similar to do this?
Thanks!

In order to uniquely recognise faces, firstly you need to detect a face, then run it through a CoreML model (or another image classification model type, such as a Tensorflow model) in order to classify the image and tell you the likeliness that the face you captured matches one of the faces trained into your model.
Apple Photos uses machine learning (as mentioned in their iPhone reveal keynote this year) to train the device to recognise faces in your photos. The training would be performed locally on the device, however, Apple does not offer any public APIs (yet) to allow us to do this.
You could send photo data (crops of faces using the tool mentioned above by Paras) to your server and have it train a model (using CoreML's trainer or something like Nvidia DIGITS on AWS or your own server), convert it to CoreML, compile the model then download it to your device and sideload the model. This is as close as you're going to get to "magic" face recognition used by Photos, for now, as the device can only read compiled models.

I don't think their is a way to uniquely identify faces returned to you from the vision framework. I checked the UUID property a VNFaceObservation and it is a different identifier every-time.
You might have to make your own CoreML model or just wait/find a good 3rd party one.
I hope someone proves me wrong because I want to know also.

You might want to check out this repo
https://github.com/KimDarren/FaceCropper
I tested this and works very well, you can even customise as per your need.

Related

Suspicious facial expression and foreign object recognition using machine learning and image processing

I am stuck with a project on detecting suspicious facial expression along with detection of foreign objects (e.g. guns, metal rods or anything). I know nothing much of ML or image processing. I need to complete the project as soon as possible. It would be helpful if anyone could direct me with some things.
How can I manage a dataset?
Which type of code should I follow?
How do I present the final system?
I know it is a lot to ask but any amount of help is appreciated.
I have tried to train a machine using transfer learning following this link in in youtube:
https://www.youtube.com/watch?v=avv9GQ3b6Qg\
The tutorial uses mobilenet as the model and a known dataset of 7 subset (Angry, Disgust, Fear, Happy, Neutral, Sad, Surprised). I was able to successfully train the model get the face detected based on these 7 emotions.
How do I further develop it to achieve what I want?

OPENCV Best way to handle a game screenshot

I want to make an application for counting game statistics automatically. For that purpose, I need some sort of computer vision for handling screenshots of the game.
There are bunch of regions with different skills in always the same place that app needs to recognize. I assume that it should have a database of pictures or maybe some trained samples.
I've started to learn opencv lib, but not sure what will be better for this purpouse.
Would you please give me some hints or algorithms that I could use?
Here is the example of game screenshot.
You can covert it into gray scale and then use any haar cascade classifier to read the words in that image and then save it into any file format (csv) this way you can utilize your game pics for gathering data so that you can train your models

How to recognize or match two images?

I have one image stored in my bundle or in the application.
Now I want to scan images in camera and want to compare that images with my locally stored image. When image is matched I want to play one video and if user move camera from that particular image to somewhere else then I want to stop that video.
For that I have tried Wikitude sdk for iOS but it is not working properly as it is crashing anytime because of memory issues or some other reasons.
Other things came in mind that Core ML and ARKit but Core ML detect the image's properties like name, type, colors etc and I want to match the image. ARKit will not support all devices and ios and also image matching as per requirement is possible or not that I don't have idea.
If anybody have any idea to achieve this requirement they can share. every help will be appreciated. Thanks:)
Easiest way is ARKit's imageDetection. You know the limitation of devices it support. But the result it gives is wide and really easy to implement. Here is an example
Next is CoreML, which is the hardest way. You need to understand machine learning even if in brief. Then the tough part - training with your dataset. Biggest drawback is you have single image. I would discard this method.
Finally mid way solution is to use OpenCV. It might be hard but suit your need. You can find different methods of feature matching to find your image in camera feed. example here. You can use objective-c++ to code in c++ for ios.
Your task is image similarity you can do it simply and with more reliable output results using machine learning. Since your task is using camera scanning. Better option is CoreML.You can refer this link by apple for Image Similarity.You can optimize your results by training with your own datasets. Any more clarifications needed comment.
Another approach is to use a so-called "siamese network". Which really means that you use a model such as Inception-v3 or MobileNet and both images and you compare their outputs.
However, these models usually give a classification output, i.e. "this is a cat". But if you remove that classification layer from the model, it gives an output that is just a bunch of numbers that describe what sort of things are in the image but in a very abstract sense.
If these numbers for two images are very similar -- if the "distance" between them is very small -- then the two images are very similar too.
So you can take an existing Core ML model, remove the classification layer, run it twice (once on each image), which gives you two sets of numbers, and then compute the distance between these numbers. If this distance is lower than some kind of threshold, then the images are similar enough.

Recognize specific images, not the objects in the images

I need to recognize specific images using the iPhone camera. My goal is to have a set of 20 images, that when a print or other display of one of them is present in front of the camera, the app recognizes that image.
I thought about using classifiers (CoreML), but I don't think it would give the intended result. For example, if I had a model that recognizes fruits, and then I showed it two different pictures of a banana, It would recognize them both as bananas, which is not what I want. I want my app to recognize specific images, regardless of its content.
The behavior I want is exactly what ARToolKit does (https://www.artoolkit.org/documentation/doku.php?id=3_Marker_Training:marker_nft_training), but I do not wish to use this library.
So my question is: Are the any other libraries, or other ways, for me to recognize specific images from the camera on iOS (preferably in Swift).
Since you are using images specific to your use case there isn't going to be an existing model that you can use. You'd have to create a model, train it, and then import it into CoreML. It's hard to provide specific advice since I know nothing about your images.
As far as libraries are concerned checkout this list and Swift-AI.
Swift-AI has a neural network that you might be able to train if you had enough images.
Most likely you will have to create the model in another language, such as Python and then import it into your Xcode project.
Take a look at this question.
This blog post goes into some detail about how to train your own model for CoreML.
Keras is probably your best bet to build your model. Take a look at this tutorial.
There are other problems too though like you only have 20 images. This is certainly not enough to train an accurate model. Also the user can present modified versions of these images. You'd have to generate realistic sample of each possible image and then use that entire set to train the model. I'd say you need a minimum of 20 images of each image (400 total).
You'll want to pre-process the image and extract features that you can compare to the known features of your images. This is how facial recognition works. Here is a guide for facial recognition that might be able to help you with feature extraction.
Simply put without a model that is based on your images you can't do much.
Answering my own question.
I ended up following this awesome tutorial that uses OpenCV to recognize specific images, and teaches how to make a wrapper so this code can be accessed by Swift.

iPhoto face recognition algorithm

I'm writing a project in which we need to be able to recognize faces using OpenCV. I'm training my base on photos, then give test photos to the program with people, which we attended. Recognition works good (80-90%). But! If I give the program a photo with person, which we didn't use in the teaching of our base, the program finds a man in our base with the terrible low distance. At the same time, Apple iPhoto works good with all photos. Can anyone know what algorithm they used to recognize faces ? or had my problem? Help please.
P.S. Tested algorithms: LBPHFaceRecognizer, FisherFaceRecognizer, EigenFaceRecognizer.
You mention iPhoto so I'm going to assume you're using OS X or iOS. If so, you may want to try Apple's built-in face detection.

Resources