How can Vision be used to identify visible face landmarks? - ios

I've been using Vision to identify Facial Landmarks, using VNDetectFaceLandmarksRequest.
It seems that whenever a face is detected, the resulting VNFaceObservation will always contain all the possible landmarks, and have positions for all of them. It also seems that positions returned for the occluded landmarks are 'guessed' by the framework.
I have tested this using a photo where the subject's face is turned to the left, and the left eye thus isn't visible. Vision returns a left eye landmark, along with a position.
Same thing with the mouth and nose of a subject wearing a N95 face mask, or the eyes of someone wearing opaque sunglasses.
While this can be a useful feature for other use cases, is there a way, using Vision or CIDetector, to figure which face landmarks actually are visible on a photo?
I also tried using CIDetector, but it appears to be able to detect mouths and smiles through N95 masks, so it doesn't seem to be a reliable alternative.

After confirmation from Apple, it appears it simply cannot be done.
If Vision detects a face, it will guess some occluded landmarks' positions, and there is no way to differentiate actually detected landmarks from guesses.
For those facing the same issue, a partial way around can be to compare the landmarks' points' positions to those of the median line's and the nose crest's points.
While this can help determine if a facial landmark is occluded by the face itself, it won't help with facial landmarks occluded by opaque sunglasses or face masks.

Related

Detect nose coordinates on a face profil view

In such tasks, I tend to you Mediapipe or Dlib to detect the landmarks of the face and get the specific coordinates I'm interested to work with.
But in the case of the human face taken from a Profil view, Dlib can't detect anything and Mediapipe shows me a standard 3D face-mesh superimposed on top of the 2D image which provides false coordinates.
I was wondering if anyone with Computer Vision (Image Processing) knowledge can guide me on how to detect the A & B points coordinates from this image
PS: The color of the background changes & also the face location is not standard.
Thanks in advance strong text
Your question seems a little unclear. If you just want (x,y) screen coordinates you can use this answer to convert the (x,y,z) that mediapipe gives you to just (x,y). If this doesn't
doesnt work for you I would recommend this repo or this one which both only work with 68 facial landmarks, but this should be sufficient for your use case.
If all of this fails I would recommend retraining hrnet on a dataset with profile views. I believe either 300-W dataset or the 300-VW dataset provides some data with heads at extreme angles.
If you wish to get the 3D coordinates in camera coordinates (X,Y,Z) you're going to have to use solvePNP. This will require getting calibration info and model mesh points for whatever facial landmark detector you use. You can find a some examples for some of this here

why is shape-indexed-feature so effective on face alignment?

I am implementing some face alignment algorithm recently. I have read the following papers:
Supervised descent method and its applications to face alignment
Face alignment by explicit shape regression
Face alignment at 3000 fps via regressing local binary features
All the paper mentioned a important keyword: shape-indexed-feature or pose-indexed-feature. This feature plays a key role in face alignment process. I did not get the key point of this feature. Why is it so important?
A shape-indexed-feature is a feature who's index gives some clue about the hierarchical structure of the shape that it came from. So in face alignment, facial landmarks are extremely important, since they are the things that will be useful in successfully aligning the faces. But, just taking facial landmarks into account throws away some of the structure inherent to a face. You know that the pupil is inside the iris, which is inside the eye. So a shape-indexed-feature would do more than tell you that you are looking at a facial landmark - it would tell you that you are looking at a facial landmark inside another landmark inside another landmark. Because there are only a few features that are 3-nested like that, you can be more confident about aligning those correctly.
Here is a much older paper that explains some of this with simpler language (especially in the introduction): http://www.cs.ubc.ca/~lowe/papers/cvpr97.pdf
If you want get shape-indexed-feature, you should do similarity transform for the face landmarks in one image first. The aim is transform the origin landmarks to a specific location which could be the mean landmark of all images. So the landmarks of each image is at same position.
Then you could extract local features according to the relocate landmarks, which are shape-indexed-feature, cause now the landmarks of each image is a fix shape.
I seached hours get the answer above, a graduation thesis and translated it, but not sure whether it's a right answer or not. In my opinion, it make sense.

Best Facial Landmark that can be easily extracted from NIR Image?

I'm playing with Eye Gaze estimation using a IR Camera. So far i have detected the two Pupil Center points as follows:
Detect the Face by using Haar Face cascade & Set the ROI to Face.
Detect the Eyes by using Haar Eye cascade & Name it as Left & Right Eye Respectively.
Detect the Pupil Center by thresholding the Eye region & found the pupil center.
So far I've tried to find the gaze direction by using the Haar Eye Boundary region. But this Haar Eye rect is not always showing the Eye Corner points. So the results was poor.
Then I've tried to tried to detect Eye Corner points using GFTT, Harriscorners & FAST but since I'm using NIR Camera the Eye Corner points are not clearly visible & so i cant able to get the exact corner positions.So I'm stuck here!
What else is the best feature that can be tracked easily from face? I heard about Flandmark but i think that is also will not work in IR captured images.
Is there any feature that can be extracted easily from the face images? Here I've attached my sample output image.
I would suggest flandmark, even if your intuition is the opposite - I've used it in my master thesis (which was about head pose estimation, a related topic). And if the question is whether it will work with the example image you've provided, I think it might detect features properly - even on a gray scaled image. I think in the flandmark they probably convert to image to grayscale before applying a detector (like the haar detector works). Moreover, It surprisingly works with the low resolution images what is an advantage too (especially when you're saying eye corners are not clearly visible). And flandmark can detect both eye corners, mouth corners and nose tip (actually I will not rely on the last one, from my experience detecting nose tip on the single image is quite noisy, however works fine with an image sequence with some filtering e.g. average, Kalman). If you decide to use that technique and it works, please let us know!

Find corner points of eyes and Mouth

I am able to detect eyes, nose and mouth in a given face using Matlab. Now, I want four more points i.e corners of the eyes and nose. how do i get these points??
This is the Image for corner points of nose.
Red point is showing the point, what I'm looking for.(its just to let you know.. there is no point in original image)
Active Appearance Model (AAM) could be useful in your case.
AMM is normally used for matching a statistical model of object shape and appearance to a new image and widely used for extracting face features and for head pose estimation.
I believe this could be helpful for you to start with.
You can try using corner detectors included in the computer vision system toolbox, such as detectHarrisFeatures, detectMinEigenFeatures, or detectFASTFeatures. However they may give you more points than you want, so you will have to do some parameter tweaking.

Recognition of face details as a set of points, not just rectangles

I'm doing research in the field of emotion recognition. For this purpose I need to catch and classify particular face details like eyes, nose, mouth, etc. Standard OpenCV function for this is detectMultiScale(), but its disadvantage is that it returns list of rectangles (video) while I'm mostly interested in particular key points - corners of mouth, upper and lower points, edges, etc (video).
So, how do they do it? OpenCV is ideal, but other solutions are ok too.
To analyse such precise points, you can use Active appearance models. Your second video seems to be done with AAM. Check out above wikipedia link, where you can get a lot of AAM tools and API.
On the other hand, if you can detect mouth using haar-cascade, apply colour filtering. Obviously lips and surrounding region has color difference. You get precise model of lips and find its edges.
Check out this paper: Lip Contour Extraction

Resources