Is it possible to use iphone X faceID data to create a 3D model of the user face? If yes, can you please give tell me where should I look? I was not reallw able to found something related to this. I found a video on the WWDC about true depth and ARKit but I am not sure that it would help.
Edit:
I just watched a WWDC video and its says that ARKit provides a detailed 3D geometry face. Do you think it's precise enough to create a 3D representation of a person face? Maybe combined with an image? Any idea?
Yes and no.
Yes, there are APIs for getting depth maps captured with the TrueDepth camera, for face tracking and modeling, and for using Face ID to authenticate in your own app:
You implement Face ID support using the LocalAuthentication framework. It's the same API you use for Touch ID support on other devices — you don't get any access to the internals of how the authentication works or the biometric data involved, just a simple yes-or-no answer about whether the user passed authentication.
For simple depth map capture with photos and video, see AVFoundation > Cameras and Media Capture, or the WWDC17 session on such — everything about capturing depth with the iPhone 7 Plus dual back camera also applies to the iPhone X and 8 Plus dual back camera, and to the front TrueDepth camera on iPhone X.
For face tracking and modeling, see ARKit, specifically ARFaceTrackingConfiguration and related API. There's sample code showing the various basic things you can do here, as well as the Face Tracking with ARKit video you found.
Yes, indeed, you can create a 3D representation of a user's face with ARKit. The wireframe you see in that video is exactly that, and is provided by ARKit. With ARKit's SceneKit integration you can easily display that model, add textures to it, add other 3D content anchored to it, etc. ARKit also provides another form of face modeling called blend shapes — this is the more abstract representation of facial parameters, tracking 50 or so muscle movements, that gets used for driving avatar characters like Animoji.
All of this works with a generalized face model, so there's not really anything in there about identifying a specific user's face (and you're forbidden from trying to use it that way in the App Store — see §3.3.52 "If your application accesses face data..." in the developer program license agreement).
No, Apple provides no access to the data or analysis used for enrolling or authenticating Face ID. Gaze tracking / attention detection and whatever parts of Apple's face modeling have to do with identifying a unique user's face aren't parts of the SDK Apple provides.
Related
Apple provides sample project for putting 3d content or face filters on people faces. The 3d content tracks face anchor and move according to it. But this function is only supported with devices that have TrueDepth Camera. For example, we can not use ARSCNFaceGeometry without TrueDepth. How Facebook or 3. party SDKs like Banuba makes this work with devices without depth camera?
As far as I know, using the MediaPipe and get face mesh is the only possibility without TrueDepth camera.
Is it possible in the Apple Vision Framework to compare faces and recognise if that person is in a picture compared to a reference image of that person?
Something like Facebook Face recognition.
Thomas
From the Vision Framework Documentation:
The Vision framework performs face and face landmark detection, text
detection, barcode recognition, image registration, and general
feature tracking. Vision also allows the use of custom Core ML models
for tasks like classification or object detection.
So, no, the Vision Framework does not provide face recognition, only face detection.
There are approaches out there to recognize faces. Here is an example of face recognition in an AR app:
https://github.com/NovatecConsulting/FaceRecognition-in-ARKit
They trained a model that can detect like 100 persons, but you have to retrain it for every new person you want to recognize. Unfortunately, you can not just give two images in and have the faces compared.
According to Face Detection vs Face Recognition article:
Face detection just means that a system is able to identify that there is a human face present in an image or video. For example, Face Detection can be used to auto focus functionality for cameras.
Face recognition describes a biometric technology that goes far beyond a way when just a human face is detected. It actually attempts to establish whose face it is.
But...
In a case you need an Augmented Reality app, like Facebook's FaceApp, the answer is:
Yes, you can create an app similar to FaceApp using ARKit.
Because you need just a simple form of Face Recognition what is accessible via ARKit or RealityKit framework. You do not even need to create a .mlmodel like you do using Vision and CoreML frameworks.
All you need is a device with a front camera, allowing you detect up to three faces at a time using ARKit 3.0 or RealityKit 1.0. Look at the following Swift code how you can do it to get ARFaceAnchor when a face has detected.
And additionally, if you wanna use reference images for simple face detection – you need to put several reference images in Xcode's .arresourcegroup folder and use the following Swift code as additional condition to get a ARImageAnchor (in the center of a detected image).
How can we access Front Facing Camera Images with ARCamera or ARSCNView and is it possible to record ARSCNView just like Camera Recording?
Regarding the front-facing camera: in short, no.
ARKit offers two basic kinds of AR experience:
World Tracking (ARWorldTrackingConfiguration), using the back-facing camera, where a user looks "through" the device at an augmented view of the world around them. (There's also AROrientationTrackingConfiguration, which is a reduced quality version of world tracking, so it still uses only the back-facing camera.)
Face Tracking (ARFaceTrackingConfiguration), supported only with the front-facing TrueDepth camera on iPhone X, where the user sees an augmented view of theirself in the front-facing camera view. (As #TawaNicolas notes, Apple has sample code here... which, until iPhone X actually becomes available, you can read but not run.)
In addition to the hardware requirement, face tracking and world tracking are mostly orthogonal feature sets. So even though there's a way to use the front facing camera (on iPhone X only), it doesn't give you an experience equivalent to what you get with the back facing camera in ARKit.
Regarding video recording in the AR experience: you can use ReplayKit in an ARKit app same as in any other app.
If you want to record just the camera feed, there isn't a high level API for that, but in theory you might have some success feeding the pixel buffers you get in each ARFrame to AVAssetWriter.
As far as I know, ARKit with Front Facing Camera is only supported for iPhone X.
Here's Apple's sample code regarding this topic.
If you want to access the UIKit or AVFoundation cameras, you still can, but separately from ARSCNView. E.g., I'm loading UIKit's UIImagePickerController from an IBAction and it is a little awkward to do so, but it works for my purposes (loading/creating image and video assets).
I planned to develop a software that can takes attendant (work , school) by face recognition as my final year project(FYP).(Just an Idea)
I have search through the net about the image processing library and i found out OpenCv is more well known as i found a lot of video for face recognition using OpenCv in youtube which will definitely help me a lot.(I'm totally new to image processing). Also, i will be using Visual Studio.
Here come the first problem, which is is it possible to detect that it is a photo or a real person is standing in front of the camera while taking the attending?
If yes, can you provide me some link or tutorial link for how image processing can detect 'photograph' and 'real person'?
As i said, I'm totally new to image processing and this is just an idea for my FYP
Or is the any open sources library that you recommend?
Eulerian Video Magnification can detect that it is a photo or a real person is standing in front of the camera but it may not detect that it is a video or real person is standing in front of the camera. Thus, the Face Recognition Authentication System which is based Eulerian Video Magnification can not be successfull when malicious user uses face video to rather than a real person face.
Here are my ideas to develop robust Face Recogition Authentication system;
You can use Multi-View Face Recognition to develop robust face authentication system. Here is a demo video of this technique and here are papers to get theoritical background. Also you can benefit from this, this, this and this when you start coding.
You can use RANDOM directions to detect that it is a photo/video or real person for example blink your eyes 3 times, move your eyebrow, look at the left side or look at the right side (multi-view face recognition will be used to recognize user's face when user look at the right or left) etc.
You should use these 2 ideas in your project for developing robust Face
Recognition Authentication system.
Here is the scnario;
I am totally new to AR and I searched on the internet about marker based and markerless AR but I am confused with marker based and markerless AR..
Lets assume an AR app triggers AR action when it scans specific images..So is this marker based AR or markerless AR..
Isn't the image a marker?
Also to position the AR content does marker based AR use devices' accelerometer and compass as in markerless AR?
In a marker-based AR application the images (or the corresponding image descriptors) to be recognized are provided beforehand. In this case you know exactly what the application will search for while acquiring camera data (camera frames). Most of the nowadays AR apps dealing with image recognition are marker-based. Why? Because it's much more simple to detect things that are hard-coded in your app.
On the other hand, a marker-less AR application recognizes things that were not directly provided to the application beforehand. This scenario is much more difficult to implement because the recognition algorithm running in your AR application has to identify patterns, colors or some other features that may exist in camera frames. For example if your algorithm is able to identify dogs, it means that the AR application will be able to trigger AR actions whenever a dog is detected in a camera frame, without you having to provide images with all the dogs in the world (this is exaggerated of course - training a database for example) when developing the application.
Long story short: in a marker-based AR application where image recognition is involved, the marker can be an image, or the corresponding descriptors (features + key points). Usually an AR marker is a black&white (square) image,a QR code for example. These markers are easily recognized and tracked => not a lot of processing power on the end-user device is needed to perform the recognition (and optionally tracking).
There is no need of an accelerometer or a compass in a marker-based app. The recognition library may be able to compute the pose matrix (rotation & translation) of the detected image relative to the camera of your device. If you know that, you know how far the recognized image is and how it is rotated relative to your device's camera. And from now on, AR begins... :)
Well. Since I got downvoted without explanation. Here is a little more detail on markerless tracking:
Actual there are several possibilities for augmented reality without "visual" markers but none of them called markerless tracking.
Showing of the virtual information can be triggered by GPS, Speech or simply turning on your phone.
Also, people tend to confuse NFT(Natural feature tracking) with markerless tracking. With NFT you can take a real life picture as a marker. But it is still a "marker".
This site has a nice overview and some examples for each marker:
Marker-Types
It's mostly in german but so beware.
What you call markerless tracking today is a technique best observed with the Hololens(and its own programming language) or the AR-Framework Kudan. Markerless Tracking doesn't find anything on his own. Instead, you can place an object at runtime somewhere in your field of view.
Markerless tracking is then used to keep this object in place. It's most likely uses a combination of sensor input and solving the SLAM( simultaneous localization and mapping) problem at runtime.
EDIT: A Little update. It seems the hololens creates its own inner geometric representation of the room. 3D-Objects are then put into that virtual room. After that, the room is kept in sync with the real world. The exact technique behind that seems to be unknown but some speculate that it is based on the Xbox Kinect technology.
Let's make it simple:
Marker-based augmented reality is when the tracked object is black-white square marker. A great example that is really easy to follow shown here: https://www.youtube.com/watch?v=PbEDkDGB-9w (you can try out by yourself)
Markerless augmented reality is when the tracked object can be anything else: picture, human body, head, eyes, hand or fingers etc. and on top of that you add virtual objects.
To sum it up, position and orientation information is the essential thing for Augmented Reality that can be provided by various sensors and methods for them. If you have that information accurate - you can create some really good AR applications.
It looks like there may be some confusion between Marker tracking and Natural Feature Tracking (NFT). A lot of AR SDK's tote their tracking as Markerless (NFT). This is still marker tracking, in that a pre-defined image or set of features is used. It's just not necessarily a black and white AR Toolkit type of marker. Vuforia, for example, uses NFT, which still requires a marker in the literal sense. Also, in the most literal sense, hand/face/body tracking is also marker tracking in that the marker is a shape. Markerless, inherent to the name, requires no pre-knowledge of the world or any shape or object be present to track.
You can read more about how Markerless tracking is achieved here, and see multiple examples of both marker-based and Markerless tracking here.
Marker based AR uses a Camera and a visual marker to determine the center, orientation and range of its spherical coordinate system. ARToolkit is the first full featured toolkit for marker based tracking.
Markerless Tracking is one of best methods for tracking currently. It performs active tracking and recognition of real environment on any type of support without using special placed markers. Allows more complex application of Augmented Reality concept.