iOS11 ARKit: Can ARKit also capture the Texture of the user's face? - ios

I read the whole documentation on all ARKit classes up and down. I don't see any place that describes ability to actually get the user face's Texture.
ARFaceAnchor contains the ARFaceGeometry (topology and geometry comprised of vertices) and the BlendShapeLocation array (coordinates allowing manipulations of individual facial traits by manipulating geometric math on the user face's vertices).
But where can I get the actual Texture of the user's face. For example: the actual skin tone / color / texture, facial hair, other unique traits, such as scars or birth marks? Or is this not possible at all?

You want a texture-map-style image for the face? There’s no API that gets you exactly that, but all the information you need is there:
ARFrame.capturedImage gets you the camera image.
ARFaceGeometry gets you a 3D mesh of the face.
ARAnchor and ARCamera together tell you where the face is in relation to the camera, and how the camera relates to the image pixels.
So it’s entirely possible to texture the face model using the current video frame image. For each vertex in the mesh...
Convert the vertex position from model space to camera space (use the anchor’s transform)
Multiply with the camera projection with that vector to get to normalized image coordinates
Divide by image width/height to get pixel coordinates
This gets you texture coordinates for each vertex, which you can then use to texture the mesh using the camera image. You could do this math either all at once to replace the texture coordinate buffer ARFaceGeometry provides, or do it in shader code on the GPU during rendering. (If you’re rendering using SceneKit / ARSCNView you can probably do this in a shader modifier for the geometry entry point.)
If instead you want to know for each pixel in the camera image what part of the face geometry it corresponds to, it’s a bit harder. You can’t just reverse the above math because you’re missing a depth value for each pixel... but if you don’t need to map every pixel, SceneKit hit testing is an easy way to get geometry for individual pixels.
If what you’re actually asking for is landmark recognition — e.g. where in the camera image are the eyes, nose, beard, etc — there’s no API in ARKit for that. The Vision framework might help.

I've put together a demo iOS app that shows how to accomplish this. The demo captures a face texture map in realtime, applying it back to a ARSCNFaceGeometry to create a textured 3D model of the user's face.
Below you can see the realtime textured 3D face model in the top left, overlaid on top of the AR front facing camera view:
The demo works by rendering an ARSCNFaceGeometry, however instead of rendering it normally, you instead render it in texture space while continuing to use the original vertex positions to determine where to sample from in the captured pixel data.
Here are links to the relevant parts of the implementation:
FaceTextureGenerator.swift — The main class for generating face textures. This sets up a Metal render pipeline to generate the texture.
faceTexture.metal — The vertex and fragment shaders used to generate the face texture. These operate in texture space.
Almost all the work is done in a metal render pass, so it easily runs in realtime.
I've also put together some notes covering the limitations of the demo
If you instead want a 2D image of the user's face, you can try doing the following:
Render the transformed ARSCNFaceGeometry to a 1-bit buffer to create an image mask. Basically you just want places where the face model appears to be white, while everything else should be black.
Apply the mask to the captured frame image.
This should give you an image with just the face (although you will likely need to crop the result)

You can calculate the texture coordinates as follows:
let geometry = faceAnchor.geometry
let vertices = geometry.vertices
let size = arFrame.camera.imageResolution
let camera = arFrame.camera
modelMatrix = faceAnchor.transform
let textureCoordinates = vertices.map { vertex -> vector_float2 in
let vertex4 = vector_float4(vertex.x, vertex.y, vertex.z, 1)
let world_vertex4 = simd_mul(modelMatrix!, vertex4)
let world_vector3 = simd_float3(x: world_vertex4.x, y: world_vertex4.y, z: world_vertex4.z)
let pt = camera.projectPoint(world_vector3,
orientation: .portrait,
viewportSize: CGSize(
width: CGFloat(size.height),
height: CGFloat(size.width)))
let v = 1.0 - Float(pt.x) / Float(size.height)
let u = Float(pt.y) / Float(size.width)
return vector_float2(u, v)
}

Related

Turn an entire SceneKit scene into an image suitable for a texture

I've written a little app using CoreMotion, AV and SceneKit to make a simple panorama. When you take a picture, it maps that onto a SK rectangle and places it in front of whatever CM direction the camera is facing. This is working fine, but...
I would like the user to be able to click a "done" button and turn the entire scene into a single image. I could then map that onto a sphere for future viewing rather than re-creating the entire set of objects. I don't need to stitch or anything like that, I want the individual images to remain separate rectangles, like photos glued to the inside of a ball.
I know about snapshot and tried using that with a really wide FOV, but that results in a fisheye view that does not map back properly (unless I'm doing it wrong). I assume there is some sort of transform I need to apply? Or perhaps there is an easier way to do this?
The key is "photos glued to the inside of a ball". You have a bunch of rectangles, suspended in space. Turning that into one image suitable for projection onto a sphere is a bit of work. You'll have to project each rectangle onto the sphere, and warp the image accordingly.
If you just want to reconstruct the scene for future viewing in SceneKit, use SCNScene's built in serialization, write(to:​options:​delegate:​progress​Handler:​) and SCNScene(named:).
To compute the mapping of images onto a sphere, you'll need some coordinate conversion. For each image, convert the coordinates of the corners into spherical coordinates, with the origin at your point of view. Change the radius of each corner's coordinate to the radius of your sphere, and you now have the projected corners' locations on the sphere.
It's tempting to repeat this process for each pixel in the input rectangular image. But that will leave empty pixels in the spherical output image. So you'll work in reverse. For each pixel in the spherical output image (within the 4 corner points), compute the ray (trivially done, in spherical coordinates) from POV to that point. Convert that ray back to Cartesian coordinates, compute its intersection with the rectangular image's plane, and sample at that point in your input image. You'll want to do some pixel weighting, since your output image and input image will have different pixel dimensions.

Metal. Why does setting MTLCullMode to none turn off depth comparison?

I an rendering a simple box:
MDLMesh(boxWithExtent: ...)
In my draw loop when I turn off back-face culling:
renderCommandEncoder.setCullMode(.none)
All depth comparison is disabled and sides of the box are drawn completely wrong with back-facing quads in front of front-facing.
Huh?
My intent is to include back-facing surfaces in the depth comparison not ignore them. This is important for when I have, for example, a shape with semi-transparent textures that reveal the shape's internals which have a different shading style. How to I force depth comparison?
UPDATE
So Warren's suggestion is an improvement but it is still not correct.
My depthStencilDescriptor:
let depthStencilDescriptor = MTLDepthStencilDescriptor()
depthStencilDescriptor.depthCompareFunction = .less
depthStencilDescriptor.isDepthWriteEnabled = true
depthStencilState = device.makeDepthStencilState(descriptor: depthStencilDescriptor)
Within my draw loop I set depth stencil state:
renderCommandEncoder.setDepthStencilState(depthStencilState)
The resultant rendering
Description. This is a box mesh. Each box face uses a shader the paints a disk texture. The texture is transparent outside the body of the disk. The shader paints a red/white spiral texture on front-facings quads and a blue/black spiral texture on back-facing quads. The box sits in front of a camera aligned quad textured with a mobil image.
Notice how one of the textures paints over the rear back-facing quad with the background texture color. Notice also that the rear-most back-facing quad is not drawn at all.
Actually it is not possible to achieve the effect I am after. I basically want to do a simple composite - Porter/Duff - here but that is order dependent. Order cannot be guaranteed here so I am basically hosed.

Texture getting stretched across faces of a cuboid in Open Inventor

I am trying to write a little script to apply texture to rectangular cuboids. To accomplish this, I run through the scenegraph, and wherever I find the SoIndexedFaceSet Nodes, I insert a SoTexture2 Node before that. I put my image file in the SoTexture2 Node. The problem I am facing is that the texture is applied correctly to 2 of the faces(say face1 and face2), in the Y-Z plane, but for the other 4 planes, it just stretches the texture at the boundaries of the two faces(1 and 2).
It looks something like this.
The front is how it should look, but as you can see, on the other two faces, it just extrapolates the corner values of the front face. Any ideas why this is happening and any way to avoid this?
Yep, assuming that you did not specify texture coordinates for your SoIndexedFaceSet, that is exactly the expected behavior.
If Open Inventor sees that you have applied a texture image to a geometry and did not specify texture coordinates, it will automatically compute some texture coordinates. Of course it's not possible to guess how you wanted the texture to be applied. So it computes the bounding box then computes texture coordinates that stretch the texture across the largest extent of the geometry (XY, YZ or XZ). If the geometry is a cuboid you can see the effect clearly as in your image. This behavior can be useful, especially as a quick approximation.
What you need to make this work the way you want, is to explicitly assign texture coordinates to the geometry such that the texture is mapped separately to each face. In Open Inventor you can actually still share the vertices between faces because you are allowed to specify different vertex indices and texture coordinate indices (of course this is only more convenient for the application because OpenGL doesn't support this and Open Inventor has to re-shuffle the data internally). If you applied the same texture to an SoCube node you would see that the texture is mapped separately to each face as expected. That's because SoCube defines texture coordinates for each face.

Calculating position of object so it matches screen pixels

I would like to move a 3D plane in a 3D space, and have the movement match
the screens pixels so I can snap the plane to the edges of the screen.
I have played around with the focal length, camera position and camera scale,
and I have managed to get a plane to match the screen pixels in terms of size,
however when moving the plane things are not correct anymore.
So basically my current status is that I feed the plane size with values
assuming that I am working with standard 2D graphics.
So if I set the plane size to 128x128, it more or less is viewed as a 2D sqaure with that
exact size.
I am not using and will not use Orthographic view, I am using and will be using Projection view because my application needs some perspective to it.
How can this be calculated?
Does anyone have any links to resources that I can read?
you need to grab the transformation matrices you use in the vertex shader and apply them to the point/some points that represents the plane
that will result in a set of points in -1,-1 to 1,1 (after dividing by w) which you will need to map to the viewport

GLKit / OpenGL-ES 2.0 - 2D Object World Space and 2D Camera

I am new to OpenGL-es 2.0 and GLKit,
and would like to ask a question.
I tried to find a good example on 2D camera but couldn't find any,
so I hope that you guys can help me :D
--
1)
Firstly, I have an object, and I store its position in GLKVector2.
I would like to know how to draw it in the world space.
2)
I have a "2D Camera" class, storing as a CGRect with its world position and size.
Its size may change depending on the "zoom" I want.
Is there any way to easily draw the objects from world space into this 2D Camera?
Is any optimization required too? such as not drawing objects outside this 2D Camera,
and clipping objects that have some parts outside of 2D Camera.
3)
If the objects are drawn into this 2D camera, how do I apply effects like clip/scale/etc so that it fits on the device screen, and draw it on the screen?
--
I have seen many things about model, view, and projection matrix, but I don't get them. I have only done XNA/Android bitmap drawing calls, which is drawing them onto a Bitmap, and resizing the Bitmap onto the screen.

Resources