I want to create an Animoji in my APP. But when I contact with some designers they didn't know how to design an Animoji 3D model. Where can I find a solution for reference?
Solution I can thought is create many bones on face of 3D model, And when I get blendShapes of ARFaceAnchor, which contain the detail information of face expression, then I use it to update bone animations of partial face.
Thank you for reading. Any advises is appreciated.
First, to clear the air a bit: Animoji is a product built on top of ARKit, not in any way a feature of ARKit itself. There's no simple path to "build a model in this format and it 'just works' in (or like) Animoji".
That said, there are multiple ways to use the face expression data vended by ARKit to perform 3D animation, so how you do it depends more on what you and your artist are comfortable with. And remember, for any of these you can use as many or as few of the blend shapes as you like, depending on how realistic you want the animation to be.
Skeletal animation
As you suggested, create bones corresponding to each of the blend shapes you're interested in, along with a mapping of blend shape values to bone positions. For example, you'll want to define two positions for the bone for the browOuterUpLeft parameter such that one of them corresponds to a value of 0.0 and another to a value of 1.0 and you can modulate its transform anywhere between those states. (And set up the bone influences in the mesh such that moving it between those two positions creates an effect similar to the reference design when applied to your model.)
Morph target animation
Define multiple, topologically equivalent meshes, one for each blend shape parameter you're interested in. Each one should represent the target state of your character for when that blend shape's weight is 1.0 and all other blend shapes are at 0.0.
Then, at render time, set each vertex position to the weighted average of the same vertex's position in all blend shape targets. Pseudocode:
for vertex in i..<vertexCount {
outPosition = float4(0)
for shape in 0..<blendShapeCount {
outPosition += targetMeshes[shape][vertex] * blendShapeWeights[shape]
}
}
An actual implementation of the above algorithm is more likely to be done in a vertex shader on the GPU, so the for vertex part would be implicit there — you'd just need to feed all your blend shape targets in as vertex attributes. (Or use a compute shader?)
If you're using SceneKit, you can let Apple implement the algorithm for you by feeding your blend shape target meshes to SCNMorpher.
This is where the name "blend shape" comes from, by the way. And rumor has it the built-in ARFaceGeometry is built this way, too.
Simpler and Hybrid approaches
As you can see in Apple's sample code, you can go even simpler — breaking a face into separate pieces (nodes in SceneKit) and setting their positions or transforms based on the blend shape parameters.
You can also combine some of these approaches. For example, a cartoon character could use morph targets for skin deformation around the mouth, but have floating 2D eyebrows that animate simply through setting node positions.
Check-out the 'weboji' javascript library on gitHub. The CG artists we hired to create the 3D models get used with the workflow in minutes. Also, it could be an interesting approach to avoid proprietary formats and closed ecosystem issues.
Screenshots of a 3D Fox (THREE.JS based demo) and a 2D Cartman (SVG based demo).
Demo on youtube featuring a 2D 'Cartman'.
Related
I'm struggling with a texture-baking process with 3DSmax software. I have a white 3D mesh with 2 image textures. I'm trying to get a diffusemap (see target_diffuse_map.jpg). To do this, I exectue the following steps:
1) Affect image-texture1 and image-texture2 to face1 and face2 of the objet.
2) Clone the object to get the white colors when baking texture.
3) unwrap UVM.
4) Rendering Texture to obtain the diffuse map.
5) Projection of the texture + white colors on the cloned object.
Please, find these steps on this small video I made: https://drive.google.com/file/d/1h4v2CrL8OCLwdeVtLmpQwD250cawgJpi/view
I obtain a bad sampled and weird diffuse map (please see obtained_diffuse_map.jpg). What I want is target_diffuse_map.jpg.
I'm I forgetting some steps?
Thank you for your help.
You need to either:
Add a small amount of "Push" in the Projection Modifier
Uncheck "Use Cage" in the Projection Options dialog, while setting a very small value for the offset
Projection Mapping works by casting rays from points on the cage towards corresponding model points on your mesh. You did not push the cage out at all, therefore rays are not well defined; rays are cast from a point toward a direction which is the exact same point. This causes numerical errors and z-fighting. The there needs to be some time amount of push so the "from" and "to" points of each ray are different giving them a well-defined direction to travel.
The second option, instead of using the cage defined in the projection modifier, is to use the offset method (you probably still need to apply projection modifier though). This method defines each rays as starting from a point defined by taking the model point of the mesh and moving outward by a fixed offset amount in the direction of the normal. The advantage is that for curved objects with large polygons, it produces less distortion because the system uses the smoothed shading normal at each point. The disadvantage you can't have different cage distances at different points of the model, for better control. Use this method for round wooden barrels and other simplistic objects with large, smooth curves.
Also, your situation is made difficult by having different parts of the model very close to each other (touching) and embedded within each other - namely how the mouth of the bottle is inside the cap and the cap it touching the base. For this case, it might make sense to break the objects apart after you have the overall UV mapping, run projection mapping separately on each one separately, and then combine the maps back together in an image editor.
I’m interested in the issue of data processing from TrueDepth Camera. It is necessary to obtain the data of a person’s face, build a 3D model of the face and save this model in an .obj file.
Since in the 3D model needed presence of the person’s eyes and teeth, then ARKit / SceneKit is not suitable, because ARKit / SceneKit do not fill these areas with data.
But with the help of the SceneKit.ModelIO library, I managed to export ARSCNView.scene (type SCNScene) in the .obj format.
I tried to take this project as a basis:
https://developer.apple.com/documentation/avfoundation/cameras_and_media_capture/streaming_depth_data_from_the_truedepth_camera
In this project, working with TrueDepth Camera is done using Metal, but if I'm not mistaken, MTKView, rendered using Metal, is not a 3D model and cannot be exported as .obj.
Please tell me if there is a way to export MTKView to SCNScene or directly to .obj?
If there is no such method, then how to make a 3D model from AVDepthData?
Thanks.
It's possible to make a 3D model from AVDepthData, but that probably isn't what you want. One depth buffer is just that — a 2D array of pixel distance-from-camera values. So the only "model" you're getting from that isn't very 3D; it's just a height map. That means you can't look at it from the side and see contours that you couldn't have seen from the front. (The "Using Depth Data" sample code attached to the WWDC 2017 talk on depth photography shows an example of this.)
If you want more of a truly-3D "model", akin to what ARKit offers, you need to be doing the work that ARKit does — using multiple color and depth frames over time, along with a machine learning system trained to understand human faces (and hardware optimized for running that system quickly). You might not find doing that yourself to be a viable option...
It is possible to get an exportable model out of ARKit using Model I/O. The outline of the code you'd need goes something like this:
Get ARFaceGeometry from a face tracking session.
Create MDLMeshBuffers from the face geometry's vertices, textureCoordinates, and triangleIndices arrays. (Apple notes the texture coordinate and triangle index arrays never change, so you only need to create those once — vertices you have to update every time you get a new frame.)
Create a MDLSubmesh from the index buffer, and a MDLMesh from the submesh plus vertex and texture coordinate buffers. (Optionally, use MDLMesh functions to generate a vertex normals buffer after creating the mesh.)
Create an empty MDLAsset and add the mesh to it.
Export the MDLAsset to a URL (providing a URL with the .obj file extension so that it infers the format you want to export).
That sequence doesn't require SceneKit (or Metal, or any ability to display the mesh) at all, which might prove useful depending on your need. If you do want to involve SceneKit and Metal you can probably skip a few steps:
Create ARSCNFaceGeometry on your Metal device and pass it an ARFaceGeometry from a face tracking session.
Use MDLMesh(scnGeometry:) to get a Model I/O representation of that geometry, then follow steps 4-5 above to export it to an .obj file.
Any way you slice it, though... if it's a strong requirement to model eyes and teeth, none of the Apple-provided options will help you because none of them do that. So, some food for thought:
Consider whether that's a strong requirement?
Replicate all of Apple's work to do your own face-model inference from color + depth image sequences?
Cheat on eye modeling using spheres centered according to the leftEyeTransform/rightEyeTransform reported by ARKit?
Cheat on teeth modeling using a pre-made model of teeth, composed with the ARKit-provided face geometry for display? (Articulate your inner-jaw model with a single open-shut joint and use ARKit's blendShapes[.jawOpen] to animate it alongside the face.)
I've been trying without success to extract face features, for instance the mouth, from ARSCNFaceGeometry in order to change their color or add a different material.
I understand I need to create an SCNGeometry for which I have the SCNGeometrySource but haven't been able to create the SCNGeometryElement.
Have tried creating it from ARFaceAnchor in update(from faceGeometry: ARFaceGeometry) but so far have been unable.
Would really appreciate someone help
ARSCNFaceGeometry is a single mesh. If you want different areas of it to be different colors, your best bet is to apply a texture map (which you do in SceneKit by providing images for material property contents).
There’s no semantic information associated with the vertices in the mesh — that is, there’s nothing that says “this point is the tip of the nose, these points are the edge of the upper lip, etc”. But the mesh is topologically stable, so if you create a texture image that adds a bit of color around the lips or a lightning bolt over the eye or whatever, it’ll stay there as the face moves around.
If you need help getting started on painting a texture, there are a couple of things you could try:
Create a dummy texture first
Make a square image and fill it with a double gradient, such that the red and blue component for each pixel is based on the x and y coordinate of that pixel. Or some other distinctive pattern. Apply that texture to the model, and see how it looks — the landmarks in the texture will guide you where to paint.
Export the model
Create a dummy ARSCNFaceGeometry using the init(blendShapes:) initializer and an empty blendShapes dictionary (you don’t need an active ARFaceTracking session for this, but you do need an iPhone X). Use SceneKit’s scene export APIs (or Model I/O) to write that model out to a 3D file of some sort (.scn, which you can process further on the Mac, or something like .obj).
Import that file into your favorite 3D modeling tool (Blender, Maya, etc) and use that tool to paint a texture. Then use that texture in your app with real faces.
Actually, the above is sort of an oversimplification, even though it’s the simple answer for common cases. ARSCNFaceGeometry can actually contain up to four submeshes if you create it with the init(device:fillMesh:) initializer. But even then, those parts aren’t semantically labeled areas of the face — they’re the holes in the regular face model, flat fill-ins for the places where eyes and mouth show through.
This is an image from apple's documentation. They show a transform from a cube to sphere and also to some random geometry.
Only a few lines lower they state:
A morpher and its target geometries may be loaded from a scene file or
created programmatically. The base geometry and all target geometries
must be topologically identical—that is, they must contain the same
number and structural arrangement of vertices.
Could someone explain this paragraph because apparently I don't understand it.
Since a sphere will never have the same structural arrangement of vertices as cube(at least I think so), it is impossible to make transformation. But hey, we all see it in the picture. I also tried do to the transformation and I don't get the expected results. So how do you go from sphere to cube or vice versa?
"Topologically identical" means that the relationships between vertices in a mesh must be preserved, but their locations in space can change. Here's an example of that in 2D:
These two meshes have the same eight vertices, connected to each other in the same ways, but their positions (and thus the shape they form) differ.
To do the same in 3D with SceneKit, you need custom vertex data — the primitive shapes that SceneKit can generate for you (like SCNSphere, SCNBox, and whatnot) all have different topologies, so they can't be used as morpher targets.
If you want to morph a box into a sphere, you'll need to generate your own box and sphere with identical topology. The "some random shape" in Apple's illustration is a hint at how you might do that — it appears to be one of the variants of a superellipsoid. If you use the equations in that Wikipedia page you can generate a set of points that can be either on a sphere or on a cube depending on other parameters. Vary those parameters to generate a couple of meshes, create SCNGeometry from those meshes, and you've got valid SCNMorpher targets.
You can see a simpler example of morphing in Apple's SceneKit WWDC 2014 Slides sample app.
You can't presume the locations of each vertex in the given images; the cube doesn't neccesarily have eight and the left-most doesn't guaruntee to have 6.
Admittedly, I've not played with SCNMorpher but from that description I imagine it will interpolate on a per-vertex basis (so they will have to match up).
If it helps, picture the sphere as having a lot of 'dots' spread equally along its surface and those are pushed or squeezed to make the other surfaces
I'd like to build an app using the new GLKit framework, and I'm in need of some design advice. I'd like to create an app that will present up to a couple thousand "bricks" (objects with very simple geometry). Most will have identical texture, but up to a couple hundred will have unique texture. I'd like the bricks to appear every few seconds, move into place and then stay put (in world coords). I'd like to simulate a camera whose position and orientation are controlled by user gestures.
The advice I need is about how to organize the code. I'd like my model to be a collection of bricks that have a lot more than graphical data associated with them:
Does it make sense to associate a view-like object with each handle geometry, texture, etc.?
Should every brick have it's own vertex buffer?
Should each have it's own GLKBaseEffect?
I'm looking for help organizing what object should do what during setup, then rendering.
I hope I can stay close to the typical MVC pattern, with my GLKViewController observing model state changes, controlling eye coordinates based on gestures, and so on.
Would be much obliged if you could give some advice or steer me toward a good example. Thanks in advance!
With respect to the models, I think an approach analogous to the relationship between UIImage and UIImageView is appropriate. So every type of brick has a single vertex buffer,GLKBaseEffect, texture and whatever else. Each brick may then appear multiple times just as multiple UIImageViews may use the same UIImage. In terms of keeping multiple reference frames, it's actually a really good idea to build a hierarchy essentially equivalent to UIView, each containing some transform relative to the parent and one sort being able to display a model.
From the GLKit documentation, I think the best way to keep the sort of camera you want (and indeed the object locations) is to store it directly as a GLKMatrix4 or a GLKQuaternion — so you don't derive the matrix or quaternion (plus location) from some other description of the camera, rather the matrix or quaternion directly is the storage for the camera.
Both of those classes have methods built in to apply rotations, and GLKMatrix4 can directly handle translations. So you can directly map the relevant gestures to those functions.
The only slightly non-obvious thing I can think of when dealing with the camera in that way is that you want to send the inverse to OpenGL rather than the thing itself. Supposing you use a matrix, the reasoning is that if you wanted to draw an object at that location you'd load the matrix directly then draw the object. When you draw an object at the same location as the camera you want it to end up being drawn at the origin. So the matrix you have to load for the camera is the inverse of the matrix you'd load to draw at that location because you want the two multiplied together to be the identity matrix.
I'm not sure how complicated the models for your bricks are but you could hit a performance bottleneck if they're simple and all moving completely independently. The general rule when dealing with OpenGL is that the more geometry you can submit at once, the faster everything goes. So, for example, an entirely static world like that in most games is much easier to draw efficiently than one where everything can move independently. If you're drawing six-sided cubes and moving them all independently then you may see worse performance than you might expect.
If you have any bricks that move in concert then it is more efficient to draw them as a single piece of geometry. If you have any bricks that definitely aren't visible then don't even try to draw them. As of iOS 5, GL_EXT_occlusion_query_boolean is available, which is a way to pass some geometry to OpenGL and ask if any of it is visible. You can use that in realtime scenes by building a hierarchical structure describing your data (which you'll already have if you've directly followed the UIView analogy), calculating or storing some bounding geometry for each view and doing the draw only if the occlusion query suggests that at least some of the bounding geometry would be visible. By following that sort of logic you can often discard large swathes of your geometry long before submitting it.