Face texture from ARKit - ios

I am running a face tracking configuration in ARKit with SceneKit, in each frame i can access the camera feed via the snapshot property or the capturedImage as a buffer, i have also been able to map each face vertex to the image coordinate space and add some UIView helpers(1 point squares) to display in realtime all the face vertices on the screen, like this:
func renderer(_ renderer: SCNSceneRenderer, didUpdate node: SCNNode, for anchor: ARAnchor) {
guard let faceGeometry = node.geometry as? ARSCNFaceGeometry,
let anchorFace = anchor as? ARFaceAnchor,
anchorFace.isTracked
else { return }
let vertices = anchorFace.geometry.vertices
for (index, vertex) in vertices.enumerated() {
let vertex = sceneView.projectPoint(node.convertPosition(SCNVector3(vertex), to: nil))
let xVertex = CGFloat(vertex.x)
let yVertex = CGFloat(vertex.y)
let newPosition = CGPoint(x: xVertex, y: yVertex)
// Here i update the position of each UIView in the screen with the calculated vertex new position, i have an array of views that matches the vertex count that is consistent across sessions.
}
}
Since the UV coordinates are also constant across sessions, i am trying to draw for each pixel that is over the face mesh its corresponding position in the UV texture so i can get, after some iterations, a persons face texture to a file.
I have come to some theorical solutions, like creating CGPaths for each triangle, and ask for each pixel if it is contained in that triangle and if it is, create a triangular image, cropping a rectangle and then applying a triangle mask obtained from the points projected by the triangle vertices in the image coordinates, so in this fashion i can obtain a triangular image that has to be translated to the underlying triangle transform (like skewing it in place), and then, in a UIView (1024x1024) add each triangle image as UIImageView as a sub view, and finally encode that UIView as PNG, this sounds like a lot of work, specifically the part of matching the cropped triangle with the UV texture corresponding triangle.
In the Apple demo project there is an image that shows how that UV texture looks like, if you edit this image and add some colors it will then show up in the face, but i need the other way around, from what i am seeing in the camera feed, create a texture of your face, in the same demo project there is an example that does exactly what i need but with a shader, and with no clues on how to extract the texture to a file, the shader codes looks like this:
/*
<samplecode>
<abstract>
SceneKit shader (geometry) modifier for texture mapping ARKit camera video onto the face.
</abstract>
</samplecode>
*/
#pragma arguments
float4x4 displayTransform // from ARFrame.displayTransform(for:viewportSize:)
#pragma body
// Transform the vertex to the camera coordinate system.
float4 vertexCamera = scn_node.modelViewTransform * _geometry.position;
// Camera projection and perspective divide to get normalized viewport coordinates (clip space).
float4 vertexClipSpace = scn_frame.projectionTransform * vertexCamera;
vertexClipSpace /= vertexClipSpace.w;
// XY in clip space is [-1,1]x[-1,1], so adjust to UV texture coordinates: [0,1]x[0,1].
// Image coordinates are Y-flipped (upper-left origin).
float4 vertexImageSpace = float4(vertexClipSpace.xy * 0.5 + 0.5, 0.0, 1.0);
vertexImageSpace.y = 1.0 - vertexImageSpace.y;
// Apply ARKit's display transform (device orientation * front-facing camera flip).
float4 transformedVertex = displayTransform * vertexImageSpace;
// Output as texture coordinates for use in later rendering stages.
_geometry.texcoords[0] = transformedVertex.xy;
/**
* MARK: Post-process special effects
*/
Honestly i do not have much experience with shaders, so any help would be appreciated in translating the shader info on how to translate to a more Cocoa Touch Swift code, right now i am not thinking yet in performance, so it if has to be done in the CPU like in a background thread or offline is ok, anyway i will have to choose the right frames to avoid skewed samples, or triangles with very good information and some other with barely a few pixels stretched (like checking if the normal of the triangle is pointing to the camera, sample it), or other UI helpers to make the user turns the face to sample all the face correctly.
I have already checked this post and this post but cannot get it to work.
This app does exactly what i need, but they do not seem like using ARKit.
Thanks.

Related

Detect a object using camera and position a 3D object using ARKit in iOS

What am I looking for?
A simple explanation of my requirement is this
Using ARKit, detect an object using iPhone camera
Find the position of this object on this virtual space
Place a 3D object on this virtual space using SceneKit. The 3D object should be behind the
marker.
An example would be to detect a small image/marker position in a 3D space using camera, place another 3D ball model behind this marker in virtual space (so the ball will be hidden from the user because the marker/image is in front)
What I am able to do so far?
I am able to detect a marker/image using ARKit
I am able to position a ball 3D model on the screen.
What is my problem?
I am unable to position the ball in such a way that ball is behind the marker that is detected.
When the ball is in front the marker, the ball correctly hide the marker. You can see in the side view that ball is in front of the marker. See below
But when the ball is behind the marker, opposite doesn't happen. The ball is always seeing in front blocking the marker. I expected the marker to hide the ball. So the scene is not respecting the z depth of the ball's position. See below
Code
Please look into the comments as well
override func viewDidLoad() {
super.viewDidLoad()
sceneView.delegate = self
sceneView.autoenablesDefaultLighting = true
//This loads my 3d model.
let ballScene = SCNScene(named: "art.scnassets/ball.scn")
ballNode = ballScene?.rootNode
//The model I have is too big. Scaling it here.
ballNode?.scale = SCNVector3Make(0.1, 0.1, 0.1)
}
override func viewWillAppear(_ animated: Bool) {
super.viewWillAppear(animated)
//I am trying to detect a marker/image. So ImageTracking configuration is enough
let configuration = ARImageTrackingConfiguration()
//Load the image/marker and set it as tracking image
//There is only one image in this set
if let trackingImages = ARReferenceImage.referenceImages(inGroupNamed: "Markers",
bundle: Bundle.main) {
configuration.trackingImages = trackingImages
configuration.maximumNumberOfTrackedImages = 1
}
sceneView.session.run(configuration)
}
override func viewWillDisappear(_ animated: Bool) {
super.viewWillDisappear(animated)
sceneView.session.pause()
}
func renderer(_ renderer: SCNSceneRenderer, nodeFor anchor: ARAnchor) -> SCNNode? {
let node = SCNNode()
if anchor is ARImageAnchor {
//my image is detected
if let ballNode = self.ballNode {
//for some reason changing the y position translate the ball in z direction
//Positive y value moves it towards the screen (infront the marker)
ballNode.position = SCNVector3(0.0, -0.02, 0.0)
//Negative y value moves it away from the screen (behind the marker)
ballNode.position = SCNVector3(0.0, -0.02, 0.0)
node.addChildNode(ballNode)
}
}
return node
}
How to make the scene to respect the z position? Or in other words, how to show a 3D model behind an image/marker that has been detected using ARKit framework?
I am running against iOS 12, using Xcode 10.3. Let me know if any other information is needed.
To achieve that you need to create an occluder in the 3D scene. Since an ARReferenceImage has a physicalSize it should be straightforward to add a geometry in the scene when the ARImageAnchor is created.
The geometry would be a SCNPlane with a SCNMaterial appropriate for an occluder. I would opt for a SCNLightingModelConstant lighting model (it's the cheapest and we won't actually draw the plane) with a colorBufferWriteMask equal to SCNColorMaskNone. The object should be transparent but still write in the depth buffer (that's how it will act as an occluder).
Finally, make sure that the occluder is rendered before any augmented object by setting its renderingOrder to -1 (or an even lower value if the app already uses rendering orders).
In ARKit 3.0 Apple engineers implemented ZDepth compositing technique called People Occlusion. This feature is available only on devices with A12 and A13 'cause it's highly processor intensive. At the moment ARKit ZDepth compositing feature is in its infancy, hence it allows you only composite people over and under (or people-like objects) background, not any other object seen via rear camera. And, I think, you know about front TrueDepth camera – it's for face tracking and it has additional IR sensor for this task.
To turn ZDepth compositing feature on, use these instance properties in ARKit 3.0:
var frameSemantics: ARConfiguration.FrameSemantics { get set }
static var personSegmentationWithDepth: ARConfiguration.FrameSemantics { get }
Real code should look like this:
let config = ARWorldTrackingConfiguration()
if let config = mySession.configuration as? ARWorldTrackingConfiguration {
config.frameSemantics.insert(.personSegmentationWithDepth)
mySession.run(config)
}
After alpha channel's segmentation a formula for every channel computation looks like this:
r = Az > Bz ? Ar : Br
g = Az > Bz ? Ag : Bg
b = Az > Bz ? Ab : Bb
a = Az > Bz ? Aa : Ba
where Az is a ZDepth channel of Foreground image (3D model)
Bz is ZDepth a channel of Background image (2D video)
Ar, Ag, Ab, Aa – Red, Green, Blue and Alpha channels of 3D model
Br, Bg, Bb, Ba – Red, Green, Blue and Alpha channels of 2D video
But in early versions of ARKit there's no ZDepth compositing feature, so you can composite a 3D model over 2D background video only using standard 4-channel compositing OVER operation:
(Argb * Aa) + (Brgb * (1 - Aa))
where Argb is RGB channels of Foreground A image (3D model)
Aa is an Alpha channel of Foreground A image (3D model)
Brgb is RGB channels of Background B image (2D video)
(1 - Aa) is an inversion of Foreground Alpha channel
As a result, without personSegmentationWithDepth property your 3D model will always be OVER a 2D video.
Thus, if object on a Video doesn't look like humans' hand or like a human body, when using regular ARKit tools, you can't place the object from 2D video over 3D model.
.....
Nonetheless, you can do it using Metal and AVFoundation frameworks. Consider – it's not easy.
To extract ZDepth data from video stream you need the following instance property:
// Works from iOS 11
var capturedDepthData: AVDepthData? { get }
Or you may use these two instance methods (remember ZDepth channel must be 32-bit):
// Works from iOS 13
func generateDilatedDepth(from frame: ARFrame,
commandBuffer: MTLCommandBuffer) -> MTLTexture
func generateMatte(from frame: ARFrame,
commandBuffer: MTLCommandBuffer) -> MTLTexture
Please read this SO post if you wanna know how to do it using Metal.
For additional information, please read this SO post.

Rendering MTLTexture on MTKView is not keeping aspect ratio

I have a texture that's 1080x1920 pixels. And I'm trying to render it on a MTKView that isn't the same aspect ratio. (i.e iPad/iPhone X full screen).
This is how I'm rendering the texture for the MTKView:
private func render(_ texture: MTLTexture, withCommandBuffer commandBuffer: MTLCommandBuffer, device: MTLDevice) {
guard let currentRenderPassDescriptor = metalView?.currentRenderPassDescriptor,
let currentDrawable = metalView?.currentDrawable,
let renderPipelineState = renderPipelineState,
let encoder = commandBuffer.makeRenderCommandEncoder(descriptor: currentRenderPassDescriptor) else {
semaphore.signal()
return
}
encoder.pushDebugGroup("RenderFrame")
encoder.setRenderPipelineState(renderPipelineState)
encoder.setFragmentTexture(texture, index: 0)
encoder.drawPrimitives(type: .triangleStrip, vertexStart: 0, vertexCount: 4, instanceCount: 1)
encoder.popDebugGroup()
encoder.endEncoding()
// Called after the command buffer is scheduled
commandBuffer.addScheduledHandler { [weak self] _ in
guard let strongSelf = self else {
return
}
strongSelf.didRender(texture: texture)
strongSelf.semaphore.signal()
}
commandBuffer.present(currentDrawable)
commandBuffer.commit()
}
I want the texture to be rendered like .scaleAspectFill on a UIView and I'm trying to learn Metal so I'm not sure where I should be looking for this (the .metal file, the pipeline, the view itself, the encoder, etc.)
Thanks!
Edit: Here is the shader code:
#include <metal_stdlib> using namespace metal;
typedef struct {
float4 renderedCoordinate [[position]];
float2 textureCoordinate; } TextureMappingVertex;
vertex TextureMappingVertex mapTexture(unsigned int vertex_id [[ vertex_id ]]) {
float4x4 renderedCoordinates = float4x4(float4( -1.0, -1.0, 0.0, 1.0 ),
float4( 1.0, -1.0, 0.0, 1.0 ),
float4( -1.0, 1.0, 0.0, 1.0 ),
float4( 1.0, 1.0, 0.0, 1.0 ));
float4x2 textureCoordinates = float4x2(float2( 0.0, 1.0 ),
float2( 1.0, 1.0 ),
float2( 0.0, 0.0 ),
float2( 1.0, 0.0 ));
TextureMappingVertex outVertex;
outVertex.renderedCoordinate = renderedCoordinates[vertex_id];
outVertex.textureCoordinate = textureCoordinates[vertex_id];
return outVertex; }
fragment half4 displayTexture(TextureMappingVertex mappingVertex [[ stage_in ]],texture2d<float, access::sample> texture [[ texture(0) ]]) {
constexpr sampler s(address::clamp_to_edge, filter::linear);
return half4(texture.sample(s, mappingVertex.textureCoordinate));
}
A few general things to start with when dealing with Metal textures or Metal in general:
You should take into account the difference between points and pixels, refer to the documentation here. The frame property of a UIView subclass (as MTKView is one) always gives you the width and the height of the view in points.
The mapping from points to actual pixels is controlled through the contentScaleFactor option. The MTKView automatically selects a texture with a fitting aspect ratio that matches the actual pixels of your device. For example, the underlying texture of a MTKView on the iPhone X would have a resolution of 2436 x 1125 (the actual display size in pixels). This is documented here: "The MTKView class automatically supports native screen scale. By default, the size of the view’s current drawable is always guaranteed to match the size of the view itself."
As documented here, the .scaleAspectFill option "scale[s] the content to fill the size of the view. Some portion of the content may be clipped to fill the view’s bounds". You want to simulate this behavior.
Rendering with Metal is nothing more than "drawing" to the resolve texture, which is automatically set by the MTKView. However, you still have full control and could do it on your own by manually creating textures and setting them in your renderPassDescriptor. But you don't need to care about this right now. The single thing you should care about is what, where and which part of the 1080x1920 pixels texture in your resolve texture you want to render in your resolve texture (which might have a different aspect ratio). We want to fully fill ("scaleAspectFill") the resolve texture, so we leave the renderedCoordinates in your fragment shader as they are. The are defining a rectangle over the whole resolve texture, which means the fragment shader is called for every single pixel in the resolve texture. Following, we will simply change the texture coordinates.
Let's define the aspect ratio as ratio = width / height, the resolve texture as r_tex and the texture you want to render as tex.
So assuming your resolve texture does not have the same aspect ratio, there are two possible scenarios:
The aspect ratio of your texture that you want to render is larger than the aspect ratio of your resolve texture (the texture Metal renders to), that means the texture you want to render has a larger width than the resolve texture. In this case we leave the y values of the coordinate as they are. The x values of texture coordinates will be changed:
x_left = 0 + ((tex.width - r_tex.width) / 2.0)
x_right = tex_width - ((tex.width - r_tex_width) / 2.0)
These values must be normalized because the texture samples needs coordinates in the range from 0 to 1:
x_left = x_left / tex.width
x_right = x_right / tex.width
We have our new texture coordinates:
topLeft = float2(x_left,0)
topRight = float2(x_right,0)
bottomLeft = float2(x_left,1)
bottomRight = float2(x_right,1)
This will have the effect that nothing of the top or the bottom of your texture will be cut off, but some outer parts at the left and right side will be clipped, i.e. not visible.
The aspect ratio of your texture that you want to render is smaller than the aspect ratio of your resolve texture. The procedure is the same as with first scenario, but this time we will change the y coordinates
This should render your texture so that the resolve texture is completely filled and the aspect ratio of your texture is maintained on the x-axis. Maintaining the y-axis will work similarly. Additionally you have to check which side of the texture is larger/smaller and incorporate this in your calculation. This will clip parts of your texture as it would be when using scaleAspectFill. Be aware that the above solution is untested. But I hope it is helpful. Be sure to visit Metal Best Practices documentation from time to time, it's very helpful to get the basic concepts right. Have fun with Metal!
So your vertex shader pretty directly dictates that the source texture be stretched to the dimensions of the viewport. You are rendering a quad that fills the viewport, because its coordinates are at the extremes ([-1, 1]) of the Normalized Device Coordinate system in the horizontal and vertical directions.
And you are mapping the source texture corner-to-corner over that same range. That's because you specify the extremes of texture coordinate space ([0, 1]) for the texture coordinates.
There are various approaches to achieve what you want. You could pass the vertex coordinates in to the shader via a buffer, instead of hard-coding them. That way, you can compute the appropriate values in app code. You'd compute the desired destination coordinates in the render target, expressed in NDC. So, conceptually, something like left_ndc = (left_pixel / target_width) * 2 - 1, etc.
Alternatively, and probably easier, you can leave the shader as-is and change the viewport for the draw operation to target only the appropriate portion of the render target.

GPUImage: How to determine an average color within the hexagon?

I’m doing video processing with GPUImage2. When the app starts, I create a hexagonal grid and add it to my cameraView. The grid is fullscreen and consists of about 100 of hexagons.
In general, what I’m trying to achieve is
For each frame I want to find an average color (in RGB or even better HSV) within each cell of the grid.
When the color is determined, I want to draw something in the center of each hexagon depending on its average color.
I have an array with hexagons, each of them knows its vertexes’ coordinates and center.
I also have an array with UIBezierPaths which contains bounds of these hexagons (just in case).
So my code looks like this
class ViewController: UIViewController {
var hexagons = [HKHexagon]()
var hexagonsBounds = [UIBezierPath]()
let averageColorExtractor = AverageColorExtractor()
override func viewDidLoad() {
super.viewDidLoad()
do {
camera = try Camera(sessionPreset:AVCaptureSessionPreset1920x1080)
camera.delegate = self
cameraView.orientation = .landscapeLeft
camera --> cameraView
camera.startCapture()
drawGrid()
} catch {
fatalError("Could not initialize rendering pipeline: \(error)")
}
}
}
extension ViewController: CameraDelegate {
func didCaptureBuffer(_ sampleBuffer: CMSampleBuffer) {
for hexagon in hexagons {
}
}
}
I guess didCaptureBuffer() should be the place to apply averageColorExtractor to each hexagon but don’t have an idea what to do next..
I am new to iOS development and it’s the first time I use GPUImage2… Please, guide me in the right direction.
Not coding for your platform at all but GPU architecture allows to do it like this:
pass the image as texture
render the center points only as points
in fragment shader compute the avg color of hex around actual position
This is the hardest and most performance demanding part. If you compute just inscribed circle it is easy but for hexagon you need to compute which texel is inside and which not. For axis aligned hexagons you can divide hex into regions (2x rectangle, 4x triangle) for rotated hexes you need add transformation matrix.
compute/render output inside the center point.
I do not know what your framework can do for you from this. If you rendered stuff is bigger then just the center point then you need either use another pass in your render or use bigger primitive then points in #2 but that means you will compute the avg color for each rendered pixel which can slow things down a lot.
Take a look at GLSL shader that uses this technique (for entirely different task but the technique is the same):
How to implement 2D raycasting light effect in GLSL
If this is not adaptable to your platform then ignore this answer ...

How to convert camera extrinsic matrix to SCNCamera position and rotation

I'm trying to achieve Augmented Reality with SceneKit.
I got a intrinsic camera matrix and a extrinsic matrix by estimating pose of a marker, using ARuco (OpenCV augmented reality library).
And I set up the SCNCamera's projectionTransform with parameters of the intrinsic matrix (fovy, aspect, zNear, zFar).
Normally in OpenGL, world coordinate relative to camera coordinate is calculated with ModelView but in SceneKit, there is no things such as modelView.
So I calculated inverse matrix of the extrinsic matrix to get the camera coordinate relative to the world coordinate(the marker coordinate).
And I think I've got correct camera's position by the inverse matrix which contains rotation and translate matrix.
However I cannot get camera's rotation from that.
Do you have any ideas?
SceneKit has the same view matrixes that you've come across in OpenGL, they're just a little hidden until you start toying with shaders. A little too hidden IMO.
You seem to have most of this figured out. The projection matrix comes from your camera projectionTransform, and the view matrix comes from the inverse of your camera matrix SCNMatrix4Invert(cameraNode.transform). In my case everything was in world coordinates making my model matrix a simple identity matrix.
The code I ended up using to get the classic model-view-projection matrix was something like...
let projection = camera.projectionTransform()
let view = SCNMatrix4Invert(cameraNode.transform)
let model = SCNMatrix4Identity
let viewProjection = SCNMatrix4Mult(view, projection)
let modelViewProjection = SCNMatrix4Mult(model, viewProjection)
For some reason I found SCNMatrix4Mult(...) took arguments in a different order than I was expecting (eg; opposite to GLKMatrix4Multiply(...)).
I'm still not 100% on this, so would welcome edits/tips. Using this method I was unable to get the SceneKit MVP matrix (as passed to shader) to match up with that calculated by the code above... but it was close enough for what I needed.
#lock's answer looks good with a couple additions:
(1) access SCNNode worldTransform instead of transform in case the cameraNode is animated or parented:
let view = SCNMatrix4Invert(cameraNode.presentationNode.worldTransform)
(2) the code doesn't account for the view's aspect ratio. e.g., assuming a perspective projection, you'll want to do:
perspMatrix.m11 /= viewportAR; //if using Yfov -> adjust Y`
/* or, */
perspMatrix.m22 *= viewportAR; //if using Xfov -> adjust X`
Where, viewportAR = viewport.width / viewport.height
Another way to do it is to have one node with a rendered delegate in the scene, and retrieve SceneKit’s matrices from that delegate (they are passed as options):
FOUNDATION_EXTERN NSString * const SCNModelTransform;
FOUNDATION_EXTERN NSString * const SCNViewTransform;
FOUNDATION_EXTERN NSString * const SCNProjectionTransform;
FOUNDATION_EXTERN NSString * const SCNNormalTransform;
FOUNDATION_EXTERN NSString * const SCNModelViewTransform;
FOUNDATION_EXTERN NSString * const SCNModelViewProjectionTransform;

Displacement Map UV Mapping?

Summary
I'm trying to apply a displacement map (Height map) to a rather simple object (Hexagonal plane) and I'm having some unexpected results. I am using grayscale and as such, I was under the impression my height map should only be affecting the Z values of my mesh. However, the displacement map I've created stretches the mesh across the X and Y planes. Furthermore, it doesn't seem to use the UV mapping I've created that all other textures are successfully applied to.
Model and UV Map
Here are reference images of my hexagonal mesh and its corresponding UV map in Blender.
Diffuse and Displacement Textures
These are the diffuse and displacement map textures I am applying to my mesh through Three.JS.
Renders
When I render the plane without a displacement map, you can see that the hexagonal plane stays within the lines. However, when I add the displacement map it clearly affects the X and Y positions of the vertices rather than affecting only the Z, expanding the plane well over the lines.
Code
Here's the relevant Three.js code:
// Textures
var diffuseTexture = THREE.ImageUtils.loadTexture('diffuse.png', null, loaded);
var displacementTexture = THREE.ImageUtils.loadTexture('displacement.png', null, loaded);
// Terrain Uniform
var terrainShader = THREE.ShaderTerrain["terrain"];
var uniformsTerrain = THREE.UniformsUtils.clone(terrainShader.uniforms);
//uniformsTerrain["tNormal"].value = null;
//uniformsTerrain["uNormalScale"].value = 1;
uniformsTerrain["tDisplacement"].value = displacementTexture;
uniformsTerrain["uDisplacementScale"].value = 1;
uniformsTerrain[ "tDiffuse1" ].value = diffuseTexture;
//uniformsTerrain[ "tDetail" ].value = null;
uniformsTerrain[ "enableDiffuse1" ].value = true;
//uniformsTerrain[ "enableDiffuse2" ].value = true;
//uniformsTerrain[ "enableSpecular" ].value = true;
//uniformsTerrain[ "uDiffuseColor" ].value.setHex(0xcccccc);
//uniformsTerrain[ "uSpecularColor" ].value.setHex(0xff0000);
//uniformsTerrain[ "uAmbientColor" ].value.setHex(0x0000cc);
//uniformsTerrain[ "uShininess" ].value = 3;
//uniformsTerrain[ "uRepeatOverlay" ].value.set(6, 6);
// Terrain Material
var material = new THREE.ShaderMaterial({
uniforms:uniformsTerrain,
vertexShader:terrainShader.vertexShader,
fragmentShader:terrainShader.fragmentShader,
lights:true,
fog:true
});
// Load Tile
var loader = new THREE.JSONLoader();
loader.load('models/hextile.js', function(g) {
//g.computeFaceNormals();
//g.computeVertexNormals();
g.computeTangents();
g.materials[0] = material;
tile = new THREE.Mesh(g, new THREE.MeshFaceMaterial());
scene.add(tile);
});
Hypothesis
I'm currently juggling three possibilities as to why this could be going wrong:
The UV map is not applying to my displacement map.
I've made the displacement map incorrectly.
I've missed a crucial step in the process that would lock the displacement to Z-only.
And of course, secret option #4 which is none of the above and I just really have no idea what I'm doing. Or any mixture of the aforementioned.
Live Example
You can view a live example here.
If anybody with more knowledge on the subject could guide me I'd be very grateful!
Edit 1: As per suggestion, I've commented out computeFaceNormals() and computeVertexNormals(). While it did make a slight improvement, the mesh is still being warped.
In your terrain material, set wireframe = true, and you will be able to see what is happening.
Your code and textures are basically fine. The problem occurs when you compute vertex normals in the loader callback function.
The computed vertex normals for the outer ring of your geometry point somewhat outward. This is most likely because in computeVertexNormals() they are computed by averaging the face normals of each neighboring face, and the face normals of the "sides" of your model (the black part) are averaged into the vertex normal calculation for those vertices that make up the outer ring of the "cap".
As a result, the outer ring of the "cap" expands outward under the displacement map.
EDIT: Sure enough, straight from your model, the vertex normals of the outer ring point outward. The vertex normals for the inner rings are all parallel. Perhaps Blender is using the same logic to generate vertex normals as computeVertexNormals() does.
The problem is how your object is constructed becuase the displacement happens along the normal vector.
the code is here.
https://github.com/mrdoob/three.js/blob/master/examples/js/ShaderTerrain.js#L348-350
"vec3 dv = texture2D( tDisplacement, uvBase ).xyz;",
This takes a the rgb vector of the displacement texture.
"float df = uDisplacementScale * dv.x + uDisplacementBias;",
this takes only red value of the vector becuase uDisplacementScale is normally 1.0 and uDisplacementBias is 0.0.
"vec3 displacedPosition = normal * df + position;",
This displaces the postion along the normal vector.
so to solve you either update the normals or the shader.

Resources