Integrating a metal depth buffer with scenekit rendering - ios

I'm using Metal to render a scene with a z buffer and now need to integrate this z-buffer into SceneKit's rendering. However I can't figure out how to get SceneKit to use this depth better correctly and am not even 100% sure what format SceneKit expects it's z-buffers to be in
Base on this question, my understanding was that SceneKit uses a reverse logarithmic z-buffer in the range of 1 (near) to 0 (far). However I can't get this working and objects I draw with SceneKit don't properly respect the depth buffer: they are either always showing or always hidden
First, here's how the generate a z-buffer texture in a Metal render pass:
struct FragmentOut {
float4 color [[color(0)]];
float depth [[depth(any)]];
fragment FragmentOut metalRenderFragment(const InOut in [[ stage_in ]]) {
FragmentOut out;
out.depth = 0; // 0 is far with reverse z buffer
float cameraSpaceZ = ...; // Computed in shader
// There constants are taken from SceneKit's camera and inlined here
const float zNear = 0.0010000000474974513;
const float zFar = 1000.0;
float logDepth = log(z / zNear) / log(zFar / zNear);
out.depth = 1.0 - logDepth; // Reverse the depth for scenekit
return out;
Then to integrate the depth buffer into SceneKit, I render a full screen quad in scenekit with a SCNProgram that uses the depth texture generated in the previous step:
fragment FragmentOut sceneKitFullScreenQuadFragment(const InOut in [[ stage_in ]],
depth2d<float, access::sample> depthTexture [[texture(1)]])
constexpr sampler sampler(filter::linear);
const float depth = depthTexture.sample(sampler, in.uv);
return {
.color = float4(0),
.depth = depth,
So two questions:
What format does SceneKit use for its z-buffer? Is it a reversed logarithmic z-buffer?
What am I doing wrong in generating the z-buffer values for SceneKit?

SceneKit uses a reverse logarithmic Z-Buffer. This post and this post show you how to get a normalized linear mapping space [0...1]. You need the opposite formula.
Also, you can toggle the value from reverseZ to directZ this way:
let sceneView = self.view as! SCNView
sceneView.usesReverseZ = true // default

Andy Jazz's answer helped but I still found the links confusing. Here's what ultimately worked for me (although there are possibly other ways to do this):
When generating the depth map (this would be inside the the metal shader in my original example) pass in SceneKit's projection transform matrix and use this to transform the depth value:
// In a metal shader generating the depth map
// The z distance from the camera, e.g. if the object
// at the current position is 5 units away, this would be 5.
const float z = ...;
// The camera points along the -z axis, so transform the -z position
// with SceneKit's projection matrix (you can get this from SCNCamera)
const float4 depthPos = (sceneKitState.projectionTransform * float4(0, 0, -z, 1));
// Then do perspective division to get the final depth value
out.depth = depthPos.z / depthPos.w;
Then inside of the SceneKit shader, simply write out the depth, taking into account usesReverseZ:
// In a scenekit, full screen quad shader
const float depth = depthTexture.sample(sampler, in.uv);
return {
.color = float4(0),
.depth = 1.0 - depth,
❗️ The above assumes you are using sceneView.usesReverseZ = true (the default). If you are using usesReverseZ = false, simply do .depth = depth instead


Shadow Mapping - Space Transformations are going bad

I am currently studying shadow mapping, and my biggest issue right now is the transformations between spaces. This is my current working theory/steps.
Pass 1:
Get depth of pixel from camera, store in depth buffer
Get depth of pixel from light, store in another buffer
Pass 2:
Use texture coordinate to sample camera's depth buffer at current pixel
Convert that depth to a view space position by multiplying the projection coordinate with invProj matrix. (also do a perspective divide).
Take that view position and multiply by invV (camera's inverse view) to get a world space position
Multiply world space position by light's viewProjection matrix.
Perspective divide that projection-space coordinate, and manipulate into [0..1] to sample from light depth buffer.
Get current depth from light and closest (sampled) depth, if current depth > closest depth, it's in shadow.
Shader Code
PS_INPUT vs(VS_INPUT input) {
output.pos = mul(input.vPos, mvp);
output.cameraDepth =;
float4 vPosInLight = mul(input.vPos, m);
vPosInLight = mul(vPosInLight, light.viewProj);
output.lightDepth =;
float cameraDepth = input.cameraDepth.x / input.cameraDepth.y;
//Bundle cameraDepth in alpha channel of a normal map.
output.normal = float4(input.normal, cameraDepth);
//4 Lights in total -- although only 1 is active right now. Going to use r/g/b/a for each light depth.
output.lightDepths.r = input.lightDepth.x / input.lightDepth.y;
Pass 2 (Screen Quad):
float4 ps(PS_INPUT input) : SV_TARGET{
float4 pixelPosView = depthToViewSpace(input.texCoord);
float4 pixelPosWorld = mul(pixelPosView, invV);
float4 pixelPosLight = mul(pixelPosWorld, light.viewProj);
float shadow = shadowCalc(pixelPosLight);
//For testing / visualisation
return float4(shadow,shadow,shadow,1);
float4 depthToViewSpace(float2 xy) {
//Get pixel depth from camera by sampling current texcoord.
//Extract the alpha channel as this holds the depth value.
//Then, transform from [0..1] to [-1..1]
float z = (_normal.Sample(_sampler, xy).a) * 2 - 1;
float x = xy.x * 2 - 1;
float y = (1 - xy.y) * 2 - 1;
float4 vProjPos = float4(x, y, z, 1.0f);
float4 vPositionVS = mul(vProjPos, invP);
vPositionVS = float4( / vPositionVS.w,1);
return vPositionVS;
float shadowCalc(float4 pixelPosL) {
//Transform pixelPosLight from [-1..1] to [0..1]
float3 projCoords = ( / pixelPosL.w) * 0.5 + 0.5;
float closestDepth = _lightDepths.Sample(_sampler, projCoords.xy).r;
float currentDepth = projCoords.z;
return currentDepth > closestDepth; //Supposed to have bias, but for now I just want shadows working haha
CPP Matrices
// (Position, LookAtPos, UpDir)
auto lightView = XMMatrixLookAtLH(XMLoadFloat4(&pos4), XMVectorSet(0,0,0,1), XMVectorSet(0,1,0,0));
// (FOV, AspectRatio (1000/680), NEAR, FAR)
auto lightProj = XMMatrixPerspectiveFovLH(1.57f , 1.47f, 0.01f, 10.0f);
XMStoreFloat4x4(&_cLightBuffer.light.viewProj, XMMatrixTranspose(XMMatrixMultiply(lightView, lightProj)));
Current Outputs
White signifies that a shadow should be projected there. Black indicates no shadow.
CameraPos (0, 2.5, -2)
CameraLookAt (0, 0, 0)
CameraFOV (1.57)
CameraNear (0.01)
CameraFar (10.0)
LightPos (0, 2.5, -2)
LightLookAt (0, 0, 0)
LightFOV (1.57)
LightNear (0.01)
LightFar (10.0)
If I change the CameraPosition to be (0, 2.5, 2), basically just flipped on the Z axis, this is the result.
Obviously a shadow shouldn't change its projection depending on where the observer is, so I think I'm making a mistake with the invV. But I really don't know for sure. I've debugged the light's projView matrix, and the values seem correct - going from CPU to GPU. It's also entirely possible I've misunderstood some theory along the way because this is quite a tricky technique for me.
Aha! Found my problem. It was a silly mistake, I was calculating the depth of pixels from each light, but storing them in a texture that was based on the view of the camera. The following image should explain my mistake better than I can with words.
For future reference, the solution I decided was to scrap my idea for storing light depths in texture channels. Instead, I basically make a new pass for each light, and bind a unique depth-stencil texture to render the geometry to. When I want to do light calculations, I bind each of the depth textures to a shader resource slot and go from there. Obviously this doesn't scale well with many lights, but for my student project where I'm only required to have 2 shadow casters, it suffices.
_context->DrawIndexed(indexCount, 0, 0); //Draw to regular render target
_sunlight->use(1, _context); //Use sunlight shader (basically just runs a Vertex Shader & Null Pixel shader so depth can be written to depth map)
_context->DrawIndexed(indexCount, 0, 0); //Draw to sunlight depth target
ID3D11RenderTargetView* nullrv = { nullptr };
ctx->OMSetRenderTargets(1, &nullrv, _sunlightDepthStencilView);
//The purpose of setting a null render target before doing the draw call is
//that a draw call with only a depth target bound is much faster.
//(At least I believe so, from my reading online)

please explain me HLSL VertexShder

I'm using vs2015 and studying dx11.
I'll show you code first.
cbuffer cbperobject {
float4x4 gWorldViewProj;
struct VertexIn {
float3 Pos : POSITION;
float4 Color : COLOR;
struct VertexOut {
float4 PosH : SV_POSITION;
float4 Color : COLOR;
VertexOut main( VertexIn vin )
VertexOut vOut;
vOut.PosH = mul(float4(vin.Pos, 1.0f), gWorldViewProj);
vOut.Color = vin.Color;
return vOut;
This is my vertex shader code. I rahter copied it from internet.
HRESULT result;
D3D11_MAPPED_SUBRESOURCE mappedResource;
XMMATRIX* dataPtr;
UINT bufferNumber;
// Transpose the matrices to prepare them for the shader.
// Lock the constant buffer so it can be written to.
result = mD3dDContext->Map(contantBuff, 0, D3D11_MAP_WRITE_DISCARD, 0, &mappedResource);
if (FAILED(result))
return false;
// Get a pointer to the data in the constant buffer.
dataPtr = (XMMATRIX*)mappedResource.pData;
// Copy the matrices into the constant buffer.
XMMATRIX world = XMLoadFloat4x4(&mWorld); // 버텍스의 월드변환
XMMATRIX view = XMLoadFloat4x4(&mView); // 카메라
XMMATRIX proj = XMLoadFloat4x4(&mProj); // 직교투영
XMMATRIX worldViewProj = world*view*proj;
worldViewProj = XMMatrixTranspose(worldViewProj);
*dataPtr = worldViewProj;
// Unlock the constant buffer.
mD3dDContext->Unmap(contantBuff, 0);
// Set the position of the constant buffer in the vertex shader.
bufferNumber = 0;
// Finanly set the constant buffer in the vertex shader with the updated values.
mD3dDContext->VSSetConstantBuffers(bufferNumber, 1, &contantBuff);
return true;
This is my setting constant buffer in shader code.
First, what is difference between POSITION and SV_POSITION semantic? Would you recommend good HLSL tutorial book? I'm Korean and I'm living in Korea. There is no good book in here; I don't know why, all good book is out of print. What a bad country for studying programming.
Second, why should I transpose my camera matrix(worldviewproj matrix) before CPU gives data to GPU? It's Vertex * matrix = processed Vertex. Why should I transpose it?
Well POSITION(Semantic) gives directive to GPU, that concrete values will be placed as points in coordinate space and SV_POSITION is giving directive for pixel shader. Actually it gives order to GPU about pixels location on screen mainly in range -1 to 1. Look at this
Well seems you need Linear Algebra lessons mate. Matrix transposition is the key stone in 3d graphics. With Matrix transpositions(And same time transposed Matrix is inverse Matrix and Inverse Matrix is always Orthogonal) all Matrix Transformations are happening(Translation, Rotation, Scaling). First of all you need Linear Algebra stuff and about Rendering Api be it OpenGL or DirectX(never mind they are just API's) you can grab any book or online documentation you can look at Happy graphics coding pal ;).

Linear Depth to World Position

I have the following fragment and vertex shaders.
HLSL code
// Vertex shader
void mainVP(
float4 position : POSITION,
out float4 outPos : POSITION,
out float2 outDepth : TEXCOORD0,
uniform float4x4 worldViewProj,
uniform float4 texelOffsets,
uniform float4 depthRange) //Passed as float4(minDepth, maxDepth,depthRange,1 / depthRange)
outPos = mul(worldViewProj, position);
outPos.xy += * outPos.w;
outDepth.x = (outPos.z - depthRange.x)*depthRange.w;//value [0..1]
outDepth.y = outPos.w;
// Fragment shader
void mainFP( float2 depth: TEXCOORD0, out float4 result : COLOR) {
float finalDepth = depth.x;
result = float4(finalDepth, finalDepth, finalDepth, 1);
This shader produces a depth map.
This depth map must then be used to reconstruct the world positions for the depth values. I have searched other posts but none of them seem to store the depth using the same formula I am using. The only similar post is the following
Reconstructing world position from linear depth
Therefore, I am having a hard time reconstructing the point using the x and y coordinates from the depth map and the corresponding depth.
I need some help in constructing the shader to get the world view position for a depth at particular texture coordinates.
It doesn't look like you're normalizing your depth. Try this instead. In your VS, do:
outDepth.xy =;
And in your PS to render the depth, you can do:
float finalDepth = depth.x / depth.y;
Here is a function to then extract the view-space position of a particular pixel from your depth texture. I'm assuming you're rendering screen aligned quad and performing your position-extraction in the pixel shader.
// Function for converting depth to view-space position
// in deferred pixel shader pass. vTexCoord is a texture
// coordinate for a full-screen quad, such that x=0 is the
// left of the screen, and y=0 is the top of the screen.
float3 VSPositionFromDepth(float2 vTexCoord)
// Get the depth value for this pixel
float z = tex2D(DepthSampler, vTexCoord);
// Get x/w and y/w from the viewport position
float x = vTexCoord.x * 2 - 1;
float y = (1 - vTexCoord.y) * 2 - 1;
float4 vProjectedPos = float4(x, y, z, 1.0f);
// Transform by the inverse projection matrix
float4 vPositionVS = mul(vProjectedPos, g_matInvProjection);
// Divide by w to get the view-space position
return / vPositionVS.w;
For a more advanced approach that reduces the number of calculation involved but involves using the view frustum and a special way of rendering the screen-aligned quad, see here.

Calculating view space normal instead of world space

I hope you can help with this. I am currently using a deferred renderer and am trying to implement SSAO but that needs view space normals and I currently have world space normals, I am trying to work out how to convert them but keep getting stuck.
This is the main part of my vertex shaders, instanceTransform is either the World matrix or transpose of the instance matrix as the same code is used for instanced models as well.
VertexShaderOutput VertexShaderFunctionCommon(VertexShaderInput input, float4x4 instanceTransform)
VertexShaderOutput output = (VertexShaderOutput)0;
float4x4 worldViewProjection = mul(instanceTransform, ViewProjection);
output.Position = mul(float4(,1), worldViewProjection);
output.Depth =;
// calculate tangent space to world space matrix using the world space tangent, binormal, and normal as basis vectors
output.TangentToWorld[0] = normalize(mul(input.Tangent, instanceTransform));
output.TangentToWorld[1] = normalize(mul(input.Binormal, instanceTransform));
output.TangentToWorld[2] = normalize(mul(input.Normal, instanceTransform));
return output;
The normal calculation in the pixel shader is:
// read the normal from the normal map
float3 normalFromMap = tex2D(normalSampler, input.TexCoord);
//tranform to [-1,1]
normalFromMap = 2.0f * normalFromMap - 1.0f;
//transform into world space
normalFromMap = mul(normalFromMap, input.TangentToWorld);
//normalize the result
normalFromMap = normalize(normalFromMap);
//output the normal, in [0,1] space
output.Normal.rgb = NormalEncode(normalFromMap);
Can you help please?

How to calculate distance for fog effect model on xna?

I struggled for some time to add a fog effect in my xna games.
I work with a custom shader effect in a file (. Fx).
The "PixelShaderFunction" works without error. But the problem is that all my land is colored the same way.
I think the problem come from the calculation of the distance between the camera and the model.
float distance = length(input.TextureCoordinate - cameraPos);
Here is my complete code with "PixelShaderFunction"
// Both techniques share this same pixel shader.
float4 PixelShaderFunction(VertexShaderOutput input) : COLOR0
float distance = length(input.TextureCoordinate - cameraPos);
float l = saturate((distance-fogNear)/(fogFar-fogNear));
return tex2D(Sampler, input.TextureCoordinate) * lerp(input.Color, fogColor, l);
If your input.TextureCoordinate really represents texture coordinates for sampler, than the way you trying to calculate distance is wrong.
You can change body of your PixelShaderFunction as follows:
float distance = distance(cameraPos, input.Position3D);
float l = saturate((distance-fogNear)/(fogFar-fogNear));
return lerp(tex2D(Sampler, input.TextureCoordinate), fogColor, l);
Add the following to your VertexShaderOutput declaration:
float4 Position3D : TEXCOORD1;
In your Vertex Shader populate Position3D with the position of the vertex multiplied on world matrix:
output.Position3D = mul(input.pos, matWorld);
