HLSL Modify Tessellation shader to make equilateral triangles? - directx

I'm in the proccess of procedural planet generation; so far I have done the dynamic LOD work, but my current software algorithm is very very slow. I decided to do it using DX11's new tessellation features instead.
Currently my sphere is a subdivided icosahedron. (20 sides all equilateral triangles)
Back when I was subdividing using my software algorithm, one triangle would be
split into four children across the midpoints of the parent forming the Hyrule symbol each time...like this: http://puu.sh/1xFIx
As you can see, each triangle subdivided created more and more equilateral triangles, i.e. each one was exactly the same shape.
But now that I am using the GPU to tessellate in HLSL, the result is definately not
what I am looking for: http://puu.sh/1xFx7
Is there anything I can do in the Hull and Domain shaders to change the tessellation
so that it subdivides into sets of equilateral triangles like the first image?
Should I be using the geometry shader for something like this? If so, would it be
slower then the tessellator?

I tried using Tessellation Shader, but I encontred a problem: the domain shader only pass the uv coordinate (SV_DomainLocation) and the input patch for positionining the vertices, when the domain location for vertex is 0.3, 0.3, 0.3 (center vertex) is impossible to know the correct position because you need information about the other vertices or a index(x, y) of iteration that's not provided by the Domain Shader Stage.
because this problem I write the code in geometry shader, this shader is very limited for tessellations because the output stream cannot have a size bigger than 1024 bytes (in shader model 5.0). I implemented the calculation of vertex positions using the uv (like SV_DomainLocation) but this only tessellate the triangles, you must use part of your code to calculate added position in center of triangles to create the precise final result.
this is the code for equilateral triangles tessellation:
// required for array
void DrawTriangle(float4 p0, float4 p1, float4 p2, inout TriangleStream<VS_OUT> stream)
VS_OUT v0;
v0.pos = p0;
VS_OUT v1;
v1.pos = p1;
VS_OUT v2;
v2.pos = p2;
[maxvertexcount(128)] // directx rule: maxvertexcount * sizeof(VS_OUT) <= 1024
void gs(triangle VS_OUT input[3], inout TriangleStream<VS_OUT> stream)
int itc = min(tess, MAX_ITERATIONS);
float fitc = itc;
float4 past_pos[MAX_ITERATIONS];
float4 array_pass[MAX_ITERATIONS];
for (int pi = 0; pi < MAX_ITERATIONS; pi++)
past_pos[pi] = float4(0, 0, 0, 0);
array_pass[pi] = float4(0, 0, 0, 0);
// -------------------------------------
// Tessellation kernel for the control points
for (int x = 0; x <= itc; x++)
float4 last;
for (int y = 0; y <= x; y++)
float2 seg = float2(x / fitc, y / fitc);
float3 uv;
uv.x = 1 - seg.x;
uv.z = seg.y;
uv.y = 1 - (uv.x + uv.z);
// ---------------------------------------
// Domain Stage
// uv Domain Location
// x,y IterationIndex
float4 fpos = input[0].pos * uv.x;
fpos += input[1].pos * uv.y;
fpos += input[2].pos * uv.z;
if (x > 0 && y > 0)
DrawTriangle(past_pos[y - 1], last, fpos, stream);
if (y < x)
// add adjacent triangle
DrawTriangle(past_pos[y - 1], fpos, past_pos[y], stream);
array_pass[y] = fpos;
last = fpos;
for (int i = 0; i < MAX_ITERATIONS; i++)
past_pos[i] = array_pass[i];


Convolution of Image Processing in Processing language

Since the Corona situation characterizes my studies as self-study, as a Processing-Language newbie I don't have an easy time getting into the subject of image processing , more specifically convolution. Therefore I hope that you can help me.
My lecturer, who unfortunately is nearly never reachable, left me the following conv code. The theory behind convolution is clear to me, but I have many gaps in understanding related to the code. Could someone leave a line comment so that I can get into the code a bit more fluently?
The Code is following
color convolution (int x, int y, float[][] matrix, int matrix_size, PImage img){
float rtotal = 0.0;
float gtotal = 0.0;
float btotal = 0.0;
int offset = matrix_size / 2;
for (int i = 0; i < matrix_size; i++){
for (int j= 0; j < matrix_size; j++){
int xloc = x+i-offset;
int yloc = y+j-offset;
int loc = xloc + img.width*yloc;
rtotal += (red(img.pixels[loc]) * matrix[i][j]);
gtotal += (green(img.pixels[loc]) * matrix[i][j]);
btotal += (blue(img.pixels[loc]) * matrix[i][j]);
rtotal = constrain(rtotal, 0, 255);
gtotal = constrain(gtotal, 0, 255);
btotal = constrain(btotal, 0, 255);
return color(rtotal, gtotal, btotal);
I have to do a bit of guesswork since I'm not positive about all of the functions you're using and I'm not familiar with the Processing 3+ library, but here's my best shot at it.
color convolution (int x, int y, float[][] matrix, int matrix_size, PImage img){
// Note: the 'matrix' parameter here will also frequently be referred to as
// a 'window' or 'kernel' in research
// I'm not certain what your PImage class is from, but I'll assume
// you're using the Processing 3+ library and work off of that assumption
// how much of each color we see within the kernel (matrix) space
float rtotal = 0.0;
float gtotal = 0.0;
float btotal = 0.0;
// this offset is to zero-center our kernel
// the fact that we use matrix_size / 2 sort of implicitly
// alludes to the fact that our matrix_size should be an odd-number
// so that we can have a middle-pixel
int offset = matrix_size / 2;
// looping through the kernel. the fact that we use 'matrix_size'
// as our end-condition for both dimensions means that our 'matrix' kernel
// must always be a square
for (int i = 0; i < matrix_size; i++){
for (int j= 0; j < matrix_size; j++){
// calculating the index conversion from 2D to the 1D format that PImage uses
// refer to: https://processing.org/tutorials/pixels/
// for a better understanding of PImage indexing (about 1/3 of the way down the page)
// WARNING: by subtracting the offset it is possible to hit negative
// x,y values here if you pick an x or y position less than matrix_size / 2.
// the same index-out-of-bounds can occur on the high end.
// When you convolve using a kernel of N x N size (N here would be matrix_size)
// you can only convolve from [N / 2, Width - (N / 2)] for x and y
int xloc = x+i-offset;
int yloc = y+j-offset;
// this is the final 1D PImage index that corresponds to [xloc, yloc] in our 2D image
// really go back up and take a look at the link if this doesn't make sense, it's pretty good
int loc = xloc + img.width*yloc;
// I have to do some speculation again since I'm not certain what red(img.pixels[loc]) does
// I'll assume it returns the red red channel of the pixel
// this section just adds up all of the pixel colors multiplied by the value in the kernel
rtotal += (red(img.pixels[loc]) * matrix[i][j]);
gtotal += (green(img.pixels[loc]) * matrix[i][j]);
btotal += (blue(img.pixels[loc]) * matrix[i][j]);
// the fact that no further division or averaging happens after the for-loops implies
// that the kernel you feed in should have balanced values for your kernel size
// for example, a kernel that's designed to average out the color over the 3 x 3 area
// it covers (this would be like blurring the image) would be filled with 1/9
// in general: the kernel you're using should have a sum of 1 for all of the numbers inside
// this is just 'in general' you can play around with not doing that, but you'll probably notice a
// darkening effect for when the sum is less than 1, and a brightening effect if it's greater than 1
// for more info on kernels, read this: https://en.wikipedia.org/wiki/Kernel_(image_processing)
// I don't have the code for this constrain function,
// but it's almost certainly just your typical clamp (constrains the values to [0, 255])
// Note: this means that your values saturate at 0 and 255
// if you see a lot of black or white then that means your kernel
// probably isn't balanced as mentioned above
rtotal = constrain(rtotal, 0, 255);
gtotal = constrain(gtotal, 0, 255);
btotal = constrain(btotal, 0, 255);
// Finished!
return color(rtotal, gtotal, btotal);

Converting YUV422 to RGB using GPU shader HLSL

I'm considering to perform the color space conversion from YUV422 to RGB using HLSL. A four-byte YUYV will yield 2 three-byte RGB values, for example, Y1UY2V will give R1G1B1(left pixel) and R2G2B2(right pixel). Given texture coordinates in pixel shader increased gradiently, how could I differentiate between the texture coordinates for the left pixels i.e. all R1G1B1 and the texture coordinates for right pixels i.e. all R2G2B2. This way I could render all R1G1B1 and all R2G2B2 on a single texture instead of two.
Not sure what version of DirectX you use, but here is the version I use for dx11 (please note in that case I send yuv data in a StructuredBuffer, which saves me the fact of dealing with row stride. You can apply the same technique sending your yuv data as texture of course (with few little changes to the code below).
Here is the pixel shader code (I assume your render target is same size as your input image, and that you render a full screen quad/triangle).
StructuredBuffer<uint> yuy;
int w;
int h;
struct psInput
float4 p : SV_Position;
float2 uv : TEXCOORD0;
float4 PS(psInput input) : SV_Target
//Calculate pixel location within buffer (if you use texture change lookup here)
uint2 xy = input.p.xy;
uint p = (xy.x) + (xy.y * w);
uint pixloc = p / 2;
uint pixdata = yuy[pixloc];
//Since pixdata is packed, use some bitshift to remove non useful data
uint v = (pixdata & 0xff000000) >> 24;
uint y1 = (pixdata & 0xff0000) >> 16;
uint u = (pixdata & 0xff00) >> 8;
uint y0 = pixdata & 0x000000FF;
//Check if you are left/right pixel
uint y = p % 2 == 0 ? y0: y1;
//Convert yuv to rgb
float cb = u;
float cr = v;
float r = (y + 1.402 * (cr - 128.0));
float g = (y - 0.344 * (cb - 128.0) - 0.714 * (cr - 128));
float b = (y + 1.772 * (cb - 128));
return float4(r,g,b,1.0f) / 256.0f;
Hope that helps.

How can I repeat my texture in DX

There is a handy feature in three.js 3d library that you can set the sampler to repeat mode and set the repeat attribute to some values you like, for example, (3, 5) means this texture will repeat 3 times horizontally and 5 times vertically. But now I'm using DirectX and I cannot find some good solutions for this problem. Note that the UV coordinates of vertices still ranges from 0 to 1, and I don't want to change my HLSL codes because I want a programmable solution for this, thanks very much!
Edit : presume I have a cube model already. And the texture coordinates of its vertices are between0 and 1. If i use wrap mode or clamp mode for sampling textures it's all OK now. But I want to repeat a texture on one of its faces, and I first need to change to wrap mode. That's i already knows. Then I have to edit my model so that texture coordinates range 0-3. What if I don't change my model? So far i came out one way: I need to add a variable to pixel shader represents how many times does the map repeats and I will multiply this factor to coordinate when sampling. Not a graceful solution i think emmmm…
Since you've edited your Question, there is another Answer to your problem:
From what I understood, you have a face with uv's like so:
0,1 1,1
| |
| |
| |
0,0 1,0
But want the texture repeated 3 times (for example) instead of 1 time.
(Without changing the original model)
Multiple solutions here:
You could do it, when updating your buffers (if you do it):
HRESULT hResult = D3DDeviceContext->Map(vertexBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &resource);
if(hResult != S_OK) return false;
YourVertexFormat *ptr=(YourVertexFormat*)resource.pData;
for(int i=0;i<vertexCount;i++)
ptr[i] = vertices[i];
ptr[i].uv.x *= multiplyX; //in your case 3
ptr[i].uv.y *= multiplyY; //in your case 5
D3DDeviceContext->Unmap(vertexBuffer, 0);
But if you don't need updating the buffer anyways, i wouldn't recommend it, because it is terribly slow.
A faster way is to use the vertex shader:
cbuffer MatrixBuffer
matrix worldMatrix;
matrix viewMatrix;
matrix projectionMatrix;
struct VertexInputType
float4 position : POSITION0;
float2 uv : TEXCOORD0;
// ...
struct PixelInputType
float4 position : SV_POSITION;
float2 uv : TEXCOORD0;
// ...
PixelInputType main(VertexInputType input)
input.position.w = 1.0f;
PixelInputType output;
output.position = mul(input.position, worldMatrix);
output.position = mul(output.position, viewMatrix);
output.position = mul(output.position, projectionMatrix);
This is what you basicly need:
output.uv = input.uv * 3; // 3x3
Or more advanced:
output.uv = float2(input.u * 3, input.v * 5);
// ...
return output;
I would recommend the vertex shader solution, because it's fast and in directx you use vertex shaders anyways, so it's not as expensive as the buffer update solution...
Hope that helped solving your problems :)
You basicly want to create a sampler state like so:
ID3D11SamplerState* m_sampleState;
3D11_SAMPLER_DESC samplerDesc;
samplerDesc.Filter = D3D11_FILTER_MIN_MAG_MIP_LINEAR;
samplerDesc.AddressU = D3D11_TEXTURE_ADDRESS_WRAP;
samplerDesc.AddressV = D3D11_TEXTURE_ADDRESS_WRAP;
samplerDesc.AddressW = D3D11_TEXTURE_ADDRESS_WRAP;
samplerDesc.MipLODBias = 0.0f;
samplerDesc.MaxAnisotropy = 1;
samplerDesc.ComparisonFunc = D3D11_COMPARISON_ALWAYS;
samplerDesc.BorderColor[0] = 0;
samplerDesc.BorderColor[1] = 0;
samplerDesc.BorderColor[2] = 0;
samplerDesc.BorderColor[3] = 0;
samplerDesc.MinLOD = 0;
samplerDesc.MaxLOD = D3D11_FLOAT32_MAX;
// Create the texture sampler state.
result = ifDEVICE->ifDX11->getD3DDevice()->CreateSamplerState(&samplerDesc, &m_sampleState);
And when you are setting your shader constants, call this:
ifDEVICE->ifDX11->getD3DDeviceContext()->PSSetSamplers(0, 1, &m_sampleState);
Then you can write your pixel shaders like this:
Texture2D Texture;
SamplerState SampleType;
float4 main(PixelInputType input) : SV_TARGET
float4 textureColor = shaderTexture.Sample(SampleType, input.uv);
Hope that helps...

DX11 Tessellation LOD with diameter incorrect tessellation values

I implemented the LoD with diameter from following withpaper NVidia TerrainTessellation WhitePaper. In Chapter "Hull Shader:Tessellation LOD" Page 7 there is a very good explenantion of the LoD with diameter. Here a good quote:
For each patch edge, the shader computes the edge length and then conceptually fits a sphere around it. The sphere is projected into screen space and its screen space diameter is used to compute the tessellation factor for the edge.
Here my HullShader:
// Globals
cbuffer TessellationBuffer // buffer need to be aligned to 16!!
float4 cameraPosition;
float tessellatedTriSize;
float3 padding;
matrix worldMatrix;
matrix projectionMatrix;
// Typedefs
struct HullInputType
float4 position : SV_POSITION;
float2 tex : TEXCOORD0;
float3 normal : NORMAL;
struct ConstantOutputType
float edges[3] : SV_TessFactor;
float inside : SV_InsideTessFactor;
struct HullOutputType
float4 position : SV_POSITION;
float2 tex : TEXCOORD0;
float3 normal : NORMAL;
// Rounding function
float roundTo2Decimals(float value)
value *= 100;
value = round(value);
value *= 0.01;
return value;
float calculateLOD(float4 patch_zero_pos, float4 patch_one_pos)//1,3,1,1; 3,3,0,1
float diameter = 0.0f;
float4 radiusPos;
float4 patchDirection;
// Calculates the distance between the patches and fits a sphere around.
diameter = distance(patch_zero_pos, patch_one_pos); // 2.23607
float radius = diameter/2; // 1.118035
patchDirection = normalize(patch_one_pos - patch_zero_pos); // 0.894,0,-0.447,0 direction from base edge_zero
// Calculate the position of the radiusPos (center of sphere) in the world.
radiusPos = patch_zero_pos + (patchDirection * radius);//2,3,0.5,1
radiusPos = mul(radiusPos, worldMatrix);
// Get the rectangular points of the sphere to the camera.
float4 camDirection;
// Direction from camera to the sphere center.
camDirection = normalize(radiusPos - cameraPosition); // 0.128,0,0.99,0
// Calculates the orthonormal basis (sUp,sDown) of a vector camDirection.
// Find the smallest component of camDirection and set it to 0. swap the two remaining
// components and negate one of them to find sUp_ which can be used to find sDown.
float4 sUp_;
float4 sUp;
float4 sDown;
float4 sDownAbs;
sDownAbs = abs(camDirection);//0.128, 0 ,0.99, 0
if(sDownAbs.y < sDownAbs.x && sDownAbs.y < sDownAbs.z) { //0.99, 0, 0.128
sUp_.x = -camDirection.z;
sUp_.y = 0.0f;
sUp_.z = camDirection.x;
sUp_.w = camDirection.w;
} else if(sDownAbs.z < sDownAbs.x && sDownAbs.z < sDownAbs.y){
sUp_.x = -camDirection.y;
sUp_.y = camDirection.x;
sUp_.z = 0.0f;
sUp_.w = camDirection.w;
sUp_.x = 0.0f;
sUp_.y = -camDirection.z;
sUp_.z = camDirection.y;
sUp_.w = camDirection.w;
// simple version
// sUp_.x = -camDirection.y;
// sUp_.y = camDirection.x;
// sUp_.z = camDirection.z;
// sUp_.w = camDirection.w;
sUp = sUp_ / length(sUp_); // =(0.99, 0, 0.128,0)/0.99824 = 0.991748,0,0.128226,0
sDown = radiusPos - (sUp * radius); // 0.891191,3,0.356639,1 = (2,3,0.5,1) - (0.991748,0,0.128226,0)*1.118035
sUp = radiusPos + (sUp * radius); // = (3.10881,3,0.643361,1)
// Projects sphere in projection space (2d).
float4 projectionUp = mul(sUp, projectionMatrix);
float4 projectionDown = mul(sDown, projectionMatrix);
// Calculate tessellation factor for this edge according to the diameter on the screen.
float2 sUp_2;
sUp_2.x = projectionUp.x;
sUp_2.y = projectionUp.y;
float2 sDown_2;
sDown_2.x = projectionDown.x;
sDown_2.y = projectionDown.y;
// Distance between the 2 points in 2D
float projSphereDiam = distance(sUp_2, sDown_2);
//return tessellatedTriSize;
//if(projSphereDiam < 2.0f)
// return 1.0f;
//else if(projSphereDiam < 10.0f)
// return 2.0f;
// return 10.0f;
return projSphereDiam*tessellatedTriSize;
// Patch Constant Function
// set/calculate any data constant to entire patch.
// is invoked once per patch
// direction vector w = 0 ; position vector w = 1
// receives as input a patch with 3 control points and each control point is represented by the structure of HullInputType
// patch control point should be displaced vertically, this can significantly affect the distance of the camera
// patchId is an identifier number of the patch generated by the Input Assembler
ConstantOutputType ColorPatchConstantFunction(InputPatch<HullInputType, 3> inputPatch, uint patchId : SV_PrimitiveID)
ConstantOutputType output;
////ret distance(x, y) Returns a distance scalar between two vectors.
float ret, retinside;
retinside = 0.0f;
float4 patch_zero_pos;//1,3,1,1
patch_zero_pos = float4(inputPatch[0].position.xyz, 1.0f);
float4 patch_one_pos;//3,3,0,1
patch_one_pos = float4(inputPatch[1].position.xyz, 1.0f);
float4 patch_two_pos;
patch_two_pos = float4(inputPatch[2].position.xyz, 1.0f);
// calculate LOD by diametersize of the edges
ret = calculateLOD(patch_zero_pos, patch_one_pos);
ret = roundTo2Decimals(ret);// rounding
output.edges[0] = ret;
retinside += ret;
ret = calculateLOD(patch_one_pos, patch_two_pos);
ret = roundTo2Decimals(ret);// rounding
output.edges[1] = ret;
retinside += ret;
ret = calculateLOD(patch_two_pos, patch_zero_pos);
ret = roundTo2Decimals(ret);// rounding
output.edges[2] = ret;
retinside += ret;
// Set the tessellation factor for tessallating inside the triangle.
// see image tessellationOuterInner
retinside *= 0.333;
// rounding
retinside = roundTo2Decimals(retinside);
output.inside = retinside;
return output;
// Hull Shader
// The hull shader is called for each output control point.
// Trivial pass through
[partitioning("fractional_odd")] //fractional_odd
HullOutputType ColorHullShader(InputPatch<HullInputType, 3> patch, uint pointId : SV_OutputControlPointID, uint patchId : SV_PrimitiveID)
HullOutputType output;
// Set the position for this control point as the output position.
output.position = patch[pointId].position;
// Set the input color as the output color.
output.tex = patch[pointId].tex;
output.normal = patch[pointId].normal;
return output;
Some graphical explenation to the code:
First find the Center between the two vertices
Find orthogonal basis (rectangular to the camera direction) from the camera on the "circle"
project sUp and sDown in Projection space for calculating the length to calculate the tessellation factor.
The Problem
The Tessellation worked fine. But for some testing reason I let the object rotate, so I can see if the tessellation is going with the rotation aswell. Some how I think it is not 100% correct. Look at the Plane, this plane is rotated by (1.0f, 2.0f, 0.0f) and the ligther red is to show higher tessellation factors compared to the darker red. the green color are factors of 1.0. It should be more detailed on the top of the plane, than on the bottom.
What am I missing?
Some test cases
If I remove rotation stuff it looks like this:
If I remove rotation and I'm including this simple version of orthogonale base calculation:
// simple version
sUp_.x = -camDirection.y;
sUp_.y = camDirection.x;
sUp_.z = camDirection.z;
sUp_.w = camDirection.w;
it looks like this:
Could it be a problem, if I'm not using a lookUp Vector?
How are you doing LoD? I'm open trying something else...
What IDE are you using? If you're using Visual Studio, you should try Visual Studio Graphics Debugger or PIX, depending on the version of VS that you have.
I used the world matrix instead the view matrix. Always use the matrix the camera is using for rotations or other transformations.

Dot Product and Luminance/ Findmyicone

I have a basic question that I am struggling with here. When you look at the findmyicone sample code from WWDC 2010, you will see this:
static const uint8_t orangeColor[] = {255, 127, 0};
uint8_t referenceColor[3];
// Remove luminance
static inline void normalize( const uint8_t colorIn[], uint8_t colorOut[] ) {
// Dot product
int sum = 0;
for (int i = 0; i < 3; i++)
sum += colorIn[i] / 3;
for (int j = 0; j < 3; j++)
colorOut[j] = (float) ((colorIn[j] / (float) sum) * 255);
And then it is called:
normalize(orangeColor, referenceColor);
Running the debugger, it is converting BGRA: (Red 255, Green 127, Blue 0) to (Red 0, Green 255, Blue 0). I have looked on the web and SO to find details on luminance and dot product and there is really no information.
1- Can someone guide me on what this function is doing?
2- Can you guide me to some helpful topics/primer online as well?
Thanks again
What they're trying to do is track a particular color across variations in brightness, so they're normalizing for the luminance of the color. I do something similar in the fragment shader I use in a color tracking example based on a GPU Gems paper from Apple, as well as the ColorObjectTracking sample application in my GPUImage framework:
vec3 normalizeColor(vec3 color)
return color / max(dot(color, vec3(1.0/3.0)), 0.3);
vec4 maskPixel(vec3 pixelColor, vec3 maskColor)
float d;
vec4 calculatedColor;
// Compute distance between current pixel color and reference color
d = distance(normalizeColor(pixelColor), normalizeColor(maskColor));
// If color difference is larger than threshold, return black.
calculatedColor = (d > threshold) ? vec4(0.0) : vec4(1.0);
//Multiply color by texture
return calculatedColor;
The above calculation takes the average of the three color components by multiplying each channel by 1/3 and then summing them (that's what the dot product does here). It then divides each color channel by this average to arrive at a normalized color.
The distance between this normalized color and the target one is calculated, and if it is within a certain threshold the pixel is marked as being of that color.
This is just one way of determining proximity of one color to another. Another way is to convert the RGB values into Y, Cr, and Cb (Y, U, and V) components and then take the distance between just the chrominance portions (Cr and Cb):
vec4 textureColor = texture2D(inputImageTexture, textureCoordinate);
vec4 textureColor2 = texture2D(inputImageTexture2, textureCoordinate2);
float maskY = 0.2989 * colorToReplace.r + 0.5866 * colorToReplace.g + 0.1145 * colorToReplace.b;
float maskCr = 0.7132 * (colorToReplace.r - maskY);
float maskCb = 0.5647 * (colorToReplace.b - maskY);
float Y = 0.2989 * textureColor.r + 0.5866 * textureColor.g + 0.1145 * textureColor.b;
float Cr = 0.7132 * (textureColor.r - Y);
float Cb = 0.5647 * (textureColor.b - Y);
float blendValue = 1.0 - smoothstep(thresholdSensitivity, thresholdSensitivity + smoothing, distance(vec2(Cr, Cb), vec2(maskCr, maskCb)));
This code is what I use in a chroma keying shader, and it's based on a similar calculation that Apple uses in one of their sample applications. Which one is best can depend on the particular situation you're facing.
