YUV420 to RGB conversion - opencv

I converted an RGB matrix to YUV matrix using this formula:
Y = (0.257 * R) + (0.504 * G) + (0.098 * B) + 16
Cr = V = (0.439 * R) - (0.368 * G) - (0.071 * B) + 128
Cb = U = -(0.148 * R) - (0.291 * G) + (0.439 * B) + 128
I then did a 4:2:0 chroma subsample on the matrix. I think I did this correctly, I took 2x2 submatrices from the YUV matrix, ordered the values from least to greatest, and took the average between the 2 values in the middle.
I then used this formula, from Wikipedia, to access the Y, U, and V planes:
size.total = size.width * size.height;
y = yuv[position.y * size.width + position.x];
u = yuv[(position.y / 2) * (size.width / 2) + (position.x / 2) + size.total];
v = yuv[(position.y / 2) * (size.width / 2) + (position.x / 2) + size.total + (size.total / 4)];
I'm using OpenCV so I tried to interpret this as best I can:
y = src.data[(i*channels)+(j*step)];
u = src.data[(j%4)*step + ((i%2)*channels+1) + max];
v = src.data[(j%4)*step + ((i%2)*channels+2) + max + (max%4)];
src is the YUV subsampled matrix. Did I interpret that formula correctly?
Here is how I converted the colours back to RGB:
bgr.data[(i*channels)+(j*step)] = (1.164 * (y - 16)) + (2.018 * (u - 128)); // B
bgr.data[(i*channels+1)+(j*step)] = (1.164 * (y - 16)) - (0.813 * (v - 128)) - (0.391 * (u - 128)); // G
bgr.data[(i*channels+2)+(j*step)] = (1.164 * (y - 16)) + (1.596 * (v - 128)); // R
The problem is my image does not return to its original colours.
Here are the images for reference:
http://i.stack.imgur.com/vQkpT.jpg (Subsampled)
http://i.stack.imgur.com/Oucc5.jpg (Output)
I see that I should be converting from YUV444 to RGB now but I don't quite I understand what the clip function does in the sample I found on Wiki.
C = Y' − 16
D = U − 128
E = V − 128
R = clip(( 298 * C + 409 * E + 128) >> 8)
G = clip(( 298 * C - 100 * D - 208 * E + 128) >> 8)
B = clip(( 298 * C + 516 * D + 128) >> 8)
Does the >> mean I should shift bits?
I'd appreciate any help/comments! Thanks
Tried doing the YUV444 conversion but it just made my image appear in shades of green.
y = src.data[(i*channels)+(j*step)];
u = src.data[(j%4)*step + ((i%2)*channels+1) + max];
v = src.data[(j%4)*step + ((i%2)*channels+2) + max + (max%4)];
c = y - 16;
d = u - 128;
e = v - 128;
bgr.data[(i*channels+2)+(j*step)] = clip((298*c + 409*e + 128)/256);
bgr.data[(i*channels+1)+(j*step)] = clip((298*c - 100*d - 208*e + 128)/256);
bgr.data[(i*channels)+(j*step)] = clip((298*c + 516*d + 128)/256);
And my clip function:
int clip(double value)
return (value > 255) ? 255 : (value < 0) ? 0 : value;

I had the same problem when decoding WebM frames to RGB. I finally found the solution after hours of searching.
Take SCALEYUV function from here: http://www.telegraphics.com.au/svn/webpformat/trunk/webpformat.h
Then to decode the RGB data from YUV, see this file:
Search for "py = img->planes[0];", there are two algorithms to convert the data. I only tried the simple one (after "// then fall back to cheaper method.").
Comments in the code also refer to this page: http://www.poynton.com/notes/colour_and_gamma/ColorFAQ.html#RTFToC30
Works great for me.

You won't get back perfectly the same image since UV does compress the image.
You don't say if the result is completely wrong (ie an error) or just not perfect
R = clip(( 298 * C + 409 * E + 128) >> 8)
G = clip(( 298 * C - 100 * D - 208 * E + 128) >> 8)
B = clip(( 298 * C + 516 * D + 128) >> 8)
The >> 8 is a bit shift, equivalent to dividing by 256. This is just to allow you to do all the arithmatic in integer units rather than floating point for speed

Was experimenting with formulas present on wiki and found that mixed formula:
byte c = (byte) (y - 16);
byte d = (byte) (u - 128);
byte e = (byte) (v - 128);
byte r = (byte) (c + (1.370705 * (e)));
byte g = (byte) (c - (0.698001 * (d)) - (0.337633 * (e)));
byte b = (byte) (c + (1.732446 * (d)));
produces "better" errors for my images, simply makes some black points pure green (i.e. rgb = 0x00FF00) which is better for detection and correction ...
wiki source: https://en.wikipedia.org/wiki/YUV#Y.27UV420p_.28and_Y.27V12_or_YV12.29_to_RGB888_conversion


Is there a way to check if an XYZ triplet is a valid color?

The XYZ color space encompasses all possible colors, not just those which can be generated by a particular device like a monitor. Not all XYZ triplets represent a color that is physically possible. Is there a way, given an XYZ triplet, to determine if it represents a real color?
I wanted to generate a CIE 1931 chromaticity diagram (seen bellow) for myself, but wasn't sure how to go about it. It's easy to, for example, take all combinations of sRGB triplets and then transform them into the xy coordinates of the chromaticity diagram and then plot them. You cannot use this same approach in the XYZ color space though since not all combinations are valid colors. So far the best I have come up with is a stochastic approach, where I generate a random spectral distribution by summing a random number of random Gaussians, then converting it to XYZ using the standard observer functions.
Having thought about it a little more I felt the obvious solution is to generate a list of xy points around the edge of spectral locus, corresponding to pure monochromatic colors. It seems to me that this can be done by directly inputting the visible frequencies (~380-780nm) into the CIE XYZ standard observer color matching functions. Treating these points like a convex polygon you could determine if a point is within the spectral locus using one algorithm or another. In my case, since what I really wanted to do is simply generate the chromaticity diagram, I simply input these points into a graphics library's polygon drawing routine and then for each pixel of the polygon I can transform it into sRGB.
I believe this solution is similar to the one used by the library that Kel linked in a comment. I'm not entirely sure, as I am not familiar with Python.
function RGBfromXYZ(X, Y, Z) {
const R = 3.2404542 * X - 1.5371385 * Y - 0.4985314 * Z
const G = -0.969266 * X + 1.8760108 * Y + 0.0415560 * Z
const B = 0.0556434 * X - 0.2040259 * Y + 1.0572252 * Z
return [R, G, B]
function XYZfromYxy(Y, x, y) {
const X = Y / y * x
const Z = Y / y * (1 - x - y)
return [X, Y, Z]
function srgb_from_linear(x) {
if (x <= 0.0031308) {
return x * 12.92
} else {
return 1.055 * Math.pow(x, 1/2.4) - 0.055
// Analytic Approximations to the CIE XYZ Color Matching Functions
// from Sloan http://jcgt.org/published/0002/02/01/paper.pdf
function xFit_1931(x) {
const t1 = (x - 442) * (x < 442 ? 0.0624 : 0.0374)
const t2 = (x -599.8) * (x < 599.8 ? 0.0264 : 0.0323)
const t3 = (x - 501.1) * (x < 501.1 ? 0.0490 : 0.0382)
return 0.362 * Math.exp(-0.5 * t1 * t1) + 1.056 * Math.exp(-0.5 * t2 * t2) - 0.065 * Math.exp(-0.5 * t3 * t3)
function yFit_1931(x) {
const t1 = (x - 568.8) * (x < 568.8 ? 0.0213 : 0.0247)
const t2 = (x - 530.9) * (x < 530.9 ? 0.0613 : 0.0322)
return 0.821 * Math.exp(-0.5 * t1 * t1) + 0.286 * Math.exp(-0.5 * t2 * t2)
function zFit_1931(x) {
const t1 = (x - 437) * (x < 437 ? 0.0845 : 0.0278)
const t2 = (x - 459) * (x < 459 ? 0.0385 : 0.0725)
return 1.217 * Math.exp(-0.5 * t1 * t1) + 0.681 * Math.exp(-0.5 * t2 * t2)
const canvas = document.createElement("canvas")
canvas.width = canvas.height = 512
const ctx = canvas.getContext("2d")
const locus_points = []
for (let i = 440; i < 650; ++i) {
const [X, Y, Z] = [xFit_1931(i), yFit_1931(i), zFit_1931(i)]
const x = (X / (X + Y + Z)) * canvas.width
const y = (Y / (X + Y + Z)) * canvas.height
locus_points.push([x, y])
locus_points.slice(1).forEach(point => ctx.lineTo(...point))
const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height)
for (let y = 0; y < canvas.height; ++y) {
for (let x = 0; x < canvas.width; ++x) {
const alpha = imageData.data[(y * canvas.width + x) * 4 + 3]
if (alpha > 0) {
const [X, Y, Z] = XYZfromYxy(1, x / canvas.width, y / canvas.height)
const [R, G, B] = RGBfromXYZ(X, Y, Z)
const r = Math.round(srgb_from_linear(R / Math.sqrt(R**2 + G**2 + B**2)) * 255)
const g = Math.round(srgb_from_linear(G / Math.sqrt(R**2 + G**2 + B**2)) * 255)
const b = Math.round(srgb_from_linear(B / Math.sqrt(R**2 + G**2 + B**2)) * 255)
imageData.data[(y * canvas.width + x) * 4 + 0] = r
imageData.data[(y * canvas.width + x) * 4 + 1] = g
imageData.data[(y * canvas.width + x) * 4 + 2] = b
ctx.putImageData(imageData, 0, 0)

Convert Color Spaces in Pillow / PIL

Right now I am converting an image from YCrCb to RGB using OpenCV:
cv2.cvtColor(arr, cv2.COLOR_YCR_CB2RGB)
Is there a function in Pillow / PIL to perform this same color conversion. At the very least I would like to perform the color conversion without needing OpenCV.
I tried the following:
def _rgb( xxx ):
y, cb, cr = xxx
r = y + 1.402 * ( cr - 128 )
g = y - .34414 * ( cb - 128 ) - .71414 * ( cr - 128 )
b = y + 1.772 * ( cb - 128 )
return r, g, b
np.apply_along_axis( _rgb, 2, arr.astype( np.float32 ) ).astype( np.uint8 )
and it is very slow and does not quite work.
Conversion per-se
YCrCb-Colorspace conversion to RGB-Colorspace states:
R = Y + 1.402 * ( Cr - 128 )
G = Y - 0.34414 * ( Cb - 128 ) - 0.71414 * ( Cr - 128 )
B = Y + 1.772 * ( Cb - 128 )
Nota Bene 1:
openCV sources document it's conversion process to be performed with different coefs than the http://en.wikipedia.org/wiki/HSL_and_HSV based on ITU-R Recommendation BT-709, resp. BT-601:
R = Y + 1.403 * ( Cr - delta )
G = Y - 0.344 * ( Cb - delta ) - 0.714 * ( Cr - delta )
B = Y + 1.773 * ( Cb - delta )
delta = 128 # for 8-bit images CV_8U,
# 32768 # for 16-bit images CV_16U,
# 0.5 # for floating-point images CV_32F.
Nota Bene 2: [ref. below]
Efficient implementation
Using vectorised mode, numpy can help with potential further acceleration speedup from JIT-compilation from numba:
import numpy as np
import numba
def translateYCrCb2RGB( a3DMatrixOfUINT8_YCrCb ): # naive type-checking & no exception handling
a3DMatrixOfUINT8_RGB = np.zeros( a3DMatrixOfUINT8_YCrCb.shape,
dtype = np.uint8
a3DMatrixOfUINT8_RGB[:,:,0] = a3DMatrixOfUINT8_YCrCb[:,:,0] \
+ 1.402 * ( a3DMatrixOfUINT8_YCrCb[:,:,1] - 128 )
a3DMatrixOfUINT8_RGB[:,:,1] = a3DMatrixOfUINT8_YCrCb[:,:,0] \
- 0.34414 * ( a3DMatrixOfUINT8_YCrCb[:,:,2] - 128 ) \
- 0.71414 * ( a3DMatrixOfUINT8_YCrCb[:,:,1] - 128 )
a3DMatrixOfUINT8_RGB[:,:,2] = a3DMatrixOfUINT8_YCrCb[:,:,0] \
+ 1.772 * ( a3DMatrixOfUINT8_YCrCb[:,:,2] - 128 )
return( a3DMatrixOfUINT8_RGB )
Further acceleration tricks may help at a cost of a larger memory footprint or destructive handling of the mutable original YCrCb-matrix
Pre-sliced approach
def translateYCrCb2RGB( Y__slice, # YCrCb_ORIGINAL[:,:,0], # ... asView
Cr_slice, # YCrCb_ORIGINAL[:,:,1], # ... asView
Cb_slice # YCrCb_ORIGINAL[:,:,2] # ... asView
): # naive type-checking & no exception handling
return( np.dstack( ( Y__slice + 1.402 * ( Cr_slice - 128 ),
Y__slice - 0.34414 * ( Cb_slice - 128 ) - 0.71414 * ( Cr_slice - 128 ),
Y__slice + 1.772 * ( Cb_slice - 128 )
) # .dstack consumes aTUPLE
Conventions need not match
def getCvFromPIL( PILpic ):
return np.array( PILpic.getdata(), # .getdata()
dtype = np.uint8 # .uint8 type-enforced
).reshape( ( PILpic.size[1], # .reshape x
PILpic.size[0], # y
3 # z-depth
) # aTUPLE
)[:,:,::-1] # RGB c-reverse -> to BGR as cv2 standard representation
From openCV sources one may read about implemented precision of coefs:
template<typename _Tp> struct YCrCb2RGB_f
typedef _Tp channel_type;
YCrCb2RGB_f(int _dstcn, int _blueIdx, const float* _coeffs)
: dstcn(_dstcn), blueIdx(_blueIdx)
static const float coeffs0[] = {1.403f, -0.714f, -0.344f, 1.773f};
memcpy(coeffs, _coeffs ? _coeffs : coeffs0, 4*sizeof(coeffs[0]));
void operator()(const _Tp* src, _Tp* dst, int n) const
int dcn = dstcn, bidx = blueIdx;
const _Tp delta = ColorChannel<_Tp>::half(), alpha = ColorChannel<_Tp>::max();
float C0 = coeffs[0], C1 = coeffs[1], C2 = coeffs[2], C3 = coeffs[3];
n *= 3;
for(int i = 0; i < n; i += 3, dst += dcn)
_Tp Y = src[i];
_Tp Cr = src[i+1];
_Tp Cb = src[i+2];
_Tp b = saturate_cast<_Tp>(Y + (Cb - delta)*C3);
_Tp g = saturate_cast<_Tp>(Y + (Cb - delta)*C2 + (Cr - delta)*C1);
_Tp r = saturate_cast<_Tp>(Y + (Cr - delta)*C0);
dst[bidx] = b; dst[1] = g; dst[bidx^2] = r;
if( dcn == 4 )
dst[3] = alpha;
int dstcn, blueIdx;
float coeffs[4];

How to apply a kernel to a raster image

Im trying to apply a Sharpen Kernel to a raster picture, Here is my kernel:
{ 0.0f,-1.0f,0.0f,
0.0f,-1.0f,0.0f }
And here is my Code:
struct Pixel{
GLubyte R, G, B;
float x, y;
. . .
for (unsigned i = 1; i < iWidth - 1; i++){
for (unsigned j = 1; j < iHeight - 1; j++){
float r = 0, g = 0, b = 0;
r += -(float)pixels[i + 1][j].R;
g += -(float)pixels[i + 1][j].G;
b += -(float)pixels[i + 1][j].B;
r += -(float)pixels[i - 1][j].R;
g += -(float)pixels[i - 1][j].G;
b += -(float)pixels[i - 1][j].B;
r += -(float)pixels[i][j + 1].R;
g += -(float)pixels[i][j + 1].G;
b += -(float)pixels[i][j + 1].B;
r += -(float)pixels[i][j - 1].R;
g += -(float)pixels[i][j - 1].G;
b += -(float)pixels[i][j - 1].B;
pixels[i][j].R = (GLubyte)((pixels[i][j].R * 5) + r);
pixels[i][j].G = (GLubyte)((pixels[i][j].G * 5) + g);
pixels[i][j].B = (GLubyte)((pixels[i][j].B * 5) + b);
But the colors get mixed up when I apply this kernel, Here is an example:
What am I doing wrong?
NOTE : I know that OpenGL can do this fast and easy, but I just wanted to experiment on this kind of masks.
EDIT : The first code had a bug:
pixels[i][j].R = (GLubyte)((pixels[i][j].R * 5) + r);
pixels[i][j].G = (GLubyte)((pixels[i][j].R/*G*/ * 5) + g);
pixels[i][j].B = (GLubyte)((pixels[i][j].R/*B*/ * 5) + b);
I fixed it but I still got that problem.
Iv changed the last three lines to this:
r = (float)((pixels[i][j].R * 5) + r);
g = (float)((pixels[i][j].G * 5) + g);
b = (float)((pixels[i][j].B * 5) + b);
if (r < 0) r = 0;
if (g < 0) g = 0;
if (b < 0) b = 0;
if (r > 255) r = 255;
if (g > 255) g = 255;
if (b > 255) b = 255;
pixels[i][j].R = r;
pixels[i][j].G = g;
pixels[i][j].B = b;
And now the output looks like this:
You have a copy-paste bug here:
pixels[i][j].R = (GLubyte)((pixels[i][j].R * 5) + r);
pixels[i][j].G = (GLubyte)((pixels[i][j].R * 5) + g);
pixels[i][j].B = (GLubyte)((pixels[i][j].R * 5) + b);
This should be:
pixels[i][j].R = (GLubyte)((pixels[i][j].R * 5) + r);
pixels[i][j].G = (GLubyte)((pixels[i][j].G * 5) + g);
pixels[i][j].B = (GLubyte)((pixels[i][j].B * 5) + b);
Also it looks like you may have iWidth/iHeight transposed, but it's hard to say without seeing the rest of the code. Typically though the outer loop iterates over rows, so the upper bound would be the number of rows, i.e. the image height.
Most importantly though you have a fundamental problem in that you're trying to perform a neighbourhood operation in-place. Each output pixel depends on its neighbours, but you're modifying these neighbours as you iterate through the image. You need to do this kind of operation out-of-place, i.e. have a separate output image:
out_pixels[i][j].R = r;
out_pixels[i][j].G = g;
out_pixels[i][j].B = b;
so that the input image does not get modified. (Note also that you'll want to copy the edge pixels over from the input image to the output image.)

Python: Want to change image HSL like photoshop

I would like to implement this feature(changing HSL with that colorize ticked) in Python, preferable using PIL or maybe numpy.
Can someone explain how this works?
As far as I know is to use the built-in function color_to_hsl to get the hsl value, change it, then convert ti back to rgb, and finally write to individual pixel.
Any clue to get make it closer?
from PIL import Image
import colorsys
def colorize(im, h, s, l_adjust):
h /= 360.0
s /= 100.0
l_adjust /= 100.0
if im.mode != 'L':
im = im.convert('L')
result = Image.new('RGB', im.size)
pixin = im.load()
pixout = result.load()
for y in range(im.size[1]):
for x in range(im.size[0]):
l = pixin[x, y] / 255.99
l += l_adjust
l = min(max(l, 0.0), 1.0)
r, g, b = colorsys.hls_to_rgb(h, l, s)
r, g, b = int(r * 255.99), int(g * 255.99), int(b * 255.99)
pixout[x, y] = (r, g, b)
return result
This is what exactly you do in photoshop with colorize check
from PIL import Image
import colorsys
def rgbLuminance(r, g, b):
luminanceR = 0.22248840
luminanceG = 0.71690369
luminanceB = 0.06060791
return (r * luminanceR) + (g * luminanceG) + (b * luminanceB)
def colorize(im, h, s, l_adjust):
h /= 360.0
s /= 100.0
l_adjust /= 100.0
result = Image.new('RGBA', im.size)
pixin = im.load()
pixout = result.load()
for y in range(im.size[1]):
for x in range(im.size[0]):
currentR = pixin[x, y][0]/255
currentG = pixin[x, y][1]/255
currentB = pixin[x, y][2]/255
lum = rgbLuminance(currentR, currentG, currentB)
if l_adjust > 0:
lum = lum * (1 - l_adjust)
lum = lum + (1.0 - (1.0 - l_adjust))
lum = lum * (l_adjust + 1)
l = lum
r, g, b = colorsys.hls_to_rgb(h, l, s)
r, g, b = int(r * 255.99), int(g * 255.99), int(b * 255.99)
pixout[x, y] = (r, g, b, 255)
return result

Fast bilinear interpolation on old iOS devices

I've got the following code to do a biliner interpolation from a matrix of 2D vectors, each cell has x and y values of the vector, and the function receives k and l indices telling the bottom-left nearest position in the matrix
// p[1] returns the interpolated values
// fieldLinePointsVerts the raw data array of fieldNumHorizontalPoints x fieldNumVerticalPoints
// only fieldNumHorizontalPoints matters to determine the index to access the raw data
// k and l horizontal and vertical indices of the point just bellow p[0] in the raw data
void interpolate( vertex2d* p, vertex2d* fieldLinePointsVerts, int fieldNumHorizontalPoints, int k, int l ) {
int index = (l * fieldNumHorizontalPoints + k) * 2;
vertex2d p11;
p11.x = fieldLinePointsVerts[index].x;
p11.y = fieldLinePointsVerts[index].y;
vertex2d q11;
q11.x = fieldLinePointsVerts[index+1].x;
q11.y = fieldLinePointsVerts[index+1].y;
index = (l * fieldNumHorizontalPoints + k + 1) * 2;
vertex2d q21;
q21.x = fieldLinePointsVerts[index+1].x;
q21.y = fieldLinePointsVerts[index+1].y;
index = ( (l + 1) * fieldNumHorizontalPoints + k) * 2;
vertex2d q12;
q12.x = fieldLinePointsVerts[index+1].x;
q12.y = fieldLinePointsVerts[index+1].y;
index = ( (l + 1) * fieldNumHorizontalPoints + k + 1 ) * 2;
vertex2d p22;
p22.x = fieldLinePointsVerts[index].x;
p22.y = fieldLinePointsVerts[index].y;
vertex2d q22;
q22.x = fieldLinePointsVerts[index+1].x;
q22.y = fieldLinePointsVerts[index+1].y;
float fx = 1.0 / (p22.x - p11.x);
float fx1 = (p22.x - p[0].x) * fx;
float fx2 = (p[0].x - p11.x) * fx;
vertex2d r1;
r1.x = fx1 * q11.x + fx2 * q21.x;
r1.y = fx1 * q11.y + fx2 * q21.y;
vertex2d r2;
r2.x = fx1 * q12.x + fx2 * q22.x;
r2.y = fx1 * q12.y + fx2 * q22.y;
float fy = 1.0 / (p22.y - p11.y);
float fy1 = (p22.y - p[0].y) * fy;
float fy2 = (p[0].y - p11.y) * fy;
p[1].x = fy1 * r1.x + fy2 * r2.x;
p[1].y = fy1 * r1.y + fy2 * r2.y;
Currently this code needs to be run every single frame in old iOS devices, say devices with arm6 processors
I've taken the numeric sub-indices from the wikipedia's equations http://en.wikipedia.org/wiki/Bilinear_interpolation
I'd accreciate any comments on optimization for performance, even plain asm code
This code should not be causing your slowdown if it's only run once per frame. However, if it's run multiple times per frame, it easily could be.
I'd run your app with a profiler to see where the true performance problem lies.
There is some room for optimization here: a) Certain index calculations could be factored out and re-used in subsequent calculations), b) You could dereference your fieldLinePointsVerts array to a pointer once and re-use that, instead of indexing it twice per index...
but in general those things won't help a great deal, unless this function is being called many, many times per frame. In which case every little thing will help.
