Calculate standard deviation for grayscale imagenet pixel values with rotation matrix and regular imagenet standard deviation - image-processing

I want to train some models to work with grayscale images, which e.g. is useful for microscope applications (Source). Therefore I want to train my model on graysale imagenet, using the pytorch grayscale conversion (torchvision.transforms.Grayscale), to convert the RGB imagenet to a grayscale imagenet. Internally pytorch rotates the color space from RGB to YPbPr as follows:
Y' is the grayscale channel then, so that Pb and Pr can be neglected after transformation. Actually pytorch even only calculates
grayscale = (0.2989 * r + 0.587 * g + 0.114 * b)
To normalize the image data, I need to know grayscale-imagenet's mean pixel value, as well as the standard deviation. Is it possible to calculate those?
I had success in calculating the mean pixel intensity using
meanGrayscale = 0.2989 * r.mean() + 0.587 * g.mean() + 0.114 * b.mean()
Transforming an image and then calculating the grayscale mean, gives the same result as first calculating the RGB means and then transforming those to a grayscale mean.
However, I am clueless when it comes to calculating the variance or standard deviation now. Does somebody have any idea, or knows some good literature on the topic? Is this even possible?
I found a publication "Jianxin Gong - Clarifying the Standard Deviational Ellipse" ... There he does it in 2 dimensions (as far as I understand). I just could not figure out yet how to do it in 3D.

Okay, I wasn't able to calculate the standard deviation as planned, but did it using the code below. The grayscale imagenet's train dataset mean and standard deviation are (round it as much as you like):
Mean: 0.44531356896770125
Standard Deviation: 0.2692461874154524
import multiprocessing
import os
def calcSTD(d):
meanValue = 0.44531356896770125
squaredError = 0
numberOfPixels = 0
for f in os.listdir("/home/imagenet/ILSVRC/Data/CLS-LOC/train/"+str(d)+"/"):
if f.endswith(".JPEG"):
image = imread("/home/imagenet/ILSVRC/Data/CLS-LOC/train/"+str(d)+"/"+str(f))
###Transform to gray if not already gray anyways
if np.array(image).ndim == 3:
matrix = np.array(image)
blue = matrix[:,:,0]/255
green = matrix[:,:,1]/255
red = matrix[:,:,2]/255
gray = (0.2989 * red + 0.587 * green + 0.114 * blue)
gray = np.array(image)/255
for line in gray:
for pixel in line:
squaredError += (pixel-meanValue)**2
numberOfPixels += 1
return (squaredError, numberOfPixels)
a_pool = multiprocessing.Pool()
folders = []
[folders.append( for f in os.scandir("/home/imagenet/ILSVRC/Data/CLS-LOC/train") if f.is_dir()]
resultStD =, folders)
StD = (sum([intensity[0] for intensity in resultStD])/sum([pixels[1] for pixels in resultStD]))**0.5
During the process some errors like this occured:
UserWarning: Possibly corrupt EXIF data. Expecting to read 8 bytes
but only got 4. Skipping tag 41486 "Possibly corrupt EXIF data. "
The repective images from the 2019 version of ImageNet were skipped.


Is color.rgb2gray() image normalization?

From what I understand, the function color.rgb2gray()from the scikit-image package is a method of image normalization.
imgGray = color.rgb2gray(img)
According to the documentation:
"The value of each grayscale pixel is calculated as the weighted sum of the corresponding red, green and blue pixels as:
Y = 0.2125 R + 0.7154 G + 0.0721 B"
Can anyone confirm if my interpretation of this is correct?

Conversion of arbitrary range to luma

I have 2d matrix of values in range <-a, b>. I would like to visualize this image by grayscale image. How I should process my data to visualize it correctly?
As far as I know human eye has logarithmic scale, so my transformation should be logarithmic too.
Convert your values to the luma in a perceptually uniform color space, for example CIE Lab or Luv. Then convert from that to RGB for display.
These are available in the colormath module, for example.
If your input value is in x
L = 100*(x - xmin) / (xmax - xmin) # L is 0-100
a, b = 0, 0 # neutral values
from colormath.color_objects import LabColor, RGBColor
from colormath.color_conversions import convert_color
lab = LabColor(L, a, b)
rgb = convert_color(lab, RGBColor)
# display rgb
Matplotlib has a lot of info about this in the section on colormaps:.

How to efficiently find and remove 1 pixel bands of image intensity changes?

We're having some visual artifacts on a normal map for a shader because of some bands of single pixels which are very contrast to their surroundings. Just to be clear, edges are not an issue, only these single pixel bands.
Using something like typical Sobel edge detection would not work in this case because on top of such a band, it would detect 0. I can think of other modifications to the kernel which might works such as
-1 -2 -1
2 4 2
-1 -2 -1
but I assumed that there was likely a "correct" mathematical way to do such an operation.
In the end, I want to smooth these lines out using the surrounding pixels (so a selective blur). These lines could appear in any orientation, so if I were to use the above kernel, I would need to apply it in both direction and add it to get the line intensity similar to when applying the Sobel kernel.
I assume that you have lines of 1 pixel width in your image that are brighter or darker than their surroundings and you want to find them and remove them from the image and replace the removed pixels by an average of the local neighborhood.
I developed an algorithm for this and it works on my example data (since you did not give any data). It has two parts:
Identification of lines
I could not think of a simple, yet effective filter to detect lines (which are connected, so one would probably need to look at correlations). So I used a simple single pixel detection filter:
-1 -1 -1
-1 8 -1
-1 -1 -1
and then some suitable thresholding.
Extrapolation of data from outside of a mask to the mask
A very elegant solution (using only convolutions) is to take the data outside the mask and convolve it with a gaussian, then take negative mask and convolve it with the very same gaussian, then divide both pixelwise. The result within the mask is the desired blurring.
What it is mathematically: a weighted averaging of the data.
Here is my phantom data:
And this is the identification of the lines
And the final result shows that the distortion has been suppressed tenfold:
And finally my code (in Matlab):
%% create phantom data with lines (1pixel wide bands)
[x, y] = ndgrid(1:100, 1:100);
original = 3 * x - 2 * y + 100 * sin(x / 2) + 120 * cos(y / 3); % funny shapes
bw = original > mean(original(:)); % black and white
distortion = bwmorph(bw,'remove'); % some lines
data = original + max(original(:)) * distortion; % phantom
% show
subplot(1,3,1); imagesc(original); axis image; colormap(hot); title('original');
subplot(1,3,2); imagesc(distortion); axis image; title('distortion');
subplot(1,3,3); imagesc(data); axis image; title('image');
%% line detection
% filter by single pixel filter
pixel_filtered = filter2([-1,-1,-1;-1,8,-1;-1,-1,-1], data);
% create mask by simple thresholding
mask = pixel_filtered > 0.2 * max(pixel_filtered(:));
% show
subplot(1,2,1); imagesc(pixel_filtered); axis image; colormap(hot); title('filtered');
subplot(1,2,2); imagesc(mask); axis image; title('mask');
%% line removal and interpolation
% smoothing kernel: gaussian
smooth_kernel = fspecial('gaussian', [3, 3], 1);
smooth_kernel = smooth_kernel ./ sum(smooth_kernel(:)); % normalize to one
% smooth image outside mask and divide by smoothed negative mask
smoothed = filter2(smooth_kernel, data .* ~mask) ./ filter2(smooth_kernel, ~mask);
% withing mask set data to smoothed
reconstruction = data .* ~mask + smoothed .* mask;
% show
subplot(1,3,1); imagesc(reconstruction); axis image; colormap(hot); title('reconstruction');
subplot(1,3,2); imagesc(original); axis image; title('original');
subplot(1,3,3); imagesc(reconstruction - original); axis image; title('difference');

change number of gray levels in a grayscale image in matlab

I am rather new to matlab, but I was hoping someone could help with this question. So I have a color image that I want to convert to grayscale and then reduce the number of gray levels. So I read in the image and I used rgb2gray() to convert the image to grayscale. However, I am not sure how to convert the image to use only 32 gray levels instead of 255 gray levels.
I was trying to use colormap(gray(32)), but this seemed to have no effect on the plotted image itself or the colorbar under the image. So I was not sure where else to look. Any tips out there? Thanks.
While result = (img/8)*8 does convert a grayscale image in the range [0, 255] to a subset of that range but now using only 32 values, it might create undesirable artifacts. A method that possibly produces visually better images is called Improved Grayscale Quantization (abbreviated as IGS). The pseudo-code for performing it can be given as:
mult = 256 / (2^bits)
mask = 2^(8 - bits) - 1
prev_sum = 0
for x = 1 to width
for y = 1 to height
value = img[x, y]
if value >> bits != mask:
prev_sum = value + (prev_sum & mask)
prev_sum = value
res[x, y] = (prev_sum >> (8 - bits)) * mult
As an example, consider the following figure and the respective quantizations with bits = 5, bits = 4, and bits = 3 using the method above:
Now the same images but quantized by doing (img/(256/(2^bits)))*(256/(2^bits)):
This is not a pathological example.
You can reduce the number of different values in an image by simple rounding:
I = rgb2gray(imread('image.gif'));
J = 8*round(I/8)
See imhist(I) and imhist(J) for the effect.
However, if you want to reduce image size, you might be better off using an image processing program like Photoshop, Gimp or IrfanView and save as a 32 color gif. In that way you'll actually reduce the file's palette, and I think that's something Matlab can't do.
Check if the the type of your image data is uint8 which I would suspect. If that's the case divide the image by 8 to abuse the flooring effect of integer division, multiply with 8 again, and you're set: I2=(I/8)*8. I2 will have only 32 gray levels.

Noise Estimation / Noise Measurement in Image

I want to estimate the noise in an image.
Let's assume the model of an Image + White Noise.
Now I want to estimate the Noise Variance.
My method is to calculate the Local Variance (3*3 up to 21*21 Blocks) of the image and then find areas where the Local Variance is fairly constant (By calculating the Local Variance of the Local Variance Matrix).
I assume those areas are "Flat" hence the Variance is almost "Pure" noise.
Yet I don't get constant results.
Is there a better way?
I can't assume anything about the Image but the independent noise (Which isn't true for real image yet let's assume it).
You can use the following method to estimate the noise variance (this implementation works for grayscale images only):
def estimate_noise(I):
H, W = I.shape
M = [[1, -2, 1],
[-2, 4, -2],
[1, -2, 1]]
sigma = np.sum(np.sum(np.absolute(convolve2d(I, M))))
sigma = sigma * math.sqrt(0.5 * math.pi) / (6 * (W-2) * (H-2))
return sigma
Reference: J. Immerkær, “Fast Noise Variance Estimation”, Computer Vision and Image Understanding, Vol. 64, No. 2, pp. 300-302, Sep. 1996 [PDF]
The problem of characterizing signal from noise is not easy. From your question, a first try would be to characterize second order statistics: natural images are known to have pixel to pixel correlations that are -by definition- not present in white noise.
In Fourier space the correlation corresponds to the energy spectrum. It is known that for natural images, it decreases as 1/f^2 . To quantify noise, I would therefore recommend to compute the correlation coefficient of the spectrum of your image with both hypothesis (flat and 1/f^2), so that you extract the coefficient.
Some functions to start you up:
import numpy
def get_grids(N_X, N_Y):
from numpy import mgrid
return mgrid[-1:1:1j*N_X, -1:1:1j*N_Y]
def frequency_radius(fx, fy):
R2 = fx**2 + fy**2
(N_X, N_Y) = fx.shape
R2[N_X/2, N_Y/2]= numpy.inf
return numpy.sqrt(R2)
def enveloppe_color(fx, fy, alpha=1.0):
# 0.0, 0.5, 1.0, 2.0 are resp. white, pink, red, brown noise
# (see )
# enveloppe
return 1. / frequency_radius(fx, fy)**alpha #
import scipy
image = scipy.lena()
N_X, N_Y = image.shape
fx, fy = get_grids(N_X, N_Y)
pink_spectrum = enveloppe_color(fx, fy)
from scipy.fftpack import fft2
power_spectrum = numpy.abs(fft2(image))**2
I recommend this wonderful paper for more details.
Scikit Image has an estimate sigma function that works pretty well:
it also works with color images, you just need to set multichannel=True and average_sigmas=True:
import cv2
from skimage.restoration import estimate_sigma
def estimate_noise(image_path):
img = cv2.imread(image_path)
return estimate_sigma(img, multichannel=True, average_sigmas=True)
High numbers mean low noise.
