Determine if an image needs contrasting automatically in OpenCV - image-processing

OpenCV has a handy cvEqualizeHist() function that works great on faded/low-contrast images.
However when an already high-contrast image is given, the result is a low-contrast one. I got the reason - the histogram being distributed evenly and stuff.
Question is - how do I get to know the difference between a low-contrast and a high-contrast image?
I'm operating on Grayscale images and setting their contrast properly so that thresholding them won't delete the text i'm supposed to extract (thats a different story).
Suggestions welcome - esp on how to find out if the majority of the pixels in the image are light gray (which means that the equalise hist is to be performed)
Please help!
EDIT: thanks everyone for many informative answers. But the standard deviation calculation was sufficient for my requirements and hence I'm taking that to be the answer to my query.

You can probably just use a simple statistical measure of the image to determine whether an image has sufficient contrast. The variance of the image would probably be a good starting point. If the variance is below a certain threshold (to be empirically determined) then you can consider it to be "low contrast".

If you're adjusting contrast just so you can threshold later on, you may be able to avoid the contrast adjustment step if you set your threshold adaptively using Ohtsu's method.
If you're still interested in finding out the image contrast, then read on.
While there are a number of different ways to calculate "contrast". Often, those metrics are applied locally as opposed to the entire image, to make the result more sensitive to image content:
Divide the image into adjacent non-overlaying neighborhoods.
Pick neighborhood sizes that are approximate to size of the features of your image (e.g. if your main feature is horizontal text, make neighborhoods tall enough to capture 2 lines of text, and just as wide).
Apply the metric to each neighborhood individually
Threshold the metric result to separate low and high variance blocks. This will prevent such things as large, blank areas of page skewing your contrast estimates.
From there, you can use a number of features to determine contrast:
The proportion of high metric blocks to low metric blocks
High metric block mean
Intensity distance between the high and low metric blocks (using means, modes, etc)
This may serve as a better indication of image contrast than global image variance alone. Here's why:
(stddev: 50.6)
(stddev: 7.9)
The two images are perfectly in contrast (the grey background is just there to make it obvious it's an image), but their standard deviations (and thus variance) are completely different.

Calculate cumulative histogram of image.
Make linear regression of cumulative histogram in the form y(x) = A*x + B.
Calculate RMSE of real_cumulative_frequency(x)-y(x).
If that RMSE is close to zero - image is already equalized. (That means that for equalized images cumulative histograms must be linear)
Idea is taken from here.
EDIT:
I've illustrated this approach in my blog (C example code included).

There is a support provided in skimage for this. skimage.exposure.is_low_contrast. reference
example :
>>> image = np.linspace(0, 0.04, 100)
>>> is_low_contrast(image)
True
>>> image[-1] = 1
>>> is_low_contrast(image)
True
>>> is_low_contrast(image, upper_percentile=100)
False

Related

What's the theory behind computing variance of an image?

I am trying to compute the blurriness of an image by using LaplacianFilter.
According to this article: https://www.pyimagesearch.com/2015/09/07/blur-detection-with-opencv/ I have to compute the variance of the output image. The problem is I don't understand conceptually how do I compute variance of an image.
Every pixel has 4 values for every color channel, therefore I can compute the variance of every channel, but then I get 4 values, or even 16 by computing variance-covariance matrix, but according to the OpenCV example, they have only 1 number.
After computing that number, they just play with the threshold in order to make a binary decision, whether the image is blurry or not.
PS. by no means I am an expert on this topic, therefore my statements can make no sense. If so, please be nice to edit the question.
On sentence description:
The blured image's edge is smoothed, so the variance is small.
1. How variance is calculated.
The core function of the post is:
def variance_of_laplacian(image):
# compute the Laplacian of the image and then return the focus
# measure, which is simply the variance of the Laplacian
return cv2.Laplacian(image, cv2.CV_64F).var()
As Opencv-Python use numpy.ndarray to represent the image, then we have a look on the numpy.var:
Help on function var in module numpy.core.fromnumeric:
var(a, axis=None, dtype=None, out=None, ddof=0, keepdims=<class 'numpy._globals$
Compute the variance along the specified axis.
Returns the variance of the array elements, a measure of the spread of a distribution.
The variance is computed for the flattened array by default, otherwise over the specified axis.
2. Using for picture
This to say, the var is calculated on the flatten laplacian image, or the flatted 1-D array.
To calculate variance of array x, it is:
var = mean(abs(x - x.mean())**2)
For example:
>>> x = np.array([[1, 2], [3, 4]])
>>> x.var()
1.25
>>> np.mean(np.abs(x - x.mean())**2)
1.25
For the laplacian image, it is edged image. Make images using GaussianBlur with different r, then do laplacian filter on them, and calculate the vars:
The blured image's edge is smoothed, so the variance is little.
First thing first, if you see the tutorial you gave, they convert the image to a greyscale, thus it will have only 1 channel and 1 variance. You can do it for each channel and try to compute a more complicated formula with it, or just use the variance over all the numbers... However I think the author also converts it to greyscale since it is a nice way of fusing the information and in one of the papers that the author supplies actually says that
A well focused image is expected to have a high variation in grey
levels.
The author of the tutorial actually explains it in a simple way. First, think what the laplacian filter does. It will show the well define edges here is an example using the grid of pictures he had. (click on it to see better the details)
As you can see the blurry images barely have any edges, while the focused ones have a lot of responses. Now, what would happen if you calculate the variance. let's imagine the case where white is 255 and black is 0. If everything is black... then the variance is low (cases of the blurry ones), but if they have like half and half then the variance is high.
However, as the author already said, this threshold is dependent on the domain, if you take a picture of a sky even if it is focus it may have low variance, since it is quite similar and does not have very well define edges...
I hope this answer your doubts :)

Effect of Downsample but not fully divide

Hey I need a 320x240 8Bit gray scale image for some Computer Vision Algorithm (Orb Feature tracking). The Raspicam driver I'm using can provide different Image Sizes. Different Image Sizes are achieved by cropping and not down sampling from the driver. As my environment is not ideal lighted the Image is quite dark and noisy. Now I had the idea to take a 640x480 Image and down sample it to 320x240 by combining always 2x2 pixels to one. Normally I would of course divide by 4 to get the correct result. But what would be the effect of dividing it by two or even one (assuming 99% of the intensity values are not bigger then 64 (256/4)). Wouldn't that simulate the effect of larger CCD cells which could gather more light in less time.
The first tests I did showed some pretty good results. Meaning I detected more Features and could follow them better between two frames.
Here, you are not taking proper average of 2x2 blocks(divide by 4). Say, you have two blocks and they have Delta-I difference in intensity. If you divide the intensity of the two blocks by a larger number, the intensity difference will reduce and vice-versa for smaller number.
When you divide the difference(Delta-I) by 2(instead of 4), you are in a way increasing the contrast(intensity difference between background and foreground. As you mentioned that your image is in poor illumination, thereby division by smaller number increases the contrast which is improving tracking. This approach will come under contrast enhancement technique and is a variation of Linear contrast enhancement.

What is the difference between Binning and sub-sampling in Image Signal Processing?

As I know, there are some functions in the CMOS Image Sensor ISP (Image Signal Processor).
Specifically, I'd like to know the difference between binning and sub-sampling. I think these purpose is same to reduce image size.
However, I'm not sure why these functions exist?
What is their purpose?
Binning and sub-sampling reduce the image size as you have suspected, but what they focus on are different things. Let's tackle each issue separately
Binning
Binning in image processing deals primarily with quantization. The closest thing I can think of is related to what is known as data binning. Basically, consider breaking up your image into distinct (non-overlapping) M x N tiles, where M and N are the rows and columns of a tile and M and N should be much smaller than the rows and columns of the image.
If you consider any grid of M x N pixels, all of these pixels get replaced with a representative colour. The way this representative colour is calculated is done in many ways... the average is a popular method. The reason why binning is performed is primarily as a data pre-processing technique which is used to reduce the effects of minor observation errors. This effectively reduces the amount of information that is representative of the image, and so it certainly reduces the image size by reducing the amount of unique colours that represent the image.
In addition, binning the data may also reduce the impact of noise that impacts the CMOS sensor on the final processed image, but at the cost of a lower dynamic range of colours.
Sub-sampling
Sub-sampling in the case of image processing mostly deals with image resizing. It's also called image scaling. The goal is to take an image and reduce its dimensions so that you get a smaller image as a result. Binning deals with keeping the image the same size (i.e. the same dimensions as the original) while reducing the amount of colours which ultimately reduces the amount of space the image takes up. Subsampling reduces the image size by removing information all together. Usually when you subsample, you also interpolate or smooth the image so that you reduce aliasing.
Sub-sampling has another application in video processing - especially in MPEG where video is encoded in YCbCr. Y is the luminance while Cb and Cr are the chrominance pairs. We tend to notice changes in luminance rather than chrominance, and so the chrominance is subsampled to reduce the amount of space taken up by the video. Specifically, the human visual system has poor acuity when it comes to colour information than we do with luminance / intensity. Usually, the chrominance values are filtered then subsampled by 1/2 or even 1/4 of that of the intensity. Even with a rather high subsampling rate, we don't notice any differences in terms of perceived image quality.
This is obviously a rather rough introduction on the differences between them both, but I hope this gives you enough of what you're after for your purposes.
Good luck!

Effect of variance (sigma) at gaussian smoothing

I know about Gaussian, varaince, image blurring and i think that i understood the concept of variance at Gaussian blur but still i am not 100% sure.
I just want to know the role of sigma or variance at Gaussian smoothing. I mean, what happens by increasing the value of sigma for the same window size..and why it happens?
It would be really helpful if somebody provide some nice literature about it. (I already tried few but couldn't find what i am looking for)
Major confusion:
Higher frequency-> details (e.g. noise),
Lower Frequency-> kind of overview of the image.
By increasing sigma, we are allowing some higher frequencies....so we should get more detailed with increasing frequency but the case is opposite, when we increase sigma, the image becomes more blurry.
I think it should be done in the following steps, first from the signal processing point of view:
Gaussian Filter is a low pass filter. Low pass filters as their names imply pass low frequencies - keeping low frequencies. So when we look at the image in the frequency domain the highest frequencies happen in the edges(places that there is a high change in intensity and each intensity value corresponds to a specific visible frequency).
The role of sigma in the Gaussian filter is to control the variation
around its mean value. So as the Sigma becomes larger the more variance allowed around mean and as the Sigma becomes smaller the less variance allowed around mean.
Filtering in the spatial domain is done through convolution. it simply
means that we apply a kernel on every pixel in the image. The law exists for kernels. Their sum has to be zero.
Now putting all together! When we apply a Gaussian filter to an image, we are doing a low pass filtering. But as you know this happen in the discrete domain(image pixels). So we have to quantize our Gaussian filter in order to make a Gaussian kernel. In the quantization step, as the Gaussian filter(GF) has a small sigma it has the steepest pick. So the more weights will be focused in the center and the less around it.
In the sense of natural image statistics! The scientists in this field of studies showed that our vision system is a kind of Gaussian filter in the responses to the images. see for example take a look at a broad scene! don't pay attention to a specific point! so you see a broad scene with lots things in it. but the details are not clear! Now see a specific point in that seen. you see more details that previously you didn't. This is the Sigma appear here. when you increase the sigma you are looking to the broad scene without paying attention to the details exits. and when you decrease the value you will get more details.
I think Wikipedia can help more than me, Low Pass Filters, Guassian Blur
Put simply, increasing the sigma terms will cast a broader net over the neighboring pixels and decrease the impact of the pixels nearest the pixel of interest, e.g. it makes a blurrier image.

Image blending modes for HDR images

The blending modes Screen, Color Dodge, Soft Light, etc.
like in Photoshop, each have their own math that works
for range 0-1. I wonder how do these blend modes work
for HDR images?
Thanks
I am not familiar with photoshop and it's filter but here is a general explanation of the math behind HDR filters.
Suppose you have 3 images (low light, medium and over exposed). You want to average those images but (I1+I2+I3)/3 is a stupid way. You want to give a higher weight to the image that captures more information in a given area.
So basically you average the images with a weight factor and there are different types of algorithms to calculate the weights. Here are few:
The simplest one is using STD (standard deviation). In each pixel, in each image calculate standard deviation of its 9 neighbours. Use std as weight:
HDR pixel(i,j) = I1(i,j)*stdI1(i,j) + I2(i,j)*stdI2(i,j) + I3(i,j)*stdI3(i,j).
Why std is used? since when std is high it means a high variation in pixels intencity which means more information was captured by the image.
Instead of STD you can use entropy filter, edge detection or any other which represents how much information is encoded around the given pixel
There are also slower but better ways to do HDR. Usually it is done with some kind of wavelet transformation. For example Furier transform. Each image is converted to furier space (coefficients of the frequencies and than the for each frequency, the maximal coefficient of 3 images is taken).
You can even combine the method of std filter and wavelet transforms. For example break the image to different frequencies, smooth the lower frequencies and take a stupid average (I1+I2+I3)/3, but with high frequencies use less smoothing and using std weighted average. The action of smoothing more lower frequencies is called 'blending'. It heavily used when stitching 2 images of different light exposure to a panorama.
Look at this image: http://magazine.magix.com/en/wp-content/uploads/2012/05/Panorama-3.jpg
You can clearly see that the sky gets different color on each image but since sky is a very low frequency (almost no information and no small object) it is heavily smoothed and averaged, thus allowing a gentle stitching.
Hope that answers your question

Resources