Suppose that a given 3-bit image(L=8) of size 64*64 pixels (M*N=4096) has the intensity distribution shown as below. How to obtain histogram equalization transformation function
and then compute the equalized histogram of the image?
Rk nk
0 800
1 520
2 970
3 660
4 330
5 450
6 260
7 106
"Histogram Equalization is the process of obtaining transformation function automatically. So you need not have to worry about shape and nature of transformation function"
So in Histogram equalization, transformation function is calculated using cumulative frequency approach and this process is automatic. From the histogram of the image, we determine the cumulative histogram, c, rescaling the values as we go so that they occupy an 8-bit range. In this way, c becomes a look-up table that can be subsequently applied to the image in order to carry out equalization.
rk nk c sk = c/MN (L-1)sk rounded value
0 800 800 0.195 1.365 1
1 520 1320 0.322 2.254 2
2 970 2290 0.559 3.913 4
3 660 2950 0.720 5.04 5
4 330 3280 0.801 5.601 6
5 450 3730 0.911 6.377 6
6 260 3990 0.974 6.818 7
7 106 4096 1.000 7.0 7
Now the equalized histogram is therefore
rk nk
0 0
1 800
2 520
3 0
4 970
5 660
6 330 + 450 = 780
7 260 + 106 = 366
The algorithm for equalization can be given as
Compute a scaling factor, α= 255 / number of pixels
Calculate histogram of the image
Create a look up table c with
c[0] = α * histogram[0]
for all remaining grey levels, i, do
c[i] = c[i-1] + α * histogram[i]
end for
for all pixel coordinates, x and y, do
g(x, y) = c[f(x, y)]
end for
But there is a problem with histogram equalization and that is mainly because it is a completely automatic technique, with no parameters to set. At times, it can improve our ability to interpret an image dramatically. However, it is difficult to predict how beneficial equalization will be for any given image; in fact, it may not be of any use at all. This is because the improvement in contrast is optimal statistically, rather than perceptually. In images with narrow histograms and relatively few grey levels, a massive increase in contrast due to histogram equalisation can have the adverse effect of reducing perceived image quality. In particular, sampling or quantisation artefacts and image noise may become more prominent.
The alternative to obtaining the transformation (mapping) function automatically is Histogram Specification. In histogram specification instead of requiring a flat histogram, we specify a particular shape explicitly. We might wish to do this in cases where it is desirable for a set of related images to have the same histogram- in order, perhaps, that a particular operation produces the same results for all images.
Histogram specification can be visualised as a two-stage process. First, we transform the input image by equalisation into a temporary image with a flat histogram. Then we transform this equalised, temporary image into an output image possessing the desired histogram. The mapping function for the second stage is easily obtained. Since a rescaled version of the cumulative histogram can be used to transform a histogram with any shape into a flat histogram, it follows that the inverse of the cumulative histogram will perform the inverse transformation from a fiat histogram to one with a specified shape.
For more details about histogram equalization and mapping functions with C and C++ code
https://programming-technique.blogspot.com/2013/01/histogram-equalization-using-c-image.html
Related
I want to implement Ncuts algorithm for an image of size 1248 x 378 x 1 but the adjacency matrix will be (1248 x 378 ) x (1248 x 378 ) which needs about 800 gb of RAM. Even if i most of it is zero, still it needs too much memory. I do need this matrix though to compute the normalized cut. Is there any way that i can find the eigenvalues without actually calculate the whole matrix?
If most of the matrix is zero,, then don't use a dense format.
Instead use a sparse matrix.
I am learning about HOG and I understand it from here. A well-explained page with an example. I am not understanding this concept that how it works
A 16×16 block has 4 histograms which can be concatenated to form a 36
x 1 element vector and it can be normalized just the way a 3×1 vector
is normalized.
How this 36*1 came and how we calculated it? and is it compulsory that we always need 9 bin vector? Is it a fixed size for HOG?
came?
Is it compulsory that we always need 9 bin vector?
Not necessarily. Dalal and Triggs stated in their original HOG paper that accuracy for their application (which was human pedestrian detection) increased when using up to 9 bins, after that the accuracy did not increase any further, that's why 9 are commonly used.
How this 36*1 came and how we calculated it?
As already pointed out in the comments:
You have 9 bins per histogram (which will each be a scalar value in your feature vector). In your example, a histogram was calculated using 8 x 8 blocks, meaning in a 16 x 16 block you will be able to calculate 4 histograms. Each of those histograms will yield a 9 x 1 feature vector so:
4 (histograms) * 9 (bins) = 36 x 1 feature vector.
You basically just concatenate your results into one vector.
I have an image, a 2D array of uint8_ts. I want to resize the image using a separable filter. Consider shrinking the width first. Because original & target sizes are unrelated, we'll use a different set of coefficients for every destination pixel. For a particular in & out size, for all y, we might have:
out(500, y) = in(673, y) * 12 + in(674, y) * 63 + in(675, y) * 25
out(501, y) = in(674, y) * 27 + in(675, y) * 58 + in(676, y) * 15
How can I use Eigen to speed this up, e.g. vectorize it for me?
This can be expressed as a matrix multiply with a sparse matrix, of dimension in_width * out_width, where in each row, only 3 out of the in_width values are non-zero. In the actual use case, 4 to 8 will typically be non-zero. But those non-zero values will be contiguous, and it would be nice to use SSE or whatever to make it fast.
Note that the source matrix has 8 bit scalars. The final result, after scaling both width & height will be 8 bits as well. It might be nice for the intermediate matrix (image) and filter to be higher precision, say 16 bits. But even if they're 8 bits, when multiplying, we'll need to take the most significant bits of the product, not the least significant.
Is this too far out of with Eigen can do? This sort of convolution, with a kernel that's different at every pixel (only because the output size isn't an integral multiple of the input size), seems common enough.
The dimension of the image is 64 x 128. That is 8192 magnitude and gradient values. After the binning stage, we are left with 1152 values as we converted 64 pixels into 9 bins based on their orientation. Can you please explain to me how after L2 normalization we get 3780 vectors?
Assumption: You have the gradients of the 64 x 128 patch.
Calculate Histogram of Gradients in 8x8 cells
This is where it starts to get interesting. The image is divided into 8x8 cells and a HOG is calculated for each 8x8 cells. One of the reasons why we use 8x8 cells is that it provides a compact representation. An 8x8 image patch contains 8x8x3 = 192 pixel values (color image). The gradient of this patch contains 2 values (magnitude and direction) per pixel which adds up to 8x8x2 = 128 values. These 128 numbers are represented using a 9-bin histogram which can be stored as an array of 9 numbers. This makes it more compact and calculating histograms over a patch makes this representation more robust to noise.
The histogram is essentially a vector of 9 bins corresponding to angles 0, 20, 40, 60 ... 180 corresponding to unsigned gradients.
16 x 16 Block Normalization
After creating the histogram based on the gradient of the image, we want our descriptor to be independent of lighting variations. Hence, we normalize the histogram. The vector norm for a RGB color [128, 64, 32] is sqrt(128*128 + 64*64 + 32*32) = 146.64, which is the infamous L2-norm. Dividing each element of this vector by 146.64 gives us a normalized vector [0.87, 0.43, 0.22]. If we were to multiply each element of this vector by 2, the normalized vector will remain the same as before.
Although simply normalizing the 9x1 histogram is intriguing, normalizing a bigger sized block of 16 x 16 is better. A 16 x 16 block has 4 histograms, which can be concatenated to form a 36 x 1 element vector and it can be normalized the same way as the 3 x 1 vector in the example. The window is then moved by 8 pixels and a normalized 36 x 1 vector is calculated over this window and the process is repeated (see the animation: Courtesy)
Calculate the HOG feature vector
This is where your question comes in.
To calculate the final feature vector for the entire image patch, the 36 x 1 vectors are concatenated into on giant vector. Let us calculate the size:
How many positions of the 16 x 16 blocks do we have? There are 7 horizontal and 15 vertical positions, which gives - 105 positions.
Each 16 x 16 block is represented by a 36 x 1 vector. So when we concatenate them all into one giant vector we obtain a 36 x 105 = 3780 dimensional vector.
For more details, look at the tutorial where I learned.
Hope it helps!
I have computed the Fourier transform of an 256-color-value grayscale image, but I'm not sure how to represent the output in a visible format.
This matrix represents the original image:
0 127 127 195
0 255 255 195
While this matrix represents the Fourier transform of the image:
1154 + 0j -382 + 8j -390 + 0j -382 - 8j
-256 + 0j 128 + 128j 0 + 0j 128 - 128j
From what I know, the magnitude can be computed as sqrt((r)^2+(i)^2) where r is the real component and i is the imaginary component. However, this yields values outside of the range that can be represented in 8 bits. How do I correct this?
Typically, one takes the log magnitude of each complex fft result value (ignoring ones with magnitude zero), and then scale the result so that the maximum expected result is 255 (the scale factor will depend on the dimensions and input gain of the 2D image).
Since the dynamic range of the spectrum is quite different from that of the original spatial signal, it is much difficult to use the original 8-bit format. You can use log(1+x) to shrink the range, and then scale into the 8-bit range.