Gamma normalization in HOG (Histogram of Gradients) - image-processing

What is the meaning of gamma normalization in HOG(Histogram of Gradients)? Is it the same as gamma correction? Because I came across many journals saying that gamma normalization in HOG is the square root of the image intensity. But it is different compared to gamma correction's formula.

Gamma Normalization in HOG is actually Power Law Transformation
s = cr^γ
where s is output pixel, r is input pixel, c is constant and γ is exponent. Different devices used for image capture, display and printing use this power law transformation to correct image intensity values and this process is known as gamma correction. In short, gamma normalization in HOG is same as gamma correction.

Related

How to calculate correlation of colours in a dataset?

In this Distill article (https://distill.pub/2017/feature-visualization/) in footnote 8 authors write:
The Fourier transforms decorrelates spatially, but a correlation will still exist
between colors. To address this, we explicitly measure the correlation between colors
in the training set and use a Cholesky decomposition to decorrelate them.
I have trouble understanding how to do that. I understand that for an arbitrary image I can calculate a correlation matrix by interpreting the image's shape as [channels, width*height] instead of [channels, height, width]. But how to take the whole dataset into account? It can be averaged over, but that doesn't have anything to do with Cholesky decomposition.
Inspecting the code confuses me even more (https://github.com/tensorflow/lucid/blob/master/lucid/optvis/param/color.py#L24). There's no code for calculating correlations, but there's a hard-coded version of the matrix (and the decorrelation happens by matrix multiplication with this matrix). The matrix is named color_correlation_svd_sqrt, which has svd inside of it, and SVD wasn't mentioned anywhere else. Also the matrix there is non-triangular, which means that it hasn't come from the Cholesky decomposition.
Clarifications on any points I've mentioned would be greatly appreciated.
I figured out the answer to your question here: How to calculate the 3x3 covariance matrix for RGB values across an image dataset?
In short, you calculate the RGB covariance matrix for the image dataset and then do the following calculations
U,S,V = torch.svd(dataset_rgb_cov_matrix)
epsilon = 1e-10
svd_sqrt = U # torch.diag(torch.sqrt(S + epsilon))

Approximating true heightmap gradient magnitude with opencv's Sobel filter

I have an image (cv::Mat, type CV_32F) representing grid-sampled height function. The grid has constant raster (dx,dy) per pixel.
I would like to estimate its gradient magnitude. Using OpenCV's Sobel filter, I approximate derivatives like this:
dfdx=zz.Sobel(zz,cv2.CV_32F,1,0,ksize=3,scale=?)
dfdy=zz.Sobel(zz,cv2.CV_32F,0,1,ksize=3,scale=?)
gradMag=np.sqrt(dfdx**2+dfdy**2)
The scale parameter is barely documented, but looking into the source, it is used to multiply derivative kernels, i.e. the (-1,0,1) for finite differences. Using the 3x3 Sobel kernel, I assumed the scale should then be 1/2*dx or 1/2*dy (finite differences scehme) to obtain derivatives in true scale, but that does not seem to be the case: I was testing this on a synthetic image of hemisphere with different raster but not getting consistent results.
How is scale supposed to be used to incorporate raster dimensions, thus getting real derivative estimates?
Scale must be equal 0.25, from here: OpenCV's Sobel filter - why does it look so bad, especially compared to Gimp?
The normalization divisor for kernels can be calculated by the following fomula:
enter code heref = max(abs(sumNegative), abs(sumPositive))
where sumNegative is the sum of negative values in the kernel and sumPositive the sum of positive values in the kernel.

Laplacian kernels of higher order in image processing

In literature on digital image processing you find examples of Laplace kernels of relatively low orders, typically, 3 or 5. I wonder, is there any general way to build Laplace kernels or arbitrary order? Links or/and references would be appreciated.
The Laplace operator is defined as the sum of the second derivatives along each of the axes of the image. (That is, it is the trace of the Hessian matrix):
Δ I = ( ∂2/∂x2 + ∂2/∂y2 ) I
There are two common ways to discretize this:
Use finite differences. The derivative operator is the convolution by [1,-1] or [0.5,0,-0.5], the second derivative operator applying the [1,-1] convolution twice, leading to a convolution with [1,-2,1].
Convolve with the derivative of a regularization kernel. The optimal regularization kernel is the Gaussian, leading to a Laplace of Gaussian operator. The result is the exact Laplace of the image smoothed by the Gaussian kernel.
An alternative is to replace the regularization kernel with an interpolating kernel. A former colleague of mine published a paper on this method:
A. Hast, "Simple filter design for first and second order derivatives by a double filtering approach", Pattern Recognition Letters 42(1):65-71, 2014.
He used a "double filter", but with linear filters that can always be simplified to a single convolution.
The idea is simply that, take an interpolating kernel, and compute its derivative at integer locations. The interpolating kernel is always 1 at the origin, and 0 at other integer locations, but it waves through these "knot points", meaning that its derivative is not zero at these integer locations.
In the extreme case, take the ideal interpolator, the sinc function:
sinc(x) = sin(πx) / πx
Its second derivative is:
d2/dx2(sinc(πx)) = [ (2 - π2x2) sin(πx) - 2πx cos(πx) ] / (πx3)
Which sampled at 11 integer locations leads to:
[ 0.08 -0.125 0.222 -0.5 2 -3 2 -0.5 0.222 -0.125 0.08 ]
But note that the normalization is not correct here, as we're cutting off the infinitely long kernel. Thus, it's better to pick a shorter kernel, such as the cubic spline kernel.
A second alternative is to compute the Laplace operator through the Fourier domain. This simply requires multiplying with -πu2-πv2, with u and v the frequencies.
This is some MATLAB code that applies this filter to a unit impulse image, leading to an image of the kernel of size 256x256:
[u,v] = meshgrid((-128:127)/256,(-128:127)/256);
Dxx = -4*(pi*u).^2;
Dyy = -4*(pi*v).^2;
L = Dxx + Dyy;
l = fftshift(ifft2(ifftshift(L)));
l = real(l); % discard insignificant imaginary component (probably not necessary in MATLAB, but Octave leaves these values there)
l(abs(l)<1e-6) = 0; % set near-zero values to zero
l here is the same as the result above for the ideal interpolator, adding the vertical and horizontal ones together, and normalizing for a length of 256.
Finally, I'd like to mention that the Laplace operator is very sensitive to noise (high frequencies are enhanced significantly). The methods discussed here are meaningful only for data without nose (presumably synthetic data?). For any real-world data, I highly recommend that you use the Laplace of Gaussian. This will give you the exact Laplace of the smoothed image. The smoothing is necessary to prevent influence from noise. With little noise, you can use a small Gaussian sigma (e.g. σ=0.8). This will give you much more useful results than any other approach.

Difference between the two sigmas i.e noise sigma and filter sigma?

I know that the sigma value that we multiply with noise is used to increase or decrease the intensity level that is,
noise = randn(size(image)) * sigma;
Here sigma has something to do with intensity. But what is the purpose of sigma while creating a filter that is,
filter = fspecial('gaussian', size, sigma);
Why do we need to pass the sigma value here? What is the difference between this sigma and the one mentioned above?
Thanks in advance!!!
Sigma is the standard deviation. Read something about normal or gaussian distribution. Wikipedia e.g.
Both play a role in those functions you mentioned.
It is used to define the noise which follows the normal distribution. See the n in randn() ?
For the filter sigma defines the weights for an average calculation where the central pixel is the Gauss bell's peak. The further away the smaller the weight.

gaussian blur with FFT

im trying to implement a gaussian blur with the use of FFT and could find here the following recipe.
This means that you can take the
Fourier transform of the image and the
filter, multiply the (complex)
results, and then take the inverse
Fourier transform.
I've got a kernel K, a 7x7 Matrix
and a Image I, a 512x512 Matrix.
I do not understand how to multiply K by I.
Is the only way to do that by making K as big as I (512x512) ?
Yes, you do need to make K as big as I by padding it with zeros. Also, after padding, but before you take the FFT of the kernel, you need to translate it with wraparound, such that the center of the kernel (the peak of the Gaussian) is at (0,0). Otherwise, your filtered image will be translated. Alternatively, you can translate the resulting filtered image once you are done.
Another point: for small kernels not using the FFT may actually be faster. A 2D Gaussian kernel is separable, meaning that you can separate it into two 1D kernels for x and y. Then instead of a 2D convolution, you can do two 1D convolutions in x and y directions in the spatial domain. For smaller kernels that may end up being faster than doing the convolution in the frequency domain using the FFT.
If you are comfortable with pixel shader and if FFT is not your main goal here, but convolution with gaussian blur kernel IS,- then i can recommend my tutorial on what convolution is
regards.

Resources