Approximating true heightmap gradient magnitude with opencv's Sobel filter - opencv

I have an image (cv::Mat, type CV_32F) representing grid-sampled height function. The grid has constant raster (dx,dy) per pixel.
I would like to estimate its gradient magnitude. Using OpenCV's Sobel filter, I approximate derivatives like this:
dfdx=zz.Sobel(zz,cv2.CV_32F,1,0,ksize=3,scale=?)
dfdy=zz.Sobel(zz,cv2.CV_32F,0,1,ksize=3,scale=?)
gradMag=np.sqrt(dfdx**2+dfdy**2)
The scale parameter is barely documented, but looking into the source, it is used to multiply derivative kernels, i.e. the (-1,0,1) for finite differences. Using the 3x3 Sobel kernel, I assumed the scale should then be 1/2*dx or 1/2*dy (finite differences scehme) to obtain derivatives in true scale, but that does not seem to be the case: I was testing this on a synthetic image of hemisphere with different raster but not getting consistent results.
How is scale supposed to be used to incorporate raster dimensions, thus getting real derivative estimates?

Scale must be equal 0.25, from here: OpenCV's Sobel filter - why does it look so bad, especially compared to Gimp?
The normalization divisor for kernels can be calculated by the following fomula:
enter code heref = max(abs(sumNegative), abs(sumPositive))
where sumNegative is the sum of negative values in the kernel and sumPositive the sum of positive values in the kernel.

Related

How bilinear interpolation works when down scaling?

I can clearly understand how bilinear interpolation works when up scaling the image, like fill the values while taking 4 nearest neighbours, but i can't understand how it works while down scaling the image. It would mean a lot to me if someone clarify for me.
Scaling an image requires mapping pixels from the input to pixels on the output. If those pixel coordinates don't map to an integer, interpolation is required to estimate what the pixel value would have been. The "Bi" part of bilinear means it's linear interpolation applied in two dimensions independently. If for example output pixel 2,3 needs to come from input coordinates 1.5,7.2 you would interpolate in the X direction by taking 0.5 of each of the pixels at 1.0 and 2.0, then interpolate in the Y direction by taking 0.8 of the pixel at 7.0 and 0.2 of the pixel at 8.0. Usually these operations are combined into a single set of equations, but they can be applied separately if needed.
Bilinear is a poor choice for downscaling because it leads to aliasing artifacts. This is when you attempt to create spatial frequencies that are beyond the Nyquist sampling limit, and high frequency detail turns into low frequency artifacts. You can minimize this by blurring the image before you downscale it. Or you can choose an interpolation algorithm that incorporates some low pass filtering.

Laplacian kernels of higher order in image processing

In literature on digital image processing you find examples of Laplace kernels of relatively low orders, typically, 3 or 5. I wonder, is there any general way to build Laplace kernels or arbitrary order? Links or/and references would be appreciated.
The Laplace operator is defined as the sum of the second derivatives along each of the axes of the image. (That is, it is the trace of the Hessian matrix):
Δ I = ( ∂2/∂x2 + ∂2/∂y2 ) I
There are two common ways to discretize this:
Use finite differences. The derivative operator is the convolution by [1,-1] or [0.5,0,-0.5], the second derivative operator applying the [1,-1] convolution twice, leading to a convolution with [1,-2,1].
Convolve with the derivative of a regularization kernel. The optimal regularization kernel is the Gaussian, leading to a Laplace of Gaussian operator. The result is the exact Laplace of the image smoothed by the Gaussian kernel.
An alternative is to replace the regularization kernel with an interpolating kernel. A former colleague of mine published a paper on this method:
A. Hast, "Simple filter design for first and second order derivatives by a double filtering approach", Pattern Recognition Letters 42(1):65-71, 2014.
He used a "double filter", but with linear filters that can always be simplified to a single convolution.
The idea is simply that, take an interpolating kernel, and compute its derivative at integer locations. The interpolating kernel is always 1 at the origin, and 0 at other integer locations, but it waves through these "knot points", meaning that its derivative is not zero at these integer locations.
In the extreme case, take the ideal interpolator, the sinc function:
sinc(x) = sin(πx) / πx
Its second derivative is:
d2/dx2(sinc(πx)) = [ (2 - π2x2) sin(πx) - 2πx cos(πx) ] / (πx3)
Which sampled at 11 integer locations leads to:
[ 0.08 -0.125 0.222 -0.5 2 -3 2 -0.5 0.222 -0.125 0.08 ]
But note that the normalization is not correct here, as we're cutting off the infinitely long kernel. Thus, it's better to pick a shorter kernel, such as the cubic spline kernel.
A second alternative is to compute the Laplace operator through the Fourier domain. This simply requires multiplying with -πu2-πv2, with u and v the frequencies.
This is some MATLAB code that applies this filter to a unit impulse image, leading to an image of the kernel of size 256x256:
[u,v] = meshgrid((-128:127)/256,(-128:127)/256);
Dxx = -4*(pi*u).^2;
Dyy = -4*(pi*v).^2;
L = Dxx + Dyy;
l = fftshift(ifft2(ifftshift(L)));
l = real(l); % discard insignificant imaginary component (probably not necessary in MATLAB, but Octave leaves these values there)
l(abs(l)<1e-6) = 0; % set near-zero values to zero
l here is the same as the result above for the ideal interpolator, adding the vertical and horizontal ones together, and normalizing for a length of 256.
Finally, I'd like to mention that the Laplace operator is very sensitive to noise (high frequencies are enhanced significantly). The methods discussed here are meaningful only for data without nose (presumably synthetic data?). For any real-world data, I highly recommend that you use the Laplace of Gaussian. This will give you the exact Laplace of the smoothed image. The smoothing is necessary to prevent influence from noise. With little noise, you can use a small Gaussian sigma (e.g. σ=0.8). This will give you much more useful results than any other approach.

Are Wavelet Coefficients simply the pixel values of decomposed image in 2D Discrete Wavelet Transform

I've been working with Discrete Wavelet Transform, I'm new to this theory. I want to access and modify the wavelet coefficients of the decomposed image, Are those wavelet coefficients simply the pixel values of the decomposed image in 2D DWT?
This is for example the result of DWT Decomposition:
So, when I want to access and modify the Wavelet Coefficients, can I just iterate through the pixel values of above image? Thank you for your help.
No. The image is merely illustrative.
The image you are looking on does not exactly correspond to original coefficients. The original wavelet coefficients are real numbers. Unlike them, you are looking on their absolute values quantized into a range from 0 to 255.
It is not true that the coefficients were calculated as pairwise sums and differences of the input samples. The coefficients were calculated using two complementary filters. See the description here. However, it is essential that these coefficients were adjusted and it is no longer possible to synthesize the original image. If you need to synthesize the image, you cannot access the pixels of the referenced image.

GPU-based Laplacian Pyramid

I have implemented an image blending method for seamless blending using plain C++. Now I want to convert this code for GPU (using OpenGL ES 2 Shaders for mobile devices). Basically the method creates Gaussian and Laplacian Pyramides for each image which are then combined from low-resolution to top (see also the paper "The Laplacian Pyramid as a Compact Image Code" from Burt et.al. 1983).
My problem is that the Laplacian pyramid levels can have negative values but my devices do not support float or integer type textures (using the ORB_texture_float extension e.g.).
I already looked for papers dealing with GPU-based pyramids but without finding something really useful.
How can I implement such a pyramid efficiently for a GPU?
Is it possible to calculate a Gaussian/Laplacian pyramid level without iterating through the preceding levels?
Regards,
EDIT
It seems as if there is no "good" way to calculate Laplacian Pyramids completely on GPU except using two passes (one for signs, one for values) which do not have support for either signed types (for instance ARB_texture_float) or types larger than byte when the the image's data range is between [0..255]. My Laplacian Pyramid runs perfectly on GPUs with ARB_texture_float extension but without the extension (and some adjustments to compress the range) the pyramid gets "wrong" due to range compression.
The safest way for you to implement a Laplacian pyramid if your textures are unsigned integers is to store two pyramids - one pyramid that contains the gradient magnitude of the Laplacian and another pyramid that stores the sign of the pixel at that location.
Yes. Any level in a Gaussian or Laplacian pyramid has a closed form solution based on the sigma value that you want to compute. Consider the base case of a LoG pyramid computed at intervals of sigma = (2/3). The first level of the pyramid has sigma 2/3 and is produced simply by convolving with a 5x5 LoG filter with sigma 2/3. The second convolution with the same filter produces an LoG image with sigma 4/3, and finally the third has sigma 6/3, or 2, so we subsample the image to produce the next integer level of the pyramid. If you want to compute the LoG of an image at sigma 2, the levels at sigma 2/3 and 4/3 are not necessary - simply subsample the image one time and convolve with an LoG filter with sigma 1.
If you want to compute the LoG at sigma = 20, quad-subsample the image (16 pixel blocks become 1 pixel) to give you a sigma 16 image, then convolve once with a sigma 4/3 LoG filter.

gaussian blur with FFT

im trying to implement a gaussian blur with the use of FFT and could find here the following recipe.
This means that you can take the
Fourier transform of the image and the
filter, multiply the (complex)
results, and then take the inverse
Fourier transform.
I've got a kernel K, a 7x7 Matrix
and a Image I, a 512x512 Matrix.
I do not understand how to multiply K by I.
Is the only way to do that by making K as big as I (512x512) ?
Yes, you do need to make K as big as I by padding it with zeros. Also, after padding, but before you take the FFT of the kernel, you need to translate it with wraparound, such that the center of the kernel (the peak of the Gaussian) is at (0,0). Otherwise, your filtered image will be translated. Alternatively, you can translate the resulting filtered image once you are done.
Another point: for small kernels not using the FFT may actually be faster. A 2D Gaussian kernel is separable, meaning that you can separate it into two 1D kernels for x and y. Then instead of a 2D convolution, you can do two 1D convolutions in x and y directions in the spatial domain. For smaller kernels that may end up being faster than doing the convolution in the frequency domain using the FFT.
If you are comfortable with pixel shader and if FFT is not your main goal here, but convolution with gaussian blur kernel IS,- then i can recommend my tutorial on what convolution is
regards.

Resources