wavelet as a bandpass or a highpass? - signal-processing

I am confused about the wavelet. I understand that wavelet transform is just a bandpass filter centered at the center frequency. However, in PyWavelets, https://pywavelets.readthedocs.io/en/latest/index.html, a wavelet transform is implemented as a filter bank which can break a signal into low and high components. So my question is: how does wavelet, listed on http://wavelets.pybytes.com/wavelet/bior6.8/, play in this game? If they are used as a bandpass filter, how can the signal be broken into two parts instead of just being bandpassed?

A wavelet transform uses the scaling function, represented by a lowpass filter, to approximate the signal on the next level, and the wavelet function, represented by a highpass filter, to encode the difference between the current level and the next.
On the page you mention these are the left and right plots of each wavelet basis, respectively. The site also has a short explanation of the algorithm (via the API reference URL). The picture on Wikipedia makes it a bit clearer I think.
So the first-level detail coefficients come from a highpass filter, and the final-level approximation coefficients come from a lowpass filter. The levels in between are repeated lowpass filtering (and usually subsampling) followed by one highpass filter. Comparable to edge detection after repeated smoothing. In other words, the bandpass property does not happen in a single step, see also this earlier answer.

Related

SSAO Denoise Filter

I'm adding screen space ambient occlusion to my engine. I don't know what is the best approach to denoise the raw ao data.
My ao computation is based on GTAO. The paper itself suggests a 4x4 bilateral filter and temporal supersampling.
I'd like to have also a ssao version that doesn't rely on temporal accumulated data (as a fallback), effectively working with a very limited number of samples.
This results in a noticeably noisy result, which means a more aggressive denoise/reconstruction is needed. However, a joint bilateral filter with view-space depth as input clearly shows artifacts when using a large kernel (es.15x15).
Are there other options?
I was thinking a simple gaussian blur coupled with edge data, however, I didn't find any resource about edge detection (at least not realt-time-graphics-oriented ones). Do you have any suggestion/resource on that?

Why using magnitude method to get processed image?

Hi guys I’ve thinking about this question:
I know that we use Fourier transform to get into frequency domain to process the image.
I read the text book, it said that when we are done with processing the image in the Fourier domain we have to invert it back to get processed image.
And the textbook taught to get the real part of the inverse.
However, when I go through the OpenCv tutorial, no matter if using OpenCV or NumPy version, eventually they use magnitude (for OpenCV) or np.abs (for NumPy).
For OpenCV, the inverse returns two channels which contain the real and imaginary components. When I took the real part of the inverse, I got a totally weird image.
May somebody who knows the meaning behind all of this:
Why using magnitude or abs to get processed image?
What’s wrong with textbook instruction (take the real part of inverse)?
The textbook is right, the tutorial is wrong.
A real-valued image has a complex conjugate symmetry in the Fourier domain. This means that the FFT of the image will have a specific symmetry. Any processing that you do must preserve this symmetry if you want the inverse transform to remain real-valued. If you do this processing wrong, then the inverse transform will be complex-valued, and probably non-sensical.
If you preserve the symmetry in the Fourier domain properly, then the imaginary component of the inverse transform will be nearly zero (likely different from zero because of numerical imprecision). Discarding this imaginary component is the correct thing to do. Computing the magnitude will yield the same result, except all negative values will become positive (note some filters are meant to produce negative values, such as derivative filters), and at an increased computational cost.
For example, a convolution is a multiplication in the Fourier domain. The filter in the Fourier domain must be real-valued and symmetric around the origin. Often people will confuse where the origin is in the Fourier domain, and multiply by a filter that is seems symmetric, but actually is shifted with respect to the origin making it not symmetric. This shift introduces a phase change of the inverse transform (see the shift property of the Fourier transform). The magnitude of the inverse transform is not affected by the phase change, so taking the magnitude of this inverse transform yields an output that sort of looks OK, except if one expects to see negative values in the filter result. It would have been better to correctly understand the FFT algorithm, create a properly symmetric filter in the Fourier domain, and simply keep the real part of the inverse transform.
Nonetheless, some filters are specifically designed to break the symmetry and yield a complex-valued filter output. For example the Gabor filter has an even (symmetric) component and an odd (anti-symmetric) component. The even component yields a real-valued output, the odd component yields an imaginary-valued output. In this case, it is the magnitude of the complex value that is of interest. Likewise, a quadrature filter is specifically meant to produce a complex-valued output. From this output, the analytic signal (or its multi-dimensional extension, the monogenic signal), both the magnitude and the phase are of interest, for example as used in the phase congruency method of edge detection.
Looking at the linked tutorial, it is the line
fshift[crow-30:crow+30, ccol-30:ccol+30] = 0
which generates the Fourier-domain filter and applies it to the image (it is equivalent to multiplying by a filter with 1s and 0s). This tutorial correctly computes the origin of the Fourier domain (though for Python 3 you would use crow,ccol = rows//2 , cols//2 to get the integer division). But the filter above is not symmetric around that origin. In Python, crow-30:crow+30 indicates 30 pixels to the left of the origin, and only 29 pixels to the right (the right bound is not included!). The correct filter would be:
fshift[crow-30:crow+30+1, ccol-30:ccol+30+1] = 0
With this filter, the inverse transform is purely real (imaginary component has values in the order of 1e-13, which is numerical errors). Thus, it is now possible (and correct) to replace img_back = np.abs(img_back) with img_back = np.real(img_back).

what if the filter window size is an even number in Gaussian filtering?

I know usually people prefer to choose the odd number as the size of Gaussian Filtering, and since the image made of discrete pixels, we can always locate the central pixel.
But what if the size is an even number? There will lead to several questions:
how will the Gaussian filter be, should it be symmetric or asymmetric?
what if the size number equals to 2?
Thank you.
There really is no such choice to be made.
A Gaussian filtering kernel that is shifted will result in a smoothing + a shift of the image. If you want a filter that doesn’t shift the image, the filter must have the origin of the Gaussian at the origin of the kernel, typically the central pixel of an odd-sized kernel.
Once we have established that, using an even-sized filter must lead to an asymetrical kernel. It is not really desirable to have an asymmetrical smoothing filter (unless we’re talking about adaptive filtering) because the asymmetry introduces a bias.
So, we’re stuck with an odd-sized filter. An even-sized filter will introduce either a bias or a shift of half a pixel.
A 2-pixel kernel cannot be a Gaussian filter because it takes at least 5 samples to represent a Gaussian kernel with sufficient detail for it to present the positive aspects of the Gaussian filter. With fewer samples, the filter will not behave like a Gaussian filter.
For more information about Gaussian filtering, I recommend that you read this blog post that I wrote 10 years ago.

Smoothing motion by using Kalman Filter or Particle Filter in video stabilization

I have a problem. I have read many papers about video stabilization. Almost papers mention about smoothing motion by using Kalman Filter, so it's strong and run in real-time applications.
But there is also another filter strongly, that is particle filter.
But why dont we use Partilce filter in smoothing motion to create stabilized video?
Some papers only use particle filter in estimating global motion between frames (motion estimation part).
It is hard to understand them.
Can anyone explain them for me, please?
Thank you so much.
A Kalman Filter is uni-modal. That means it has one belief along with an error covariance matrix to represent the confidence in that belief as a normal distribution. If you are going to smooth some process, you want to get out a single, smoothed result. This is consistent with a KF. It's like using least squares regression to fit a line to data. You are simplifying the input to one result.
A particle filter is multi-modal by its very nature. Where a Kalman Filter represents belief as a central value and a variance around that central value, a particle filter just has many particles whose values are clustered around regions that are more likely. A particle filter can represent essentially the same state as a KF (imagine a histogram of the particles that looks like the classic bell curve of the normal distribution). But a particle filter can also have multiple humps or really any shape at all. This ability to have multiple simultaneous modes is ideally suited to handle problems like estimating motion, because one mode (cluster of particles) can represent one move, and another mode represents a different move. When presented with this ambiguity, a KF would have to abandon one of the possibilities altogether, but a particle filter can keep on believing both things at the same time until the ambiguity is resolved by more data.

Gaussian blur and FFT

I´m trying to make an implementation of Gaussian blur for a school project.
I need to make both a CPU and a GPU implementation to compare performance.
I am not quite sure that I understand how Gaussian blur works. So one of my questions is
if I have understood it correctly?
Heres what I do now:
I use the equation from wikipedia http://en.wikipedia.org/wiki/Gaussian_blur to calculate
the filter.
For 2d I take RGB of each pixel in the image and apply the filter to it by
multiplying RGB of the pixel and the surrounding pixels with the associated filter position.
These are then summed to be the new pixel RGB values.
For 1d I apply the filter first horizontally and then vetically, which should give
the same result if I understand things correctly.
Is this result exactly the same result as when the 2d filter is applied?
Another question I have is about how the algorithm can be optimized.
I have read that the Fast Fourier Transform is applicable to Gaussian blur.
But I can't figure out how to relate it.
Can someone give me a hint in the right direction?
Thanks.
Yes, the 2D Gaussian kernel is separable so you can just apply it as two 1D kernels. Note that you can't apply these operations "in place" however - you need at least one temporary buffer to store the result of the first 1D pass.
FFT-based convolution is a useful optimisation when you have large kernels - this applies to any kind of filter, not just Gaussian. Just how big "large" is depends on your architecture, but you probably don't want to worry about using an FFT-based approach for anything smaller than, say, a 49x49 kernel. The general approach is:
FFT the image
FFT the kernel, padded to the size of the image
multiply the two in the frequency domain (equivalent to convolution in the spatial domain)
IFFT (inverse FFT) the result
Note that if you're applying the same filter to more than one image then you only need to FFT the padded kernel once. You still have at least two FFTs to perform per image though (one forward and one inverse), which is why this technique only becomes a computational win for large-ish kernels.

Resources