I find that kernels for sobel for size 5x5 and 7x7 are also common. But I don't know how to reason.
Is it an extension with binomial expansion?
Thanks in advance.
Related
I've been prototyping some image (8bit) processing that involves the median filtering with a rectangular kernel {1,25} and I found out that OpenCV medianBlur only accepts square kernels.
I've discovered a constant time median algorithm but again it works with a square kernel. Do you guys know I could implement my 1D median using OpenCV or standard c++?
Thanks in advance
I've implemented a gaussian blur fragment shader in GLSL. I understand the main concepts behind all of it: convolution, separation of x and y using linearity, multiple passes to increase radius...
I still have a few questions though:
What's the relationship between sigma and radius?
I've read that sigma is equivalent to radius, I don't see how sigma is expressed in pixels. Or is "radius" just a name for sigma, not related to pixels?
How do I choose sigma?
Considering I use multiple passes to increase sigma, how do I choose a good sigma to obtain the sigma I want at any given pass? If the resulting sigma is equal to the square root of the sum of the squares of the sigmas and sigma is equivalent to radius, what's an easy way to get any desired radius?
What's the good size for a kernel, and how does it relate to sigma?
I've seen most implementations use a 5x5 kernel. This is probably a good choice for a fast implementation with decent quality, but is there another reason to choose another kernel size? How does sigma relate to the kernel size? Should I find the best sigma so that coefficients outside my kernel are negligible and just normalize?
What's the relationship between sigma and radius?
I think your terms here are interchangeable depending on your implementation. For most glsl implementations of Gaussian blur they use the sigma value to define the amount of blur. In the Gaussian blur definition the radius can be considered the 'blur radius'. These terms are in pixel space.
How do I choose sigma?
This will define how much blur you want, which corresponds to the size of the kernel to be used in the convolution. Bigger values will result in more blurring.
The NVidia implementation uses a kernel size of int(sigma*3).
You may experiment using a smaller kernel size with higher values of sigma for performance considerations. These are free parameters to experiment with, which define how many pixels to use for modulation and how much of the corresponding pixel to include in the result.
What's the good size for a kernel, and how does it relate to sigma?
Based on the sigma value you will want to choose a corresponding kernel size. The kernel size will determine how many pixels to sample during the convolution and the sigma will define how much to modulate them by.
You may want to post some code for a more detailed explanation. NVidia has a pretty good chapter on how to build a Gaussian Kernel.
Look at Example 40-1.
I am trying to identify which parts of a picture are in focus and which are blurred, something like this:
But HOW to do that? Any ideas on how to mesure this? I've read something about finding the high frequencies but how could it produce a picture like those?
Cheers,
Any image will be the sharpest at its optimum focus. Take advantage of that - run the Sobel operator or the Laplace operator, any kind of difference(derivative) filter. Sum the results pixel by pixel, the image with the highest sum is the best focused one.
Edit:
There will be additional constraints depending on how much additional information you have, e.g. multiple samples, similarity of objects in the image, etc.
Check out this paper for more precision over the Laplace filter. In my problem with 4K images, the Laplace filter was insufficient for detecting blurs and out-of-focus regions.
https://github.com/facebookresearch/DeepFocus
edit: Blur detection with deep learning has a number of approaches. Choose the method that best suits your needs:)
I´m trying to make an implementation of Gaussian blur for a school project.
I need to make both a CPU and a GPU implementation to compare performance.
I am not quite sure that I understand how Gaussian blur works. So one of my questions is
if I have understood it correctly?
Heres what I do now:
I use the equation from wikipedia http://en.wikipedia.org/wiki/Gaussian_blur to calculate
the filter.
For 2d I take RGB of each pixel in the image and apply the filter to it by
multiplying RGB of the pixel and the surrounding pixels with the associated filter position.
These are then summed to be the new pixel RGB values.
For 1d I apply the filter first horizontally and then vetically, which should give
the same result if I understand things correctly.
Is this result exactly the same result as when the 2d filter is applied?
Another question I have is about how the algorithm can be optimized.
I have read that the Fast Fourier Transform is applicable to Gaussian blur.
But I can't figure out how to relate it.
Can someone give me a hint in the right direction?
Thanks.
Yes, the 2D Gaussian kernel is separable so you can just apply it as two 1D kernels. Note that you can't apply these operations "in place" however - you need at least one temporary buffer to store the result of the first 1D pass.
FFT-based convolution is a useful optimisation when you have large kernels - this applies to any kind of filter, not just Gaussian. Just how big "large" is depends on your architecture, but you probably don't want to worry about using an FFT-based approach for anything smaller than, say, a 49x49 kernel. The general approach is:
FFT the image
FFT the kernel, padded to the size of the image
multiply the two in the frequency domain (equivalent to convolution in the spatial domain)
IFFT (inverse FFT) the result
Note that if you're applying the same filter to more than one image then you only need to FFT the padded kernel once. You still have at least two FFTs to perform per image though (one forward and one inverse), which is why this technique only becomes a computational win for large-ish kernels.
heya! i recently came across this statement larger sobel operators are more stable in noise. which got me thinking- why?
its probably because the gradient differences are less pronounced, so that noise is ignored. am i correct? thanks!
Of course, this depends on a lot of things, one being the frequency of the noise you are facing. If the noise changes with much higher frequency (e.g. with every pixel) than the patterns you are trying to find, then you are correct.
In general, however, sobel operators are a sort of finite-differencing, and as such work best the smaller the differences are.
If you use a larger sobel operator, what you are actually doing is using a sobel operator + a low pass filter. The low pass filter might not be the best way to deal with the noise in your image - sometimes it might be more favorable to use the smallest sobel filter and some other (machine-learning) algorithm for noise rejection.