heya! i recently came across this statement larger sobel operators are more stable in noise. which got me thinking- why?
its probably because the gradient differences are less pronounced, so that noise is ignored. am i correct? thanks!
Of course, this depends on a lot of things, one being the frequency of the noise you are facing. If the noise changes with much higher frequency (e.g. with every pixel) than the patterns you are trying to find, then you are correct.
In general, however, sobel operators are a sort of finite-differencing, and as such work best the smaller the differences are.
If you use a larger sobel operator, what you are actually doing is using a sobel operator + a low pass filter. The low pass filter might not be the best way to deal with the noise in your image - sometimes it might be more favorable to use the smallest sobel filter and some other (machine-learning) algorithm for noise rejection.
Related
I'm trying to develop a way to count the number of bright spots in an image. The spots should be gaussian point sources, but there is a lot of noise. There are probably on the order of 10-20 actual point sources in this image. My first though was to use a gaussian convolution with sigma = 15, which seems to do a good job.
First, is there a better way to isolate these bright spots?
Second, how can I 'detect' the bright spots, i.e. count them? I haven't had any luck with circular hough transforms from opencv.
Edit: Here is the original without gridlines, here is the convolved image without gridlines.
I am working with thermal infrared images which subject to quantity of noises.
I found that low rank based approaches such as approaches based on Singular Value Decomposition (SVD) or Weighted Nuclear Norm Metric (WNNM) give very efficient result in terms of reducing the noise while preserving the structure of the information.
Their main drawback is the fact they are quite slow to compute (several minutes per image)
Here is some litterature:
https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7067415
https://arxiv.org/abs/1705.09912
The second paper has some MatLab code available, there is quite a lot of files but the translation to python is should not that complex.
OpenCV implement as well (and it is available in python) a very efficient algorithm on the Non-Local Means algorithm:
https://docs.opencv.org/master/d5/d69/tutorial_py_non_local_means.html
Let's pretend that in plus of having an image, I also have a gradient from left to right on the X axis of an image, and another gradient from top to bottom on the Y axis. Those two gradients are of the same size of the image, and could both range from -0.5 to 0.5.
Now, I'd like to make the convolution kernel (a.k.a. convolution filter, or convolution weights) depend on the (x, y) location in the gradient. So the kernel is a function of the gradient as if the kernel was the output of a nested mini-neural net. This would make the weights of the filter to be different in every position, but slightly similar to their neighbors. How do I do that within PyTorch or TensorFlow?
Sure, I could compute a Toeplitz matrix (a.k.a. diagonal-constant matrix) by myself, but the matrix multiplication would take O(n^3) operations if pretending x==y==n, whereas convolutions can be implemented in O(n^2) normally. Or I could maybe iterate on every element myself and do the multiplications in an unvectorized fashion.
Any better ideas? I'd like to see creativity here, thinking about how could this be implemented neatly. I believe coding that would be an interesting way to build a network layer capable of doing things similar to a simplified version of a Spatial Transformer Networks, but which's spatial transformation would be independent of the image.
Here is a solution I thought for a simplified version of this problem where a linear combination of weights would be used rather than truly using a nested mini neural network:
It may be possible to do 4 different convolutions passes so as to have 4 feature maps, then to multiply those 4 maps with the gradients (2 vertical and 2 horizontal gradients), and add them together so that only 1 map remains. However, that would be a linear combination of the different maps which is simpler than truly using a nested neural network which would alter the kernel in the first place.
Thinking more about it, here is a solution to an equivalent question. The thing with this solution is that it flips the problem around by placing the "mini neural net" after rather than before, and in a quite different way. So it solves the problem, but offer a much different optimization space and convergence behavior which is less natural for me to think about than how I formulated the problem.
In a sense, a solution to the problem could be very similar to simply concatenating the two gradients to 1 regular feature map (from a regular convolution) such as having a depth of d_2 = d_1 + 2 after the concatenation), and then performing more convolutions on top of this. I won't prove why this is a valid solution to an equivalent problem, but I thought through this and it seems provable.
The optimization space (for the weights) would be here very different and I think it wouldn't converge with the same behavior. I'd like to know what you people think about this solution in terms of optimization convergence.
The reason why convolutions are more efficient than fully connected layers is because they are translation invariant. If you wish to have convolutions which are dependent on location you would need to add two extra parameters to the convolution i.e. having N+2 input channels where x, y coord are the values of the two additonal channels (as in e.g. CoordConv).
As for alternative solutions, is the gradient meaningful? If not, and it is uniform across all images, it might be better to just manually remove it in the pre-processing stage (similar to orientation correction, cropping etc). If it is not (e.g. differences in lighting, shadows) then including other layers under the assumption they would learn the invariance of different lightings is a common hands-off approach.
I know about Gaussian, varaince, image blurring and i think that i understood the concept of variance at Gaussian blur but still i am not 100% sure.
I just want to know the role of sigma or variance at Gaussian smoothing. I mean, what happens by increasing the value of sigma for the same window size..and why it happens?
It would be really helpful if somebody provide some nice literature about it. (I already tried few but couldn't find what i am looking for)
Major confusion:
Higher frequency-> details (e.g. noise),
Lower Frequency-> kind of overview of the image.
By increasing sigma, we are allowing some higher frequencies....so we should get more detailed with increasing frequency but the case is opposite, when we increase sigma, the image becomes more blurry.
I think it should be done in the following steps, first from the signal processing point of view:
Gaussian Filter is a low pass filter. Low pass filters as their names imply pass low frequencies - keeping low frequencies. So when we look at the image in the frequency domain the highest frequencies happen in the edges(places that there is a high change in intensity and each intensity value corresponds to a specific visible frequency).
The role of sigma in the Gaussian filter is to control the variation
around its mean value. So as the Sigma becomes larger the more variance allowed around mean and as the Sigma becomes smaller the less variance allowed around mean.
Filtering in the spatial domain is done through convolution. it simply
means that we apply a kernel on every pixel in the image. The law exists for kernels. Their sum has to be zero.
Now putting all together! When we apply a Gaussian filter to an image, we are doing a low pass filtering. But as you know this happen in the discrete domain(image pixels). So we have to quantize our Gaussian filter in order to make a Gaussian kernel. In the quantization step, as the Gaussian filter(GF) has a small sigma it has the steepest pick. So the more weights will be focused in the center and the less around it.
In the sense of natural image statistics! The scientists in this field of studies showed that our vision system is a kind of Gaussian filter in the responses to the images. see for example take a look at a broad scene! don't pay attention to a specific point! so you see a broad scene with lots things in it. but the details are not clear! Now see a specific point in that seen. you see more details that previously you didn't. This is the Sigma appear here. when you increase the sigma you are looking to the broad scene without paying attention to the details exits. and when you decrease the value you will get more details.
I think Wikipedia can help more than me, Low Pass Filters, Guassian Blur
Put simply, increasing the sigma terms will cast a broader net over the neighboring pixels and decrease the impact of the pixels nearest the pixel of interest, e.g. it makes a blurrier image.
I am trying to identify which parts of a picture are in focus and which are blurred, something like this:
But HOW to do that? Any ideas on how to mesure this? I've read something about finding the high frequencies but how could it produce a picture like those?
Cheers,
Any image will be the sharpest at its optimum focus. Take advantage of that - run the Sobel operator or the Laplace operator, any kind of difference(derivative) filter. Sum the results pixel by pixel, the image with the highest sum is the best focused one.
Edit:
There will be additional constraints depending on how much additional information you have, e.g. multiple samples, similarity of objects in the image, etc.
Check out this paper for more precision over the Laplace filter. In my problem with 4K images, the Laplace filter was insufficient for detecting blurs and out-of-focus regions.
https://github.com/facebookresearch/DeepFocus
edit: Blur detection with deep learning has a number of approaches. Choose the method that best suits your needs:)
I'm working on signal processing issues. I'm extracting some features for feeding a classifier. Among these features, there is the sum of first 5 FFT coefficients. As you know primary FFT coefficients actually indicate how dominant low frequency components of a signal are. This is very close to what a low-pass filter gives.
Here I'm suspicious about whether computing FFT to take those first 5 coefficients is an unnecessary task. I think applying low-pass filter will just eliminate low-frequency components and it won't have a significant effect on primary FFT coefficients. However there may be some other way in combination with low-pass filter in order to extract same information (that is contained in first five FFT coefficients) without using FFT.
Do you have any ideas or suggestions regarding this issue?
Thanks in advance.
If you just need an indicator for the low freq part of a signal I suggest to do something really simple. Just take an ordinary lowpass filter, for instance a 2nd order butterworth, with the cutoff frequency set appropriately (5Hz in your case, if I understood right). Then compute the energy (sum over squared values) or rms-value over your window (length 100). Or perhaps take the ratio of the low-freq energy and the overall energy of the window, to get a relative measure. That should give you a pretty good indicator for low frequency contributions of your signal.
People tend to overuse the fft for all kinds of really simple tasks. In 90% of the use cases an fft can be replaced by a simpler algorithm.
I seems you should take a look at the Goertzel Algorithm, as for the seemingly limited number of frequencies you need, it should take less computation. After updating the feedback parts on each sample, you can select how often to generate your "feature metric" or a little additional weighting of the results, can yield a respectable low pass filter.