I'm trying to evaluate the complexity of some basic image filtering algorithms. I was wondering if you could verify this theory;
For a basic pixel by pixel filter like Inverse the number of operations grows linearly with the size of the input (In pixels) and
Let S = Length of the side of the image
Let M = # pixels input
Inverse is of order O(M) or O(S^2).
A convolution filter on the other hand has a parameter R which determines the size of the neighborhood to convolve in establishing the next pixel value for each filter.
Let R = Radius of convolution filter
Convolution is of order O(M*((R+R*2)^2) = O(M*(4R^2) = O(MR^2)
Or should I let N = the size of the convolution filter (Neighbourhood) in pixels?
O(M*(N)) = O(MN)
Ultimately a convolution filter is linearly dependent on the product of the number of pixels and the number of pixels in the neighbourhood.
If you have any links to a paper where this has been documented it would be greatly appreciated.
Kind regards,
Gavin
O(MN) seems right if I understand that for each pixel in the image the convolution is the adjustment of pixel values in the neighbourhood N, regardless of N being square. N could be best-fit triangle ... but providing the pixels in the neighbourhood are adjusted for each pixel in the image then O(MN) makes more sense, because the dependency is in the pixels adjusted per pixel in the source image.
Interestingly, in a non-regular neighbourhood some pixels may be adjusted by the neighbourhood mask more than others, but O(MN) will still stand.
If the neighbourhood is central on a pixel P and then moved to the next P which was not in the neighbourhood (meaning each pixel is transformed once) then this doesn't stand.
Related
I have an Image I
I am trying to do Automatic Object Extraction using Quantum Mechanics
Each pixel in an image is considered as a potential field, V(x,y) and hence each wave (eigen) function represents a meaningful region.
2D Time-independent Sschrodinger's equation
Multiplying both sides by
We get,
Rewriting the Laplacian using Finite Difference approach
where Ni is the set of neighbours with index i, and |Ni| is the cardinality of, i.e. the number of elements in Ni
Combining the above two equations, we get:
where M is the number of elements in
Now,the left hand side of the equation is a measure of how similar the labels in a neighbourhood are, i.e. a measure of spatial coherence.
Now, for applying this to images, the potential V is given as the pixel intensities.
Here, V is the pixel intensities
The right hand side is a measure of how close the pixel values in a segment are to a constant value E.
Now, the wave functions can be numerically calculated by solving the eigenvectors of Hamiltonian operator in matrix form which is
for i = j
for
and elsewhere 0
Now, in this paper it is said that first we have to find the maximum and minimum eigenvalues and then calculate the eigenvectors with eigenvalues closest to a number of values regularly selected between the minimum and maximum eigenvalues. the number is 300.
I have calculated the 300 eigenvectors.
And then the absolute square of the eigenvectors are thresholded to obtain the segments.
Fine upto this part.
Now, how do I reconstruct the eigenvectors into a 2D image so as to get the potential segments in the image?
I am reading this paper Achanta-SLIC Superpixel segmentation where it says that the every superpixel cluster center is located at a distance of S = root(N/k) and that expected spatial extent of a superpixel is a region of S * S and the search for similar pixels is done in a spatial region of 2S*2S.
Can someone please explain me this point as I am stuck at it?
From the paper:
Our algorithm takes as input a desired number of approximately equally-sized
superpixels K.
So, let's assume that our SP are approximately squares. You will have K of them.
For an image with N pixels, the approximate size of each superpixel
is therefore N/K pixels
If you divide the image area N in K SP, every SP has (almost) N/K pixels. I.e., the area of each SP is N/K.
For roughly equally sized superpixels there would be a superpixel center at every grid interval S = sqrt(N/K).
Each SP is assumed to be squared, with area N/K. The side of the square will then be sqrt(area) = sqrt(N/K) = S. This means that a SP center is S far from neighbours's centers.
Since the spatial extent of any superpixel is approximately S^2 (the approximate area of a superpixel)
Well, the side of each square is S, then its area is S^2 (which is the same as N/K = sqrt(N/K)^2 = S^2).
we can safely assume that pixels that are associated with this cluster
center lie within a 2S × 2S area around the superpixel center
We mentioned that each side of the square will be S, then each pixels of the SP will lie within the size of half the diagonal from the center sqrt(S/2), which is less than the side sqrt(S/2) < S. But SP are not exactly squares, so we want to be a little more flexible, and say that all pixels lie within the double of this distance: 2S.
Assuming that I have a grayscale (8-bit) image and assume that I have an integral image created from that same image.
Image resolution is 720x576. According to SURF algorithm, each octave is composed of 4 box filters, which are defined by the number of pixels on their side. The
first octave uses filters with 9x9, 15x15, 21x21 and 27x27 pixels. The
second octave uses filters with 15x15, 27x27, 39x39 and 51x51 pixels.The third octave uses filters with 27x27, 51x51, 75x75 and 99x99 pixels. If the image is sufficiently large and I guess 720x576 is big enough (right??!!), a fourth octave is added, 51x51, 99x99, 147x147 and 195x195. These
octaves partially overlap one another to improve the quality of the interpolated results.
// so, we have:
//
// 9x9 15x15 21x21 27x27
// 15x15 27x27 39x39 51x51
// 27x27 51x51 75x75 99x99
// 51x51 99x99 147x147 195x195
The questions are:What are the values in each of these filters? Should I hardcode these values, or should I calculate them? How exactly (numerically) to apply filters to the integral image?
Also, for calculating the Hessian determinant I found two approximations:
det(HessianApprox) = DxxDyy − (0.9Dxy)^2 anddet(HessianApprox) = DxxDyy − (0.81Dxy)^2Which one is correct?
(Dxx, Dyy, and Dxy are Gaussian second order derivatives).
I had to go back to the original paper to find the precise answers to your questions.
Some background first
SURF leverages a common Image Analysis approach for regions-of-interest detection that is called blob detection.
The typical approach for blob detection is a difference of Gaussians.
There are several reasons for this, the first one being to mimic what happens in the visual cortex of the human brains.
The drawback to difference of Gaussians (DoG) is the computation time that is too expensive to be applied to large image areas.
In order to bypass this issue, SURF takes a simple approach. A DoG is simply the computation of two Gaussian averages (or equivalently, apply a Gaussian blur) followed by taking their difference.
A quick-and-dirty approximation (not so dirty for small regions) is to approximate the Gaussian blur by a box blur.
A box blur is the average value of all the images values in a given rectangle. It can be computed efficiently via integral images.
Using integral images
Inside an integral image, each pixel value is the sum of all the pixels that were above it and on its left in the original image.
The top-left pixel value in the integral image is thus 0, and the bottom-rightmost pixel of the integral image has thus the sum of all the original pixels for value.
Then, you just need to remark that the box blur is equal to the sum of all the pixels inside a given rectangle (not originating in the top-lefmost pixel of the image) and apply the following simple geometric reasoning.
If you have a rectangle with corners ABCD (top left, top right, bottom left, bottom right), then the value of the box filter is given by:
boxFilter(ABCD) = A + D - B - C,
where A, B, C, D is a shortcut for IntegralImagePixelAt(A) (B, C, D respectively).
Integral images in SURF
SURF is not using box blurs of sizes 9x9, etc. directly.
What it uses instead is several orders of Gaussian derivatives, or Haar-like features.
Let's take an example. Suppose you are to compute the 9x9 filters output. This corresponds to a given sigma, hence a fixed scale/octave.
The sigma being fixed, you center your 9x9 window on the pixel of interest. Then, you compute the output of the 2nd order Gaussian derivative in each direction (horizontal, vertical, diagonal). The Fig. 1 in the paper gives you an illustration of the vertical and diagonal filters.
The Hessian determinant
There is a factor to take into account the scale differences. Let's believe the paper that the determinant is equal to:
Det = DxxDyy - (0.9 * Dxy)^2.
Finally, the determinant is given by: Det = DxxDyy - 0.81*Dxy^2.
Look at page 17 of this document
http://www.sci.utah.edu/~fletcher/CS7960/slides/Scott.pdf
If you made a code for normal Gaussian 2D convolution, just use the box filter as a Gaussian kernel and the input image will be the same original image not integral image. The results from this method will be same with the one you asked.
I am trying to blur a scanned text document to the point that the text lines are blurred to black.. I mean the text blends into each other and all I see are black lines.
I'm new to MATLAB and even though I know the basics I cannot get the image to blur properly. I have read this: Gaussian Blurr and according to that the blur is managed/decided by the sigma function. But that is not how it works in the code I wrote.
While trying to learn Gaussian blurring in Matlab I came to find out that its achieved by using this function: fspecial('gaussian',hsize,sigma);
So apparently there are two variables hsize specifies number of rows or columns in the function while sigma is the standard deviation.
Can some one please explain the significance of hsize here and why it has a much deeper effect on the result even more than sigma?
Why is it that even if I increase sigma to a very high value the blurr is not effected but the image is distorted a lot by increasing the hsize
here is my code:
img = imread('c:\new.jpg');
h = fspecial('gaussian',hsize,sigma);
out = imfilter(img,h);
imshow(out);
and the results are attached:
Why is it not only controlled by sigma? What role does hsize play? Why cant I get it to blur the text only rather than distort the entire image?
Thank you
hsize refers to the size of the filter. Specifically, a filter that is Nx
x Ny pixels uses a pixel region Nx x Ny in size centered around each
pixel when computing the response of the filter. The response is just how
the pixels in that region are combined together. In the case of a
gaussian filter, the intensity at each pixel around the central one is
weighted according to a gaussian function prior to performing a box average over the region.
sigma refers to the standard deviation of the gaussian (see documentation
for fspecial) with units in pixels. As you increase sigma (keeping the
size of the filter the same) eventually you approach a simple box average with uniform weighting
over the filter area around the central pixel, so you stop seeing an effect from increasing sigma.
The similarity between results obtained with gaussian blur (with large value of sigma) and a box
average are shown in the left and middle images below. The right image shows
the results of eroding the image, which is probably what you want.
The code:
% gaussian filter:
hsize = 5;
sigma = 10;
h = fspecial('gaussian',hsize,sigma);
out = imfilter(img,h);
% box filter:
h = fspecial('average',hsize);
out = imfilter(img,h);
% erode:
se=strel('ball',4,4);
out = imerode(img,se);
Fspecial's Manual
h = fspecial('gaussian', hsize, sigma) returns a rotationally
symmetric Gaussian lowpass filter of size hsize with standard
deviation sigma (positive). hsize can be a vector specifying the
number of rows and columns in h, or it can be a scalar, in which case
h is a square matrix. The default value for hsize is [3 3]; the
default value for sigma is 0.5. Not recommended. Use imgaussfilt or
imgaussfilt3 instead.
where they say that fspecial - gaussian is not recommended.
In deciding the standard deviation (sigma), you need still decide hsize which affects the blurring.
In imgaussfilt, you decide the standard deviation and the system considers you the rest.
I can get much more better tolerance levels with imgaussfilt and imgaussfilt3 in my systems in Matlab 2016a, example output here in the body
im = im2double( imgGray );
sigma = 5;
simulatedPsfImage = imgaussfilt(im, sigma);
simulatedPsfImage = im2double( simulatedPsfImage );
[ measuredResolution, standardError, bestFitData ] = ...
EstimateResolutionFromPsfImage( simulatedPsfImage, [1.00 1.00] );
Note that the tolerance levels of fspecial are high [0.70 1.30] by default.
Gaussian smoothing is a common image processing function, and for an introduction of Gaussian filtering, please refer to here. As we can see, one parameter: standard derivation will determine the shape of Gaussian function. However, when we perform convolution with Gaussian filtering, another parameter: the window size of Gaussian filter should also be determined at the same time. For example, when we use fspecial function provided by MATLAB, not only the standard derivation but also the window size must be provided. Intuitively, the larger the Gaussian standard derivation is the bigger the Gaussian kernel window should. However, there is no general rule about how to set the right window size. Any ideas? Thanks!
The size of the mask drives the filter amount. A larger size, corresponding to a larger convolution mask, will generally result in a greater degree of filtering. As a kinda trade-off for greater amounts of noise reduction, larger filters also affect the details quality of the image.
That's as milestone. Now coming to the Gaussian filter, the standard deviation is the main parameter. If you use a 2D filter, at the edge of the mask you will probably desire the weights to approximate 0.
To this respect, as I already said, you can choose a mask with a size which is generally three times the standard deviation. This way, almost the whole Gaussian bell is taken into account and at the mask's edges your weights will asymptotically tend to zero.
I hope this helps.
Here is a good reference.
After discretizing, pixel with distance greater than 3 sigma have negligible weights. See this
As already pointed, 6sigma, implies 3sigma both ways
Size of convolution matrix to be used for filtering would inadvertently be 6sigma by 6sigma, because of points 1 and 2 above.
Here how you can obtain the discrete Gaussian.
Finally, the size of the standard deviation(and therefore the Kernel used) depends on how much noise you suspect to be in the image. Clearly, a larger convolution kernel implies farther pixels get to contribute to the new value of the centre pixel as opposed to a smaller kernel.
Given sigma and the minimal weight epsilon in the filter you can solve for the necessary radius of the filter x:
For example if sigma = 1 then the gaussian is greater than epsilon = 0.01 when x <= 2.715 so a filter radius = 3 (width = 2*3 + 1 = 7) is sufficient.
sigma = 0.5, x <= 1.48, use radius 2
sigma = 1, x <= 2.715, use radius 3
sigma = 1.5, x <= 3.84, use radius 4
sigma = 2, x <= 4.89, use radius 5
sigma = 2.5, x <= 5.88, use radius 6
If you reduce/increase epsilon then you will need a larger/smaller radius.