Two Dimensional FFTW Help - image-processing

I'm currently trying to compute the fft of an image via fftw_plan_dft_2d.
To use this function, I'm linearizing the image data into an in array and calling the function mentioned above (and detailed below)
ftw_plan fftw_plan_dft_2d(int n0, int n1,
fftw_complex *in, fftw_complex *out,
int sign, unsigned flags);
The func modifies a complex array, out, with a size equal to the number of pixels in the original image.
Do you know if this is the proper way of computing the 2D FFT of an image? If so, what does the data within out represent? IE Where are the high and low frequency values in the array?
Thanks,
djs22

A 2D FFT is equivalent to applying a 1D FFT to each row of the image in one pass, followed by 1D FFTs on all the columns of the output from the first pass.
The output of a 2D FFT is just like the output of a 1D FFT, except that you have complex magnitudes in x, y dimensions rather just a single dimension. Spatial frequency increases with the x and y index as expected.
There's a section in the FFTW manual (here) which covers the organisation of the real-to-complex 2D FFT output data, assuming that's what you're using.

It is.
Try to compute 2 plans:
plan1 = fftw_plan_dft_2d(image->rows, image->cols, in, fft, FFTW_FORWARD, FFTW_ESTIMATE);
plan2 = fftw_plan_dft_2d(image->rows, image->cols, fft, ifft, FFTW_BACKWARD, FFTW_ESTIMATE);
You'll obtain the original data in ifft.
Hope it helps :)

Related

Reconstruct image from eigenvectors obtained from solving the eigenfunction of Hamiltonian operator in matrix form

I have an Image I
I am trying to do Automatic Object Extraction using Quantum Mechanics
Each pixel in an image is considered as a potential field, V(x,y) and hence each wave (eigen) function represents a meaningful region.
2D Time-independent Sschrodinger's equation
Multiplying both sides by
We get,
Rewriting the Laplacian using Finite Difference approach
where Ni is the set of neighbours with index i, and |Ni| is the cardinality of, i.e. the number of elements in Ni
Combining the above two equations, we get:
where M is the number of elements in
Now,the left hand side of the equation is a measure of how similar the labels in a neighbourhood are, i.e. a measure of spatial coherence.
Now, for applying this to images, the potential V is given as the pixel intensities.
Here, V is the pixel intensities
The right hand side is a measure of how close the pixel values in a segment are to a constant value E.
Now, the wave functions can be numerically calculated by solving the eigenvectors of Hamiltonian operator in matrix form which is
for i = j
for
and elsewhere 0
Now, in this paper it is said that first we have to find the maximum and minimum eigenvalues and then calculate the eigenvectors with eigenvalues closest to a number of values regularly selected between the minimum and maximum eigenvalues. the number is 300.
I have calculated the 300 eigenvectors.
And then the absolute square of the eigenvectors are thresholded to obtain the segments.
Fine upto this part.
Now, how do I reconstruct the eigenvectors into a 2D image so as to get the potential segments in the image?

Are Wavelet Coefficients simply the pixel values of decomposed image in 2D Discrete Wavelet Transform

I've been working with Discrete Wavelet Transform, I'm new to this theory. I want to access and modify the wavelet coefficients of the decomposed image, Are those wavelet coefficients simply the pixel values of the decomposed image in 2D DWT?
This is for example the result of DWT Decomposition:
So, when I want to access and modify the Wavelet Coefficients, can I just iterate through the pixel values of above image? Thank you for your help.
No. The image is merely illustrative.
The image you are looking on does not exactly correspond to original coefficients. The original wavelet coefficients are real numbers. Unlike them, you are looking on their absolute values quantized into a range from 0 to 255.
It is not true that the coefficients were calculated as pairwise sums and differences of the input samples. The coefficients were calculated using two complementary filters. See the description here. However, it is essential that these coefficients were adjusted and it is no longer possible to synthesize the original image. If you need to synthesize the image, you cannot access the pixels of the referenced image.

Rotating 1d FFT to get 2D FFT?

I have a blurry image with a sharp edge and I want to use the profile of that sharp edge to estimate the point spread function (PSF) of the imaging system (assuming that it is symmetric). The profile of the edge gives me the "edge spread function" (ESF) and the derivative of that gives me the "line spread function" (LSF). I am trying to follow these directions that I found in an old paper on how to convert from the LSF to the PSF:
"If we form the one-dimensional Fourier transform of the LSF and rotate the resulting curve about its vertical axis, the surface thus generated proves to be the two-dimensional fourier transform of the PSF. Hence it is merely necessary to take a two-dimensional inverse Fourier transform to obtain the PSF"
I can't seem to get this to work. The 2D FFT of a PSF-like function (for example a 2d gaussian) has lots of alternative positive and negative values, but if I rotate a 1D FFT, I get concentric rings of positive or negative values and the inverse transform looks nothing like a point-spread function. Am I missing a step or misunderstanding something? Any help would be appreciated! Thanks!
Edit: Here is some code showing my attempt to follow the procedure described
;generate x array
x=findgen(1000)/999*50-25
;generate gaussian test function in 1D
;P[0] = peak value
;P[1] = centroid
;P[2] = sigma
;P[3] = base level
P=[1.0,0.0,4.0,0.0]
test1d=gaussian_1d(x,P)
;Take the FFT of the test function
fft1d=fft(test1d)
;create an array with the frequency values for the FFT array, following the conventions used by IDL
;This piece of code to find freq is straight from IDL documentation: http://www.exelisvis.com/docs/FFT.html
N=n_elements(fft1d)
T=x[1]-x[0] ;T = sampling interval
fftx=(findgen((N-1)/2)+1)
is_N_even=(N MOD 2) EQ 0
if (is_N_even) then $
freq=[0.0,fftx,N/2,-N/2+fftx]/(N*T) $
else $
freq=[0.0,fftx,-(N/2+1)+fftx]/(N*T)
;Create a 1000x1000 array where each element holds the distance from the center
dim=1000
center=[(dim-1)/2.0,(dim-1)/2.0]
xarray=cmreplicate(findgen(dim),dim)
yarray=transpose(cmreplicate(findgen(dim),dim))
rarray=sqrt((xarray-center[0])^2+(yarray-center[1])^2)
rarray=rarray/max(rarray)*max(freq) ;scale rarray so max value is equal to highest freq in 1D FFT
;rotate the 1d FFT about zero to get a 2d array by interpolating the 1D function to the frequency values in the 2d array
fft2d=rarray*0.0
fft2d(findgen(n_elements(rarray)))=interpol(fft1d,freq,rarray(findgen(n_elements(rarray))))
;Take the inverse fourier transform of the 2d array
psf=fft(fft2d,/inverse)
;shift the PSF to be centered in the image
psf=shift(psf,500,500)
window,0,xsize=1000,ysize=1000
tvscl,abs(psf) ;visualize the absolute value of the result from the inverse 2d FFT
I don't know IDL, but I think your problem here is that you're taking the FFT of signals that are centered, where by default the function expects 0-frequency components at the beginning of the array.
A quick search for the proper way to do this in IDL indicates the CENTER keyword is what you're looking for.
CENTER
Set this keyword to shift the zero-frequency component to the center of the spectrum. In the forward direction, the resulting Fourier transform has the zero-frequency component shifted to the center of the array. In the reverse direction, the input is assumed to be a centered Fourier transform, and the coefficients are shifted back before performing the inverse transform.
Without letting the FFT routine know where the center of your signal is, it will seem shifted by N/2. In the converse domain this is a strong phase shift that will appear as if values are alternating positive and negative.
Ok, looks like I have solved the problem. The main issue seems to be that I needed to use the absolute value of the FFT results, rather than the complex array that is returned by default. Using the /CENTER keyword also helped make the indexing of the FFT result much simpler than IDL's default. Here is the working version of the code:
;generate x array
x=findgen(1000)/999*50-25
;generate lorentzian test function in 1D
;P[0] = peak value
;P[1] = centroid
;P[2] = fwhm
;P[3] = base level
P=[1.0,0.0,2,0.0]
test1d=lorentzian_1d(x,P)
;Take the FFT of the test function
fft1d=abs(fft(test1d,/center))
;Create an array of frequencies corresponding to the FFT result
N=n_elements(fft1d)
T=x[1]-x[0] ;T = sampling interval
freq=findgen(N)/(N*T)-N/(2*N*T)
;Create an array where each element holds the distance from the center
dim=1000
center=[(dim-1)/2.0,(dim-1)/2.0]
xarray=cmreplicate(findgen(dim),dim)
yarray=transpose(cmreplicate(findgen(dim),dim))
rarray=sqrt((xarray-center[0])^2+(yarray-center[1])^2)
rarray=rarray/max(rarray)*max(freq) ;scale rarray so max value is equal to highest freq in 1D FFT
;rotate the 1d FFT about zero to get a 2d array by interpolating the 1D function to the frequency values in the 2d array
fft2d=rarray*0.0
fft2d(findgen(n_elements(rarray)))=interpol(fft1d,freq,rarray(findgen(n_elements(rarray))))
;Take the inverse fourier transform of the 2d array
psf=abs(fft(fft2d,/inverse,/center))
;shift the PSF to be centered in the image
psf=shift(psf,dim/2.0,dim/2.0)
psf=psf/max(psf)
window,0,xsize=1000,ysize=1000
tvscl,real_part(psf) ;visualize the resulting PSF
;Test the performance by integrating the PSF in one dimension to recover the LSF
psftotal=total(psf,1)
plot,x*sqrt(2),psftotal/max(psftotal),thick=2,linestyle=2
oplot,x,test1d

OpenCV: Efficient Difference-of-Gaussian

I am trying to implement difference of guassians (DoG), for a specific case of edge detection. As the name of the algorithm suggests, it is actually fairly straightforward:
Mat g1, g2, result;
Mat img = imread("test.png", CV_LOAD_IMAGE_COLOR);
GaussianBlur(img, g1, Size(1,1), 0);
GaussianBlur(img, g2, Size(3,3), 0);
result = g1 - g2;
However, I have the feeling that this can be done more efficiently. Can it perhaps be done in less passes over the data?
The question here has taught me about separable filters, but I'm too much of an image processing newbie to understand how to apply them in this case.
Can anyone give me some pointers on how one could optimise this?
Separable filters work in the same way as normal gaussian filters. The separable filters are faster than normal Gaussian when the image size is large. The filter kernel can be formed analytically and the filter can be separated into two 1 dimensional vectors, one horizontal and one vertical.
for example..
consider the filter to be
1 2 1
2 4 2
1 2 1
this filter can be separated into horizontal vector (H) 1 2 1 and vertical vector(V) 1 2 1. Now these sets of two filters are applied to the image. Vector H is applied to the horizontal pixels and V to the vertical pixels. The results are then added together to get the Gaussian Blur. I'm providing a function that does the separable Gaussian Blur. (Please dont ask me about the comments, I'm too lazy :P)
Mat sepConv(Mat input, int radius)
{
Mat sep;
Mat dst,dst2;
int ksize = 2 *radius +1;
double sigma = radius / 2.575;
Mat gau = getGaussianKernel(ksize, sigma,CV_32FC1);
Mat newgau = Mat(gau.rows,1,gau.type());
gau.col(0).copyTo(newgau.col(0));
filter2D(input, dst2, -1, newgau);
filter2D(dst2.t(), dst, -1, newgau);
return dst.t();
}
One more method to improve the calculation of Gaussian Blur is to use FFT. FFT based convolution is much faster than the separable kernel method, if the data size is pretty huge.
A quick google search provided me with the following function
Mat Conv2ByFFT(Mat A,Mat B)
{
Mat C;
// reallocate the output array if needed
C.create(abs(A.rows - B.rows)+1, abs(A.cols - B.cols)+1, A.type());
Size dftSize;
// compute the size of DFT transform
dftSize.width = getOptimalDFTSize(A.cols + B.cols - 1);
dftSize.height = getOptimalDFTSize(A.rows + B.rows - 1);
// allocate temporary buffers and initialize them with 0's
Mat tempA(dftSize, A.type(), Scalar::all(0));
Mat tempB(dftSize, B.type(), Scalar::all(0));
// copy A and B to the top-left corners of tempA and tempB, respectively
Mat roiA(tempA, Rect(0,0,A.cols,A.rows));
A.copyTo(roiA);
Mat roiB(tempB, Rect(0,0,B.cols,B.rows));
B.copyTo(roiB);
// now transform the padded A & B in-place;
// use "nonzeroRows" hint for faster processing
Mat Ax = computeDFT(tempA);
Mat Bx = computeDFT(tempB);
// multiply the spectrums;
// the function handles packed spectrum representations well
mulSpectrums(Ax, Bx, Ax,0,true);
// transform the product back from the frequency domain.
// Even though all the result rows will be non-zero,
// we need only the first C.rows of them, and thus we
// pass nonzeroRows == C.rows
//dft(Ax, Ax, DFT_INVERSE + DFT_SCALE, C.rows);
updateMag(Ax);
Mat Cx = updateResult(Ax);
//idft(tempA, tempA, DFT_SCALE, A.rows + B.rows - 1 );
// now copy the result back to C.
Cx(Rect(0, 0, C.cols, C.rows)).copyTo(C);
//C.convertTo(C, CV_8UC1);
// all the temporary buffers will be deallocated automatically
return C;
}
Hope this helps. :)
I know this post is old. But the question is interresting and may interrest future readers. As far as I know, a DoG filter is not separable. So there is two solutions left:
1) compute both convolutions by calling the function GaussianBlur() twice then subtract the two images
2) Make a kernel by computing the difference of two gaussian kernels then convolve it with the image.
About which solution is faster:
The solution 2 seems faster at first sight because it convolves the image only once.
But this does not involve a separable filter. On the contrary, the first solution involves two separable filter and may be faster finaly. (I do not know how the OpenCV function GaussianBlur() is optimised and whether it uses separable filters or not. But it is likely.)
However, if one uses FFT technique to convolve, the second solution is surely faster.
If anyone has any advice to add or wishes to correct me, please do.

gaussian blur with FFT

im trying to implement a gaussian blur with the use of FFT and could find here the following recipe.
This means that you can take the
Fourier transform of the image and the
filter, multiply the (complex)
results, and then take the inverse
Fourier transform.
I've got a kernel K, a 7x7 Matrix
and a Image I, a 512x512 Matrix.
I do not understand how to multiply K by I.
Is the only way to do that by making K as big as I (512x512) ?
Yes, you do need to make K as big as I by padding it with zeros. Also, after padding, but before you take the FFT of the kernel, you need to translate it with wraparound, such that the center of the kernel (the peak of the Gaussian) is at (0,0). Otherwise, your filtered image will be translated. Alternatively, you can translate the resulting filtered image once you are done.
Another point: for small kernels not using the FFT may actually be faster. A 2D Gaussian kernel is separable, meaning that you can separate it into two 1D kernels for x and y. Then instead of a 2D convolution, you can do two 1D convolutions in x and y directions in the spatial domain. For smaller kernels that may end up being faster than doing the convolution in the frequency domain using the FFT.
If you are comfortable with pixel shader and if FFT is not your main goal here, but convolution with gaussian blur kernel IS,- then i can recommend my tutorial on what convolution is
regards.

Resources