If the maximum magnitude M = A^2 + B^2 of DFT transform corresponds to the frequency F,
(A - real, B - imaginary output of DFT, at frequency F)
then is it correct to do the following:
for (j = 0; j < size; ++j) {
data[j] -= (A*cos(2*PI*F*j/dfts) -
B*sin(2*PI*F*j/dfts)) / dfts;
}
In order to cancel (subtract) that frequency influence from original wave data?
The data is assumed to be a sum of several sines and cosines with different frequencies and multiplied by different coeffitients.
EDIT1:
I could achieve the cancelation by subtracting, and the result is correct. There was a mistake in above, but it is possible to do it. If interested I can post the way of doing it.
EDIT2:
And if you do the next DFT transform, you will get a very small, near zero value at the A and B values.
But you need to remember, that the original data can be the sum of 4 frequencies (sines and cosines) but the DFT transform will give you as much as the size of DFT is divided by 2.
No - that won't work. It could only work if the time domain component matched the FFT bin frequency exactly and the phase of the component is constant throughout the sample window, and even if this were the case you'd still need to take care of phase in your subtraction.
Ideally you need to remove (i.e. zero) the component in the frequency domain and then do an inverse FFT. Note that you probably don't want to just zero the bin of interest in the frequency domain, as this will produce artefacts in the time domain after you inverse FFT - you'll need to apply a window function to the bin of interest and adjacent bins.
Related
While learning an image denoising technique based on bilateral filter, I encountered this tutorial which provides with full lists of arguments used to run OpenCV's bilateralFilter function. What I see, it's slightly confusing, because there is no explanation about a mathematical rule to alter the diameter value by manipulating both the sigma arguments. So, if picking some specific arguments to pass into that function, I realize hardly what diameter corresponds with a particular couple of sigma values.
Does there exist a dependency between both deviations and the diameter? If my inference is correct, what equation (may be, introduced in OpenCV documentation) is to be referred if applying bilateral filter in a program-based solution?
According to the documentation, the bilateralFilter function in OpenCV takes a parameter d, the neighborhood diameter, as well as a parameter sigmaSpace, the spatial sigma. They can be selected separately, but if d "is non-positive, it is computed from sigmaSpace." For more details we need to look at the source code:
if( d <= 0 )
radius = cvRound(sigma_space*1.5);
else
radius = d/2;
radius = MAX(radius, 1);
d = radius*2 + 1;
That is, if d is not positive, then it is taken as 3 times sigmaSpace. d is also always forced to be odd, so that there is a central pixel in the neighborhood.
Note that the other sigma, sigmaColor, is unrelated to the spatial size of the filter.
In general, if one chooses a sigmaSpace that is too large for the given d, then the Gaussian kernel will be cut off in a way that makes it not appear like a Gaussian, and loose its nice filtering properties (see for example here for an explanation). If it is taken too small for the given d, then many pixels in the neighborhood will always have a near-zero weight, meaning that computational work is wasted. The default value is rather small (one typically uses a radius of 3 times sigma for Gaussian filtering), but is still quite reasonable given the computational cost of the bilateral filter (a smaller neighborhood is cheaper).
These two value (d and sigma) are totally unrelated to each other. Sigma determines the values of the pixels of the kernel, but d determines the size of the kernel.
For example consider this Gaussian filter with sigma=1:
It's a filter kernel and and as you can see the pixel values of the kernel only depends on sigma (the 3*3 matrix in the middle is equal in both kernel), but reducing the size of the kernel (or reducing the diameter) will make the outer pixels ineffective without effecting the values of the middle pixels.
And now if you change the sigma, (with k=3) the kernel is still 3*3 but the pixels' values would be different.
dct don't the conversion properly in opencv.
imf = np.float32(block)
dct = cv2.dct(imf)
[[154,123,123,123,123,123,123,136],
[192,180,136,154,154,154,136,110],
[254,198,154,154,180,154,123,123],
[239,180,136,180,180,166,123,123],
[180,154,136,167,166,149,136,136],
[128,136,123,136,154,180,198,154],
[123,105,110,149,136,136,180,166],
[110,136,123,123,123,136,154,136]]
this block of an image,when converting with code shown above
[162.3 ,40.6, 20.0...
[30.5 ,108.4...
this should be the result,
[1186.3 , 40.6, 20.0...
[30.5, 108.4 ....
but I found this Result. for sample block, https://www.math.cuhk.edu.hk/~lmlui/dct.pdf
The DCT is working fine. The difference between what you got and what you expect is because that particular example given actually does the DFT on M instead of on the original image, I. In this case, as the paper shows, M = I - 128. The only difference in your example is that you don't subtract off that piece, so the values are all larger. In a cosine or Fourier transform, the first coefficient (the "DC offset" as it is sometimes called) has a higher value because your image values are just greater. But that's why all the other coefficients are the same. If you take an image and you simply add some or subtract some from the entire image equally, the coefficients of the transform will be the same, except the very first one.
From the standard definition of the DCT:
You can see here that for the first coefficient with k = 0, that inside the cosine function, you just get 0, and cos(0) = 1. Thus, X_0 as it's shown in this picture is just the sum of all the x_n values. Generally this value may be scaled by something relating to N so that it's something like an average. When doing so, it relates back to the X_0 term being a "DC offset" which you'll see described as the "mean value of the signal," or in other words, how far the signal is from 0. This is super useful to have as one of the cosine/Fourier transform coefficients as it then can completely describe a signal; all the other coefficients describe the frequency content and so they say nothing about how far the values are from 0, but the first coefficient, the DC offset, does tell you the shift!
I want to do feature scaling datasets by using means and standard deviations, and my code is below; but apparently it is not a univerisal code, since it seems only work with one dataset. Thus I am wondering what is wrong with my code, any help will be appreciated! Thanks!
X is the dataset I am currently using.
mu = mean(X);
sigma = std(X);
m = size(X, 1);
mu_matrix = ones(m, 1) * mu;
sigma_matrix = ones(m, 1) * sigma;
featureNormalize = (X-mu_matrix)/sigma;
Thank you for clarifying what you think the code should be doing in the comments.
My answer will effectively answer why what you think is happening is not what is happening.
First let's talk about the mean and std functions. When their input is a vector (whether this is vertically or horizontally aligned), then this will return a single number which is the mean or standard deviation of that vector respectively, as you might expect.
However, when the input is a matrix, then you need to know what it does differently. Unless you specify the direction (dimension) in which you should be calculating means / std, then it will calculate means along the rows, i.e. returning a single number for each column. Therefore, the end-result of this operation will be a horizontal vector.
Therefore, both mu and sigma will be horizontal vectors in your code.
Now let's move on to the 'matrix multiplication' operator (i.e. *).
When using the matrix multiplication operator, if you multiply a horizontal vector with a vertical vector (i.e. the usual matrix multiplication operation), your output is a single number (i.e. a scalar). However, if you reverse the orientations, as in, you multiply a vertical vector by a horizontal one, you will in fact be calculating a 'Kronecker product' instead. Since the output of the * operation is completely defined by the rows of the first input, and the columns of the second input, whether you're getting a matrix multiplication or a kronecker product is implicit and entirely dependent on the orientation of your inputs.
Therefore, in your case, the line mu_matrix = ones(m, 1) * mu; is not in fact appending a vector of ones, like you say. It is in fact performing the kronecker product between a vertical vector of ones, and the horizontal vector that is your mu, effectively creating an m-by-n matrix with mu repeated vertically for m rows.
Therefore, at the end of this operation, as the variable naming would suggest, mu_matrix is in fact a matrix (same with sigma_matrix), having the same size as X.
Your final step is X- mu_sigma, which gives you at each element, the difference between x and mu at that element. Then you "divide" with the sigma matrix.
Here is why I asked if you were sure you should be using ./ instead of /.
/ is the matrix division operator. With / You are effectively performing matrix multiplication by an inverse matrix, since D / S is mathematically equivalent to D * inv(S). It seems to me you should be using ./ instead, to simply divide each element by the standard deviation of that column (which is why you had to repeat the horizontal vector over m rows in sigma_matrix, so that you could use it for 'elementwise division'), since what you are trying to do is to normalise each row (i.e. observation) of a particular column, by the standard deviation that is specific to that column (i.e. feature).
I want to determine image sharpness by the amount of high frequencies within the image. As far as I understand the dft() function from OpenCV returns two matrices with real and complex numbers.
This is where I am stuck. How can I determine the amount of high frequencies from this data?
I am thankful for every hint/link which could provide me with a better understanding.
Greetings
Make FT
Calculate magnitude of result
Now you have 2D matrix. Consider upper left quadrant (other are mirrors for real source).
Here Magn[0][0] entry corresponds to zero frequency, and Magn[(n-1)/2][(n-1)/2] entry corresponds to the highest frequency.
Left upper part of this submatrix contains low-frequency samples, so you can calculate sum of values in this part and in the rest part and compare these sums. For example (pseudocode):
cvIntegral(Magn, Rect(0..n/4, 0..n/4)) compare with
cvIntegral(Magn, Rect(0..n/2, 0..n/2)) - cvIntegral(Magn, Rect(0..n/4, 0..n/4))
I've been experimenting with a few different techniques that I can find for a freq shifting (specifically I want to shift high freq signals to a lower freq). At the moment I'm trying to use this technique -
take the original signal, x(t), multiply it by: cos(2 PI dF t), sin(2
PI dF t)
R(t) = x(t) cos(2 PI dF t)
I(t) = x(t) sin(2 PI dF t)
where dF is the delta frequency to be shifted.
Now you have two time series signals: R(t) and I(t).
Conduct complex Fourier transform using R(t) as real and I(t) as
imaginary parts. The results will be frequency shifted spectrum.
I have interpreted this into the following code -
for(j=0;j<(BUFFERSIZE/2);j++)
{
Partfunc = (((double)j)/2048);
PreFFTShift[j+x] = PingData[j]*(cos(2*M_PI*Shift*(Partfunc)));
PreFFTShift[j+1+x] = PingData[j]*(sin(2*M_PI*Shift*(Partfunc)));
x++;
}
//INITIALIZE FFT
status = arm_cfft_radix4_init_f32(&S, fftSize, ifftFlag, doBitReverse);
//FFT on FFTData
arm_cfft_radix4_f32(&S, PreFFTShift);
This builds me an array with interleaved real and imag data and then FFT. I then inverse the FFT, but the output im getting is pretty garbled. Results seem huge in comparison to what I think they should be, and although there are a few traces of a freq shifted signal, its hard to tell as the result seems mostly pretty noisy.
I've also attempted simply revolving the array values of a standard FFT of my original signal to get a freq shift, but to no avail. Is there a better method for doing this?
have you tried something like:
Use a Hanning window for each framed data
Once you have your windowed frame of audio data, you do an FFT on it
Do some kind of transformation in the frequency domain (you can use
Flanagan - phase vocoder)
Now you need to go back to the time domain with an IFFT
Apply Hanning window in the IFFT data
Use overlap-add at each new frame of time-domain data into the output
stream
My results:
I created two concatenated sinusoids (250Hz and 400Hz) and move one octave UP!
Blue waveform is the original and red was changed, you can see one fadeIN-fadeOut caused by overlap add and hann window !
If you want the frequency shift to sound more "natural", you will have to maintain the ratios between all the initial frequency bins, where the amount of shift will depend on the FFT bin, thus requiring lots of interpolation. The Phase Vocoder algorithm will use multiple FFTs to reduce phase distortion in the result.