Symmetric part after applying FFT is for which frequencies? - signal-processing

I've got a 4096 samples long 44.1 kHz audio-clip. After applying the FFT to it I get 4096 frequency bands.
Each band would then span 10.77 Hz (44100 / 4096).
I've been told the 2nd half of the frequencies is conjugate symmetric to the first half.
Considering this is my calculation above correct or did I miss something important?

That's pretty much correct - for most common complex-to-complex FFTs with purely real inputs (i.e. all imaginary parts zero) the first N/2 output bins (0..2047 in your case) are typically the only bins that you will be interested in. The first bin is DC (0 Hz), and bin N/2 corresponds to Nyquist (Fs/2 = 22.05 kHz), which is not normally of interest. Bins above N/2 are just complex conjugate "mirror images" of the bottom N/2-1 bins.
See this answer for more details.

Related

Find High Frequencies with Discrete Fourier Transform [OpenCV]

I want to determine image sharpness by the amount of high frequencies within the image. As far as I understand the dft() function from OpenCV returns two matrices with real and complex numbers.
This is where I am stuck. How can I determine the amount of high frequencies from this data?
I am thankful for every hint/link which could provide me with a better understanding.
Greetings
Make FT
Calculate magnitude of result
Now you have 2D matrix. Consider upper left quadrant (other are mirrors for real source).
Here Magn[0][0] entry corresponds to zero frequency, and Magn[(n-1)/2][(n-1)/2] entry corresponds to the highest frequency.
Left upper part of this submatrix contains low-frequency samples, so you can calculate sum of values in this part and in the rest part and compare these sums. For example (pseudocode):
cvIntegral(Magn, Rect(0..n/4, 0..n/4)) compare with
cvIntegral(Magn, Rect(0..n/2, 0..n/2)) - cvIntegral(Magn, Rect(0..n/4, 0..n/4))

SIFT parabola fitting of histogram

I am implementing Lowe's method, "SIFT", for finding and describing features in an image.
I have found interest points, and now I have to describe them: Using Lowe's method, I have calculated the magnitude and gradient in an area around the keypoint, and created a Gauss weighted histogram, with 36 bins, each corresponding to an orientation of 10 degrees. For each keypoint, there is a histogram. Each bin is the sum of the weighted magnitude, in that direction. An example taken from aishack.in: http://www.aishack.in/static/img/tut/sift-orientation-histogram.jpg
Bins within 80% the size of the maximum bin, is made a new keypoint. After describing, it says in the paper: "Finally, a parabola is fit to the 3 histogram values closest to each peak to interpolate the peak position for better accuracy". I am not sure i get this.
In my understanding, it means the peak, the left, and the right value of that peak, will have a parabola fit, like this(be warned! Drawn free hand)
http://i.stack.imgur.com/7V8pb.jpg
and the orientation of the keypoint will be where the extremum of the parabola is. For instance: If the parabola fitted at 10-19, 20-29, and 30-39 (with 20-29 being the histogram peak), had extremum at a point, that reached in the 30-39, then this would be the orientation of that keypoint. Am i understanding this correctly? In this way, the orientation of the keypoint, can only be within 36 orientations
Another option: Same idea as above, only the histogram is no longer discrete: the extremum of the parapola will thus be a continuous value, and this value is assigned to the keypoint.
The idea of the parabola fitting is to find the peak with better than bin resolution. As you see in your example, the peak is at 20-29 (average 24.5) but the 10-19 bin is higher than the 30-39 bin. It's therefore likely that the precise peak should be below 24.5.
You can't have a non-discrete histogram, that defeats the point of a histogram. What you can have is overlapping bins: create a bin for 20-29, but also a bin for 21-30 and 22-31 etc. So the value 24 would map to 10 bins, from 15-24 to 24-35.
And when you increment a bin, you don't necessarily need to increment it by 1. You can also increment a bin by a variable amount, e.g. the distance from the given value to the edge of the bin. So 24 would add 1 to bin 16-25 but 4 to bin 20-29.

Calculate autocorrelation in time domain vs FFT

I'm doing a pitch detection using a combination of an ACF and AMDF.
First I was using ACF in the time domain like this:
Get a buffer of 2048 samples
Window it (Hamming window)
sum=Sum(Buffer[i]*Buffer[i+lag]) for all i < 2048 - lag
acf = sum / 2048
And repeat the last 2 steps for all lags to be considered. (actually doing interpolation for non-integer lags)
Now I found that you can use FFT to calculate the ACF:
Get a buffer of 2048 samples
Window it (Hamming window)
fftBuf=fft(buffer)
buffer[i]=real(fftBuf[i])^2+imag(fftBuf[i])^2
fftBuf=fft(buffer) //ifft=fft for real signals
acfBuf = real(fftBuf) / 2048
Then actBuf[lag] is the ACF value at that lag.
I expected that the results will be the same or at least similar. But they are not.
E.g for a 65.4Hz Sine wave (note C2) I get ~0.2 for a the corresponding lag of 674.25 using the time-domain approach and ~536.795 using the fft.
What did I miss? Or isn't both the same?

Cepstrum analysis: how to find quefrency step?

I'm newbie in cepstrum analysis. So that's the question.
I have signal with the length 4096 and sample rate 8000 Hz. I make FFT and get the array with the length 4096*2 (2*i position is for cosinus coeff, 2*i+1 position is for sinus coeff). Frequency step is (sampleRate/signalLength == 8000/4096). So, I can calculate frequency at i position this way: i*sampleRate/signalLength.
Then, I make the cepstrum transformation. I can't understand how to find quefrency step and how to find frequency for given quefrency.
The bin number of an FFT result is inversely proportional to the length of the period of a sinusoidal component in the time domain. The bin number of a quefrency result is also inversely proportional to the distance between partials in a series of overtones in the frequency domain (this distance often the same as a root or fundamental pitch). Thus quefrency bin number would be proportional to period or repeat lag (autocorrelation peak) of a harmonically rich periodic signal in the time domain.

what is the PSD unit by using FFT method

I'm just doing a power spectral density analysis of a signal in time domain. I'm following the fft method described in :
http://www.mathworks.com/support/tech-notes/1700/1702.html
It gives the real physical unit for the PSD. However, the unit is "power", is that mean "V^2/Hz"?
If I take 10*log10(power) or 10*log10(V^2/Hz), do I get the unit of "dB/Hz"?
Then how can I convert it to dBm/MHz?
It depends on the unit of your timeseries. Often we think of this as just "amplitude", but if your timeseries is a series of voltage amplitude vs. time, then your PSD estimate will be Volts^2/Hz. This is because the PSD is the Fourier Transform of the autocorrelation of your original signal: The autocorrelation has units of Volts^2, and running it through the Fourier Transform decomposes these units over frequency, instead of time, resulting in units of Volts^2/Hz. This is commonly referred to as Watts/Hz, but the conversion from Volts^2 to Watts is not very physically meaningful, as W = V^2/R.
10*log10(power) will result in a unit of dB/Hz, but remember that decibels are always a comparison between two power levels; you are quantifying a ratio of powers. A better definition of decibels is 10*log10(P1/P0), as explained here. If you simply plug a PSD bin estimate into this equation, you are setting your PSD bin to P1 and implicitly comparing it to a P0 value of 1. This may be what you want, and it may not be. For visualization purposes, this is fairly typical, but if you have a standard reference power you should be comparing to, you should use that for P0 instead.
Assuming that you are attempting to plot a dB Power Spectral Density estimate, to convert from Hz to MHz, you simple rescale the x-axis of your frequency graph. Remember that a MHz is just 1 million Hz, so the only difference is that 240000Hz = 0.24MHz
EDIT
The point brought up by mtrw is a very valid one; if you are dealing with large amounts of data and are averaging FFT vectors, I highly suggest the Multitaper method; it's a much more statistically sound method of sacrificing frequency resolution for greater confidence on your PSD estimate.
If you have a PSD in W/Hz i.e. 100 W/Hz then you have 50 dBm/Hz. dB/Hz or is often vaguely and generically used instead of dBm/Hz. Audacity uses dB as shorthand for dBFS (not dBFS/Hz, because it is computing a DFT, and discrete frequencies use a power spectrum and not a density) . A digital signal that reaches 50% of the maximum level has an amplitude of −6 dBFS, which is 6 dB below full scale – the removal of the MSB, hence the 6dB/bit figure (because 50% of maximum level is 25% of maximum power; 1/4 = - 6dB)
dBm is the logarithmic ratio of the power with respect to 1mW, you divide the power by 1mW to get a unitless ratio, and then take the logarithm to get dB units, which in this case makes more sense to be clarified as dBm.
dBc/Hz is the ratio with respect to the carrier power, which is a ratio of two dBm/Hz values, meaning you subtract them and you get dBc/Hz; you get the same result if you divide the two linear power levels in W and then convert the ratio to dB (or more appropriately dBc).
dB-Hz is a logarithmic measure of bandwidth with respect to 1Hz and
dBJ is a measure of spectral density as a logarithmic ratio to 1 joule, seeing as W/Hz is indeed J.
Power spectral density is a density function, so you need to integrate it to get the actual quantity, like a line Integral of a V/m electric field, or a probability density of probability per x. This does not make sense for discrete quantities and instead the power spectrum is used akin to a probability mass function. If you see dB (which should be used for the discrete frequency domain) instead of dBm/Hz then it's wrong, but if you see it instead of dBm then it's right, as long as it's made clear what the reference is.

Resources