what is the PSD unit by using FFT method - signal-processing

I'm just doing a power spectral density analysis of a signal in time domain. I'm following the fft method described in :
http://www.mathworks.com/support/tech-notes/1700/1702.html
It gives the real physical unit for the PSD. However, the unit is "power", is that mean "V^2/Hz"?
If I take 10*log10(power) or 10*log10(V^2/Hz), do I get the unit of "dB/Hz"?
Then how can I convert it to dBm/MHz?

It depends on the unit of your timeseries. Often we think of this as just "amplitude", but if your timeseries is a series of voltage amplitude vs. time, then your PSD estimate will be Volts^2/Hz. This is because the PSD is the Fourier Transform of the autocorrelation of your original signal: The autocorrelation has units of Volts^2, and running it through the Fourier Transform decomposes these units over frequency, instead of time, resulting in units of Volts^2/Hz. This is commonly referred to as Watts/Hz, but the conversion from Volts^2 to Watts is not very physically meaningful, as W = V^2/R.
10*log10(power) will result in a unit of dB/Hz, but remember that decibels are always a comparison between two power levels; you are quantifying a ratio of powers. A better definition of decibels is 10*log10(P1/P0), as explained here. If you simply plug a PSD bin estimate into this equation, you are setting your PSD bin to P1 and implicitly comparing it to a P0 value of 1. This may be what you want, and it may not be. For visualization purposes, this is fairly typical, but if you have a standard reference power you should be comparing to, you should use that for P0 instead.
Assuming that you are attempting to plot a dB Power Spectral Density estimate, to convert from Hz to MHz, you simple rescale the x-axis of your frequency graph. Remember that a MHz is just 1 million Hz, so the only difference is that 240000Hz = 0.24MHz
EDIT
The point brought up by mtrw is a very valid one; if you are dealing with large amounts of data and are averaging FFT vectors, I highly suggest the Multitaper method; it's a much more statistically sound method of sacrificing frequency resolution for greater confidence on your PSD estimate.

If you have a PSD in W/Hz i.e. 100 W/Hz then you have 50 dBm/Hz. dB/Hz or is often vaguely and generically used instead of dBm/Hz. Audacity uses dB as shorthand for dBFS (not dBFS/Hz, because it is computing a DFT, and discrete frequencies use a power spectrum and not a density) . A digital signal that reaches 50% of the maximum level has an amplitude of −6 dBFS, which is 6 dB below full scale – the removal of the MSB, hence the 6dB/bit figure (because 50% of maximum level is 25% of maximum power; 1/4 = - 6dB)
dBm is the logarithmic ratio of the power with respect to 1mW, you divide the power by 1mW to get a unitless ratio, and then take the logarithm to get dB units, which in this case makes more sense to be clarified as dBm.
dBc/Hz is the ratio with respect to the carrier power, which is a ratio of two dBm/Hz values, meaning you subtract them and you get dBc/Hz; you get the same result if you divide the two linear power levels in W and then convert the ratio to dB (or more appropriately dBc).
dB-Hz is a logarithmic measure of bandwidth with respect to 1Hz and
dBJ is a measure of spectral density as a logarithmic ratio to 1 joule, seeing as W/Hz is indeed J.
Power spectral density is a density function, so you need to integrate it to get the actual quantity, like a line Integral of a V/m electric field, or a probability density of probability per x. This does not make sense for discrete quantities and instead the power spectrum is used akin to a probability mass function. If you see dB (which should be used for the discrete frequency domain) instead of dBm/Hz then it's wrong, but if you see it instead of dBm then it's right, as long as it's made clear what the reference is.

Related

How does image digitalization differ from sound digitalization (PCM)?

I am trying to understand digitalization of sound and images.
As far as I know, they both need to convert analog signal to digital signal. Both should be using sampling and quantization.
Sound: We have amplitudes on axis y and time on axis x. What is on axis x and y during image digitalization?
What is kind of standard of sample rate for image digitalization? It is used 44kHz for CDs (sound digitalization). How exactly is used sample rate for images?
Quantization: Sound - we use bit-depth - which means levels of amplitude - Image: using bit-depth also, but it means how many intesities are we able to recognize? (is it true?)
What are other differences between sound and image digitalization?
Acquisition of images can be summarized as a spatial sampling and conversion/quantization steps. The spatial sampling on (x,y) is due to the pixel size. The data (on the third axis, z) is the number of electrons generated by photoelectric effect on the chip. These electrons are converted to ADU (analog digital unit) and then to bits. What is quantized is the light intensity in level of greys, for example data on 8 bits would give 2^8 = 256 levels of gray.
An image loses information both due to the spatial sampling (resolution) and the intensity quantization (levels of gray).
Unless you are talking about videos, images won't have sampling in units of Hz (1/time) but in 1/distance. What is important is to verify the Shannon-Nyquist theorem to avoid aliasing. The spatial frequencies you are able to get depend directly on the optical design. The pixel size must be chosen respectively to this design to avoid aliasing.
EDIT: On the example below I plotted a sine function (white/black stripes). On the left part the signal is correctly sampled, on the right it is undersampled by a factor of 4. It is the same signal, but due to bigger pixels (smaller sampling) you get aliasing of your data. Here the stripes are horizontal, but you also have the same effect for vertical ones.
There is no common standard for the spatial axis for image sampling. A 20 megapixel sensor or camera will produce images at a completely different spatial resolution in pixels per mm, or pixels per degree angle of view than a 2 megapixel sensor or camera. These images will typically be rescaled to yet another non-common-standard resolution for viewing (72 ppi, 300 ppi, "Retina", SD/HDTV, CCIR-601, "4k", etc.)
For audio, 48k is starting to become more common than 44.1ksps. (on iPhones, etc.)
("a nice thing about standards is that there are so many of them")
Amplitude scaling in raw format also has no single standard. When converted or requantized to storage format, 8-bit, 10-bit, and 12-bit quantizations are the most common for RGB color separations. (JPEG, PNG, etc. formats)
Channel formats are different between audio and image.
X, Y, where X is time and Y is amplitude is only good for mono audio. Stereo usually needs T,L,R for time, left, and right channels. Images are often in X,Y,R,G,B, or 5 dimensional tensors, where X,Y are spatial location coordinates, and RGB are color intensities at that location. The image intensities can be somewhat related (depending on gamma corrections, etc.) to the number of incident photons per shutter duration in certain visible EM frequency ranges per incident solid angle to some lens.
A low-pass filter for audio, and a Bayer filter for images, are commonly used to make the signal closer to bandlimited so it can be sampled with less aliasing noise/artifacts.

Is pixel value normalization needed in medical image segmentation?

I have a dataset of CT-Scan representing hips scan. I'm currently not normalizing the pixel value because in CT-Scan pixel value represent different part of the scan (bone 1000+, water0, air-1000, etc). Also the range of pixels value change every scan (ex. -500:1500, -400:1200).
I'm wondering if normalizing pixel value between [0,1] would be a + for my training or I would lost information on the relation between int pixel value and segmentation truth.
Thanks for the answers
It depends a little on your data. What you are describing are so called Hounsfield Units (probably read up on that), you basically express every intensity relative to the one of water.
Bone density (and with that the corresponding intensity) can vary greatly, not to mention if there is metal present.
Your HU range is highly dependent on the body region and mainly the patient.
https://images.app.goo.gl/WNLCs8eENTdbXWwM7
CT Scans are usually uint16 grayscale, I would definitely normalize as long as you can ensure that your float range is sufficient to accommodate the 2^16 different grayscale values.

Strange FFT spectrum from a near perfect sinusoid

I have retrieved some signal in my Abaqus simulation for verification purpose. The true signal shall be a perfect sinusoid at 300kHz and I performed fft on the sampled signal using scipy.fftpack.fft.
But I got a strange spectrum as shown below (sorry that I am too lazy to scale the x-axis of the spectrum to the correct frequency). In the same figure, I sliced the signal into pieces and plotted in the time domain. I also repeated the same process for a pure sine wave.
This totally surprises me. As indicated below in the code, sampling frequency is 16.66x of the frequency of the signal. At the moment, I think it is due to the very little error in the sampling period. In theory, Abaqus shall sample it in a regular time interval. As you can see, there is some little error so that the dots in my signal appear to be thicker than the perfect signal. But does such a small error give a striking difference in the frequency spectrum? Otherwise, why is the frequency spectrum like that?
FYI1: This is the magnified fft spectrum of my signal:
FYI2: This is the python code that was used to produce the above figures
def myfft(x, k, label):
plt.plot(np.abs(fft(x))[0:k], label = label)
plt.legend()
plt.subplot(4,1,1)
for i in range(149800//200):
plt.plot(mysignal[200*i:200*(i+1)], 'bo')
plt.subplot(4,1,2)
myfft(mysignal,150000//2, 'fft of my signal')
plt.subplot(4,1,3)
[Fs,f, sample] = [5e6,300000, 150000]
x = np.arange(sample)
y = np.sin(2 * np.pi * f * x / Fs)
for i in range(149800//200):
plt.plot(y[200*i:200*(i+1)], 'bo')
plt.subplot(4,1,4)
myfft(y,150000//2, 'fft of a perfect signal')
plt.subplots_adjust(top = 2, right = 2)
FYI3: Here is my signal in .npy and .txt format. The signal is pretty long. It has 150001 points. The .txt one is the raw file from Abaqus. The .npy format is what I used to produce the above plot - (1) the time vector is removed and (2) the data is in half precision and normalized.
Any standard FFT algorithm you use operates on the assumption that the signal you provide is uniformly sampled. Uniform in this context means equally spaced in time. Your signal is clearly not uniformly sampled, therefore the FFT does not "see" a perfect sine but a distorted version. As a consequence you see all these additional spectral components the FFT computes to map your distorted signal to the frequency domain. You have two options now. Resample your signal i.e. it is uniformly sampled and use your off the shelf FFT or take a non-uniform FFT to get your spectrum. Here is one library you could use to calculate your non-uniform FFT.

Find High Frequencies with Discrete Fourier Transform [OpenCV]

I want to determine image sharpness by the amount of high frequencies within the image. As far as I understand the dft() function from OpenCV returns two matrices with real and complex numbers.
This is where I am stuck. How can I determine the amount of high frequencies from this data?
I am thankful for every hint/link which could provide me with a better understanding.
Greetings
Make FT
Calculate magnitude of result
Now you have 2D matrix. Consider upper left quadrant (other are mirrors for real source).
Here Magn[0][0] entry corresponds to zero frequency, and Magn[(n-1)/2][(n-1)/2] entry corresponds to the highest frequency.
Left upper part of this submatrix contains low-frequency samples, so you can calculate sum of values in this part and in the rest part and compare these sums. For example (pseudocode):
cvIntegral(Magn, Rect(0..n/4, 0..n/4)) compare with
cvIntegral(Magn, Rect(0..n/2, 0..n/2)) - cvIntegral(Magn, Rect(0..n/4, 0..n/4))

Cepstrum analysis: how to find quefrency step?

I'm newbie in cepstrum analysis. So that's the question.
I have signal with the length 4096 and sample rate 8000 Hz. I make FFT and get the array with the length 4096*2 (2*i position is for cosinus coeff, 2*i+1 position is for sinus coeff). Frequency step is (sampleRate/signalLength == 8000/4096). So, I can calculate frequency at i position this way: i*sampleRate/signalLength.
Then, I make the cepstrum transformation. I can't understand how to find quefrency step and how to find frequency for given quefrency.
The bin number of an FFT result is inversely proportional to the length of the period of a sinusoidal component in the time domain. The bin number of a quefrency result is also inversely proportional to the distance between partials in a series of overtones in the frequency domain (this distance often the same as a root or fundamental pitch). Thus quefrency bin number would be proportional to period or repeat lag (autocorrelation peak) of a harmonically rich periodic signal in the time domain.

Resources