N step fft in D language - signal-processing

I am using fft function from std.numeric
Complex!double[] resultfft = fft(timeDomainAmplitudeVal);
The parameter timeDomainAmplitudeVal is audio amplitude data. Sample rate 44100 hz and there is 131072(2^16) samples
I am seeing that resultfft has the same size as timeDomainAmplitudeVal(131072) which does not fits my project(also makes no sense) . I need to be able to divide FFT to N equally spaced frequencies. And I need this N to be defined by me .
Is there anyway to implement this with std.numeric.fft or can you have any advices for fft library?
Ps: I will be glad to hear if some DSP libraries exist also

That's just how Fourier transforms work in the practical number-crunching world. Give S samples of signal, get S amplitudes. (Ignoring issues with complex numbers and symmetries.)
If you want N amplitudes, you'll have to interpolate the S-points amplitudes you get from FFT. Your biggest decision is to choose between linear, cubic, truncated sinc, etc.
Altnernative: resample the original audio signal to have your desired N samples in the same overall time interval. Then FFT it.

take a look at pfft, a fast FFT written in D.
http://jerro.github.io/pfft/doc/pfft.pfft.html
or numpy & Pyd
http://docs.scipy.org/doc/numpy/reference/routines.fft.html
http://pyd.dsource.org/
HTH

This is absolutely normal that the FFT gives the same data length.
Here some C++ code to perform windows FFT analysis with overlap and optional "zero-phase" ordering. http://pastebin.com/4YKgbed1
What do FFT coefficients mean?
Question: "OK so I've done the FFT and I'm said I can recover the original signal. Now, what are these coefficients."
Answer: "You can think of coefficient i as representing the phase and amplitude of frequencies from SR*i/(2*N) to SR*(i+1)/(2*N). This is a helpful metaphor. But a more accurate view is that coefficient i is the contribution of a sine of frequency SR*i/(2*N) in a reconstruction of the original input chunk."

Related

How would I break down a signals sound pressure level by frequency

I've been given some digitized sound recordings and asked to plot the sound pressure level per Hz.
The signal is sampled at 40KHz and the units for the y axis are simply volts.
I've been asked to produce a graph of the SPL as dB/Hz vs Hz.
EDIT: The input units are voltage vs time.
Does this make sense? I though SPL was a time domain measure?
If it does make sense how would I go about producing this graph? Apply the dB formula (20 * log10(x) IIRC) and do an FFT on that or...?
What you're describing is a Power Spectral Density. Matlab, for example, has a pwelch function that does literally what you're asking for. To scale to dBSPL/Hz, simply apply 10*log10([psd]) where psd is the output of pwelch. Let me know if you need help with the function inputs.
If you're working with a different framework, let me know which, 100% sure they'll have a version of this function, possibly with a different output format in which case the scaling might be different.

Is there a fundamental difference between DSP for Audio Signal Processing and Sensor Signal Processing?

Audio is made up of multiple frequencies occurring at any given time, and we can perform the FFT to get the Frequency bins, but what does the concept of Frequency mean when it comes to Sensor data?
For example, a Triaxial Accelerometer somehow converts a voltage signal and produces acceleration readings in ms^-2. Is the FFT performed with those X,Y,Z readings or the voltages sampled at Fs.
Am I overcomplicating things or is there a difference in mindset when performing DSP for Audio vs Sensor data?
A Fourier transform is tool to convert functions or signals into something that is easier to work with. It is a mathematical tool. The result can have an easy physical interpretation but that is not always the case.
Assume you have an object with constant mass and several periodic sin-like forces F_1*sin(c*t), F_2*sin(d*t), ... that act on the object. The total force is just the sum of those forces:
F(t) = F_1*sin(c*t) + F_2*sin(d*t) + ...
You get the particle's acceleration using Newton's 2nd law:
m * a(t) = F(t)
=> a(t) = F(t) / m = F_1/m * sin(c*t) + F_2/m * sin(d*t) + ...
Let's assume you measured a(t) but don't know the equation above. It you perform a Fourier transformation you can calculate the values of F_1/m, F_2/m, ... . That means your Fourier transform of the the acceleration is the amplitude of the force at the given frequency over the object's mass.
This interpretation works because the Fourier transform is linear and so is the adding of forces (See Newtons 2nd law). If you describe something non-linear chances are that there is no easy interpretation of the result of the transformation.
So when do you perform the FFT? It depends:
If you do it to improve you signal (remove noise) do it on the measured data.
If you want to analyse the physical object (resonances) do it on the acceleration.
(If the conversion is linear (ADC output to m/s^2 is a simple multiplication) it does not matter.)
I hope this makes things a bit clearer (at least from the physical point of view).

fundamental frequency of female voice

According to what I have read on the internet, the normal range of fundamental frequency of female voice is 165 to 255 Hz .
I am using Praat and also python library called Parselmouth to get the fundamental frequency values of female voice in an audio file(.wav). however, I got some values that are over 255Hz(eg: 400+Hz, 500Hz).
Is it normal to get big values like this?
It is possible, but unlikely, if you are trying to capture the fundamental frequency (F0) of a speaking voice. It sounds likely that you are capturing a more easily resonating overtone (e.g. F1 or F2) instead.
My experiments with Praat give me the impression that the with good parameters it will reliably extract F0.
What you'll want to do is to verify that by comparing the pitch curve with a spectrogram. Here's an example of a fitting made by Praat (female speaker):
You can see from the image that
Most prominent frequency seems to be F2
Around 200 Hz seems likely to be F0, since there's only noise below that (compared to before/after the segment)
Praat has calculated a good estimate of F0 for the voiced speech segments
If, after a visual inspection, it seems that you are getting wrong results, you can try to tweak the parameters. Window length greatly affects the frequency resolution.
If you can't capture frequencies this low, you should try increasing the window length - the intuition is that it gives the algorithm a better chance at finding slowly changing periodic features in the data.

Frequency analysis of very short signal in GNU Octave

I have some very short signals from oscilloscope (50k-200k samples) registered over about 2ms time length. Those are acoustic signals with registered signal of a spark of ESD (electrostatic discharge).
I'd like to get some frequency data of that signal, in near-acoustic frequency range (up to about 30kHz) with as high time resolution as possible.
I have tried ploting a spectrogram (specgram in Octave) to view the signal, but the output is not really usefull. Using specgram( x, N, fs );, where x is my signal of fs sampling rate, I receive plot starting at very high frequencies of about 500MHz for low values of N and I get better frequency resolution for big N values (like 2^12-13) but the window is too wide and I receive only 2 spectrum values over whole signal length.
I understand that it may be the limitation of Fourier transform which is probably used by the specgram function (actually, I don't know much about signal analysis).
Is there any other way to get some frequency (as a function of time) information of that kind of signal? I've read something about wavelets, but when I tried using dwt function of signal package, I received this error:
error: 'wfilters' undefined near line 51 column 14
error: called from
dwt at line 51 column 12
Even if this would work, I am not so sure if I'd know how to actually use the output of those wavelet functions ...
To get audio frequency information from such a high sample rate, you will need obtain a sample vector long enough to contain at least a few whole cycles at audio frequencies, e.g. many 10's of milliseconds of contiguous samples, which may or may not be more than your scope can gather. To reasonably process this amount of data, you might low pass filter the sample data to just contain audio frequencies, and then resample it to a lower sample rate, but above twice that filter cut-off frequency. Then you will end up with a much shorter sample vector to feed an FFT for your audio spectrum analysis.

Can FFT length affect filtering accuracy?

I am designing a fractional delay filter, and my lagrange coefficient of order 5 h(n) have 6 taps in time domain. I have tested to convolute the h(n) with x(n) which is 5000 sampled signal using matlab, and the result seems ok. When I tried to use FFT and IFFT method, the output is totally wrong. Actually my FFT is computed with 8192 data in frequency domain, which is the nearest power of 2 for 5000 signal sample. For the IFFT portion, I convert back the 8192 frequency domain data back to 5000 length data in time domain. So, the problem is, why this thing works in convolution, but not in FFT multiplication. Does converting my 6 taps h(n) to 8192 taps in frequency domain causes this problem?
Actually I have tried using overlap-save method, which perform the FFT and multiplication with smaller chunks of x(n) and doing it 5 times separately. The result seems slight better than the previous, and at least I can see the waveform pattern, but still slightly distorted. So, any idea where goes wrong, and what is the solution. Thank you.
The reason I am implementing the circular convolution in frequency domain instead of time domain is, I am try to merge the Lagrange filter with other low pass filter in frequency domain, so that the implementation can be more efficient. Of course I do believe implement filtering in frequency domain will be much faster than convolution in time domain. The LP filter has 120 taps in time domain. Due to the memory constraints, the raw data including the padding will be limited to 1024 in length, and so with the fft bins.
Because my Lagrange coefficient has only 6 taps, which is huge different with 1024 taps. I doubt that the fft of the 6 taps to 1024 bins in frequency domain will cause error. Here is my matlab code on Lagrange filter only. This is just a test code only, not implementation code. It's a bit messy, sorry about that. Really appreciate if you can give me more advice on this problem. Thank you.
t=1:5000;
fs=2.5*(10^12);
A=70000;
x=A*sin(2*pi*10.*t.*(10^6).*t./fs);
delay=0.4;
N=5;
n = 0:N;
h = ones(1,N+1);
for k = 0:N
index = find(n ~= k);
h(index) = h(index) * (delay-k)./ (n(index)-k);
end
pad=zeros(1,length(h)-1);
out=[];
H=fft(hh,1024);
H=fft([h zeros(1,1024-length(h))]);
for i=0:1:ceil(length(x)/(1024-length(h)+1))-1
if (i ~= ceil(length(x)/(1024-length(h)+1))-1)
a=x(1,i*(1024-length(h)+1)+1:(i+1)*(1024-length(h)+1));
else
temp=x(1,i*(1024-length(h)+1)+1:length(x));
a=[temp zeros(1,1024-length(h)+1-length(temp))];
end
xx=[pad a];
X=fft(xx,1024);
Y=H.*X;
y=abs(ifft(Y,1024));
out=[out y(1,length(h):length(y))];
pad=y(1,length(a)+1:length(y));
end
Some comments:
The nearest power of two is actually 4096. Do you expect the remaining 904 samples to contribute much? I would guess that they are significant only if you are looking for relatively low-frequency features.
How did you pad your signal out to 8192 samples? Padding your sample out to 8192 implies that approximately 40% of your data is "fictional". If you used zeros to lengthen your dataset, you likely injected a step change at the pad point - which implies a lot of high-frequency content.
A short code snippet demonstrating your methods couldn't hurt.

Resources