Is there a fundamental difference between DSP for Audio Signal Processing and Sensor Signal Processing? - signal-processing

Audio is made up of multiple frequencies occurring at any given time, and we can perform the FFT to get the Frequency bins, but what does the concept of Frequency mean when it comes to Sensor data?
For example, a Triaxial Accelerometer somehow converts a voltage signal and produces acceleration readings in ms^-2. Is the FFT performed with those X,Y,Z readings or the voltages sampled at Fs.
Am I overcomplicating things or is there a difference in mindset when performing DSP for Audio vs Sensor data?

A Fourier transform is tool to convert functions or signals into something that is easier to work with. It is a mathematical tool. The result can have an easy physical interpretation but that is not always the case.
Assume you have an object with constant mass and several periodic sin-like forces F_1*sin(c*t), F_2*sin(d*t), ... that act on the object. The total force is just the sum of those forces:
F(t) = F_1*sin(c*t) + F_2*sin(d*t) + ...
You get the particle's acceleration using Newton's 2nd law:
m * a(t) = F(t)
=> a(t) = F(t) / m = F_1/m * sin(c*t) + F_2/m * sin(d*t) + ...
Let's assume you measured a(t) but don't know the equation above. It you perform a Fourier transformation you can calculate the values of F_1/m, F_2/m, ... . That means your Fourier transform of the the acceleration is the amplitude of the force at the given frequency over the object's mass.
This interpretation works because the Fourier transform is linear and so is the adding of forces (See Newtons 2nd law). If you describe something non-linear chances are that there is no easy interpretation of the result of the transformation.
So when do you perform the FFT? It depends:
If you do it to improve you signal (remove noise) do it on the measured data.
If you want to analyse the physical object (resonances) do it on the acceleration.
(If the conversion is linear (ADC output to m/s^2 is a simple multiplication) it does not matter.)
I hope this makes things a bit clearer (at least from the physical point of view).

Related

Does averaging/filtering an autocorrelation input signal wrongfully change the output?

I'm investigating sensor measurements of NO2 in the atmosphere over the course of several days. My first interest is to find periodicity of the data to which end I'm using autocorrelation.
My problem is that the praxis seems to be to use moving average as well as filtering of the measurements; moving average of about 10-50 data points and readings set above the sensors maximum reading of 200µg/m³ is set to 200µg/m³ (as far as my understanding goes on that).
Anyhow... When performing my autocorrelation I found that processing the raw signal or the averaged/filtered signal gives wildly different results, as can be seen in appended autocorrelation figure (bottom), which leads me to my question:
When performing autocorrelation, do I wrongfully change the result by using an averaged/filtered input signal to my autocorrelation function? And if so, which way is "correct"?
On top: RAW sensor measurement of NO2 concentration, NO moving average/filtering! Middle: measurement processed with a moving average of 30 data points and any reading >200 is set to 200. Bottom: autocorrelation of the two above measurements, with some slight smoothing. Right scale is inactive and possible end effects are not interesting.
Comments on the figure: I know it looks bad/weird that the moving average signal is flat most of the time, and that this flatness is not at a constant 200 (max). This is really not of interest, the behavior of autocorrelation is my concern.
Applying a moving average before autocorrelating is the same as applying the moving average twice (once forward and once backward) after autocorrelating.
Let * denote convolution and ^R denote time-reversal of a signal. x and m are your input signal and moving average filter
AutoCorrelate(x*m) = (x*m) * (x*m)^R
= x * m * x^R * m^R
= x * x^R * m * m^R
= AutoCorrelate(x) * (m * m^R)
Note that a moving average filter is the same shape as it's time-reversal, so by filtering the signal before autocorrelation, you have filtered the autocorrelation twice.
Since a moving average filter is a low-pass this explains why the curves in your autocorrelation are smoothed out.
Whether or not this is appropriate really depends on your application. If the moving average filter only removes noise, then it's a good idea. If the moving average removes important parts of the signal that indicate its timing, then it's not a good idea.

Discrete Wavelet Transform (Daubechies wavelet) of an array complex numbers

Say, I have a signal represented as an array of real numbers y = [1,2,0,4,5,6,7,90,5,6]. I can use Daubechies-4 coefficients D4 = [0.482962, 0.836516, 0.224143, -0.129409], and apply a wavelet transform to receive high- and low-frequencies of the signal. So, the high frequency component will be calculated like this:
high[v] = y[2*v]*D4[0] + y[2*v+1]*D4[1] + y[2*v+2]*D4[2] + y[2*v+3]*D4[3],
and the low frequency component can be calculated using other D4 coefs permutation.
The question is: what if y is complex array? Do I just multiply and add complex numbers to receive subbands, or is it correct to get amplitude and phase, treat each of them like a real number, do the wavelet transform for them, and then restore complex number array for each subband using formulas real_part = abs * cos(phase) and imaginary_part = abs * sin(phase)?
To handle the case of complex data, you're looking at the Complex Wavelet Transform. It's actually a simple extension to the DWT. The most common way to handle complex data is to treat the real and imaginary components as two separate signals and perform a DWT on each component separately. You will then receive the decomposition of the real and imaginary components.
This is commonly known as the Dual-Tree Complex Wavelet Transform. This can best be described by the figure below that I pulled from Wikipedia:
Source: Wikipedia
It's called "dual-tree" because you have two DWT decompositions happening in parallel - one for the real component and one for the imaginary. In the above diagram, g0/h0 represent the low-pass and high-pass components of the real part of the signal x and g1/h1 represent the low-pass and high-pass components of the imaginary part of the signal x.
Once you decompose the real and imaginary parts into their respective DWT decompositions, you can combine them to get the magnitude and/or phase and proceed to the next step or whatever you desire to do with them.
The mathematical proof regarding the correctness of this is outside the scope of what we're talking about, but if you would like to see how this got derived, I refer you to the canonical paper by Kingsbury in 1997 in the work Image Processing with Complex Wavelets - http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=835E60EAF8B1BE4DB34C77FEE9BBBD56?doi=10.1.1.55.3189&rep=rep1&type=pdf. Pay close attention to the noise filtering of images using the CWT - this is probably what you're looking for.

kalman filter with redundant state measurements

I am trying to implement a kalman filter for orientation detection. just like most other implementations I found online, I will be using a gyro and accelerometer to measure the pitch and roll, however I intend to also add horizon detection. This will give me a second reading for the pitch and roll. This means that I will have two means of measuring the current state, accelerometer and horizon detection whilst the gyro will be used for control.
So far I have implemented the filter on the sensor data and horizon detection separately based on this tutorial: http://blog.tkjelectronics.dk/2012/09/a-practical-approach-to-kalman-filter-and-how-to-implement-it/
Which part of the kalman filter do I have to modify for the algorithm to choose the best reading between the predicted state, accelerometer reading and horizon detected reading? Any help, links to papers or sites will be appreciated
thanks in advance for your help
The KF consists of two parallel components:
1. the estimated state, AND
2. the uncertainty in that estimate (specifically, the covariance matrix of the state components).
When combining 2 estimates of the state, the standard method takes a weighted average, with the weights being the inverses of the (co)variances. That is, the more certain of the 2 estimates (smaller covariance) is weighted more highly than the other.
So if you are not already tracking the covariances for your 2 estimates, you will need to do this.
For a scalar state X with 2 estimates X' and X", you would have a variance for each estimate: V' and V", with inverses C' = 1/V' and C" = 1/V". (The "certainties" C are easier to use than the variances V.)
Then the MMSE estimate (which is what the KF attempts to optimize) of the state is given by: Xmmse = (X'/V' + X"/V") / (1/V' + 1/V").
[There is also a corresponding update to V at this point, based on V' and V".]
For vectorial states, V will be replaces by a covariance matrix, and the divisions will become matrix inverses. In this case, it may be easier to track the inverses C directly: Xmmse = (C' + C") \ (C' * X' + C" * X") [and corresponding update to C],
where "\" denotes pre-multiplication by the inverse of the first factor, C'+C".
I hope this helps.
[I apologize for the poor formatting here. Stack Overflow has inferred that the algebraic expressions are code, and demanded formatting them as code before it would let me answer. They aren't, so I couldn't.]

N step fft in D language

I am using fft function from std.numeric
Complex!double[] resultfft = fft(timeDomainAmplitudeVal);
The parameter timeDomainAmplitudeVal is audio amplitude data. Sample rate 44100 hz and there is 131072(2^16) samples
I am seeing that resultfft has the same size as timeDomainAmplitudeVal(131072) which does not fits my project(also makes no sense) . I need to be able to divide FFT to N equally spaced frequencies. And I need this N to be defined by me .
Is there anyway to implement this with std.numeric.fft or can you have any advices for fft library?
Ps: I will be glad to hear if some DSP libraries exist also
That's just how Fourier transforms work in the practical number-crunching world. Give S samples of signal, get S amplitudes. (Ignoring issues with complex numbers and symmetries.)
If you want N amplitudes, you'll have to interpolate the S-points amplitudes you get from FFT. Your biggest decision is to choose between linear, cubic, truncated sinc, etc.
Altnernative: resample the original audio signal to have your desired N samples in the same overall time interval. Then FFT it.
take a look at pfft, a fast FFT written in D.
http://jerro.github.io/pfft/doc/pfft.pfft.html
or numpy & Pyd
http://docs.scipy.org/doc/numpy/reference/routines.fft.html
http://pyd.dsource.org/
HTH
This is absolutely normal that the FFT gives the same data length.
Here some C++ code to perform windows FFT analysis with overlap and optional "zero-phase" ordering. http://pastebin.com/4YKgbed1
What do FFT coefficients mean?
Question: "OK so I've done the FFT and I'm said I can recover the original signal. Now, what are these coefficients."
Answer: "You can think of coefficient i as representing the phase and amplitude of frequencies from SR*i/(2*N) to SR*(i+1)/(2*N). This is a helpful metaphor. But a more accurate view is that coefficient i is the contribution of a sine of frequency SR*i/(2*N) in a reconstruction of the original input chunk."

Why do the convolution results have different lengths when performed in time domain vs in frequency domain?

I'm not a DSP expert, but I understand that there are two ways that I can apply a discrete time-domain filter to a discrete time-domain waveform. The first is to convolve them in the time domain, and the second is to take the FFT of both, multiply both complex spectrums, and take IFFT of the result. One key difference in these methods is the second approach is subject to circular convolution.
As an example, if the filter and waveforms are both N points long, the first approach (i.e. convolution) produces a result that is N+N-1 points long, where the first half of this response is the filter filling up and the 2nd half is the filter emptying. To get a steady-state response, the filter needs to have fewer points than the waveform to be filtered.
Continuing this example with the second approach, and assuming the discrete time-domain waveform data is all real (not complex), the FFT of the filter and the waveform both produce FFTs of N points long. Multiplying both spectrums IFFT'ing the result produces a time-domain result also N points long. Here the response where the filter fills up and empties overlap each other in the time domain, and there's no steady state response. This is the effect of circular convolution. To avoid this, typically the filter size would be smaller than the waveform size and both would be zero-padded to allow space for the frequency convolution to expand in time after IFFT of the product of the two spectrums.
My question is, I often see work in the literature from well-established experts/companies where they have a discrete (real) time-domain waveform (N points), they FFT it, multiply it by some filter (also N points), and IFFT the result for subsequent processing. My naive thinking is this result should contain no steady-state response and thus should contain artifacts from the filter filling/emptying that would lead to errors in interpreting the resulting data, but I must be missing something. Under what circumstances can this be a valid approach?
Any insight would be greatly appreciated
The basic problem is not about zero padding vs the assumed periodicity, but that Fourier analysis decomposes the signal into sine waves which, at the most basic level, are assumed to be infinite in extent. Both approaches are correct in that the IFFT using the full FFT will return the exact input waveform, and both approaches are incorrect in that using less than the full spectrum can lead to effects at the edges (that usually extend a few wavelengths). The only difference is in the details of what you assume fills in the rest of infinity, not in whether you are making an assumption.
Back to your first paragraph: Usually, in DSP, the biggest problem I run into with FFTs is that they are non-causal, and for this reason I often prefer to stay in the time domain, using, for example, FIR and IIR filters.
Update:
In the question statement, the OP correctly points out some of the problems that can arise when using FFTs to filter signals, for example, edge effects, that can be particularly problematic when doing a convolution that is comparable in the length (in the time domain) to the sampled waveform. It's important to note though that not all filtering is done using FFTs, and in the paper cited by the OP, they are not using FFT filters, and the problems that would arise with an FFT filter implementation do not arise using their approach.
Consider, for example, a filter that implements a simple average over 128 sample points, using two different implementations.
FFT: In the FFT/convolution approach one would have a sample of, say, 256, points and convolve this with a wfm that is constant for the first half and goes to zero in the second half. The question here is (even after this system has run a few cycles), what determines the value of the first point of the result? The FFT assumes that the wfm is circular (i.e. infinitely periodic) so either: the first point of the result is determined by the last 127 (i.e. future) samples of the wfm (skipping over the middle of the wfm), or by 127 zeros if you zero-pad. Neither is correct.
FIR: Another approach is to implement the average with an FIR filter. For example, here one could use the average of the values in a 128 register FIFO queue. That is, as each sample point comes in, 1) put it in the queue, 2) dequeue the oldest item, 3) average all of the 128 items remaining in the queue; and this is your result for this sample point. This approach runs continuously, handling one point at a time, and returning the filtered result after each sample, and has none of the problems that occur from the FFT as it's applied to finite sample chunks. Each result is just the average of the current sample and the 127 samples that came before it.
The paper that OP cites takes an approach much more similar to the FIR filter than to the FFT filter (note though that the filter in the paper is more complicated, and the whole paper is basically an analysis of this filter.) See, for example, this free book which describes how to analyze and apply different filters, and note also that the Laplace approach to analysis of the FIR and IIR filters is quite similar what what's found in the cited paper.
Here's an example of convolution without zero padding for the DFT (circular convolution) vs linear convolution. This is the convolution of a length M=32 sequence with a length L=128 sequence (using Numpy/Matplotlib):
f = rand(32); g = rand(128)
h1 = convolve(f, g)
h2 = real(ifft(fft(f, 128)*fft(g)))
plot(h1); plot(h2,'r')
grid()
The first M-1 points are different, and it's short by M-1 points since it wasn't zero padded. These differences are a problem if you're doing block convolution, but techniques such as overlap and save or overlap and add are used to overcome this problem. Otherwise if you're just computing a one-off filtering operation, the valid result will start at index M-1 and end at index L-1, with a length of L-M+1.
As to the paper cited, I looked at their MATLAB code in appendix A. I think they made a mistake in applying the Hfinal transfer function to the negative frequencies without first conjugating it. Otherwise, you can see in their graphs that the clock jitter is a periodic signal, so using circular convolution is fine for a steady-state analysis.
Edit: Regarding conjugating the transfer function, the PLLs have a real-valued impulse response, and every real-valued signal has a conjugate symmetric spectrum. In the code you can see that they're just using Hfinal[N-i] to get the negative frequencies without taking the conjugate. I've plotted their transfer function from -50 MHz to 50 MHz:
N = 150000 # number of samples. Need >50k to get a good spectrum.
res = 100e6/N # resolution of single freq point
f = res * arange(-N/2, N/2) # set the frequency sweep [-50MHz,50MHz), N points
s = 2j*pi*f # set the xfer function to complex radians
f1 = 22e6 # define 3dB corner frequency for H1
zeta1 = 0.54 # define peaking for H1
f2 = 7e6 # define 3dB corner frequency for H2
zeta2 = 0.54 # define peaking for H2
f3 = 1.0e6 # define 3dB corner frequency for H3
# w1 = natural frequency
w1 = 2*pi*f1/((1 + 2*zeta1**2 + ((1 + 2*zeta1**2)**2 + 1)**0.5)**0.5)
# H1 transfer function
H1 = ((2*zeta1*w1*s + w1**2)/(s**2 + 2*zeta1*w1*s + w1**2))
# w2 = natural frequency
w2 = 2*pi*f2/((1 + 2*zeta2**2 + ((1 + 2*zeta2**2)**2 + 1)**0.5)**0.5)
# H2 transfer function
H2 = ((2*zeta2*w2*s + w2**2)/(s**2 + 2*zeta2*w2*s + w2**2))
w3 = 2*pi*f3 # w3 = 3dB point for a single pole high pass function.
H3 = s/(s+w3) # the H3 xfer function is a high pass
Ht = 2*(H1-H2)*H3 # Final transfer based on the difference functions
subplot(311); plot(f, abs(Ht)); ylabel("abs")
subplot(312); plot(f, real(Ht)); ylabel("real")
subplot(313); plot(f, imag(Ht)); ylabel("imag")
As you can see, the real component has even symmetry and the imaginary component has odd symmetry. In their code they only calculated the positive frequencies for a loglog plot (reasonable enough). However, for calculating the inverse transform they used the values for the positive frequencies for the negative frequencies by indexing Hfinal[N-i] but forgot to conjugate it.
I can shed some light to the reason why "windowing" is applied before FFT is applied.
As already pointed out the FFT assumes that we have a infinite signal. When we take a sample over a finite time T this is mathematically the equivalent of multiplying the signal with a rectangular function.
Multiplying in the time domain becomes convolution in the frequency domain. The frequency response of a rectangle is the sync function i.e. sin(x)/x. The x in the numerator is the kicker, because it dies down O(1/N).
If you have frequency components which are exactly multiples of 1/T this does not matter as the sync function is zero in all points except that frequency where it is 1.
However if you have a sine which fall between 2 points you will see the sync function sampled on the frequency point. It lloks like a magnified version of the sync function and the 'ghost' signals caused by the convolution die down with 1/N or 6dB/octave. If you have a signal 60db above the noise floor, you will not see the noise for 1000 frequencies left and right from your main signal, it will be swamped by the "skirts" of the sync function.
If you use a different time window you get a different frequency response, a cosine for example dies down with 1/x^2, there are specialized windows for different measurements. The Hanning window is often used as a general purpose window.
The point is that the rectangular window used when not applying any "windowing function" creates far worse artefacts than a well chosen window. i.e by "distorting" the time samples we get a much better picture in the frequency domain which closer resembles "reality", or rather the "reality" we expect and want to see.
Although there will be artifacts from assuming that a rectangular window of data is periodic at the FFT aperture width, which is one interpretation of what circular convolution does without sufficient zero padding, the differences may or may not be large enough to swamp the data analysis in question.

Resources