FFT and decibel scales - ios

If I take audio data on an iPhone (i.e. real) data, perform an FFT and then take the magnitudes (Re^2 + Im^2).
These vary from >0 to some large numbers, so I do 10log(n) to get it in dB.
This gives me outputs that are negative (for the inputs that were < 1) to positive.
But the examples I've seen for this (and also drawing the spectrum in Sonic Visualiser) always have positive spectrums when measured in dB.
So what have I missed?!
On a wider note, as I understand it decibels are a ratio, so in this context when turning the FFT magnitudes into dB, what are they a ratio to?

The simple answer is that, for the most part, you can add an arbitrary number to the dB value to make the values all positive, or all negative, or whatever you prefer. With an uncalibrated microphone, like on the iPhone, this is all that makes sense anyway, since all you know are relative values.
For a more advanced technical approach, using a calibrated microphone, you could reference everything using dB (SPL), as a reasonable standard, but this is a hassle, and not meaningful in your use case anyway.
Rationale:
The main reason that shifting by an arbitrary amount is that the log doesn't report the units of measurement. For example, even if you know the input amplitude is 0.1 Pascal, it's completely valid to say this is 100 milliPascal, where you'd be taking the log of 100 rather than 0.1 (so the log values are either 2 or -1). Both are completely valid and the choice is entirely arbitrary. When comparing to a standard reference, as in dB SPL, note that it's done as a ratio, log(P/Pref), removing the impact of changing the units.

Since an FFT is a linear operator, The scale of the output of an FFT is related to the scale of the data input to the FFT. The scale of the input to the FFT on your iPhone depends on the gain of the mic, audio filters, potentially the AGC, and the DAC reference. Since the latter are all undocumented and can vary (by position of the mic, model of device, input gain which may depend on the audio session configuration, and etc.) you won't know the ratio unless you perform some sort of calibration against a known reference.

Related

How do the cuSPARSE and cuBLAS libraries deal with memory allocated using cudaMallocPitch?

I am implementing a simple routine that performs sparse matrix - dense matrix multiplication using cusparseScsrmm from cuSPARSE. This is part of a bigger application that could allocate memory on GPU using cudaMalloc (more than 99% of the time) or cudaMallocPitch (very rarely used). I have a couple of questions regarding how cuSPARSE deals with pitched memory:
1) I passed in pitched memory into the cuSPARSE routine but the results were incorrect (as expected, since there is no way to pass in the pitch as an argument). Is there a way to get these libraries working with memory allocated using cudaMallocPitch?
2) What is the best way to deal with this? Should I just add a check in the calling function, to enforce that the memory not be allocated using pitched mode?
For sparse matrix operations, the concept of pitched data has no relevance anyway.
For dense matrix operations most operations don't directly support a "pitch" to the data per se, however various operations can operate on a sub-matrix. With a particular caveat, it should be possible for such operations to handle pitched or unpitched data. Any time you see a CUBLAS (or CUSPARSE) operation that accepts "leading dimension" arguments, those arguments could be used to encompass a pitch in the data.
Since the "leading dimension" parameter is specified in matrix elements, and the pitch is (usually) specified in bytes, the caveat here is that the pitch is evenly divisible by the size of the matrix element in question, so that the pitch (in bytes) can be converted to a "leading dimension" parameter specified in matrix elements. I would expect that this would be typically possible for char, int, float, double and similar types, as I believe the pitch quantity returned by cudaMallocPitch will usually be evenly divisible by 16. But there is no stated guarantee of this, so proper run-time checking is advised, if you intend to use this approach.
For example, it should be possible to perform a CUBLAS matrix-matrix multiply (gemm) on pitched data, with appropriate specification of the lda, ldb and ldc parameters.
The operation you indicate does offer such leading dimension parameters for the dense matrices involved.
If 99% of your use-cases don't use pitched data, I would either not support pitched data at all, or else, for operations where no leading dimension parameters are available, copy the pitched data to an unpitched buffer for use in the desired operation. A device-to-device pitched to unpitched copy can run at approximately the rate of memory bandwidth, so it might be fast enough to not be a serious issue for 1% of the use cases.

Objective-C normalise dB level from iPhone's built in mic

I am building a dB meter as part of an app I am creating, I have got it receiving the peak and average power's from the mic on my iPhone (values ranging from -60 to 0.4) and now I need to figure out how to convert these power levels into db levels like the ones in this chart http://www.gcaudio.com/resources/howtos/loudness.html
Does anyone have any idea how I could do this? I can't figure an algorithm out for the life of me and it is kind of crucial as the whole point of the app is to do with real word audio levels, if that makes sense.
Any help will be greatly appreciated.
Apple's done a pretty good job at making the frequency response of the microphone flat and consistent between devices, so the only thing you'll need to determine is a calibration point. For this you will require a reference audio source and calibrated Sound Pressure level meter.
It's worth noting that sound pressure measurements are often measured against the A-weighting scale. This is frequency weighted for the human aural system. In order to measure this, you will need to apply the relevant filter curve to results taken form the microphone.
Also be aware of the difference between peak and mean (in this case RMS) measurements.
As far as I can tell from looking at the documentation, the "power level" returned from an AVAudioRecorder (assuming that's what you're using – you didn't specify) is already in decibels. See here from the documentation for averagePowerForChannel:
Return Value
The current average power, in decibels, for the sound
being recorded. A return value of 0 dB indicates full scale, or
maximum power; a return value of -160 dB indicates minimum power (that
is, near silence).

Complex interpolation on an FPGA

I have a problem in that I need to implement an algorithm on an FPGA that requires a large array of data that is too large to fit into block or distributed memory. The array contains complex fixed-point values, and it turns out that I can do a good job by reducing the total number of stored values through decimation and then linearly interpolating the interim values on demand.
Though I have DSP blocks (and so fixed-point hardware multipliers) which could be used trivially for real and imaginary part interpolation, I actually want to do the interpolation on the amplitude and angle (of the polar form of the complex number) and then convert the result to real-imaginary form. The data can be stored in polar form if it improves things.
I think my question boils down to this: How should I quickly convert between polar complex numbers and real-imaginary complex numbers (and back again) on an FPGA (noting availability of DSP hardware)? The solution need not be exact, just close, but be speed optimised. Alternatively, better strategies are gladly received!
edit: I know about cordic techniques, so this would be how I would do it in the absence of a better idea. Are there refinements specific to this problem I could invoke?
Another edit: Following from #mbschenkel's question, and some more thinking on my part, I wanted to know if there were any known tricks specific to the problem of polar interpolation.
In my case, the dominant variation between samples is a phase rotation, with a slowly varying amplitude. Since the sampling grid is known ahead of time and is regular, one trick could be to precompute some complex interpolation factors. So, for two complex values a and b, if we wish to find (N-1) intermediate equally spaced values, we can precompute the factor
scale = (abs(b)/abs(a))**(1/N)*exp(1j*(angle(b)-angle(a)))/N)
and then find each intermediate value iteratively as val[n] = scale * val[n-1] where val[0] = a.
This works well for me as I need the samples in order and I compute them all. For small variations in amplitude (i.e. abs(b)/abs(a) ~= 1) and 0 < n < N, (abs(b)/abs(a))**(n/N) is approximately linear (though linear is not necessarily better).
The above is all very good, but still results in a complex multiplication. Are there other options for approximating this? I'm interested in resource and speed constraints, not accuracy. I know I can do the rotation with CORDIC, but still need a pair of multiplications for the scaling, so I'm adding lots of complexity and resource usage for potentially limited results. I don't really have a feel for the convergence of CORDIC, so perhaps I just truncate early, or use lots of resources to converge quickly.

Phase difference between two signals?

I'm working on this embedded project where I have to resonate the transducer by calculating the phase difference between its Voltage and Current waveform and making it zero by changing its frequency. Where I(current) & V(Voltage) are the same frequency signals at any instant but not the fixed frequency signals approx.(47Khz - 52kHz). All I have to do is to calculate phase difference between these two signals. Which method will be most effective.
FFT of Two signals and then phase difference between the specific components
Or cross-correlation of two signals?
Or another if any ? Which method will give me most accurate result ? and with what resolution? Does sampling rate affects phase difference's resolution (minimum phase difference which can be sensed) ?
I'm new to Digital signal processing, in case of any mistake, correct me.
ADDITIONAL DETAILS:-
Noise In my system can be white/Gaussian Noise(Not significant) & Harmonics of Fundamental (Which might be significant one in resonant mismatch case).
Yes 4046 can be a good alternative with switching regulators. I'm working with (NCO/DDS) where I can scale/ reshape sinusoidal on ongoing basis.
Implementation of Analog filter will be very complex as I will require higher order filter with high roll-off rate for harmonic removal , so I'm choosing DSP based filter and its easy to work with MATLAB DSP Processors.
What sampling rate would you suggest for a ~50 KHz (47Khz-52KHz) system for achieving result in FFT or Goertzel with phase resolution of preferably =<0.1 degrees or less and frequency steps will vary from as small as ~1 to 2Hz . to 50 Hz-200Hz.
My frequency is variable 45KHz - 55Khz ... But will be known to my system... Knowing phase error for the last fed frequency is more desirable. After FFT AND DIGITAL FILTERING , IFFT can be performed for more noise free samples which can be used for further processing. So i guess FFT do both the tasks ...
But I'm wondering about the Phase difference accuracy cause thats the crucial part.
The Goertzel algorithm http://www.embedded.com/design/configurable-systems/4024443/The-Goertzel-Algorithm is a fairly efficient tone detection method that resolves the signal into real and imaginary components. I'll assume you can do the numeric to get the phase difference or just polarity, as you require.
Resolution versus time constant is a design tradeoff which this article highlights issues. http://www.mstarlabs.com/dsp/goertzel/goertzel.html
Additional
"What accuracy can be obtained?"
It depends...upon what you are faced with (i.e., signal levels, external noise, etc.), what hardware you have (i.e., adc, processor, etc.), and how you implement your solution (sample rate, numerical precision, etc.). Without the complete picture, I'll be guessing what you could achieve as the Goertzel approach is far from easy.
But I imagine for a high school project with good signal levels and low noise, an easier method of using the phase comparator (2 as it locks at zero degrees) of a 4046 PLL www.nxp.com/documents/data_sheet/HEF4046B.pdf will likely get you down to a few degrees.
One other issue if you have a high Q transducer is generating a high-resolution frequency. There is a method but that's another avenue.
Yet more
"Harmonics of Fundamental (Which might be significant)"... hmm hence the digital filtering;
but if the sampling rate is too low then there might be a problem with aliasing. Also, mismatched anti-aliasing filters are likely to take your whole error budget. A rule of thumb of ten times sampling frequency seems a bit low, and it being higher it will make the filter design easier.
Spatial windowing addresses off-frequency issues along with higher roll-off and attenuation and is described in this article. Sliding Spectrum Analysis by Eric Jacobsen and Richard Lyons in Streamlining Digital Signal Processing http://www.amazon.com/Streamlining-Digital-Signal-Processing-Guidebook/dp/1118278380
In my previous project after detecting either carrier, I then was interested in the timing of the frequency changes in immense noise. With carrier phase generation inconstancies, the phase error was never quiescent to be quantified, so I can't guess better than you what you might get with your project conditions.
Not to detract from chip's answer (I upvoted it!) but some other options are:
Cross correlation. Off the top of my head, I am not sure what the performance difference between that and the Goertzel algorithm will be, but both should be doable on an embedded system.
Ad-hoc methods. For example, I would try something like this: bandpass the signals to eliminate noise, find the peaks and measure the time difference between the peaks. This will probably be more efficient, and, provided you do a reasonable job throwing out outliers and handling wrap-around, should be extremely robust. The bandpass filters will, themselves, alter the phase, so you'll have to make sure you apply exactly the same filter to both signals.
If the input signal-to-noise ratios are not too bad, a computually efficient solution can be built based on zero crossing detection. Also, have a look at http://www.metrology.pg.gda.pl/full/2005/M&MS_2005_427.pdf for a nice comparison of phase difference detection algorithms, including zero-crossing ones.
Computing 1-bin of a DFT (or using the similar complex Goertzel block filter) will work if the signal frequency is accurately known. (Set the DFT bin or the Goertzel to exactly that frequency).
If the frequency isn't exactly known, you could try using an FFT with an FFTshift to interpolate the frequency magnitude peak, and then interpolate the phase at that frequency for each of the two signals. An FFT will also allow you to window the data, which may improve phase estimation accuracy if the frequency isn't exactly bin centered (or exactly the Goertzel filter frequency). Different windows may improve the phase estimation accuracy for frequencies "between bins". A Blackman-Nutall window will be better than a rectangular window, but there may be better window choices.
The phase measurement accuracy will depend on the S/N ratio, the length of time one samples the two (assumed stationary) signals, and possibly the window used.
If you have a Phase Locked Loop (PLL) that tracks each input, then you can subtract the phase coefficients (of the generator components) to determine offset between the phases. This would also be robust against noise.

Extrapolation using fft in octave

Using GNU octave, I'm computing a fft over a piece of signal, then eliminating some frequencies, and finally reconstructing the signal. This give me a nice approximation of the signal ; but it doesn't give me a way to extrapolate the data.
Suppose basically that I have plotted three periods and a half of
f: x -> sin(x) + 0.5*sin(3*x) + 1.2*sin(5*x)
and then added a piece of low amplitude, zero-centered random noise. With fft/ifft, I can easily remove most of the noise ; but then how do I extrapolate 3 more periods of my signal data? (other of course that duplicating the signal).
The math way is easy : you have a decomposition of your function as an infinite sum of sines/cosines, and you just need to extract a partial sum and apply it anywhere. But I don't quite get the programmatic way...
Thanks!
The Discrete Fourier Transform relies on the assumption that your time domain data is periodic, so you can just repeat your time domain data ad nauseam - no explicit extrapolation is necessary. Of course this may not give you what you expect if your individual component periods are not exact sub-multiples of the DFT input window duration. This is one reason why we typically apply window functions such as the Hanning Window prior to the transform.

Resources