Estimate time to multiply two 16-bit numbers on DSPs? - signal-processing

I am trying to procure a DSP board which allows very fast implementation of solving differential equations up to 10th order or so. The clock runs at 300 MHz (~3ns). Considering standard multiplication implementation on DSPs, how do I estimate the time to multiply two 16-bit numbers?
Thanks,
Mayank

Related

Sinusoids with frequencies that are random variales - What does the FFT impulse look like?

I'm currently working on a program in C++ in which I am computing the time varying FFT of a wav file. I have a question regarding plotting the results of an FFT.
Say for example I have a 70 Hz signal that is produced by some instrument with certain harmonics. Even though I say this signal is 70 Hz, it's a real signal and I assume will have some randomness in which that 70Hz signal varies. Say I sample it for 1 second at a sample rate of 20kHz. I realize the sample period probably doesn't need to be 1 second, but bear with me.
Because I now have 20000 samples, when I compute the FFT. I will have 20000 or (19999) frequency bins. Let's also assume that my sample rate in conjunction some windowing techniques minimize spectral leakage.
My question then: Will the FFT still produce a relatively ideal impulse at 70Hz? Or will there 'appear to be' spectral leakage which is caused by the randomness the original signal? In otherwords, what does the FFT look like of a sinusoid whose frequency is a random variable?
Some of the more common modulation schemes will add sidebands that carry the information in the modulation. Depending on the amount and type of modulation with respect to the length of the FFT, the sidebands can either appear separate from the FFT peak, or just "fatten" a single peak.
Your spectrum will appear broadened and this happens in the real world. Look e.g for the Voight profile, which is a Lorentizan (the result of an ideal exponential decay) convolved with a Gaussian of a certain width, the width being determined by stochastic fluctuations, e.g. Doppler effect on molecules in a gas that is being probed by a narrow-band laser.
You will not get an 'ideal' frequency peak either way. The limit for the resolution of the FFT is one frequency bin, (frequency resolution being given by the inverse of the time vector length), but even that (as #xvan pointed out) is in general broadened by the window function. If your window is nonexistent, i.e. it is in fact a square window of the length of the time vector, then you'll get spectral peaks that are convolved with a sinc function, and thus broadened.
The best way to visualize this is to make a long vector and plot a spectrogram (often shown for audio signals) with enough resolution so you can see the individual variation. The FFT of the overall signal is then the projection of the moving peaks onto the vertical axis of the spectrogram. The FFT of a given time vector does not have any time resolution, but sums up all frequencies that happen during the time you FFT. So the spectrogram (often people simply use the STFT, short time fourier transform) has at any given time the 'full' resolution, i.e. narrow lineshape that you expect. The FFT of the full time vector shows the algebraic sum of all your lineshapes and therefore appears broadened.
To sum it up there are two separate effects:
a) broadening from the window function (as the commenters 1 and 2 pointed out)
b) broadening from the effect of frequency fluctuation that you are trying to simulate and that happens in real life (e.g. you sitting on a swing while receiving a radio signal).
Finally, note the significance of #xvan's comment : phi= phi(t). If the phase angle is time dependent then it has a derivative that is not zero. dphi/dt is a frequency shift, so your instantaneous frequency becomes f0 + dphi/dt.

Optimal value of sampling frequency for guitar notes detection

I am running FFT algorithm to detect the music note played on a guitar.
The frequencies that I am interested are in the range 65.41Hz (C2) to 1864.7Hz (A#6).
If I set the sampling frequency of the input to 16KHz, the output of FFT would yield N points from 0Hz to 16KHz linearly. All the input I am interested would be in the first N/8 points approximately. The other N*7/8 points are of no use to me. They actually are decreasing my resolution.
From Nyquist's theory (https://en.wikipedia.org/wiki/Nyquist_frequency), the sampling frequency that is needed is just twice the maximum frequency one desires. In my case, this would be about 4KHz.
Is 4KHz really the ideal sampling frequency for a guitar tuning app?
Intuitively, one would feel a better sampling frequency would give you more accurate results. However, in this case, it seems having a lesser sampling frequency is better for improving the resolution. Regards.
You are confusing the pitch of a guitar note with spectral frequency. A guitar generates lots of overtones and harmonics at a much higher frequency than the pitch of a played note. Those higher harmonics and overtones, more than the possibly weak fundamental frequency in some cases, is what the human ear hears and interprets as the lower perceived pitch.
Any of the overtones and harmonics around or above 2 kHz that are not completely low pass filtered out before sampling at 4 kHz will cause aliasing and thus corruption of your sampled data and its spectrum.
If you want to create an accurate tuner, use a pitch estimation algorithm, not an FFT peak frequency bin estimator. And depending on which pitch estimation method you choose, a higher density of samples per unit time might allow finer accuracy or greater reliability under background noise or more prompt responsiveness.
Is 4KHz really the ideal sampling frequency for a guitar tuning app?
You've been mis-reading Nyquist's theorem if you ask it like that.
States that every sampling frequency above twice your maximum signal frequency will allow you to perfectly reconstruct your original signal. So there's no "ideal" frequency. Just a set of frequencies that are sufficient. What is ideal hence depends on a lot of other things: mainly, what your digitizer really supports (hint: most sound cards can do 44.1kHz, but not 4kHz), what kind of margin you want to have for filters etc to work on, and what kind of processing power you can spend (hint: modern smart phones, PCs and even pocket calculators don't really have a hard time processing a couple hundred kHz in real time).
Also note that #hotpaw2 is right, the harmonics are important, and are multiples of the base tone frequency.
However, in this case, it seems having a lesser sampling frequency is better for improving the resolution.
no. No matter where that comes from, it's wrong. Information theory's first and foremost result is that based upon more information, you can't make worse estimates. An oversampled signal is simply more information on the same signal.
Yes, if all you are interested in is frequencies up to 2 kHz then you only need a sampling frequency of 4 kHz. This should include an anti-aliasing filter in front of the ADC or any downconverter to prevent any higher frequency components from aliasing into a lower frequency.
If all you are interested in is specific frequencies (one or two) then you may want to look at the Goertzel algorithm which is more efficient than an FFT for a single frequency. Also, the chirp-Z transform can be used to effectively get a zoomed FFT (resulting in a higher resolution over a smaller bandwidth without the computational complexity of an FFT with the same resolution). You may want to check out this CZT tutorial

FFT for n Points (non power of 2 )

I need to know a way to make FFT (DFT) work with just n points, where n is not a power of 2.
I want to analyze an modify the sound spectrum, in particular of Wave-Files, which have in common 44100 sampling points. But my FFT does not work, it only works with points which are in shape like 2^n.
So what can I do? Beside fill up the vector with zeros to the next power of 2 ?!
Any way to modify the FFT algorithm?
Thanks!
You can use the FFTW library or the code generators of the Spiral project. They implement FFT for numbers with small prime factors, break down large prime factors p by reducing it to a FFT of size (p-1) which is even, etc.
However, just for signal analysis it is questionable why you want to analyze exactly one second of sound and not smaller units. Also, you may want to use a windowing procedure to avoid the jumps at the ends of the segment.
Aside from padding the array as you suggest, or using some other library function, you can construct a Fourier transform with arbitrary length and spacing in the frequency domain (also for non-integer sample spacings).
This is a well know result and is based on the Chirp-z transform (or Bluestein's FFT). Another good reference is given by Rabiner and can be found at the above link.
In summary, with this approach you don't have to write the FFT yourself, you can simply use an existing high-performance FFT and then apply the convolution theorem to a suitably scaled and conditioned version of your signal.
The performance will still be, O(n*log n), multiplied by some implementation-dependent scaling factor.
The FFT is just a faster method of computing the DFT for certain length vectors; and a DFT can be computed for any length of input vector. You can also zero-pad your input vector to a length supported by your FFT library, which may be faster.
If you want to modify your sound file, you may need to use the overlap-add or overlap-save fast convolution filtering after determining the length of the impulse response of your frequency domain modification.

Fast Fourier Transform results: frequency axis scale?

I successfully implemented code that takes array data and runs a fast fourier transform on it, using Apple's Accelerate Framework (performed on iOS device).
My question now is what is the scale of the frequency axis? The results have peaks as expected in certain frequency ranges, but I'm not sure what the frequency should be. The Accelerate Framework's FFT functions take in an array and spit out an array with the same (or more) number of data points. Does it assume that all those points are equally spaced in time? It doesn't take the sampling frequency or time variable as input. Is the scale of the frequency axis (i.e. frequency increment on each point) just the sampling period divided by 2*Pi (or something similar to that?) I couldn't find a lot of information in the documentation on this. I've been looking for similar questions online and haven't found anything.
This is in some ways a math question, although it depends heavily on the Accelerate Framework implementation.
Thanks
EDIT
I asked a follow-up question here but no one has answered it yet. Please take a look!
The FFT gives you linearly spaced frequency bins up to the sampling frequency. This means that the spacing between the bins is (sample frequency) / (number of bins).
The frequency axis scale does not depend on the Accelerate framework implementation, only on the sample rate (FS) of the time domain data and the length (N) of the FFT. Any FFT.
For strictly real data input, the second half of the FFT results will just be complex conjugates of the first half. Only the first half, up to FS/2, are usually plotted for real data.

Sine Table Interpolation

I want to put together a SDR system that tunes initially AM, later FM etc.
The system I am planning to use to do this will have a sine lookup table for Direct Digital Synthesis (DDS).
In order to tune properly I expect to need to be able to precisely control the frequency of the sine wave fed to the Mixer (multiplier in this case). I expect that linear interpolation will be close, but think a non-linear method will provide better results.
What is a good and fast interpolation method to use for sine tables. Multiplication and addition are cheap on the target system; division is costly.
Edit:
I am planning on implementing constants with multiply/shift functions to normalize the constants to scaled integers. Intermediate values will use wide adds, and multiplies will use 18 or 17 bits. Floating point "pre-computation" can be used, but not on the target platform. When I say "division is costly" I mean that it has to implemented using the multipliers and a lot of code. It's not unthinkable, but should be avoided. However, true floating point IEEE methods would take a significant amount of resources on this platform, as well as a custom implementation.
Any SDR experiences would be helpful.
If you don't get very good results with linear interpolation you can try the trigonometric relations.
Sum and Difference Formulas
sin(A+B)=sinA*cosB + cosA*sinB
sin(A-B)=sinA*cosB - cosA*sinB
cos(A+B)=cosA*cosB - sinA*sinB
cos(A-B)=cosA*cosB + sinA*sinB
and you can have precalculated sin and cos values for A, B ranges, ie
A range: 0, 10, 20, ... 90
B range: 0.01 ... 0.99
table interpolation for smooth functions = ick hurl bleah. IMHO I would only use table interpolation on some really weird function, or where you absolutely needed to ensure you avoid discontinuities (note that the derivatives for interpolated tables are discontinuous though). By the time you finish doing table lookups and the required interpolation code, you could have already evaluated a polynomial or two, at least if multiplication doesn't cause you too much heartburn.
IMHO you're much better off using Chebyshev approximation for each segment (e.g. -90 to +90 degrees, or -45 to +45 degrees, and then other segments of the same width) of the sine waveform, and picking the minimum degree polynomial that reduces your error to a desired value. If the segment is small enough you could get away with a quadratic or maybe even a linear polynomial; there's tradeoffs between accuracy, and # of segments, and degree of polynomial.
See my post in this other question, it'll save you the trouble of calculating coefficients (at least if you believe my math).
(edit: in case this wasn't clear, you do the Chebyshev approximation at design-time on your favorite high-powered PC, so that at run-time you can use a dirtbag microcontroller or FPGA or whatever with a simple polynomial of degree 1-4. Don't go over degree 4 unless you know what you're doing, 3 or below would be better.)
Why a table? This very fast function has its worst noise peak at -90db when the signal is at -20db. That's crazy good.
For resampling of audio, I always use one of the interpolators from the Elephant paper. This was discussed in a previous SO question.
If you're on a processor that doesn't have fp, you can still do these things, but they are harder. I've been there. I feel your pain. Good luck! I used to do conversions for fp to integer for fun, but now you'd have to pay me to do it. :-)
Cool online references that apply to your problem:
http://www.audiomulch.com/~rossb/code/sinusoids/
http://www.dattalo.com/technical/theory/sinewave.html
Edit: additional thoughts based on your comments
Since you're working on a tricky processor, maybe you should look into how to make your sine table have more angles to look up, but still keep it small.
Suppose you break a quadrant into 90 pieces (in reality, you'd probably use 256 pieces, but let's keep it 90 for familiarity and clarity). Encode those as 16 bits. That's 180 bytes of table so far.
Now, for every one of those degrees, we're going to have 9 (in reality probably 8 or 16) in-between points.
Let's take the range between 3 degrees and 4 degrees as an example.
sin(3)=0.052335956 //this will be in your table as a 16-bit number
sin(4)=0.069756474 //this will be in your table as a 16-bit number
so we're going to look at sin(3.1)
sin(3.1)=0.054978813 //we're going to be tricky and store the result
// in 8 bits as a percentage of the distance between
// sin(3) and sin(4)
What you want to do is figure out how sin(3.1) fits in between sin(3) and sin(4). If it's half way between, code that as a byte of 128. If it's a quarter of the way between, code that as 64.
That's an additional 90 bytes and you've encoded down to a tenth of a degree in 16-bit res in only 180+90*9 bytes. You can extend as needed (maybe going up to 32-bit angles and 16-bit tween angles) and linearly interpolate in between very quickly. To minimize storage space, you're taking advantage of the fact that consecutive values are close to each other.
Edit 2: better way to encode the in-between angles in a table
I just remembered that when I did this, I ended up very compactly expressing the difference between the expected value according to linear interpolation and the actual value. This error is always in the same direction.
I first calculated the maximum error in the range and then based the scale on that.
Worked great. I feel like I should do the code in a blog entry to illustrate. :-)
Interpolation in a sine table is effectively resampling. Obviously you can get perfect results by a single call to sin, so whatever your solution is it needs to outperform that. For fixed-filter resampling, you're still going to only have a fixed set of available points (a 3:1 upsampler means you'll have 2 new points available between each point in your table). How expensive is memory on the target system? My primary recommendation is simply improve the table resolution and use linear interpolation. You'll get the same results as a smaller table and simple upsample but with less computational overhead.
Have you considered using the Taylor series for the trig functions (found here)? This involves multiplication and division but depending on how your numbers are represented you may be able to turn the division into multiplication (or bit shifts if you're very lucky). You can compute as many terms of the series as you need and get your precision that way.
Alternately if this sine wave is going to be an analog signal at some point then you could just use a lookup table approach and use an analog filter to remove the sampling frequency from the resulting waveform. If your sampling frequency is 100 times the sine frequency it will be easy to remove. You'll need a variable filter to do this. I've never done such a thing but I know there's digital potentiometers that take a binary number and change their resistance. That could be the basis of a variable RC filter - probably with some op-amps for gain, etc.
Good luck!
People have written some amazingly clever code for quickly calculating sin() on systems with tiny amounts of memory that don't even have a hardware multiply instruction, much less a division instruction.
In order of increasing complexity:
Use a square wave. Many AM radios use square waves in their ring demodulator, and I fail to see why your AM demodulator requires anything more complicated.
Approximate sin() by looking up the "closest value" in a raw table of 256 values per quarter-cycle. Yes, you see horrible-looking stair-steps, but (with a little bit of analog filtering) this often works well. (In fact, this is often overkill, and a much shorter table is adequate).
Approximate sin() by looking up the 2 closest values in a raw table, and linearly interpolating between them.
Approximate sin() with 16 short, equally-spaced-in-x cubic splines per quarter-cycle "gives better than 16-bit precision" for sin(x).
Wikibooks: Fixed-Point Numbers links to some clever implementations of the last 3.

Resources