I am building a dB meter as part of an app I am creating, I have got it receiving the peak and average power's from the mic on my iPhone (values ranging from -60 to 0.4) and now I need to figure out how to convert these power levels into db levels like the ones in this chart http://www.gcaudio.com/resources/howtos/loudness.html
Does anyone have any idea how I could do this? I can't figure an algorithm out for the life of me and it is kind of crucial as the whole point of the app is to do with real word audio levels, if that makes sense.
Any help will be greatly appreciated.
Apple's done a pretty good job at making the frequency response of the microphone flat and consistent between devices, so the only thing you'll need to determine is a calibration point. For this you will require a reference audio source and calibrated Sound Pressure level meter.
It's worth noting that sound pressure measurements are often measured against the A-weighting scale. This is frequency weighted for the human aural system. In order to measure this, you will need to apply the relevant filter curve to results taken form the microphone.
Also be aware of the difference between peak and mean (in this case RMS) measurements.
As far as I can tell from looking at the documentation, the "power level" returned from an AVAudioRecorder (assuming that's what you're using – you didn't specify) is already in decibels. See here from the documentation for averagePowerForChannel:
Return Value
The current average power, in decibels, for the sound
being recorded. A return value of 0 dB indicates full scale, or
maximum power; a return value of -160 dB indicates minimum power (that
is, near silence).
Related
I have an electromagnetic sensor which reports how much electromagnetic field strength it reads in space.
And I also have a device that emits electromagnetic field. It covers 1 meter area.
So I want to kind of predict position of the sensor using its reading.
But the sensor is affected by metal so it makes the position prediction drifts.
It's like if the reading is 1, and you put it near a metal, you get 2.
Something like that. It's not just noise, it's a permanent drift. Unless you remove the metal it will give reading 2 always.
What are the techniques or topics I need to learn in general to recover reading 1 from 2?
Suppose that the metal is fixed somewhere in space and I can calibrate the sensor by putting it near metal first.
You can suggest anything about removing the drift in general. Also please consider that I can have another emitter putting somewhere so I should be able to recover the true reading easier.
Let me suggest that you view your sensor output as a combination of two factors:
sensor_output = emitter_effect + environment_effect
And you want to obtain emitter_effect without the addition of environment_effect. So, of course you need to subtract:
emitter_effect = sensor_output - environment_effect
Subtracting the environment's effect on your sensor is usually called compensation. In order to compensate, you need to be able to model or predict the effect your environment (extra metal floating around) is having on the sensor. The form of the model for your environment effect can be very simple or very complex.
Simple methods generally use a seperate sensor to estimate environment_effect. I'm not sure exactly what your scenario is, but you may be able to select a sensor which would independently measure the quantity of interference (metal) in your setup.
More complex methods can perform compensation without referring to an independent sensor for measuring inteference. For example, if you expect the distance to be at 10.0 on average with only occasional deviations, you could use that fact to estimate how much interference is present. In my experience, this type of method is less reliable; systems with independent sensors for measuring interference are more predictable and reliable.
You can start reading about Kalman filtering if you're interested in model-based estimation:
https://en.wikipedia.org/wiki/Kalman_filter
It's a complex topic, so you should expect a steep learning curve. Kalman filtering (and related Bayesian estimation methods) are the formal way to convert from "bad sensor reading" to "corrected sensor reading".
I developed an app a few months back for iOS devices that generates real-time harmonic rich drones. It works fine on newer devices, but it's running into buffer underruns on slower devices. I need to optimize this thing and need some mental help. Here's a super basic overview of what I'm currently doing:
Create an "Oscillator Bank" that consists of X number of harmonics (simply calculated from a given fundamental frequency. Nothing fancy here.)
Inside my DAC function that spits out samples to an iOS audio buffer, I call a "GetNextSample()" function that goes through the bank of sine oscillators, calculates the sample for each one and adds them up. Some simple additive synthesis.
Enjoy the beauty of the drone.
Again, it works great, until it doesn't. I'd like to optimize this thing so I'm not using brute additive synthesis of real-time calculated sine waves. If I limit the number of harmonics ("banks") to 2, it'll work on the older devices. Not cool. On the newer devices, it underruns around 50 harmonics. Not too bad. But if I want to play multiple drones at once to create some chords, that's too much processing power.... so...
Should I generate waveform tables to just loop through instead of constant calculation? (I assume yes...)
Should I convert my usage of double-precision floating point to integer based calculations? (I assume yes...)
And my big algorithmic question (being pretty non-mathematical):
If I use a waveform table, how do I accurately determine how long the wave/table should be?? In my experience developing this app, if I just go to the end of a period (2*PI) and start over again, resetting the phase back to 0, I get a sound artifact, since I'm force offsetting the phase. In other words, I can't guarantee that one period will give me the right results...
Maybe I'm over complicating things... What's the standard way of doing quick, processor friendly real-time synth of multiple added sines?
I'll keep poking around in the meantime.
Thanks!
Have you (or can you, not an iOS person) increase the buffer size? Might give you enough overhead that you do not need this. Otherwise yes wave-table synthesis is a viable approach. You could calculate a new wavetable from the sum of all the harmonics only when a parameter changes.
I have written such a beast in golang on server side ... for starters yes use single precision floating point
To address table population, I would assure your implementation is solid by having it synthesize a square wave. Visualize the output for each run as you give it each additional frequency (with its corresponding parms of amplitude and phase shift) ... by definition a single cycle is enough as long as you are correctly using enough cycles to cover the time period of a sample
Its important to leverage the knowledge that generating an output curve from an input set of sine waves ( each with freq, amplitude, phase shift) lends itself to doing the reverse ... namely perform a FFT on that output curve to have the api give you its version of the underlying sine waves (again each with a freq, amplitude and phase) ... this will confirm your system is accurate
The name of the process you are implementing is : inverse Fourier transform and there are libraries for this however I too prefer rolling my own
im have RSSI readings but no idea how to find measurement and process noise. What is the way to find those values?
Not at all. RSSI stands for "Received Signal Strength Indicator" and says absolutely nothing about the signal-to-noise ratio related to your Kalman filter. RSSI is not a "well-defined" things; it can mean a million things:
Defining the "strength" of a signal is a tricky thing. Imagine you're sitting in a car with an FM radio. What does the RSSI bars on that radio's display mean? Maybe:
The amount of Energy passing through the antenna port (including noise, because at this point no one knows what noise and signal are)?
The amount of Energy passing through the selected bandpass for the whole ultra shortwave band (78-108 MHz, depending on region) (incl. noise)?
Energy coming out of the preamplifier (incl. Noise and noise generated by the amplifier)?
Energy passing through the IF filter, which selects your individual station (is that already the signal strength as you want to define it?)?
RMS of the voltage observed by the ADC (the ADC probably samples much higher than your channel bandwidth) (is that the signal strength as you want to define it?)?
RMS of the digital values after a digital channel selection filter (i.t.t.s.s.a.y.w.t.d.i?)?
RMS of the digital values after FM demodulation (i.t.t.s.s.a.y.w.t.d.i?)?
RMS of the digital values after FM demodulation and audio frequency filtering for a mono mix (i.t.t.s.s.a.y.w.t.d.i?)?
RMS of digital values in a stereo audio signal (i.t.t.s.s.a.y.w.t.d.i?) ?
...
as you can imagine, for systems like FM radios, this is still relatively easy. For things like mobile phones, multichannel GPS receivers, WiFi cards, digital beamforming radars etc., RSSI really can mean everything or nothing at all.
You will have to mathematically define away to describe what your noise is. And then you will need to find the formula that describes your exact implementation of what "RSSI" is, and then you can deduct whether knowing RSSI says anything about process noise.
A Kalman Filter is a mathematical construct for computing the expected state of a system that is changing over time, given an initial state and noisy measurements of that system. The key to the "process noise" component of this is the fact that the system is changing. The way that the system changes is the process.
Your state might change due to manual control or due to the nature of the system. For example, if you have a car on a hill, it can roll down the hill naturally (described by the state transition matrix), or you might drive it down the hill manually (described by the control input matrix). Any noise that might affect these inputs - wind, bumps, twitches - can be described with the process noise.
You can measure the process noise the way you would measure variance in any system - take the expected dynamics and compare them with the true dynamics to generate a covariance matrix.
If I take audio data on an iPhone (i.e. real) data, perform an FFT and then take the magnitudes (Re^2 + Im^2).
These vary from >0 to some large numbers, so I do 10log(n) to get it in dB.
This gives me outputs that are negative (for the inputs that were < 1) to positive.
But the examples I've seen for this (and also drawing the spectrum in Sonic Visualiser) always have positive spectrums when measured in dB.
So what have I missed?!
On a wider note, as I understand it decibels are a ratio, so in this context when turning the FFT magnitudes into dB, what are they a ratio to?
The simple answer is that, for the most part, you can add an arbitrary number to the dB value to make the values all positive, or all negative, or whatever you prefer. With an uncalibrated microphone, like on the iPhone, this is all that makes sense anyway, since all you know are relative values.
For a more advanced technical approach, using a calibrated microphone, you could reference everything using dB (SPL), as a reasonable standard, but this is a hassle, and not meaningful in your use case anyway.
Rationale:
The main reason that shifting by an arbitrary amount is that the log doesn't report the units of measurement. For example, even if you know the input amplitude is 0.1 Pascal, it's completely valid to say this is 100 milliPascal, where you'd be taking the log of 100 rather than 0.1 (so the log values are either 2 or -1). Both are completely valid and the choice is entirely arbitrary. When comparing to a standard reference, as in dB SPL, note that it's done as a ratio, log(P/Pref), removing the impact of changing the units.
Since an FFT is a linear operator, The scale of the output of an FFT is related to the scale of the data input to the FFT. The scale of the input to the FFT on your iPhone depends on the gain of the mic, audio filters, potentially the AGC, and the DAC reference. Since the latter are all undocumented and can vary (by position of the mic, model of device, input gain which may depend on the audio session configuration, and etc.) you won't know the ratio unless you perform some sort of calibration against a known reference.
I have been looking into how to convert my digital data into analog.
So, I have a two column ASCII data file (x: time, y=voltage amplitude) which I would like to convert into an analog signal (varying Voltage with time). There are Digital to Analog converters, but the good ones are quite expensive. There should be a more trivial way to achieve this.
Ultimately what I'd like to do is to reconstruct the original time variant voltage which was sampled every nano-second and recorded as an ASCII data file.
I thought I may feed the data into my laptop's sound card and re-generate the time variant voltage which I can then feed into the analyzer via the audio jack. Does this sound feasible?
I am not looking into recovering the "shape" but the signal (voltage) itself.
Puzzled on several accounts.
You want to convert into an analog signal (varying Voltage with time) But the what you already have, the discrete signal, is indeed a "varying voltage with time", only that both the values (voltages) and times are discrete. That's the way computers (digital equipment, in general) work.
Only when the signal goes to some non-discrete medium (eg. a classical audio cable+plug) we have an analog signal. Precisely, the sound card of your computer is at its core a "Digital to Analog converter".
So, it appears you are not trying to do some digital processing of your signal (interpolation, or whatever), you are not dealing with computer programming, but with a hardware thing: getting the signal to a cable. If so, SO is not the proper place. YOu might try https://electronics.stackexchange.com/ ...
But, on another thing, you say that your data was "sampled every nano-second". That means 1 billion samples per second, or a sample freq of 1Ghz. That's a ridiculously high frequency, at least in the audio world. You cant output that to a sound card, which would be limited to the audio range (about 48Khz = 48000 samples per second).
You want to just fit a curve to the data. Assuming the sampling rate is sufficient, a third-order polynomial would be plenty. At each point N, you fit a cubic polynomial to points N-1, N, N+1, and N+2, and then you have an analytic expression for the data values between those points. Shift over one, and repeat. You can average the values for multiple successive curves, if you want.