Detecting a low frequency tone in an audio file - ios

I know this question been asked hundred times... But I am getting frustrated with my result so I wanted to ask again. Before I dive deep into fft, I need to figure this simple task out.
I need to detect a 20 hz tone in an audiofile. I insert the 20hz tone myself like in the picture. (It can be any frequency as long as listener can't hear it so I thought I should choose a frequency around 20hz to 50 hz)
info about the audiofile.
afinfo 1.m4a
File: 1.m4a
File type ID: adts
Num Tracks: 1
----
Data format: 1 ch, 22050 Hz, 'aac ' (0x00000000) 0 bits/channel, 0 bytes/packet, 1024 frames/packet, 0 bytes/frame
Channel layout: Mono
estimated duration: 8.634043 sec
audio bytes: 42416
audio packets: 219
bit rate: 33364 bits per second
packet size upper bound: 768
maximum packet size: 319
audio data file offset: 0
optimized
format list:
[ 0] format: 1 ch, 22050 Hz, 'aac ' (0x00000000) 0 bits/channel, 0 bytes/packet, 1024 frames/packet, 0 bytes/frame
Channel layout: Mono
----
I followed this three tutorials and I came up with a working code that reads audio buffer and gives me fft doubles.
http://blog.bjornroche.com/2012/07/frequency-detection-using-fft-aka-pitch.html
https://github.com/alexbw/iPhoneFFT
How do I obtain the frequencies of each value in an FFT?
I read the data as follows
// If there's more packets, read them
inCompleteAQBuffer->mAudioDataByteSize = numBytes;
CheckError(AudioQueueEnqueueBuffer(inAQ,
inCompleteAQBuffer,
(sound->packetDescs?nPackets:0),
sound->packetDescs),
"couldn't enqueue buffer");
sound->packetPosition += nPackets;
int numFrequencies=2048;
int kNumFFTWindows=10;
SInt16 *testBuffer = (SInt16*)inCompleteAQBuffer->mAudioData; //Read data from buffer...!
OouraFFT *myFFT = [[OouraFFT alloc] initForSignalsOfLength:numFrequencies*2 andNumWindows:kNumFFTWindows];
for(long i=0; i<myFFT.dataLength; i++)
{
myFFT.inputData[i] = (double)testBuffer[i];
}
[myFFT calculateWelchPeriodogramWithNewSignalSegment];
for (int i=0;i<myFFT.dataLength/2;i++) {
NSLog(#"the spectrum data %d is %f ",i,myFFT.spectrumData[i]);
}
and my out out log something like
Everything checks out for 4096 samples of data
Set up all values, about to init window type 2
the spectrum data 0 is 42449.823771
the spectrum data 1 is 39561.024361
.
.
.
.
the spectrum data 2047 is -42859933071799162597786649755206634193030992632381393031503716729604050285238471034480950745056828418192654328314899253768124076782117157451993697900895932215179138987660717342012863875797337184571512678648234639360.000000
I know I am not calculating the magnitude yet but how can I detect that sound has 20 hz in it? Do I need to learn Goertzel algorithm?

There are many ways to convey information which gets inserted into then retrieved from some preexisting wave pattern. The information going in can vary things like the amplitude (amplitude modulation) or freq (frequency modulation), etc. Do you have a strategy here ? Note that the density of information you wish to convey can be influenced by such factors as the modulating frequency (higher frequencies naturally can convey more information as it can resolve changes more times per second).
Another approach is possible if both the sender and receiver have the source audio (reference). In this case the receiver could do a diff between reference and actual received audio to resolve out the transmitted extra information. A variation on this would be to have the sender send ~~same~~ audio twice, first send the reference untouched audio followed by a modulated version of this same reference audio that way the receiver just does a diff between these two audibly ~~same~~ clips to resolve out the embedded audio.
Going back to your original question ... if the sender and receiver have an agreement ... say for some time period X the reference pure 20 Hz tone is sent followed by another period X that 20 Hz tone is modulated by your input information to either alter its amplitude or frequency ... then just repeat this pattern ... on receiving side they just do a diff between each such pair of time periods to resolve your modulated information ... for this to work the source audio cannot have any tones below some freq say 100 Hz (you remove such frequency band if needed) just to eliminate interference from source audio ... you have not mentioned what kind of data you wish to transmit ... if its voice you first would need to stretch it out in effect lowering its frequency range from the 1 kHz range down to your low 20 Hz range ... once result of diff is available on receiving side you then squeeze this curve to restore it back to normal voice range of 1kHz ... maybe more work than you have time for but this might just work ... real AM/FM radio uses modulation to send voice over mega Hz carrier frequencies so it can work

Related

how to calculate the CAN bus baud rate from the Tq and clock frequency ?

first , I think I know how to calculate the CAN bus Baud rate form the parameter in picture blew ,this is a CAN FD config.
clock frequency :80000 k
pre-scaler :1
so we can get the Tq = 1/80000 K
BTL cycles : 40
time for a bit = 40 * (1/80000K) = 1/2000k
So we can get the baud rate = 1/ (1/2000k) = **2000k .**
this Baud rate which we calculated is equal to the value which the CANoe Generated.
But what puzzles me is :when I use this method to calculate the Baud Rate for a CAN(not CAN FD),the result is different from the value which the CANoe generated ,why ??? is there something different between CAN and CAN FD ??
could you please to help me ? thank you very much !
clock :16000K
Pre-sacler :1
tq = 1/16000k
BTL : 16
time for a bit = 16*1/16000k = 1/1000k
baud rate = 1000k
but result generate via CANoe is 500k ,seems somewhere i missing a "divide by 2 " ??
the CAN control chip for CANOE is SJA1000 ,from the CANOE help document.
for this chip :
CAN clock = system clock * pre-scaler * 2
the key-point of this question is the "2" here ,for other chip ,such as STM32F103 ,Clock set for CAN bus is 36Mhz,it doesnt need to divide by “2”
so the clock frequency below maybe the system clock I guess
According to this rule, I set the parameters of another development board, and the measured communication was successful.
meanwhile the user of CANOE should just focus on Baud rate and sample point ,There is no need to pay too much attention to other parameters.
Hope this helps you

Select an integer number of periods

Suppose we have sinusoidal with frequency 100Hz and sampling frequency of 1000Hz. It means that our signal has 100 periods in a second and we are taking 1000 samples in a second. Therefore, in order to select a complete period I'll have to take fs/f=10 samples. Right?
What if the sampling period is not a multiple of the frequency of the signal (like 550Hz)? Do I have to find the minimum multiple M of f and fs, and than take M samples?
My goal is to select an integer number of periods in order to be able to replicate them without changes.
You have f periods a second, and fs samples a second.
If you take M samples, it would cover M/fs part of a second, or P = f * (M/fs) periods. You want this number to be integer.
So you need to take M = fs / gcd(f, fs) samples.
For your example P = 1000 / gcd(100, 1000) = 1000 / 100 = 10.
If you have 60 Hz frequency and 80 Hz sampling frequency, it gives P = 80 / gcd(60, 80) = 80 / 20 = 4 -- 4 samples will cover 4 * 1/80 = 1/20 part of a second, and that will be 3 periods.
If you have 113 Hz frequency and 512 Hz sampling frequency, you are out of luck, since gcd(113, 512) = 1 and you'll need 512 samples, covering the whole second and 113 periods.
In general, an arbitrary frequency will not have an integer number of periods. Irrational frequencies will never even repeat ever. So some means other than concatenation of buffers one period in length will be needed to synthesize exactly periodic waveforms of arbitrary frequencies. Approximation by interpolation for fractional phase offsets is one possibility.

What is the supported format for compressed 4-channel audio file in iOS?

First of all I'm a noob in both iOS and audio programming, so bear with me if I don't use the correct technical terms, but I'll do my best!
What we want to do:
In an iOS app we are developing, we want to be able to play sounds throughout 4 different outputs to have a mini surround system. That is, we want to have the Left and Right channels play through the Headphones, while the Center and Center surround play through an audio hardware connected to the lightning port. Since the audio files will be streamed/dowloaded from a remote server, using raw (PCM) audio files is not an option.
The problem:
Apple has, since iOS 6, made it possible to play an audio file using a multiroute configuration... and that is grate and exactly what we need... but, when ever we try to play a 4-channel audio file, AAC-encoded and encapsulated in an m4a (or CAF) file format, we get the following error:
ERROR: [0x19deee000] AVAudioFile.mm:86: AVAudioFileImpl: error 1718449215
(Which is the status code for "kAudioFileUnsupportedDataFormatError" )
We get the same error when we use the same audio encoded as lossless (ALAC) instead, but we don't get this error when playing the same audio befor encoding (PCM format).
We don't get the error neither when we use a stereo audio file, or a 5.1 audio file encoded, the same way as the 4-channels one, in both AAC and ALAC.
What we tried:
The encoding
The file was encoded using Apple's audio tools provided with Mac OS X: afconvert using this command:
afconvert -v -f 'm4af' -d "aac#44100" 4ch_master.caf 4ch_44100_AAC.m4a
and
afconvert -v -f 'caff' -d "alac#44100" 4ch_master.caf 4ch_44100_ALAC.caf
in the case of lossless encoding.
The audio format, as given by afinfo for the master (PCM) audio file:
File: 4ch_master.caf
File type ID: caff
Num Tracks: 1
----
Data format: 4 ch, 44100 Hz, 'lpcm' (0x0000000C) 16-bit little-endian signed integer
no channel layout.
estimated duration: 582.741338 sec
audio bytes: 205591144
audio packets: 25698893
bit rate: 2822400 bits per second
packet size upper bound: 8
maximum packet size: 8
audio data file offset: 4096
optimized
audio 25698893 valid frames + 0 priming + 0 remainder = 25698893
source bit depth: I16
The AAC-encoded format info:
File: 4ch_44100_AAC.m4a
File type ID: m4af
Num Tracks: 1
----
Data format: 4 ch, 44100 Hz, 'aac ' (0x00000000) 0 bits/channel, 0 bytes/packet, 1024 frames/packet, 0 bytes/frame
Channel layout: Quadraphonic
estimated duration: 582.741338 sec
audio bytes: 18338514
audio packets: 25099
bit rate: 251730 bits per second
packet size upper bound: 1039
maximum packet size: 1039
audio data file offset: 106496
optimized
audio 25698893 valid frames + 2112 priming + 371 remainder = 25701376
source bit depth: I16
format list:
[ 0] format: 4 ch, 44100 Hz, 'aac ' (0x00000000) 0 bits/channel, 0 bytes/packet, 1024 frames/packet, 0 bytes/frame
Channel layout: Quadraphonic
----
And for the lossless encoded audio file:
File: 4ch_44100_ALAC.caf
File type ID: caff
Num Tracks: 1
----
Data format: 4 ch, 44100 Hz, 'alac' (0x00000001) from 16-bit source, 4096 frames/packet
Channel layout: 4.0 (C L R Cs)
estimated duration: 582.741338 sec
audio bytes: 83333400
audio packets: 6275
bit rate: 1143862 bits per second
packet size upper bound: 16777
maximum packet size: 16777
audio data file offset: 20480
optimized
audio 25698893 valid frames + 0 priming + 3507 remainder = 25702400
source bit depth: I16
----
The code
In the code part, at the beginning, we followed the implementation presented at session 505 of WWDC12 using AVAudioPlayer API. At that level, multirouting didn't seemed to work reliably.. we didn't suspect that that might have been related to the audio format, so we moved on experimenting with AVAudioEngine API, presented at session 502 of WWDC14 and the sample code associated to it. We made the multirouting work for the master 4-channels audio file (after some adaptations), but then we hit the error mentioned above when calling scheduleFile, as shown in the code snippet below (Note: We are using Swift and all the necessary audio graph setup is done but not shown here):
var playerNode: AVAudioPlayerNode!
...
...
let audioFileToPlay = AVAudioFile(forReading: URLOfTheAudioFle)
playerNode.scheduleFile(audioFileToPlay, atTime: nil, completionHandler: nil)
Do someone have a hint on what could be wrong in the audio data format?
After contacting Apple Support, the answer was that this is not possible for the currently shipping system configurations:
"Thank you for contacting Apple Developer Technical Support (DTS). Our engineers have reviewed your request and have concluded that there is no supported way to achieve the desired functionality given the currently shipping system configurations."

How can I get my iPhone to listen for sound frequencies above a certain threshold?

I'm interested in getting my iOS app to turn on the microphone and only listen for frequencies above 17000 hz. If it hears something in that range, I'd like the app to call a method.
I was able to find a repository that detects frequency: https://github.com/krafter/DetectingAudioFrequency
And here is a post breaking down FFT:
Get Hz frequency from audio stream on iPhone
Using these examples, I've been able to get the phone to react to the strongest frequency it hears, but I'm more interested in just reacting to the above 17000 hz frequencies.
The fact that I wrote that code helps me answering this question but the answer probably only applies to this code.
You can easily limit the frequencies you listen to just by trimming that output array to a piece that contains only the range you need.
In details: To be simple - array[0..255] contains your audio in frequency domain. For example you sample rate was 44100 when you did FFT.
Then maximum frequency you can encode is 22050. (Nyquist theorem).
That is array[0] contains value for 22050/256=86.13 Hz. Array[1] contains value for 86.13*2 = 172.26 Hz, array[2] contains value for 86.13*3 = 258.39 Hz. And so on. Your full range is distributed across those 256 values. (and yes, precision suffers)
So if you only need to listen to some range, let's say above 17000Hz, you just take a piece of that array and ignore the rest. In this case you take 17000/86.13=197 to 255 subarray and you have it. Only 17000-22050 range.
In my repo you modify strongestFrequencyHZ function like that:
static Float32 strongestFrequencyHZ(Float32 *buffer, FFTHelperRef *fftHelper, UInt32 frameSize, Float32 *freqValue) {
Float32 *fftData = computeFFT(fftHelper, buffer, frameSize);
fftData[0] = 0.0;
unsigned long length = frameSize/2.0;
Float32 max = 0;
unsigned long maxIndex = 0;
Float32 freqLimit = 17000; //HZ
Float32 freqsPerIndex = NyquistMaxFreq/length;
unsigned long lowestLimitIndex = (unsigned long) freqLimit/freqsPerIndex;
unsigned long newLen = length-lowestLimitIndex;
Float32 *newData = fftData+lowestLimitIndex; //address arithmetic
max = vectorMaxValueACC32_index(newData, newLen, 1, &maxIndex);
if (freqValue!=NULL) { *freqValue = max; }
Float32 HZ = frequencyHerzValue(lowestLimitIndex+maxIndex, length, NyquistMaxFreq);
return HZ;
}
I did some address arithmetic in there so it looks kind of complicated. You can just take that fftData array and do the regular stuff.
Other things to keep in mind:
Finding strongest freq. is easy. You just find maximum in that array. That's it. But in you case you need to monitor the range and find when it went from regular weak noise to some strong signal. In other words when stuff peaks, and this is not so trivial, but possible. You can probably just set some limit above which the signal becomes detected, although this not the best option.
I would rather be optimistic about this cause in real life you can't see much noise at 18000Hz around you. The only thing a can remember of are some old TVs that produce that high pitched sound when they're on.

Extract Treble and Bass from audio in iOS

I'm looking for a way to get the treble and bass data from a song for some incrementation of time (say 0.1 seconds) and in the range of 0.0 to 1.0. I've googled around but haven't been able to find anything remotely close to what I'm looking for. Ultimately I want to be able to represent the treble and bass level while the song is playing.
Thanks!
Its reasonably easy. You need to perform an FFT and then sum up the bins that interest you. A lot of how you select will depend on the sampling rate of your audio.
You then need to choose an appropriate FFT order to get good information in the frequency bins returned.
So if you do an order 8 FFT you will need 256 samples. This will return you 128 complex pairs.
Next you need to convert these to magnitude. This is actually quite simple. if you are using std::complex you can simply perform a std::abs on the complex number and you will have its magnitude (sqrt( r^2 + i^2 )).
Interestingly at this point there is something called Parseval's theorem. This theorem states that after performinng a fourier transform the sum of the bins returned is equal to the sum of mean squares of the input signal.
This means that to get the amplitude of a specific set of bins you can simply add them together divide by the number of them and then sqrt to get the RMS amplitude value of those bins.
So where does this leave you?
Well from here you need to figure out which bins you are adding together.
A treble tone is defined as above 2000Hz.
A bass tone is below 300Hz (if my memory serves me correctly).
Mids are between 300Hz and 2kHz.
Now suppose your sample rate is 8kHz. The Nyquist rate says that the highest frequency you can represent in 8kHz sampling is 4kHz. Each bin thus represents 4000/128 or 31.25Hz.
So if the first 10 bins (Up to 312.5Hz) are used for Bass frequencies. Bin 10 to Bin 63 represent the mids. Finally bin 64 to 127 is the trebles.
You can then calculate the RMS value as described above and you have the RMS values.
RMS values can be converted to dBFS values by performing 20.0f * log10f( rmsVal );. This will return you a value from 0dB (max amplitude) down to -infinity dB (min amplitude). Be aware amplitudes do not range from -1 to 1.
To help you along, here is a bit of my C++ based FFT class for iPhone (which uses vDSP under the hood):
MacOSFFT::MacOSFFT( unsigned int fftOrder ) :
BaseFFT( fftOrder )
{
mFFTSetup = (void*)vDSP_create_fftsetup( mFFTOrder, 0 );
mImagBuffer.resize( 1 << mFFTOrder );
mRealBufferOut.resize( 1 << mFFTOrder );
mImagBufferOut.resize( 1 << mFFTOrder );
}
MacOSFFT::~MacOSFFT()
{
vDSP_destroy_fftsetup( (FFTSetup)mFFTSetup );
}
bool MacOSFFT::ForwardFFT( std::vector< std::complex< float > >& outVec, const std::vector< float >& inVec )
{
return ForwardFFT( &outVec.front(), &inVec.front(), inVec.size() );
}
bool MacOSFFT::ForwardFFT( std::complex< float >* pOut, const float* pIn, unsigned int num )
{
// Bring in a pre-allocated imaginary buffer that is initialised to 0.
DSPSplitComplex dspscIn;
dspscIn.realp = (float*)pIn;
dspscIn.imagp = &mImagBuffer.front();
DSPSplitComplex dspscOut;
dspscOut.realp = &mRealBufferOut.front();
dspscOut.imagp = &mImagBufferOut.front();
vDSP_fft_zop( (FFTSetup)mFFTSetup, &dspscIn, 1, &dspscOut, 1, mFFTOrder, kFFTDirection_Forward );
vDSP_ztoc( &dspscOut, 1, (DSPComplex*)pOut, 1, num );
return true;
}
It seems that you're looking for Fast Fourier Transform sample code.
It is quite a large topic to cover in an answer.
The tools you will need are already build in iOS: vDSP API
This should help you: vDSP Programming Guide
And there is also a FFT Sample Code available
You might also want to check out iPhoneFFT. Though that code is slighlty
outdated it can help you understand processes "under-the-hood".
Refer to auriotouch2 example from Apple - it has everything from frequency analysis to UI representation of what you want.

Resources