record pcmaudiodata per 10 milisecond without playback - ios

İ need to record pcmaudio per 10 milisecond without playback in swift.
I have tried this code but i can't find how can i stop playback while recording.
RecordAudio Github Repo
and second question: How can i get PCM data from circular buffer for encode-decode process properly. When I convert recorded audio data to signed byte or unsigned byte or anything else the converted data sometimes will corrupt. What is the best practice for this kind of process

In the RecordAudio sample code, the audio format is specified as Float (32-bit floats). When doing a float to integer conversion, you have to make sure your scale and offset results in a value in legal range for the destination type. e.g. check that -1.0 to 1.0 results in 0 to 256 (unsigned byte), and out-of-range values are clipped to legal values. Also pay attention to the number of samples you convert, as an Audio Unit callback can vary the frameCount sent (number of samples returned). You most likely won't get exactly 10 mS in any single RemoteIO callback, but may have to observe a circular buffer filled by multiple callbacks, or a larger buffer that you will have to split.
When RemoteIO is running in play-and-record mode, you can usually silence playback by zeroing the bufferList buffers (after copying, analyzing, or otherwise using the data in the buffers) before returning from the Audio Unit callback.

Related

How to modify the size of AudioUnit Buffer?

I'm developing an app of recording, but I have a demand that the size of input buffer should be 882 bytes. I know that I can modify the mDataByteSize of buffList like this picture.
But it can only be modified to power of 2. When I tried to modify it to 882, it warned me that "AudioUnitRender error:-50".
I hope somebody can help me because I have no way.
You can't demand a specific input size in an Audio Unit recording callback bufferList. In fact, the Audio Unit API is allowed to change the number of samples per audio buffer at run time. So your app has to support a differing number of frames than requested, each callback.
Instead your app should save the samples in another temporary FIFO buffer, and the later remove some of the samples in your desired block size when that temporary FIFO buffer become full enough. Typically a circular buffer is used to store the temporary amount until it gets filled to the size you need or larger. They you can pull out exactly 882 or whatever number of samples you need.

What's the reason of using Circular Buffer in iOS Audio Calling APP?

My question is pretty much self explanatory. Sorry if it seems too dumb.
I am writing a iOS VoIP dialer and have checked some open-source code(iOS audio calling app). And almost all of those use Circular Buffer for storing recorded and received PCM audio data. SO i am wondering why we need to use a Circular Buffer in this case. What's the exact reason for using such audio buffer.
Thanks in advance.
Using a circular buffer lets you process the input and output data asynchronously from it's source. The audio render process takes place on a high priority thread. It asks for audio samples from your app (playback), and offers audio (recording/processing) on a timer in the form of callbacks.
A typical scenario would be for the audio callback to fire every 0.023 seconds to ask for (and/or offer) 1024 samples of audio. This thread is synchronized with system hardware so it is imperative that your callback returns before the 0.023 seconds is up. If you don't, the hardware won't wait for you, it will just skip that cycle and you will have an audible pop or silence, or miss audio you are trying to record.
A circular buffer's place is to pass data between threads. In an audio application that would be moving the samples to and from the audio thread asynchronously. One thread produces samples on to the "head" of the buffer, and the other thread consumes them from the "tail".
Here's an example, retrieving audio samples from the microphone and writing them to disk. Your app has subscribed to a callback that fires every 0.023 seconds, offering 1024 samples to be recorded. The naive approach would be to simply write the audio to disk from within that callback.
void myCallback(float *samples,int sampleCount, SampleSaver *saver){
SampleSaverSaveSamples(saver,samples,sampleCount);
}
This will work!! Most of the time...
The problem is that there is no guarantee that writing to disk will finish before 0.023 seconds, so every now and then, your recording has a pop in it because SampleSaver just plain took too long and the hardware just skips the next callback.
The right way to do this is to use a circular buffer. I personally use TPCircularBuffer because it's awesome. The way it works (externally) is that you ask the buffer for a pointer to write data to (the head) on one thread, then on another thread you ask the buffer for a pointer to read from (the tail). Here's how it would be done using TPCircularBuffer (skipping setup and using a simplified callback).
//this is on the high priority thread that can't wait for anything like a slow write to disk
void myCallback(float *samples,int sampleCount, TPCircularBuffer *buffer){
int32_t availableBytes = 0;
float *head = TPCircularBufferHead(buffer, &availableBytes);
memcpy(head,samples,sampleCount * sizeof(float));//copies samples to head
TPCircularBufferProduce(buffer,sampleCount * sizeof(float)); //moves buffer head "forward in the circle"
}
This operation is super quick and puts no extra pressure on that sensitive audio thread. You then create your own timer a separate thread to write the samples to disk.
//this is on some background thread that can take it's sweet time
void myLeisurelySavingCallback(TPCircularBuffer *buffer, SampleSaver *saver){
int32_t available;
float *tail = TPCircularBufferTail(buffer, &available);
int samplesInBuffer = available / sizeof(float); //mono
SampleSaverSaveSamples(saver, tail, samplesInBuffer);
TPCircularBufferConsume(buffer, samplesInBuffer * sizeof(float)); // moves tail forward
}
And there you have it, not only do you avoid audio glitches, but if you initialize a big enough buffer, you can set your write to disk callback to only fire every second or two (after the circular buffer has built up a good bit of audio) which is much easier on your system than writing to disk every 0.023 seconds!
The main reason to use the buffer is so the samples can be handled asynchronously. They are a great way to pass messages between threads without locks as well. Here is a good article explaining a neat memory trick for the implementation of a circular buffer.
Good question. There is another good reason for using Circular Buffer.
In iOS, if you use callbacks(Audio unit) for recording and playing audio(In-fact you need to use it if you want to create a real-time audio transferring app) then you will get a chunk of data for a specific amount of time(let's say 20 milliseconds) from the recorder callback. And in iOS, you will never get fixed length of data always(If you set the callback interval as 20ms then you will get 370 or 372 bytes of data. And you will never know when you will get 370 bytes or 372 bytes. Correct me if i am wrong). Then, to transfer the audio through UDP packets you need to use a codec for data encoding and decoding(G729 is generally used for VoIP apps). But g729 takes data by the multiplier of 8. Assume, you encode 368(8*46) bytes per 20ms. So what are you going to do with rest of the data ? You need to store it by sequence for the next chunk to process.
SO that's the reason. There are some other details thing but i kapt it simple for your better understanding. Just comment below if you have any question.

Get peak volume of audio input on iOS

On iOS 7, how do I get the current microphone input volume in a range between 0 and 1?
I've seen several approaches like this one, but the results I get baffle me.
The return values of peakPowerForChannel: are documented to be in the range of -160 to 0 with 0 being the loudest and -160 near absolute silence.
Problem: Given a quite room and a short but loud noise, the power goes all the way up in an instant but takes very long time to drop back to quite level (way longer than the actual noise...)
What I want: Essentially I want an exact copy of the Audio Input patch of Quartz Composer with its Volume Peak output. Any tips?
To get a similar volume peak measurement, you might have to input raw audio via the iOS Audio Queue API (or the RemoteIO Audio Unit), and analyze the raw PCM waveform samples in each audio callback, looking for a magnitude maxima over your desired frame width or analysis time.

Millisecond (and greater) precision for audio file elapsed time in iOS

I am looking for a low-latency way of finding out how many seconds have elapsed in an audio file to guaranteed millisecond precision in real-time. According to the AVAudioPlayer class reference, a call to -currentTime will return "the offset of the current playback position, measured in seconds from the start of the sound", however an NSTimeInterval is a double and this implies fractions of a second are possible.
As a testing scenario, I have an audio file playing and the user taps a button. Playback DOES NOT pause/stop, but at the moment the button was tapped I would like to obtain information about the elapsed time. In the real application, the "button may be pressed" many times in one second, hence the need for millisecond precision.
My files are stored as AIFF files and are around 1-10 minutes in length. Ideally I would like to find out exactly which sample frame is 'up-next' when playback resumes - however, this level of precision is a little excessive and millisecond precision is perfectly acceptable.
Is AVAudioPlayer's -currentTime method sufficient to achieve guaranteed millisecond precision for a currently-playing audio file? Or, would it be preferable to use a lower-level API such as iOS's Audio Units?
If you want sub-millisecond relative time resolution, convert to raw PCM and count buffers * length + samples using a low latency RemoteIO Audio Unit configuration. Most iOS devices will support as small as 6 mS RemoteIO buffers of 256 samples, with a callback for each buffer.

kAudioDevicePropertyBufferFrameSize replacement for iOS

I was trying to set up an audio unit to render the music (instead of Audio Queue.. which was too opaque for my purposes).. iOS doesn't have this property kAudioDevicePropertyBufferFrameSize.. any idea how I can derive this value to set up the buffer size of my IO unit?
I found this post interesting.. it asks about the possibility of using a combination of kAudioSessionProperty_CurrentHardwareIOBufferDuration and kAudioSessionProperty_CurrentHardwareOutputLatency audio session properties to determine that value.. but there is no answer.. any ideas?
You can use the kAudioSessionProperty_CurrentHardwareIOBufferDuration property, which represents the buffer size in seconds. Multiply this by the sample rate you get from kAudioSessionProperty_CurrentHardwareSampleRate to get the number of samples you should buffer.
The resulting buffer size should be a multiple of 2. I believe either 512 or 4096 are what you're likely to get, but you should always base it off of the values returned from AudioSessionGetProperty.
Example:
Float64 sampleRate;
UInt32 propSize = sizeof(Float64);
AudioSessionGetProperty(kAudioSessionProperty_CurrentHardwareSampleRate,
&propSize,
&sampleRate);
Float32 bufferDuration;
propSize = sizeof(Float32);
AudioSessionGetProperty(kAudioSessionProperty_CurrentHardwareIOBufferDuration,
&propSize,
&bufferDuration);
UInt32 bufferLengthInFrames = sampleRate * bufferDuration;
The next step is to find out the input stream format of the unit you're sending audio to. Based on your description, I'm assuming that you're programmatically generating audio to send to the speakers. This code assumes unit is an AudioUnit you're sending audio to, whether that's the RemoteIO or something like an effect Audio Unit.
AudioStreamBasicDescription inputASBD;
UInt32 propSize = sizeof(AudioStreamBasicDescription);
AudioUnitGetProperty(unit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Input,
0,
&inputASBD,
&propSize);
After this, inputASBD.mFormatFlags will be a bit field corresponding to the audio stream format that unit is expecting. The two most likely sets of flags are named kAudioFormatFlagsCanonical and kAudioFormatFlagsAudioUnitCanonical. These two have corresponding sample types AudioSampleType and AudioUnitSampleType that you can base your size calculation off of.
As an aside, AudioSampleType typically represents samples coming from the mic or destined for the speakers, whereas AudioUnitSampleType is usually for samples that are intended to be processed (by an audio unit, for example). At the moment on iOS, AudioSampleType is a SInt16 and AudioUnitSampleType is fixed 8.24 number stored in a SInt32 container. Here's a post on the Core Audio mailing list explaining this design choice
The reason I hold back from saying something like "just use Float32, it'll work" is because the actual bit representation of the stream is subject to change if Apple feels like it.
The audio unit itself decides on the actual buffer size, so the app's audio unit callback has to be able to handle any reasonable size given to it. You can suggest and poll the kAudioSessionProperty_CurrentHardwareIOBufferDuration property, but note that this value can while your app is running (especially during screen lock or call interruptions, etc.) outside of what the app can control.

Resources