Corrupt recording with repeating audio in IOS - ios

My application records streaming audio on iPhone. My problem is that a small percent (~2%) of the recordings are corrupted. They appear to have some audio buffers duplicated.
For example listen to this file.
Edit: A surprising thing is that looking closely at the data using Audacity shows the repeating parts are very very similar but not identical. Since FLAC (the format I use for encoding the audio) is a loss-less compression, I guess this is not a bug in the streaming/encoding but the problem originates at the data that comes from the microphone!
Below is the code I use to setup the audio recording streaming - is there anything wrong with it?
// see functions implementation below
- (void)startRecording
{
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)
, ^{
[self setUpRecordQueue];
[self setUpRecordQueueBuffers];
[self primeRecordQueueBuffers];
AudioQueueStart(recordQueue, NULL);
});
}
// this is called only once before any recording takes place
- (void)setUpAudioFormat
{
AudioSessionInitialize(
NULL,
NULL,
nil,
(__bridge void *)(self)
);
UInt32 sessionCategory = kAudioSessionCategory_PlayAndRecord;
AudioSessionSetProperty(
kAudioSessionProperty_AudioCategory,
sizeof(sessionCategory),
&sessionCategory
);
AudioSessionSetActive(true);
audioFormat.mFormatID = kAudioFormatLinearPCM;
audioFormat.mSampleRate = SAMPLE_RATE;//16000.0;
audioFormat.mChannelsPerFrame = CHANNELS;//1;
audioFormat.mBitsPerChannel = 16;
audioFormat.mFramesPerPacket = 1;
audioFormat.mBytesPerFrame = audioFormat.mChannelsPerFrame * sizeof(SInt16);
audioFormat.mBytesPerPacket = audioFormat.mBytesPerFrame * audioFormat.mFramesPerPacket;
audioFormat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
bufferNumPackets = 2048; // must be power of 2 for FFT!
bufferByteSize = [self byteSizeForNumPackets:bufferNumPackets];
}
// I suspect the duplicate buffers arrive here:
static void recordCallback(
void* inUserData,
AudioQueueRef inAudioQueue,
AudioQueueBufferRef inBuffer,
const AudioTimeStamp* inStartTime,
UInt32 inNumPackets,
const AudioStreamPacketDescription* inPacketDesc)
{
Recorder* recorder = (__bridge Recorder*) inUserData;
if (inNumPackets > 0)
{
// append the buffer to FLAC encoder
[recorder recordedBuffer:inBuffer->mAudioData byteSize:inBuffer->mAudioDataByteSize packetsNum:inNumPackets];
}
AudioQueueEnqueueBuffer(inAudioQueue, inBuffer, 0, NULL);
}
- (void)setUpRecordQueue
{
OSStatus errorStatus = AudioQueueNewInput(
&audioFormat,
recordCallback,
(__bridge void *)(self), // userData
CFRunLoopGetMain(), // run loop
NULL, // run loop mode
0, // flags
&recordQueue);
UInt32 trueValue = true;
AudioQueueSetProperty(recordQueue,kAudioQueueProperty_EnableLevelMetering,&trueValue,sizeof (UInt32));
}
- (void)setUpRecordQueueBuffers
{
for (int t = 0; t < NUMBER_AUDIO_DATA_BUFFERS; ++t)
{
OSStatus errorStatus = AudioQueueAllocateBuffer(
recordQueue,
bufferByteSize,
&recordQueueBuffers[t]);
}
}
- (void)primeRecordQueueBuffers
{
for (int t = 0; t < NUMBER_AUDIO_DATA_BUFFERS; ++t)
{
OSStatus errorStatus = AudioQueueEnqueueBuffer(
recordQueue,
recordQueueBuffers[t],
0,
NULL);
}
}

Turns out there was a rare bug allowing multiple recordings to start at nearly the same time - so two recordings took place in parallel but sent the audio buffers to the same callback, making the distorted repeating buffers in the encoded recordings...

Related

How to interleave a non-interleaved AudioBufferList inside a render callback?

I'm working on a project that involves streaming audio from an AVPlayer video player object into libpd using an MTAudioProcessingTap. For the process loop of the tap, I used PdAudioUnits render callback code as a guide; but I realized recently that the audio format expected by libpd is not the same as the audio coming from the tap — that is, the tap is providing two buffers of non-interleaved audio data in the incoming AudioBufferList, whereas libpd expects interleaved samples. I don't think I can change the tap itself to provide interleaved samples.
Does anyone know of a way I can work around this?
I think that I need to somehow create a new AudioBufferList or float buffer and interleave the samples in place; but I'm not quite sure how to do this and it seems like it would be expensive. If anyone could give me some pointers I would greatly appreciate it!
Here is my code for installing my tap:
- (void)installTapWithItem:(AVPlayerItem *)playerItem {
MTAudioProcessingTapCallbacks callbacks;
callbacks.version = kMTAudioProcessingTapCallbacksVersion_0;
callbacks.clientInfo = (__bridge void *)self;
callbacks.init = tap_InitCallback;
callbacks.finalize = tap_FinalizeCallback;
callbacks.prepare = tap_PrepareCallback;
callbacks.unprepare = tap_UnprepareCallback;
callbacks.process = tap_ProcessCallback;
MTAudioProcessingTapRef audioProcessingTap;
if (noErr == MTAudioProcessingTapCreate(kCFAllocatorDefault, &callbacks, kMTAudioProcessingTapCreationFlag_PreEffects, &audioProcessingTap))
{
NSLog(#"Tap created!");
AVAssetTrack *audioTrack = [playerItem.asset tracksWithMediaType:AVMediaTypeAudio].firstObject;
AVMutableAudioMixInputParameters* inputParams = [AVMutableAudioMixInputParameters audioMixInputParametersWithTrack:audioTrack];
inputParams.audioTapProcessor = audioProcessingTap;
AVMutableAudioMix* audioMix = [AVMutableAudioMix audioMix];
audioMix.inputParameters = #[inputParams];
playerItem.audioMix = audioMix;
}
}
And my tap_ProcessCallback:
static void tap_ProcessCallback(MTAudioProcessingTapRef tap, CMItemCount numberFrames, MTAudioProcessingTapFlags flags, AudioBufferList *bufferListInOut, CMItemCount *numberFramesOut, MTAudioProcessingTapFlags *flagsOut)
{
OSStatus status = MTAudioProcessingTapGetSourceAudio(tap, numberFrames, bufferListInOut, flagsOut, nil, numberFramesOut);
if (noErr != status) {
NSLog(#"Error: MTAudioProcessingTapGetSourceAudio: %d", (int)status);
return;
}
TapProcessorContext *context = (TapProcessorContext *)MTAudioProcessingTapGetStorage(tap);
// first, create the input and output ring buffers if they haven't been created yet
if (context->frameSize != numberFrames) {
NSLog(#"creating ring buffers with size: %ld", (long)numberFrames);
createRingBuffers((UInt32)numberFrames, context);
}
//adapted from PdAudioUnit.m
float *buffer = (float *)bufferListInOut->mBuffers->mData;
if (context->inputRingBuffer || context->outputRingBuffer) {
// output buffer info from ioData
UInt32 outputBufferSize = bufferListInOut->mBuffers[0].mDataByteSize;
UInt32 outputFrames = (UInt32)numberFrames;
// UInt32 outputChannels = bufferListInOut->mBuffers[0].mNumberChannels;
// input buffer info from ioData *after* rendering input samples
UInt32 inputBufferSize = outputBufferSize;
UInt32 inputFrames = (UInt32)numberFrames;
// UInt32 inputChannels = 0;
UInt32 framesAvailable = (UInt32)rb_available_to_read(context->inputRingBuffer) / context->inputFrameSize;
while (inputFrames + framesAvailable < outputFrames) {
// pad input buffer to make sure we have enough blocks to fill auBuffer,
// this should hopefully only happen when the audio unit is started
rb_write_value_to_buffer(context->inputRingBuffer, 0, context->inputBlockSize);
framesAvailable += context->blockFrames;
}
rb_write_to_buffer(context->inputRingBuffer, 1, buffer, inputBufferSize);
// input ring buffer -> context -> output ring buffer
char *copy = (char *)buffer;
while (rb_available_to_read(context->outputRingBuffer) < outputBufferSize) {
rb_read_from_buffer(context->inputRingBuffer, copy, context->inputBlockSize);
[PdBase processFloatWithInputBuffer:(float *)copy outputBuffer:(float *)copy ticks:1];
rb_write_to_buffer(context->outputRingBuffer, 1, copy, context->outputBlockSize);
}
// output ring buffer -> audio unit
rb_read_from_buffer(context->outputRingBuffer, (char *)buffer, outputBufferSize);
}
}
Answering my own question...
I'm not sure exactly why this works, but it does. Apparently I didn't need to use ring buffers either which is strange. I also added a switch for when mNumberBuffers only has one buffer.
if (context->frameSize && outputBufferSize > 0) {
if (bufferListInOut->mNumberBuffers > 1) {
float *left = (float *)bufferListInOut->mBuffers[0].mData;
float *right = (float *)bufferListInOut->mBuffers[1].mData;
//manually interleave channels
for (int i = 0; i < outputBufferSize; i += 2) {
context->interleaved[i] = left[i / 2];
context->interleaved[i + 1] = right[i / 2];
}
[PdBase processFloatWithInputBuffer:context->interleaved outputBuffer:context->interleaved ticks:64];
//de-interleave
for (int i = 0; i < outputBufferSize; i += 2) {
left[i / 2] = context->interleaved[i];
right[i / 2] = context->interleaved[i + 1];
}
} else {
context->interleaved = (float *)bufferListInOut->mBuffers[0].mData;
[PdBase processFloatWithInputBuffer:context->interleaved outputBuffer:context->interleaved ticks:32];
}
}

Audio Recording AudioQueueStart buffer never filled

I am using AudioQueueStart in order to start recording on an iOS device and I want all the recording data streamed to me in buffers so that I can process them and send them to a server.
Basic functionality works great however in my BufferFilled function I usually get < 10 bytes of data on every call. This feels very inefficient. Especially since I have tried to set the buffer size to 16384 btyes (see beginning of startRecording method)
How can I make it fill up the buffer more before calling BufferFilled? Or do I need to make a second phase buffering before sending to server to achieve what I want?
OSStatus BufferFilled(void *aqData, SInt64 inPosition, UInt32 requestCount, const void *inBuffer, UInt32 *actualCount) {
AQRecorderState *pAqData = (AQRecorderState*)aqData;
NSData *audioData = [NSData dataWithBytes:inBuffer length:requestCount];
*actualCount = inBuffer + requestCount;
//audioData is ususally < 10 bytes, sometimes 100 bytes but never close to 16384 bytes
return 0;
}
void HandleInputBuffer(void *aqData, AudioQueueRef inAQ, AudioQueueBufferRef inBuffer, const AudioTimeStamp *inStartTime, UInt32 inNumPackets, const AudioStreamPacketDescription *inPacketDesc) {
AQRecorderState *pAqData = (AQRecorderState*)aqData;
if (inNumPackets == 0 && pAqData->mDataFormat.mBytesPerPacket != 0)
inNumPackets = inBuffer->mAudioDataByteSize / pAqData->mDataFormat.mBytesPerPacket;
if(AudioFileWritePackets(pAqData->mAudioFile, false, inBuffer->mAudioDataByteSize, inPacketDesc, pAqData->mCurrentPacket, &inNumPackets, inBuffer->mAudioData) == noErr) {
pAqData->mCurrentPacket += inNumPackets;
}
if (pAqData->mIsRunning == 0)
return;
OSStatus error = AudioQueueEnqueueBuffer(pAqData->mQueue, inBuffer, 0, NULL);
}
void DeriveBufferSize(AudioQueueRef audioQueue, AudioStreamBasicDescription *ASBDescription, Float64 seconds, UInt32 *outBufferSize) {
static const int maxBufferSize = 0x50000;
int maxPacketSize = ASBDescription->mBytesPerPacket;
if (maxPacketSize == 0) {
UInt32 maxVBRPacketSize = sizeof(maxPacketSize);
AudioQueueGetProperty(audioQueue, kAudioQueueProperty_MaximumOutputPacketSize, &maxPacketSize, &maxVBRPacketSize);
}
Float64 numBytesForTime = ASBDescription->mSampleRate * maxPacketSize * seconds;
*outBufferSize = (UInt32)(numBytesForTime < maxBufferSize ? numBytesForTime : maxBufferSize);
}
OSStatus SetMagicCookieForFile (AudioQueueRef inQueue, AudioFileID inFile) {
OSStatus result = noErr;
UInt32 cookieSize;
if (AudioQueueGetPropertySize (inQueue, kAudioQueueProperty_MagicCookie, &cookieSize) == noErr) {
char* magicCookie =
(char *) malloc (cookieSize);
if (AudioQueueGetProperty (inQueue, kAudioQueueProperty_MagicCookie, magicCookie, &cookieSize) == noErr)
result = AudioFileSetProperty (inFile, kAudioFilePropertyMagicCookieData, cookieSize, magicCookie);
free(magicCookie);
}
return result;
}
- (void)startRecording {
aqData.mDataFormat.mFormatID = kAudioFormatMPEG4AAC;
aqData.mDataFormat.mSampleRate = 22050.0;
aqData.mDataFormat.mChannelsPerFrame = 1;
aqData.mDataFormat.mBitsPerChannel = 0;
aqData.mDataFormat.mBytesPerPacket = 0;
aqData.mDataFormat.mBytesPerFrame = 0;
aqData.mDataFormat.mFramesPerPacket = 1024;
aqData.mDataFormat.mFormatFlags = kMPEG4Object_AAC_Main;
AudioFileTypeID fileType = kAudioFileAAC_ADTSType;
aqData.bufferByteSize = 16384;
UInt32 defaultToSpeaker = TRUE;
AudioSessionSetProperty(kAudioSessionProperty_OverrideCategoryDefaultToSpeaker, sizeof(defaultToSpeaker), &defaultToSpeaker);
OSStatus status = AudioQueueNewInput(&aqData.mDataFormat, HandleInputBuffer, &aqData, NULL, kCFRunLoopCommonModes, 0, &aqData.mQueue);
UInt32 dataFormatSize = sizeof (aqData.mDataFormat);
status = AudioQueueGetProperty(aqData.mQueue, kAudioQueueProperty_StreamDescription, &aqData.mDataFormat, &dataFormatSize);
status = AudioFileInitializeWithCallbacks(&aqData, nil, BufferFilled, nil, nil, fileType, &aqData.mDataFormat, 0, &aqData.mAudioFile);
for (int i = 0; i < kNumberBuffers; ++i) {
status = AudioQueueAllocateBuffer (aqData.mQueue, aqData.bufferByteSize, &aqData.mBuffers[i]);
status = AudioQueueEnqueueBuffer (aqData.mQueue, aqData.mBuffers[i], 0, NULL);
}
aqData.mCurrentPacket = 0;
aqData.mIsRunning = true;
status = AudioQueueStart(aqData.mQueue, NULL);
}
UPDATE: I have logged the data that I receive and it is quite interesting, it almost seems like half of the "packets" are some kind of header and half is sound data. Could I assume this is just how the AAC encoding on iOS works? It writes header in one buffer, then data in the next one and so on. And it never wants more than around 170-180 bytes for each data chunk and that is why it ignores my large buffer?
I solved this eventually. Turns out that yes the encoding on iOS produces small and large chunks of data. I added a second phase buffer myself using NSMutableData and it worked perfectly.

How to play live audio on iOS?

I have an IPCamera that requires the use of a custom library for connecting and communication. I have the video all taken care of, but I also want to give the user the option to listen to the audio that is recorded by the camera.
I receive the audio in the form of a byte stream (the audio is PCM u-law).
Since I don't read the data from a file or have an URL I can connect to, I think I would have to use something like AudioUnits or openAL to play my audio.
I tried to implement it with AudioUnits based on the examples I found online and this is what I have so far:
-(void) audioThread
{
char buffer[1024];
int size = 0;
boolean audioConfigured = false;
AudioComponentInstance audioUnit;
while (running) {
getAudioData(buffer,size); //fill buffer with my audio
int16_t* tempChar = (int16_t *)calloc(ret, sizeof(int16_t));
for (int i = 0; i < ret; i++) {
tempChar[i] = MuLaw_Decode(buf[i]);
}
uint8_t *data = NULL;
data = malloc(size);
data = memcpy(data, &tempChar, size);
CMBlockBufferRef blockBuffer = NULL;
OSStatus status = CMBlockBufferCreateWithMemoryBlock(NULL, data,
size,
kCFAllocatorNull, NULL,
0,
size,
0, &blockBuffer);
CMSampleBufferRef sampleBuffer = NULL;
// now I create my samplebuffer from the block buffer
if(status == noErr)
{
const size_t sampleSize = size;
status = CMSampleBufferCreate(kCFAllocatorDefault,
blockBuffer, true, NULL, NULL,
formatDesc, 1, 0, NULL, 1,
&sampleSize, &sampleBuffer);
}
AudioStreamBasicDescription audioBasic;
audioBasic.mBitsPerChannel = 16;
audioBasic.mBytesPerPacket = 2;
audioBasic.mBytesPerFrame = 2;
audioBasic.mChannelsPerFrame = 1;
audioBasic.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
audioBasic.mFormatID = kAudioFormatLinearPCM;
audioBasic.mFramesPerPacket = 1;
audioBasic.mSampleRate = 48000;
audioBasic.mReserved = 0;
if(!audioConfigured)
{
//initialize the circular buffer
if(instance.decodingBuffer == NULL)
instance.decodingBuffer = malloc(sizeof(TPCircularBuffer));
if(!TPCircularBufferInit(instance.decodingBuffer, 1024))
continue;
AudioComponentDescription componentDescription;
componentDescription.componentType = kAudioUnitType_Output;
componentDescription.componentSubType = kAudioUnitSubType_RemoteIO;
componentDescription.componentManufacturer = kAudioUnitManufacturer_Apple;
componentDescription.componentFlags = 0;
componentDescription.componentFlagsMask = 0;
AudioComponent component = AudioComponentFindNext(NULL, &componentDescription);
if(AudioComponentInstanceNew(component, &audioUnit) != noErr) {
NSLog(#"Failed to initialize the AudioComponent");
continue;
}
//enable IO for playback
UInt32 flag = 1;
if(AudioUnitSetProperty(audioUnit, kAudioOutputUnitProperty_EnableIO, kAudioUnitScope_Output, 0, &flag, sizeof(flag)) != noErr) {
NSLog(#"Failed to enable IO for playback");
continue;
}
// set the format for the outputstream
if(AudioUnitSetProperty(audioUnit, kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Output, 1, &audioBasic, sizeof(audioBasic)) != noErr) {
NSLog(#"Failed to set the format for the outputstream");
continue;
}
// set output callback
AURenderCallbackStruct callbackStruct;
callbackStruct.inputProc = playbackCallback;
callbackStruct.inputProcRefCon = (__bridge void*) self;
if(AudioUnitSetProperty(audioUnit, kAudioUnitProperty_SetRenderCallback, kAudioUnitScope_Global, 0, &callbackStruct, sizeof(callbackStruct))!= noErr) {
NSLog(#"Failed to Set output callback");
continue;
}
// Disable buffer allocation for the recorder (optional - do this if we want to pass in our own)
flag = 0;
status = AudioUnitSetProperty(audioUnit, kAudioUnitProperty_ShouldAllocateBuffer, kAudioUnitScope_Output, 1, &flag, sizeof(flag));
if(AudioUnitInitialize(audioUnit) != noErr) {
NSLog(#"Failed to initialize audioUnits");
}
if(AudioOutputUnitStart(audioUnit)!= noErr) {
NSLog(#"[thread_ReceiveAudio] Failed to start audio");
}
audioConfigured = true;
}
AudioBufferList bufferList ;
if (sampleBuffer!=NULL) {
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampleBuffer, NULL, &bufferList, sizeof(bufferList), NULL, NULL, kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment, &blockBuffer);
UInt64 size = CMSampleBufferGetTotalSampleSize(sampleBuffer);
// Put audio into circular buffer
TPCircularBufferProduceBytes(self.decodingBuffer, bufferList.mBuffers[0].mData, size);
//TPCircularBufferCopyAudioBufferList(self.decodingBuffer, &bufferList, NULL, kTPCircularBufferCopyAll, NULL);
CFRelease(sampleBuffer);
CFRelease(blockBuffer);
}
}
//stop playing audio
if(audioConfigured){
if(AudioOutputUnitStop(audioUnit)!= noErr) {
NSLog(#"[thread_ReceiveAudio] Failed to stop audio");
}
else{
//clean up audio
AudioComponentInstanceDispose(audioUnit);
}
}
}
int16_t MuLaw_Decode(int8_t number)
{
const uint16_t MULAW_BIAS = 33;
uint8_t sign = 0, position = 0;
int16_t decoded = 0;
number = ~number;
if (number & 0x80)
{
number &= ~(1 << 7);
sign = -1;
}
position = ((number & 0xF0) >> 4) + 5;
decoded = ((1 << position) | ((number & 0x0F) << (position - 4))
| (1 << (position - 5))) - MULAW_BIAS;
return (sign == 0) ? (decoded) : (-(decoded));
}
static OSStatus playbackCallback(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData) {
int bytesToCopy = ioData->mBuffers[0].mDataByteSize;
SInt16 *targetBuffer = (SInt16*)ioData->mBuffers[0].mData;
int32_t availableBytes;
SInt16 *buffer = TPCircularBufferTail(instance.decodingBuffer, &availableBytes);
int sampleCount = MIN(bytesToCopy, availableBytes);
memcpy(targetBuffer, buffer, MIN(bytesToCopy, availableBytes));
TPCircularBufferConsume(self.decodingBuffer, sampleCount);
return noErr;
}
The code above doesn't produce any errors, but won't play any sound. I though I could set the audio through the bufferList in the recordCallback, but it is never called.
So my question is: How do I play audio from a byte stream on iOS?
I decided to look at the project with fresh eyes. I got rid of most of the code and got it to work now. It is not pretty, but at least it runs for now. For example: I had to set my sample rate to 4000, otherwise it would play to fast and I still have performance issues. Anyway this is what I came up with:
#define BUFFER_SIZE 1024
#define NUM_CHANNELS 2
#define kOutputBus 0
#define kInputBus 1
-(void) main
{
char buf[BUFFER_SIZE];
int size;
runloop: while (self.running) {
getAudioData(&buf, size);
if(!self.configured) {
if(![self activateAudioSession])
continue;
self.configured = true;
}
TPCircularBufferProduceBytes(self.decodingBuffer, buf, size);
}
//stop audiounits
AudioOutputUnitStop(self.audioUnit);
AudioComponentInstanceDispose(self.audioUnit);
if (self.decodingBuffer != NULL) {
TPCircularBufferCleanup(self.decodingBuffer);
}
}
static void audioSessionInterruptionCallback(void *inUserData, UInt32 interruptionState) {
if (interruptionState == kAudioSessionEndInterruption) {
AudioSessionSetActive(YES);
AudioOutputUnitStart(self.audioUnit);
}
if (interruptionState == kAudioSessionBeginInterruption) {
AudioOutputUnitStop(self.audioUnit);
}
}
static OSStatus playbackCallback(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData) {
// Notes: ioData contains buffers (may be more than one!)
// Fill them up as much as you can. Remember to set the size value in each buffer to match how much data is in the buffer.
if (!self.running ) {
return -1;
}
int bytesToCopy = ioData->mBuffers[0].mDataByteSize;
SInt16 *targetBuffer = (SInt16*)ioData->mBuffers[0].mData;
// Pull audio from playthrough buffer
int32_t availableBytes;
if(self.decodingBuffer == NULL || self.decodingBuffer->length < 1) {
NSLog(#"buffer is empty");
return 0;
}
SInt16 *buffer = TPCircularBufferTail(self.decodingBuffer, &availableBytes);
int sampleCount = MIN(bytesToCopy, availableBytes);
memcpy(targetBuffer, buffer, sampleCount);
TPCircularBufferConsume(self.decodingBuffer, sampleCount);
return noErr;
}
- (BOOL) activateAudioSession {
if (!self.activated_) {
OSStatus result;
result = AudioSessionInitialize(NULL,
NULL,
audioSessionInterruptionCallback,
(__bridge void *)(self));
if (kAudioSessionAlreadyInitialized != result)
[self checkError:result message:#"Couldn't initialize audio session"];
[self setupAudio]
self.activated_ = YES;
}
return self.activated_;
}
- (void) setupAudio
{
OSStatus status;
// Describe audio component
AudioComponentDescription desc;
desc.componentType = kAudioUnitType_Output;
desc.componentSubType = kAudioUnitSubType_RemoteIO;
desc.componentFlags = 0;
desc.componentFlagsMask = 0;
desc.componentManufacturer = kAudioUnitManufacturer_Apple;
// Get component
AudioComponent inputComponent = AudioComponentFindNext(NULL, &desc);
// Get audio units
AudioComponentInstanceNew(inputComponent, &_audioUnit);
// // Enable IO for recording
// UInt32 flag = 1;
// status = AudioUnitSetProperty(audioUnit,
// kAudioOutputUnitProperty_EnableIO,
// kAudioUnitScope_Input,
// kInputBus,
// &flag,
// sizeof(flag));
// Enable IO for playback
UInt32 flag = 1;
AudioUnitSetProperty(_audioUnit,
kAudioOutputUnitProperty_EnableIO,
kAudioUnitScope_Output,
kOutputBus,
&flag,
sizeof(flag));
// Describe format
AudioStreamBasicDescription format;
format.mSampleRate = 4000;
format.mFormatID = kAudioFormatULaw; //kAudioFormatULaw
format.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;//
format.mBitsPerChannel = 8 * sizeof(char);
format.mChannelsPerFrame = NUM_CHANNELS;
format.mBytesPerFrame = sizeof(char) * NUM_CHANNELS;
format.mFramesPerPacket = 1;
format.mBytesPerPacket = format.mBytesPerFrame * format.mFramesPerPacket;
format.mReserved = 0;
self.audioFormat = format;
// Apply format
AudioUnitSetProperty(_audioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Output,
kInputBus,
&_audioFormat,
sizeof(_audioFormat));
AudioUnitSetProperty(_audioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Input,
kOutputBus,
&_audioFormat,
sizeof(_audioFormat));
// // Set input callback
// AURenderCallbackStruct callbackStruct;
// callbackStruct.inputProc = recordingCallback;
// callbackStruct.inputProcRefCon = self;
// status = AudioUnitSetProperty(audioUnit,
// kAudioOutputUnitProperty_SetInputCallback,
// kAudioUnitScope_Global,
// kInputBus,
// &callbackStruct,
// sizeof(callbackStruct));
// checkStatus(status);
// Set output callback
AURenderCallbackStruct callbackStruct;
callbackStruct.inputProc = playbackCallback;
callbackStruct.inputProcRefCon = (__bridge void * _Nullable)(self);
AudioUnitSetProperty(_audioUnit,
kAudioUnitProperty_SetRenderCallback,
kAudioUnitScope_Global,
kOutputBus,
&callbackStruct,
sizeof(callbackStruct));
// Disable buffer allocation for the recorder (optional - do this if we want to pass in our own)
flag = 0;
status = AudioUnitSetProperty(_audioUnit,
kAudioUnitProperty_ShouldAllocateBuffer,
kAudioUnitScope_Output,
kInputBus,
&flag,
sizeof(flag));
//initialize the circular buffer
if(self.decodingBuffer == NULL)
self.decodingBuffer = malloc(sizeof(TPCircularBuffer));
if(!TPCircularBufferInit(self.decodingBuffer, 512*1024))
return NO;
// Initialise
status = AudioUnitInitialize(self.audioUnit);
AudioOutputUnitStart(self.audioUnit);
}
I found most of this by looking through github and from a tasty pixel
If the AVAudioSession is configured to use short buffers, you can use the RemoteIO Audio Unit to play received audio with low additional latency.
Check errors during audio configuration. Some iOS devices only support a 48 kHz sample rate, so you may need to resample your audio PCM data from 8 kHz to another rate.
RemoteIO only supports linear PCM, so you will need to first convert all your incoming 8-bit u-law PCM samples to 16-bit linear PCM format before storing them in a lock-free circular buffer.
You need to call AudioOutputUnitStart to start audio callbacks being called by the OS. Your code should not be calling these callbacks. They will be called by the OS.
AudioUnitRender is used for recording callbacks, not for playing audio. So you don't need to use it. Just fill the AudioBufferList buffers with the requested number of frames in the play callback.
Then you can use the play audio callback to check your circular buffer and pull the requested number of samples, if enough are available. You should not do any memory management (such as a free() call) inside this callback.

Real-time converting the PCM buffer to AAC data for iOS using Remote IO and Audio Convert Service

I'm using Remote IO to get the audio buffer from PCM, I want to real-time send the data to Darwin Server by cellular network (3G network). I choose The AAC format as there is an article from Fraunhofer called "AAC-ELD based Audio Communication on iOS A Developer’s Guide". The sample code works great. The audio is recorded in LPCM format and encoded to AACELD and decoded back to LPCM and finally performed playback immediately, but it's AACELD(Enhanced Low Delay) format. When I change the format from "kAudioFormatMPEG4AAC_ELD" to "kAudioFormatMPEG4AAC". I can hear the audio for 1 second and the audio is stuck for the next 1 second and the pattern continues. And the audio is twice as frequent as the reality which means the sound last 1 second in real world will only last 0.5 second for playback. I then change the sample frame size from 512 to 1024. the frequency is normal but I can hear the audio for 2 second and it is stuck for the next 2 seconds and the pattern continues... I figured out that the AudioConverterFillComplexBuffer function fails for 2 seconds and then works well in the next 2 seconds. I don't know why. Please help. Thanks in advance.
I really didn't change much of the code just changed the formatID and sample frame size from 512 to 1024
The article is here: http://www.full-hd-voice.com/content/dam/fullhdvoice/documents/iOS-ACE-AP-v2.pdf
1.global variables
static AudioBuffer g_inputBuffer;
static AudioBuffer g_outputBuffer;
static AudioComponentInstance g_audioUnit;
static AudioUnitElement g_outputBus = 0;
static AudioUnitElement g_inputBus = 1;
static UInt32 g_outChannels = 2;
static UInt32 g_inChannels = 1;
static UInt32 g_frameSize = 1024;
static UInt32 g_inputByteSize = 0;
static UInt32 g_outputByteSize = 0;
static unsigned int g_initialized = 0;
static AACELDEncoder *g_encoder = NULL;
static AACELDDecoder *g_decoder = NULL;
static MagicCookie g_cookie;
/* Structure to keep the encoder configuration */
typedef struct EncoderProperties_
{
Float64 samplingRate;
UInt32 inChannels;
UInt32 outChannels;
UInt32 frameSize;
UInt32 bitrate;
} EncoderProperties;
/* Structure to keep the magic cookie */
typedef struct MagicCookie_
{
void *data;
int byteSize;
} MagicCookie;
/* Structure to keep one encoded AU */
typedef struct EncodedAudioBuffer_
{
UInt32 mChannels;
UInt32 mDataBytesSize;
void *data;
} EncodedAudioBuffer;
typedef struct DecoderProperties_
{
Float64 samplingRate;
UInt32 inChannels;
UInt32 outChannels;
UInt32 frameSize;
} DecoderProperties;
2.initialise Audio Session and Audio Unit and encoder&decoder
void InitAudioUnit()
{
/* Calculate the required input and output buffer sizes */
g_inputByteSize = g_frameSize * g_inChannels * sizeof(AudioSampleType);
g_outputByteSize = g_frameSize * g_outChannels * sizeof(AudioSampleType);
/* Initialize the I/O buffers */
g_inputBuffer.mNumberChannels = g_inChannels;
g_inputBuffer.mDataByteSize = g_inputByteSize;
if (g_initialized)
free(g_inputBuffer.mData);
g_inputBuffer.mData = malloc(sizeof(unsigned char)*g_inputByteSize);
memset(g_inputBuffer.mData, 0, g_inputByteSize);
g_outputBuffer.mNumberChannels = g_outChannels;
g_outputBuffer.mDataByteSize = g_outputByteSize;
if (g_initialized)
free(g_outputBuffer.mData);
g_outputBuffer.mData = malloc(sizeof(unsigned char)*g_outputByteSize);
memset(g_outputBuffer.mData, 0, g_outputByteSize);
g_initialized = 1;
/* Initialize the audio session */
AudioSessionInitialize(NULL, NULL, interruptionListener, NULL);
/* Activate the audio session */
AudioSessionSetActive(TRUE);
/* Enable recording for full-duplex I/O */
UInt32 audioCategory = kAudioSessionCategory_PlayAndRecord;
AudioSessionSetProperty(kAudioSessionProperty_AudioCategory,
sizeof(audioCategory),
&audioCategory);
/* Set the route change listener */
AudioSessionAddPropertyListener(kAudioSessionProperty_AudioRouteChange,
routeChangeListener,
NULL);
/* Set the preferred buffer time */
Float32 preferredBufferTime = 1024.0 / 44100.0;
AudioSessionSetProperty(kAudioSessionProperty_PreferredHardwareIOBufferDuration,
sizeof(preferredBufferTime),
&preferredBufferTime);
/* Setup the audio component for I/O */
AudioComponentDescription componentDesc;
memset(&componentDesc, 0, sizeof(componentDesc));
componentDesc.componentType = kAudioUnitType_Output;
componentDesc.componentSubType = kAudioUnitSubType_RemoteIO;
componentDesc.componentManufacturer = kAudioUnitManufacturer_Apple;
/* Find and create the audio component */
AudioComponent auComponent = AudioComponentFindNext(NULL, &componentDesc);
AudioComponentInstanceNew(auComponent, &g_audioUnit);
/* Enable the audio input */
UInt32 enableAudioInput = 1;
AudioUnitSetProperty(g_audioUnit,
kAudioOutputUnitProperty_EnableIO,
kAudioUnitScope_Input,
g_inputBus,
&enableAudioInput,
sizeof(enableAudioInput));
/* Setup the render callback */
AURenderCallbackStruct renderCallbackInfo;
renderCallbackInfo.inputProc = audioUnitRenderCallback;
renderCallbackInfo.inputProcRefCon = NULL;
AudioUnitSetProperty(g_audioUnit,
kAudioUnitProperty_SetRenderCallback,
kAudioUnitScope_Input,
g_outputBus,
&renderCallbackInfo,
sizeof(renderCallbackInfo));
/* Set the input and output audio stream formats */
AudioStreamBasicDescription audioFormat;
audioFormat.mSampleRate = 44100;
audioFormat.mFormatID = kAudioFormatLinearPCM;
audioFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
audioFormat.mFramesPerPacket = 1;
audioFormat.mBitsPerChannel = 8 * sizeof(AudioSampleType);
audioFormat.mChannelsPerFrame = g_inChannels;
audioFormat.mBytesPerFrame = audioFormat.mChannelsPerFrame * sizeof(AudioSampleType);
audioFormat.mBytesPerPacket = audioFormat.mBytesPerFrame;
AudioUnitSetProperty(g_audioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Output,
g_inputBus,
&audioFormat,
sizeof(audioFormat));
audioFormat.mChannelsPerFrame = g_outChannels;
audioFormat.mBytesPerFrame = audioFormat.mChannelsPerFrame * sizeof(AudioSampleType);
audioFormat.mBytesPerPacket = audioFormat.mBytesPerFrame;
AudioUnitSetProperty(g_audioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Input,
g_outputBus,
&audioFormat,
sizeof(audioFormat));
/* Initialize the ELD codec */
InitAACELD();
}
void InitAACELD()
{
EncoderProperties p;
p.samplingRate = 44100.0;
p.inChannels = 1;
p.outChannels = 1;
p.frameSize = 1024;
p.bitrate = 32000;
g_encoder = CreateAACELDEncoder();
InitAACELDEncoder(g_encoder, p, &g_cookie);
DecoderProperties dp;
dp.samplingRate = 44100.0;
dp.inChannels = 1;
dp.outChannels = 2;
dp.frameSize = p.frameSize;
g_decoder = CreateAACELDDecoder();
InitAACELDDecoder(g_decoder, dp, &g_cookie);
}
int InitAACELDEncoder(AACELDEncoder *encoder, EncoderProperties props, MagicCookie *outCookie)
{
/* Copy the provided encoder properties */
encoder->inChannels = props.inChannels;
encoder->outChannels = props.outChannels;
encoder->samplingRate = props.samplingRate;
encoder->frameSize = props.frameSize;
encoder->bitrate = props.bitrate;
/* Convenience macro to fill out the ASBD structure.
Available only when __cplusplus is defined! */
FillOutASBDForLPCM(encoder->sourceFormat,
encoder->samplingRate,
encoder->inChannels,
8*sizeof(AudioSampleType),
8*sizeof(AudioSampleType),
false,
false);
/* Set the format parameters for AAC-ELD encoding. */
encoder->destinationFormat.mFormatID = kAudioFormatMPEG4AAC;
encoder->destinationFormat.mChannelsPerFrame = encoder->outChannels;
encoder->destinationFormat.mSampleRate = encoder->samplingRate;
/* Get the size of the formatinfo structure */
UInt32 dataSize = sizeof(encoder->destinationFormat);
/* Request the propertie from CoreAudio */
AudioFormatGetProperty(kAudioFormatProperty_FormatInfo,
0,
NULL,
&dataSize,
&(encoder->destinationFormat));
/* Create a new audio converter */
AudioConverterNew(&(encoder->sourceFormat),
&(encoder->destinationFormat),
&(encoder->audioConverter));
if (!encoder->audioConverter)
{
return -1;
}
/* Try to set the desired output bitrate */
UInt32 outputBitrate = encoder->bitrate;
dataSize = sizeof(outputBitrate);
AudioConverterSetProperty(encoder->audioConverter,
kAudioConverterEncodeBitRate,
dataSize,
&outputBitrate);
/* Query the maximum possible output packet size */
if (encoder->destinationFormat.mBytesPerPacket == 0)
{
UInt32 maxOutputSizePerPacket = 0;
dataSize = sizeof(maxOutputSizePerPacket);
AudioConverterGetProperty(encoder->audioConverter,
kAudioConverterPropertyMaximumOutputPacketSize,
&dataSize,
&maxOutputSizePerPacket);
encoder->maxOutputPacketSize = maxOutputSizePerPacket;
}
else
{
encoder->maxOutputPacketSize = encoder->destinationFormat.mBytesPerPacket;
}
/* Fetch the Magic Cookie from the ELD implementation */
UInt32 cookieSize = 0;
AudioConverterGetPropertyInfo(encoder->audioConverter,
kAudioConverterCompressionMagicCookie,
&cookieSize,
NULL);
char* cookie = (char*)malloc(cookieSize*sizeof(char));
AudioConverterGetProperty(encoder->audioConverter,
kAudioConverterCompressionMagicCookie,
&cookieSize,
cookie);
outCookie->data = cookie;
outCookie->byteSize = cookieSize;
/* Prepare the temporary AU buffer for encoding */
encoder->encoderBuffer = malloc(encoder->maxOutputPacketSize);
return 0;
}
int InitAACELDDecoder(AACELDDecoder* decoder, DecoderProperties props, const MagicCookie *cookie)
{
/* Copy the provided decoder properties */
decoder->inChannels = props.inChannels;
decoder->outChannels = props.outChannels;
decoder->samplingRate = props.samplingRate;
decoder->frameSize = props.frameSize;
/* We will decode to LPCM */
FillOutASBDForLPCM(decoder->destinationFormat,
decoder->samplingRate,
decoder->outChannels,
8*sizeof(AudioSampleType),
8*sizeof(AudioSampleType),
false,
false);
/* from AAC-ELD, having the same sampling rate, but possibly a different channel configuration */
decoder->sourceFormat.mFormatID = kAudioFormatMPEG4AAC;
decoder->sourceFormat.mChannelsPerFrame = decoder->inChannels;
decoder->sourceFormat.mSampleRate = decoder->samplingRate;
/* Get the rest of the format info */
UInt32 dataSize = sizeof(decoder->sourceFormat);
AudioFormatGetProperty(kAudioFormatProperty_FormatInfo,
0,
NULL,
&dataSize,
&(decoder->sourceFormat));
/* Create a new AudioConverter instance for the conversion AAC-ELD -> LPCM */
AudioConverterNew(&(decoder->sourceFormat),
&(decoder->destinationFormat),
&(decoder->audioConverter));
if (!decoder->audioConverter)
{
return -1;
}
/* Check for variable output packet size */
if (decoder->destinationFormat.mBytesPerPacket == 0)
{
UInt32 maxOutputSizePerPacket = 0;
dataSize = sizeof(maxOutputSizePerPacket);
AudioConverterGetProperty(decoder->audioConverter,
kAudioConverterPropertyMaximumOutputPacketSize,
&dataSize,
&maxOutputSizePerPacket);
decoder->maxOutputPacketSize = maxOutputSizePerPacket;
}
else
{
decoder->maxOutputPacketSize = decoder->destinationFormat.mBytesPerPacket;
}
/* Set the corresponding encoder cookie */
AudioConverterSetProperty(decoder->audioConverter,
kAudioConverterDecompressionMagicCookie,
cookie->byteSize,
cookie->data);
return 0;
}
3.Render Callback and encoder & decoder
static OSStatus audioUnitRenderCallback(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberOfFrames,
AudioBufferList *ioData)
{
/* Get the input samples */
AudioUnitRender(g_audioUnit,
ioActionFlags,
inTimeStamp,
g_inputBus,
inNumberOfFrames,
ioData);
/* Copy to global input buffer */
memcpy(g_inputBuffer.mData, ioData->mBuffers[0].mData, g_inputBuffer.mDataByteSize);
/* Encode with AudioConverter */
EncodedAudioBuffer encodedAU;
EncodeAACELD(g_encoder, &g_inputBuffer, &encodedAU);
/* Decode with AudioConverter */
g_outputBuffer.mDataByteSize = g_outputByteSize;
DecodeAACELD(g_decoder, &encodedAU, &g_outputBuffer);
/* Copy output samples to Audio Units' IO buffer */
ioData->mBuffers[0].mNumberChannels = g_outputBuffer.mNumberChannels;
ioData->mBuffers[0].mDataByteSize = g_outputBuffer.mDataByteSize;
memcpy(ioData->mBuffers[0].mData, g_outputBuffer.mData, g_outputBuffer.mDataByteSize);
return noErr;
}
static OSStatus encodeProc(AudioConverterRef inAudioConverter,
UInt32 *ioNumberDataPackets,
AudioBufferList *ioData,
AudioStreamPacketDescription **outDataPacketDescription,
void *inUserData)
{
/* Get the current encoder state from the inUserData parameter */
AACELDEncoder *encoder = (AACELDEncoder*) inUserData;
/* Compute the maximum number of output packets */
UInt32 maxPackets = encoder->bytesToEncode / encoder->sourceFormat.mBytesPerPacket;
if (*ioNumberDataPackets > maxPackets)
{
/* If requested number of packets is bigger, adjust */
*ioNumberDataPackets = maxPackets;
}
/* Check to make sure we have only one audio buffer */
if (ioData->mNumberBuffers != 1)
{
return 1;
}
/* Set the data to be encoded */
ioData->mBuffers[0].mDataByteSize = encoder->currentSampleBuffer->mDataByteSize;
ioData->mBuffers[0].mData = encoder->currentSampleBuffer->mData;
ioData->mBuffers[0].mNumberChannels = encoder->currentSampleBuffer->mNumberChannels;
if (outDataPacketDescription)
{
*outDataPacketDescription = NULL;
}
if (encoder->bytesToEncode == 0)
{
// We are currently out of data but want to keep on processing
// See Apple Technical Q&A QA1317
return 1;
}
encoder->bytesToEncode = 0;
return noErr;
}
int EncodeAACELD(AACELDEncoder *encoder, AudioBuffer *inSamples, EncodedAudioBuffer *outData)
{
/* Clear the encoder buffer */
memset(encoder->encoderBuffer, 0, sizeof(encoder->maxOutputPacketSize));
/* Keep a reference to the samples that should be encoded */
encoder->currentSampleBuffer = inSamples;
encoder->bytesToEncode = inSamples->mDataByteSize;
UInt32 numOutputDataPackets = 1;
AudioStreamPacketDescription outPacketDesc[1];
/* Create the output buffer list */
AudioBufferList outBufferList;
outBufferList.mNumberBuffers = 1;
outBufferList.mBuffers[0].mNumberChannels = encoder->outChannels;
outBufferList.mBuffers[0].mDataByteSize = encoder->maxOutputPacketSize;
outBufferList.mBuffers[0].mData = encoder->encoderBuffer;
/* Start the encoding process */
OSStatus status = AudioConverterFillComplexBuffer(encoder->audioConverter,
encodeProc,
encoder,
&numOutputDataPackets,
&outBufferList,
outPacketDesc);
if (status != noErr)
{
return -1;
}
/* Set the ouput data */
outData->mChannels = encoder->outChannels;
outData->data = encoder->encoderBuffer;
outData->mDataBytesSize = outPacketDesc[0].mDataByteSize;
return 0;
}
static OSStatus decodeProc(AudioConverterRef inAudioConverter,
UInt32 *ioNumberDataPackets,
AudioBufferList *ioData,
AudioStreamPacketDescription **outDataPacketDescription,
void *inUserData)
{
/* Get the current decoder state from the inUserData parameter */
AACELDDecoder *decoder = (AACELDDecoder*)inUserData;
/* Compute the maximum number of output packets */
UInt32 maxPackets = decoder->bytesToDecode / decoder->maxOutputPacketSize;
if (*ioNumberDataPackets > maxPackets)
{
/* If requested number of packets is bigger, adjust */
*ioNumberDataPackets = maxPackets;
}
/* If there is data to be decoded, set it accordingly */
if (decoder->bytesToDecode)
{
ioData->mBuffers[0].mData = decoder->decodeBuffer;
ioData->mBuffers[0].mDataByteSize = decoder->bytesToDecode;
ioData->mBuffers[0].mNumberChannels = decoder->inChannels;
}
/* And set the packet description */
if (outDataPacketDescription)
{
decoder->packetDesc[0].mStartOffset = 0;
decoder->packetDesc[0].mVariableFramesInPacket = 0;
decoder->packetDesc[0].mDataByteSize = decoder->bytesToDecode;
(*outDataPacketDescription) = decoder->packetDesc;
}
if (decoder->bytesToDecode == 0)
{
// We are currently out of data but want to keep on processing
// See Apple Technical Q&A QA1317
return 1;
}
decoder->bytesToDecode = 0;
return noErr;
}
int DecodeAACELD(AACELDDecoder* decoder, EncodedAudioBuffer *inData, AudioBuffer *outSamples)
{
OSStatus status = noErr;
/* Keep a reference to the samples that should be decoded */
decoder->decodeBuffer = inData->data;
decoder->bytesToDecode = inData->mDataBytesSize;
UInt32 outBufferMaxSizeBytes = decoder->frameSize * decoder->outChannels * sizeof(AudioSampleType);
assert(outSamples->mDataByteSize <= outBufferMaxSizeBytes);
UInt32 numOutputDataPackets = outBufferMaxSizeBytes / decoder->maxOutputPacketSize;
/* Output packet stream are 512 LPCM samples */
AudioStreamPacketDescription outputPacketDesc[1024];
/* Create the output buffer list */
AudioBufferList outBufferList;
outBufferList.mNumberBuffers = 1;
outBufferList.mBuffers[0].mNumberChannels = decoder->outChannels;
outBufferList.mBuffers[0].mDataByteSize = outSamples->mDataByteSize;
outBufferList.mBuffers[0].mData = outSamples->mData;
/* Start the decoding process */
status = AudioConverterFillComplexBuffer(decoder->audioConverter,
decodeProc,
decoder,
&numOutputDataPackets,
&outBufferList,
outputPacketDesc);
if (noErr != status)
{
return -1;
}
return 0;
}

Realtime audio processing without output

I'm looking on this example http://teragonaudio.com/article/How-to-do-realtime-recording-with-effect-processing-on-iOS.html
and i want to turn off my output. I try to change: kAudioSessionCategory_PlayAndRecord to kAudioSessionCategory_RecordAudio but this is not working. I also try to get rid off:
if(AudioUnitSetProperty(*audioUnit, kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Output, 1, &streamDescription, sizeof(streamDescription)) != noErr) {
return 1;
}
Becouse i want to get sound from microphone but not playing it. But not matter what i do when my sound get to renderCallback method there is a -50 error. When audio is automatically play on output everything works fine...
Update with code:
using namespace std;
AudioUnit *audioUnit = NULL;
float *convertedSampleBuffer = NULL;
int initAudioSession() {
audioUnit = (AudioUnit*)malloc(sizeof(AudioUnit));
if(AudioSessionInitialize(NULL, NULL, NULL, NULL) != noErr) {
return 1;
}
if(AudioSessionSetActive(true) != noErr) {
return 1;
}
UInt32 sessionCategory = kAudioSessionCategory_PlayAndRecord;
if(AudioSessionSetProperty(kAudioSessionProperty_AudioCategory,
sizeof(UInt32), &sessionCategory) != noErr) {
return 1;
}
Float32 bufferSizeInSec = 0.02f;
if(AudioSessionSetProperty(kAudioSessionProperty_PreferredHardwareIOBufferDuration,
sizeof(Float32), &bufferSizeInSec) != noErr) {
return 1;
}
UInt32 overrideCategory = 1;
if(AudioSessionSetProperty(kAudioSessionProperty_OverrideCategoryDefaultToSpeaker,
sizeof(UInt32), &overrideCategory) != noErr) {
return 1;
}
// There are many properties you might want to provide callback functions for:
// kAudioSessionProperty_AudioRouteChange
// kAudioSessionProperty_OverrideCategoryEnableBluetoothInput
// etc.
return 0;
}
OSStatus renderCallback(void *userData, AudioUnitRenderActionFlags *actionFlags,
const AudioTimeStamp *audioTimeStamp, UInt32 busNumber,
UInt32 numFrames, AudioBufferList *buffers) {
OSStatus status = AudioUnitRender(*audioUnit, actionFlags, audioTimeStamp,
1, numFrames, buffers);
int doOutput = 0;
if(status != noErr) {
return status;
}
if(convertedSampleBuffer == NULL) {
// Lazy initialization of this buffer is necessary because we don't
// know the frame count until the first callback
convertedSampleBuffer = (float*)malloc(sizeof(float) * numFrames);
baseTime = (float)QRealTimer::getUptimeInMilliseconds();
}
SInt16 *inputFrames = (SInt16*)(buffers->mBuffers->mData);
// If your DSP code can use integers, then don't bother converting to
// floats here, as it just wastes CPU. However, most DSP algorithms rely
// on floating point, and this is especially true if you are porting a
// VST/AU to iOS.
int i;
for( i = numFrames; i < fftlength; i++ ) // Shifting buffer
x_inbuf[i - numFrames] = x_inbuf[i];
for( i = 0; i < numFrames; i++) {
x_inbuf[i + x_phase] = (float)inputFrames[i] / (float)32768;
}
if( x_phase + numFrames == fftlength )
{
x_alignment.SigProc_frontend(x_inbuf); // Signal processing front-end (FFT!)
doOutput = x_alignment.Align();
/// Output as text! In the real-time version,
// this is where we update visualisation callbacks and launch other services
if ((doOutput) & (x_netscore.isEvent(x_alignment.Position()))
&(x_alignment.lastAction()<x_alignment.Position()) )
{
// here i want to do something with my input!
}
}
else
x_phase += numFrames;
return noErr;
}
int initAudioStreams(AudioUnit *audioUnit) {
UInt32 audioCategory = kAudioSessionCategory_PlayAndRecord;
if(AudioSessionSetProperty(kAudioSessionProperty_AudioCategory,
sizeof(UInt32), &audioCategory) != noErr) {
return 1;
}
UInt32 overrideCategory = 1;
if(AudioSessionSetProperty(kAudioSessionProperty_OverrideCategoryDefaultToSpeaker,
sizeof(UInt32), &overrideCategory) != noErr) {
// Less serious error, but you may want to handle it and bail here
}
AudioComponentDescription componentDescription;
componentDescription.componentType = kAudioUnitType_Output;
componentDescription.componentSubType = kAudioUnitSubType_RemoteIO;
componentDescription.componentManufacturer = kAudioUnitManufacturer_Apple;
componentDescription.componentFlags = 0;
componentDescription.componentFlagsMask = 0;
AudioComponent component = AudioComponentFindNext(NULL, &componentDescription);
if(AudioComponentInstanceNew(component, audioUnit) != noErr) {
return 1;
}
UInt32 enable = 1;
if(AudioUnitSetProperty(*audioUnit, kAudioOutputUnitProperty_EnableIO,
kAudioUnitScope_Input, 1, &enable, sizeof(UInt32)) != noErr) {
return 1;
}
AURenderCallbackStruct callbackStruct;
callbackStruct.inputProc = renderCallback; // Render function
callbackStruct.inputProcRefCon = NULL;
if(AudioUnitSetProperty(*audioUnit, kAudioUnitProperty_SetRenderCallback,
kAudioUnitScope_Input, 0, &callbackStruct,
sizeof(AURenderCallbackStruct)) != noErr) {
return 1;
}
AudioStreamBasicDescription streamDescription;
// You might want to replace this with a different value, but keep in mind that the
// iPhone does not support all sample rates. 8kHz, 22kHz, and 44.1kHz should all work.
streamDescription.mSampleRate = 44100;
// Yes, I know you probably want floating point samples, but the iPhone isn't going
// to give you floating point data. You'll need to make the conversion by hand from
// linear PCM <-> float.
streamDescription.mFormatID = kAudioFormatLinearPCM;
// This part is important!
streamDescription.mFormatFlags = kAudioFormatFlagIsSignedInteger |
kAudioFormatFlagsNativeEndian |
kAudioFormatFlagIsPacked;
streamDescription.mBitsPerChannel = 16;
// 1 sample per frame, will always be 2 as long as 16-bit samples are being used
streamDescription.mBytesPerFrame = 2;
streamDescription.mChannelsPerFrame = 1;
streamDescription.mBytesPerPacket = streamDescription.mBytesPerFrame *
streamDescription.mChannelsPerFrame;
// Always should be set to 1
streamDescription.mFramesPerPacket = 1;
// Always set to 0, just to be sure
streamDescription.mReserved = 0;
// Set up input stream with above properties
if(AudioUnitSetProperty(*audioUnit, kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Input, 0, &streamDescription, sizeof(streamDescription)) != noErr) {
return 1;
}
// Ditto for the output stream, which we will be sending the processed audio to
if(AudioUnitSetProperty(*audioUnit, kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Output, 1, &streamDescription, sizeof(streamDescription)) != noErr) {
return 1;
}
return 0;
}
int startAudioUnit(AudioUnit *audioUnit) {
if(AudioUnitInitialize(*audioUnit) != noErr) {
return 1;
}
if(AudioOutputUnitStart(*audioUnit) != noErr) {
return 1;
}
return 0;
}
And calling from my VC:
initAudioSession();
initAudioStreams( audioUnit);
startAudioUnit( audioUnit);
If you want only recording, no playback, simply comment out the line that sets renderCallback:
AURenderCallbackStruct callbackStruct;
callbackStruct.inputProc = renderCallback; // Render function
callbackStruct.inputProcRefCon = NULL;
if(AudioUnitSetProperty(*audioUnit, kAudioUnitProperty_SetRenderCallback,
kAudioUnitScope_Input, 0, &callbackStruct,
sizeof(AURenderCallbackStruct)) != noErr) {
return 1;
}
Update after seeing code:
As I suspected, you're missing input callback. Add these lines:
// at top:
#define kInputBus 1
AURenderCallbackStruct callbackStruct;
/**/
callbackStruct.inputProc = &ALAudioUnit::recordingCallback;
callbackStruct.inputProcRefCon = this;
status = AudioUnitSetProperty(audioUnit,
kAudioOutputUnitProperty_SetInputCallback,
kAudioUnitScope_Global,
kInputBus,
&callbackStruct,
sizeof(callbackStruct));
Now in your recordingCallback:
OSStatus ALAudioUnit::recordingCallback(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData)
{
// TODO: Use inRefCon to access our interface object to do stuff
// Then, use inNumberFrames to figure out how much data is available, and make
// that much space available in buffers in an AudioBufferList.
// Then:
// Obtain recorded samples
OSStatus status;
ALAudioUnit *pThis = reinterpret_cast<ALAudioUnit*>(inRefCon);
if (!pThis)
return noErr;
//assert (pThis->m_nMaxSliceFrames >= inNumberFrames);
pThis->recorderBufferList->GetBufferList().mBuffers[0].mDataByteSize = inNumberFrames * pThis->m_recorderSBD.mBytesPerFrame;
status = AudioUnitRender(pThis->audioUnit,
ioActionFlags,
inTimeStamp,
inBusNumber,
inNumberFrames,
&pThis->recorderBufferList->GetBufferList());
THROW_EXCEPTION_IF_ERROR(status, "error rendering audio unit");
// If we're not playing, I don't care about the data, simply discard it
if (!pThis->playbackState || pThis->isSeeking) return noErr;
// Now, we have the samples we just read sitting in buffers in bufferList
pThis->DoStuffWithTheRecordedAudio(inNumberFrames, pThis->recorderBufferList, inTimeStamp);
return noErr;
}
Btw, I'm allocating my own buffer instead of using the one provided by AudioUnit. You might want to change those parts if you want to use AudioUnit allocated buffer.
Update:
How to allocate own buffer:
recorderBufferList = new AUBufferList();
recorderBufferList->Allocate(m_recorderSBD, m_nMaxSliceFrames);
recorderBufferList->PrepareBuffer(m_recorderSBD, m_nMaxSliceFrames);
Also, if you're doing this, tell AudioUnit to not allocate buffers:
// Disable buffer allocation for the recorder (optional - do this if we want to pass in our own)
flag = 0;
status = AudioUnitSetProperty(audioUnit,
kAudioUnitProperty_ShouldAllocateBuffer,
kAudioUnitScope_Input,
kInputBus,
&flag,
sizeof(flag));
You'll need to include CoreAudio utility classes
Thanks for #Mar0ux 's answer. Whoever got here looking for complete sample code doing this can take a look here:
https://code.google.com/p/ios-coreaudio-example/
I am doing a similar app working with the same code and I found that you can end playback by changing the enumeration kAudioSessionCategory_PlayAndRecord to RecordAudio
int initAudioStreams(AudioUnit *audioUnit) {
UInt32 audioCategory = kAudioSessionCategory_RecordAudio;
if(AudioSessionSetProperty(kAudioSessionProperty_AudioCategory,
sizeof(UInt32), &audioCategory) != noErr) {
return 1;
}
This stopped the feedback between mic and speaker on my hardware.

Resources