Playing .caf audio file with ExtAudioFileRead is not sync - ios

My app does huge data processing on audio coming from the mic input.
In order to get a "demo mode", I want to do the same thing based on a local .caf audio file.
I managed to get the audio file.
Now I am trying to use ExtAudioFileRead to read the .caf file and then do the data processing.
void readFile()
{
OSStatus err = noErr;
// Audio file
NSURL *path = [[NSBundle mainBundle] URLForResource:#"output" withExtension:#"caf"];
ExtAudioFileOpenURL((__bridge CFURLRef)path, &audio->audiofile);
assert(audio->audiofile);
// File's format.
AudioStreamBasicDescription fileFormat;
UInt32 size = sizeof(fileFormat);
err = ExtAudioFileGetProperty(audio->audiofile, kExtAudioFileProperty_FileDataFormat, &size, &fileFormat);
// tell the ExtAudioFile API what format we want samples back in
//bzero(&audio->clientFormat, sizeof(audio->clientFormat));
audio->clientFormat.mSampleRate = SampleRate;
audio->clientFormat.mFormatID = kAudioFormatLinearPCM;
audio->clientFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
audio->clientFormat.mFramesPerPacket = 1;
audio->clientFormat.mChannelsPerFrame = 1;
audio->clientFormat.mBitsPerChannel = 16;//sizeof(AudioSampleType) * 8;
audio->clientFormat.mBytesPerPacket = 2 * audio->clientFormat.mChannelsPerFrame;
audio->clientFormat.mBytesPerFrame = 2 * audio->clientFormat.mChannelsPerFrame;
err = ExtAudioFileSetProperty(audio->audiofile, kExtAudioFileProperty_ClientDataFormat, sizeof(audio->clientFormat), &audio->clientFormat);
// find out how many frames we need to read
SInt64 numFrames = 0;
size = sizeof(numFrames);
err = ExtAudioFileGetProperty(audio->audiofile, kExtAudioFileProperty_FileLengthFrames, &size, &numFrames);
// create the buffers for reading in data
AudioBufferList *bufferList = malloc(sizeof(AudioBufferList) + sizeof(AudioBuffer) * (audio->clientFormat.mChannelsPerFrame - 1));
bufferList->mNumberBuffers = audio->clientFormat.mChannelsPerFrame;
for (int ii=0; ii < bufferList->mNumberBuffers; ++ii)
{
bufferList->mBuffers[ii].mDataByteSize = sizeof(float) * (int)numFrames;
bufferList->mBuffers[ii].mNumberChannels = 1;
bufferList->mBuffers[ii].mData = malloc(bufferList->mBuffers[ii].mDataByteSize);
}
UInt32 maxReadFrames = 1024;
UInt32 rFrames = (UInt32)numFrames;
while(rFrames > 0)
{
UInt32 framesToRead = (maxReadFrames > rFrames) ? rFrames : maxReadFrames;
err = ExtAudioFileRead(audio->audiofile, &framesToRead, bufferList);
[audio processAudio:bufferList];
if (rFrames % SampleRate == 0)
[audio realtimeUpdate:nil];
rFrames = rFrames - maxReadFrames;
}
// Close the file
ExtAudioFileDispose(audio->audiofile);
// destroy the buffers
for (int ii=0; ii < bufferList->mNumberBuffers; ++ii)
{
free(bufferList->mBuffers[ii].mData);
}
free(bufferList);
bufferList = NULL;
}
There is clearly something that i did not understand or that I am doing wrong with ExtAudioFileRead because this code does not work at all. I have two main problems :
The file is played instantaneously. I mean that 44'100 samples are clearly not equal to 1 second. My 3 minutes audio file processing is done in a few seconds...
During the processing, I need to update the UI. So I have a few dispatch_sync in processaudio and realtimeUpdate. This seems to be really not appreciated by ExtAudioFileRead and it freezes.
Thanks for you help.

The code you wrote is just reading samples from the file and then calling processAudio. This will be done as fast as possible. As soon as processAudio is finished the next batch of samples is read and processAudio is called again. You shouldn't assume that reading from an audio file (which is a low level and non blocking os call) takes the same time the audio read would take to play.
If you want to process the audio in the file according to the sample rate you should probably use an AUFilePlayer audio unit. This can play back the file at the right speed and you can use a callback to process the samples in real audio time instead of "as fast as possible".

Related

Using CMSampleTimingInfo, CMSampleBuffer and AudioBufferList from raw PCM stream

I'm receiving a raw PCM stream from Google's WebRTC C++ reference implementation (a hook inserted into VoEBaseImpl::GetPlayoutData). The audio appears to be linear PCM, signed int16, but when recording this using an AssetWriter it saves to the audio file highly distorted and higher pitch.
I am assuming this is an error somewhere with the input parameters, most probably with respect to the conversion of the stereo-int16 to an AudioBufferList and then on to a CMSampleBuffer. Is there any issue with the below code?
void RecorderImpl::RenderAudioFrame(void* audio_data, size_t number_of_frames, int sample_rate, int64_t elapsed_time_ms, int64_t ntp_time_ms) {
OSStatus status;
AudioChannelLayout acl;
bzero(&acl, sizeof(acl));
acl.mChannelLayoutTag = kAudioChannelLayoutTag_Stereo;
AudioStreamBasicDescription audioFormat;
audioFormat.mSampleRate = sample_rate;
audioFormat.mFormatID = kAudioFormatLinearPCM;
audioFormat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
audioFormat.mFramesPerPacket = 1;
audioFormat.mChannelsPerFrame = 2;
audioFormat.mBitsPerChannel = 16;
audioFormat.mBytesPerPacket = audioFormat.mFramesPerPacket * audioFormat.mChannelsPerFrame * audioFormat.mBitsPerChannel / 8;
audioFormat.mBytesPerFrame = audioFormat.mBytesPerPacket / audioFormat.mFramesPerPacket;
CMSampleTimingInfo timing = { CMTimeMake(1, sample_rate), CMTimeMake(elapsed_time_ms, 1000), kCMTimeInvalid };
CMFormatDescriptionRef format = NULL;
status = CMAudioFormatDescriptionCreate(kCFAllocatorDefault, &audioFormat, sizeof(acl), &acl, 0, NULL, NULL, &format);
if(status != 0) {
NSLog(#"Failed to create audio format description");
return;
}
CMSampleBufferRef buffer;
status = CMSampleBufferCreate(kCFAllocatorDefault, NULL, false, NULL, NULL, format, (CMItemCount)number_of_frames, 1, &timing, 0, NULL, &buffer);
if(status != 0) {
NSLog(#"Failed to allocate sample buffer");
return;
}
AudioBufferList bufferList;
bufferList.mNumberBuffers = 1;
bufferList.mBuffers[0].mNumberChannels = audioFormat.mChannelsPerFrame;
bufferList.mBuffers[0].mDataByteSize = (UInt32)(number_of_frames * audioFormat.mBytesPerFrame);
bufferList.mBuffers[0].mData = audio_data;
status = CMSampleBufferSetDataBufferFromAudioBufferList(buffer, kCFAllocatorDefault, kCFAllocatorDefault, 0, &bufferList);
if(status != 0) {
NSLog(#"Failed to convert audio buffer list into sample buffer");
return;
}
[recorder writeAudioFrames:buffer];
CFRelease(buffer);
}
For reference, the sample rate I'm receiving from WebRTC on an iPhone 6S+ / iOS 9.2 is 48kHz with 480 samples per invocation of this hook and I'm receiving data every 10 ms.
First of all, congratulations on having the temerity to create an audio CMSampleBuffer from scratch. For most, they are neither created nor destroyed, but handed down immaculate and mysterious from CoreMedia and AVFoundation.
The presentationTimeStamps in your timing info are in integral milliseconds, which cannot represent your 48kHz samples' positions in time.
Instead of CMTimeMake(elapsed_time_ms, 1000), try CMTimeMake(elapsed_frames, sample_rate), where elapsed_frames are the number of frames that you have previously written.
That would explain the distortion, but not the pitch, so make sure that the AudioStreamBasicDescription matches your AVAssetWriterInput setup. It's hard to say without seeing your AVAssetWriter code.
p.s Look out for writeAudioFrames - if it's asynchronous, you'll have problems with ownership of the audio_data.
p.p.s. it looks like you're leaking the CMFormatDescriptionRef.
I ended up opening up the audio file that was generated in Audacity and saw that every frame had half of it dropped, as shown in this rather bizarre looking waveform:
Changing acl.mChannelLayoutTag to kAudioChannelLayoutTag_Mono and changing audioFormat.mChannelsPerFrame to 1 solved the issue and now the audio quality is perfect. Hooray!

Looping AAC file with ExtAudioFileRead - bug?

Reading audio files on iOS with ExtAudioFileRead, it seems that reaching eof completely freezes the reader… Example, assumes _abl AudioBufferList and _eaf ExtendedAudioFileRef are allocated and correctly configured:
- ( void )testRead
{
UInt32 requestedFrames = 1024;
UInt32 numFrames = requestedFrames;
OSStatus error = 0;
error = ExtAudioFileRead( _eaf, &numFrames, _abl );
if( numFrames < requestedFrames ) //eof, want to read enough frames from the beginning of the file to reach requestedFrames and loop gaplessly
{
requestedFrames = requestedFrames - numFrames;
numFrames = requestedFrames;
// move some pointers in _abl's buffers to write at the correct offset
error = ExtAudioFileSeek( _eaf, 0 );
error = ExtAudioFileRead( _eaf, &numFrames, _abl );
if( numFrames != requestedFrames ) //Now this call always sets numFrames to the same value as the previous read call...
{
NSLog( #"Oh no!" );
}
}
}
No errors, always the same behavior, exactly as if the reader was stuck at the end of the file. ExtAudioFileTell confirms the requested seek, btw. Also tried keeping track of the position in the file to request only the number of frames available at eof, same result: as soon as the last packet is read, seek seems to have no effect.
Happily seeking in other circumstances.
Bug? Feature? Imminent face palm? I'd very much appreciate any help in solving this!
I'm testing this on an iPad 3 ( iOS7.1 ).
Cheers,
Gregzo
Woozah!
Gotcha, evil AudioBufferList tinkerer.
So, in addition to informing the client as to the number of frames actually read, ExtAudioFileRead also sets the AudioBufferList's AudioBuffers mDataByteSize to the number of bytes read. As it clamps reading to that value, not resetting it at eof results in perpetually getting less frames than asking.
So, once eof is reached, simply reset the abl's buffers size.
-( void )resetABLBuffersSize: ( AudioBufferList * )alb size: ( UInt32 )size
{
AudioBuffer * buffer;
UInt32 i;
for( i = 0; i < abl->mNumberBuffers; i++ )
{
buffer = &( abl->mBuffers[ i ] );
buffer->mDataByteSize = size;
}
}
Shouldn't this be documented? The official docs only describe the AudioBufferList parameter as such: One or more buffers into which the audio data is read.
Cheers,
Gregzo

Audioqueue callback not being called

So, basically I want to play some audio files (mp3 and caf mostly). But the callback never gets called. Only when I call them to prime the queue.
Here's my data struct:
struct AQPlayerState
{
CAStreamBasicDescription mDataFormat;
AudioQueueRef mQueue;
AudioQueueBufferRef mBuffers[kBufferNum];
AudioFileID mAudioFile;
UInt32 bufferByteSize;
SInt64 mCurrentPacket;
UInt32 mNumPacketsToRead;
AudioStreamPacketDescription *mPacketDescs;
bool mIsRunning;
};
Here's my callback function:
static void HandleOutputBuffer (void *aqData, AudioQueueRef inAQ, AudioQueueBufferRef inBuffer)
{
NSLog(#"HandleOutput");
AQPlayerState *pAqData = (AQPlayerState *) aqData;
if (pAqData->mIsRunning == false) return;
UInt32 numBytesReadFromFile;
UInt32 numPackets = pAqData->mNumPacketsToRead;
AudioFileReadPackets (pAqData->mAudioFile,
false,
&numBytesReadFromFile,
pAqData->mPacketDescs,
pAqData->mCurrentPacket,
&numPackets,
inBuffer->mAudioData);
if (numPackets > 0) {
inBuffer->mAudioDataByteSize = numBytesReadFromFile;
AudioQueueEnqueueBuffer (pAqData->mQueue,
inBuffer,
(pAqData->mPacketDescs ? numPackets : 0),
pAqData->mPacketDescs);
pAqData->mCurrentPacket += numPackets;
} else {
// AudioQueueStop(pAqData->mQueue, false);
// AudioQueueDispose(pAqData->mQueue, true);
// AudioFileClose (pAqData->mAudioFile);
// free(pAqData->mPacketDescs);
// free(pAqData->mFloatBuffer);
pAqData->mIsRunning = false;
}
}
And here's my method:
- (void)playFile
{
AQPlayerState aqData;
// get the source file
NSString *p = [[NSBundle mainBundle] pathForResource:#"1_Female" ofType:#"mp3"];
NSURL *url2 = [NSURL fileURLWithPath:p];
CFURLRef srcFile = (__bridge CFURLRef)url2;
OSStatus result = AudioFileOpenURL(srcFile, 0x1/*fsRdPerm*/, 0/*inFileTypeHint*/, &aqData.mAudioFile);
CFRelease (srcFile);
CheckError(result, "Error opinning sound file");
UInt32 size = sizeof(aqData.mDataFormat);
CheckError(AudioFileGetProperty(aqData.mAudioFile, kAudioFilePropertyDataFormat, &size, &aqData.mDataFormat),
"Error getting file's data format");
CheckError(AudioQueueNewOutput(&aqData.mDataFormat, HandleOutputBuffer, &aqData, CFRunLoopGetCurrent(), kCFRunLoopCommonModes, 0, &aqData.mQueue),
"Error AudioQueueNewOutPut");
// we need to calculate how many packets we read at a time and how big a buffer we need
// we base this on the size of the packets in the file and an approximate duration for each buffer
{
bool isFormatVBR = (aqData.mDataFormat.mBytesPerPacket == 0 || aqData.mDataFormat.mFramesPerPacket == 0);
// first check to see what the max size of a packet is - if it is bigger
// than our allocation default size, that needs to become larger
UInt32 maxPacketSize;
size = sizeof(maxPacketSize);
CheckError(AudioFileGetProperty(aqData.mAudioFile, kAudioFilePropertyPacketSizeUpperBound, &size, &maxPacketSize),
"Error getting max packet size");
// adjust buffer size to represent about a second of audio based on this format
CalculateBytesForTime(aqData.mDataFormat, maxPacketSize, 1.0/*seconds*/, &aqData.bufferByteSize, &aqData.mNumPacketsToRead);
if (isFormatVBR) {
aqData.mPacketDescs = new AudioStreamPacketDescription [aqData.mNumPacketsToRead];
} else {
aqData.mPacketDescs = NULL; // we don't provide packet descriptions for constant bit rate formats (like linear PCM)
}
printf ("Buffer Byte Size: %d, Num Packets to Read: %d\n", (int)aqData.bufferByteSize, (int)aqData.mNumPacketsToRead);
}
// if the file has a magic cookie, we should get it and set it on the AQ
size = sizeof(UInt32);
result = AudioFileGetPropertyInfo(aqData.mAudioFile, kAudioFilePropertyMagicCookieData, &size, NULL);
if (!result && size) {
char* cookie = new char [size];
CheckError(AudioFileGetProperty(aqData.mAudioFile, kAudioFilePropertyMagicCookieData, &size, cookie),
"Error getting cookie from file");
CheckError(AudioQueueSetProperty(aqData.mQueue, kAudioQueueProperty_MagicCookie, cookie, size),
"Error setting cookie to file");
delete[] cookie;
}
aqData.mCurrentPacket = 0;
for (int i = 0; i < kBufferNum; ++i) {
CheckError(AudioQueueAllocateBuffer (aqData.mQueue,
aqData.bufferByteSize,
&aqData.mBuffers[i]),
"Error AudioQueueAllocateBuffer");
HandleOutputBuffer (&aqData,
aqData.mQueue,
aqData.mBuffers[i]);
}
// set queue's gain
Float32 gain = 1.0;
CheckError(AudioQueueSetParameter (aqData.mQueue,
kAudioQueueParam_Volume,
gain),
"Error AudioQueueSetParameter");
aqData.mIsRunning = true;
CheckError(AudioQueueStart(aqData.mQueue,
NULL),
"Error AudioQueueStart");
}
And the output when I press play:
Buffer Byte Size: 40310, Num Packets to Read: 38
HandleOutput start
HandleOutput start
HandleOutput start
I tryed replacing CFRunLoopGetCurrent() with CFRunLoopGetMain() and CFRunLoopCommonModes with CFRunLoopDefaultMode, but nothing.
Shouldn't the primed buffers start playing right away I start the queue?
When I start the queue, no callbacks are bang fired.
What am I doing wrong? Thanks for any ideas
What you are basically trying to do here is a basic example of audio playback using Audio Queues. Without looking at your code in detail to see what's missing (that could take a while) i'd rather recommend to you to follow the steps in this basic sample code that does exactly what you're doing (without the extras that aren't really relevant.. for example why are you trying to add audio gain?)
Somewhere else you were trying to play audio using audio units. Audio units are more complex than basic audio queue playback, and I wouldn't attempt them before being very comfortable with audio queues. But you can look at this example project for a basic example of audio queues.
In general when it comes to Core Audio programming in iOS, it's best you take your time with the basic examples and build your way up.. the problem with a lot of tutorials online is that they add extra stuff and often mix it with obj-c code.. when Core Audio is purely C code (ie the extra stuff won't add anything to the learning process). I strongly recommend you go over the book Learning Core Audio if you haven't already. All the sample code is available online, but you can also clone it from this repo for convenience. That's how I learned core audio. It takes time :)

Play audio file using Audio Units?

I've successfully recorded audio from the microphone into an audio file using Audio Units with the help of openframeworks and this website http://atastypixel.com/blog/using-remoteio-audio-unit.
I want to be able to stream the file back to audio units and play the audio. According to Play an audio file using RemoteIO and Audio Unit I can use ExtAudioFileOpenURL and ExtAudioFileRead. However, how do I play audio data in my buffer?
This is what I currently have:
static OSStatus setupAudioFileRead() {
//construct the file destination URL
CFURLRef destinationURL = audioSystemFileURL();
OSStatus status = ExtAudioFileOpenURL(destinationURL, &audioFileRef);
CFRelease(destinationURL);
if (checkStatus(status)) { ofLog(OF_LOG_ERROR, "ofxiPhoneSoundStream: Couldn't open file to read"); return status; }
while( TRUE ) {
// Try to fill the buffer to capacity.
UInt32 framesRead = 8000;
status = ExtAudioFileRead( audioFileRef, &framesRead, &inputBufferList );
// error check
if( checkStatus(status) ) { break; }
// 0 frames read means EOF.
if( framesRead == 0 ) { break; }
//play audio???
}
return noErr;
}
From this author: http://atastypixel.com/blog/using-remoteio-audio-unit/, if you scroll down to the PLAYBACK section, try something like this:
static OSStatus playbackCallback(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData) {
// Notes: ioData contains buffers (may be more than one!)
// Fill them up as much as you can. Remember to set the size value in each buffer to match how
// much data is in the buffer.
for (int i=0; i < ioData->mNumberBuffers; i++)
{
AudioBuffer buffer = ioData->mBuffers[i];
// copy from your whatever buffer data to output buffer
UInt32 size = min(buffer.mDataByteSize, your buffer.size);
memcpy(buffer.mData, your buffer, size);
buffer.mDataByteSize = size; // indicate how much data we wrote in the buffer
// To test if your Audio Unit setup is working - comment out the three
// lines above and uncomment the for loop below to hear random noise
/*
UInt16 *frameBuffer = buffer.mData;
for (int j = 0; j < inNumberFrames; j++) {
frameBuffer[j] = rand();
}
*/
}
return noErr;
}
If you are only looking for recording from MIC to a file and play it back, the Apple's Speakhere sample is probably much more ready to use.
Basically,
1. Create a RemoteIO unit (See references about how to create RemoteIO);
Create a FilePlayer audio unit which is a dedicated audio unit to read an audio file and provide audio data in the file to output units, for example, the RemoteIO unit created in step 1. To actually use the FilePlayer, a lot of settings (specify which file to play, which part of the file to play, etc.) are needed to be done on the it;
Set kAudioUnitProperty_SetRenderCallback and kAudioUnitProperty_StreamFormat properties of the RemoteIO unit. The first property is essentially a callback function from which the RemoteIO unit pulls audio data and play it. The second property must be set in accordance to StreamFormat that supported by the FilePlayer. It can be derived from a get-property function invoked on the FilePlayer.
Define the callback set in step 3 where the most important thing to do is asking the FilePlayer to render into the buffer provided by the callback for which you will need to invoke AudioUnitRender() on the FilePlayer.
Finally start the RemoteIO unit to play the file.
Above is just a preliminary outline of basic things to do to play files using audio units on iOS. You can refer to Chris Adamson and Kevin Avila's Learning Core Audio for details.
It's a relatively simple approach that utilizes the audio unit mentioned in the Tasty Pixel blog. In the recording callback, instead of filling the buffer with data from the microphone, you could fill it with data from the file using ExtAudioFileRead. I'll try and paste an example below. Mind you this will just work for .caf files.
In the start method call an readAudio or initAudioFile function, something that just gets all the info about the file.
- (void) start {
readAudio();
OSStatus status = AudioOutputUnitStart(audioUnit);
checkStatus(status);
}
Now in the readAudio method you initialize the audio file reference as such.
ExtAudioFileRef fileRef;
void readAudio() {
NSString * name = #"AudioFile";
NSString * source = [[NSBundle mainBundle] pathForResource:name ofType:#"caf"];
const char * cString = [source cStringUsingEncoding:NSASCIIStringEncoding];
CFStringRef str = CFStringCreateWithCString(NULL, cString, kCFStringEncodingMacRoman);
CFURLRef inputFileURL = CFURLCreateWithFileSystemPath(kCFAllocatorDefault, str, kCFURLPOSIXPathStyle, false);
AudioFileID fileID;
OSStatus err = AudioFileOpenURL(inputFileURL, kAudioFileReadPermission, 0, &fileID);
CheckError(err, "AudioFileOpenURL");
err = ExtAudioFileOpenURL(inputFileURL, &fileRef);
CheckError(err, "ExtAudioFileOpenURL");
err = ExtAudioFileSetProperty(fileRef, kExtAudioFileProperty_ClientDataFormat, sizeof(AudioStreamBasicDescription), &audioFormat);
CheckError(err, "ExtAudioFileSetProperty");
}
Now that you have the Audio Data at hand, next step is pretty easy. In the recordingCallback read the data from the file instead of the mic.
static OSStatus recordingCallback(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData) {
// Because of the way our audio format (setup below) is chosen:
// we only need 1 buffer, since it is mono
// Samples are 16 bits = 2 bytes.
// 1 frame includes only 1 sample
AudioBuffer buffer;
buffer.mNumberChannels = 1;
buffer.mDataByteSize = inNumberFrames * 2;
buffer.mData = malloc( inNumberFrames * 2 );
// Put buffer in a AudioBufferList
AudioBufferList bufferList;
bufferList.mNumberBuffers = 1;
bufferList.mBuffers[0] = buffer;
// Then:
// Obtain recorded samples
OSStatus err = ExtAudioFileRead(fileRef, &inNumberFrames, &bufferList);
// Now, we have the samples we just read sitting in buffers in bufferList
// Process the new data
[iosAudio processAudio:&bufferList];
// release the malloc'ed data in the buffer we created earlier
free(bufferList.mBuffers[0].mData);
return noErr;
}
This worked for me.

Can I use AVCaptureSession to encode an AAC stream to memory?

I'm writing an iOS app that streams video and audio over the network.
I am using AVCaptureSession to grab raw video frames using AVCaptureVideoDataOutput and encode them in software using x264. This works great.
I wanted to do the same for audio, only that I don't need that much control on the audio side so I wanted to use the built in hardware encoder to produce an AAC stream. This meant using Audio Converter from the Audio Toolbox layer. In order to do so I put in a handler for AVCaptudeAudioDataOutput's audio frames:
- (void)captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection
{
// get the audio samples into a common buffer _pcmBuffer
CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
CMBlockBufferGetDataPointer(blockBuffer, 0, NULL, &_pcmBufferSize, &_pcmBuffer);
// use AudioConverter to
UInt32 ouputPacketsCount = 1;
AudioBufferList bufferList;
bufferList.mNumberBuffers = 1;
bufferList.mBuffers[0].mNumberChannels = 1;
bufferList.mBuffers[0].mDataByteSize = sizeof(_aacBuffer);
bufferList.mBuffers[0].mData = _aacBuffer;
OSStatus st = AudioConverterFillComplexBuffer(_converter, converter_callback, (__bridge void *) self, &ouputPacketsCount, &bufferList, NULL);
if (0 == st) {
// ... send bufferList.mBuffers[0].mDataByteSize bytes from _aacBuffer...
}
}
In this case the callback function for the audio converter is pretty simple (assuming packet sizes and counts are setup properly):
- (void) putPcmSamplesInBufferList:(AudioBufferList *)bufferList withCount:(UInt32 *)count
{
bufferList->mBuffers[0].mData = _pcmBuffer;
bufferList->mBuffers[0].mDataByteSize = _pcmBufferSize;
}
And the setup for the audio converter looks like this:
{
// ...
AudioStreamBasicDescription pcmASBD = {0};
pcmASBD.mSampleRate = ((AVAudioSession *) [AVAudioSession sharedInstance]).currentHardwareSampleRate;
pcmASBD.mFormatID = kAudioFormatLinearPCM;
pcmASBD.mFormatFlags = kAudioFormatFlagsCanonical;
pcmASBD.mChannelsPerFrame = 1;
pcmASBD.mBytesPerFrame = sizeof(AudioSampleType);
pcmASBD.mFramesPerPacket = 1;
pcmASBD.mBytesPerPacket = pcmASBD.mBytesPerFrame * pcmASBD.mFramesPerPacket;
pcmASBD.mBitsPerChannel = 8 * pcmASBD.mBytesPerFrame;
AudioStreamBasicDescription aacASBD = {0};
aacASBD.mFormatID = kAudioFormatMPEG4AAC;
aacASBD.mSampleRate = pcmASBD.mSampleRate;
aacASBD.mChannelsPerFrame = pcmASBD.mChannelsPerFrame;
size = sizeof(aacASBD);
AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL, &size, &aacASBD);
AudioConverterNew(&pcmASBD, &aacASBD, &_converter);
// ...
}
This seems pretty straight forward only the IT DOES NOT WORK. Once the AVCaptureSession is running, the audio converter (specifically AudioConverterFillComplexBuffer) returns an 'hwiu' (hardware in use) error. Conversion works fine if the session is stopped but then I can't capture anything...
I was wondering if there was a way to get an AAC stream out of AVCaptureSession. The options I'm considering are:
Somehow using AVAssetWriterInput to encode audio samples into AAC and then get the encoded packets somehow (not through AVAssetWriter, which would only write to a file).
Reorganizing my app so that it uses AVCaptureSession only on the video side and uses Audio Queues on the audio side. This will make flow control (starting and stopping recording, responding to interruptions) more complicated and I'm afraid that it might cause synching problems between the audio and video. Also, it just doesn't seem like a good design.
Does anyone know if getting the AAC out of AVCaptureSession is possible? Do I have to use Audio Queues here? Could this get me into synching or control problems?
I ended up asking Apple for advice (it turns out you can do that if you have a paid developer account).
It seems that AVCaptureSession grabs a hold of the AAC hardware encoder but only lets you use it to write directly to file.
You can use the software encoder but you have to ask for it specifically instead of using AudioConverterNew:
AudioClassDescription *description = [self
getAudioClassDescriptionWithType:kAudioFormatMPEG4AAC
fromManufacturer:kAppleSoftwareAudioCodecManufacturer];
if (!description) {
return false;
}
// see the question as for setting up pcmASBD and arc ASBD
OSStatus st = AudioConverterNewSpecific(&pcmASBD, &aacASBD, 1, description, &_converter);
if (st) {
NSLog(#"error creating audio converter: %s", OSSTATUS(st));
return false;
}
with
- (AudioClassDescription *)getAudioClassDescriptionWithType:(UInt32)type
fromManufacturer:(UInt32)manufacturer
{
static AudioClassDescription desc;
UInt32 encoderSpecifier = type;
OSStatus st;
UInt32 size;
st = AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders,
sizeof(encoderSpecifier),
&encoderSpecifier,
&size);
if (st) {
NSLog(#"error getting audio format propery info: %s", OSSTATUS(st));
return nil;
}
unsigned int count = size / sizeof(AudioClassDescription);
AudioClassDescription descriptions[count];
st = AudioFormatGetProperty(kAudioFormatProperty_Encoders,
sizeof(encoderSpecifier),
&encoderSpecifier,
&size,
descriptions);
if (st) {
NSLog(#"error getting audio format propery: %s", OSSTATUS(st));
return nil;
}
for (unsigned int i = 0; i < count; i++) {
if ((type == descriptions[i].mSubType) &&
(manufacturer == descriptions[i].mManufacturer)) {
memcpy(&desc, &(descriptions[i]), sizeof(desc));
return &desc;
}
}
return nil;
}
The software encoder will take up CPU resources, of course, but will get the job done.

Resources