We have a communications application that has been out for over 8 years now on the IOS platform, and recently we have run into an issue on the new iPhone 14.
We are using the audio session category AVAudioSessionCategoryPlayAndRecord with AVAudioSessionModeVoiceChat. We are also using the kAudioUnitSubType_VoiceProcessingIO component.
Part of the CoreAudio setup sets the callback size as follows:
NSTimeInterval desiredBufferDuration = .02;
BOOL prefResult = [session setPreferredIOBufferDuration:desiredBufferDuration error:&nsError];
Then when asking for the buffer duration back with
NSTimeInterval actualBufferDuration = [session IOBufferDuration];
We get the expected .0213333, which is 1024 samples at 48kHz.
In the audio callback, we have ALWAYS received 1024 samples. In our callback, we simply log the number of samples supplied as follows:
static OSStatus inputAudioCallback (void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData)
{
NSLog (#"A %d", inNumberFrames);
}
The code is simplified, but you get the idea.
On any hardware device other than iPhone 14, we get 1024 every time from this log statement.
On the iPhone 14, we get the following sequence:
A 960 (6 times)
A 1440
A 960 (7 times)
A 1440
And it repeats. This alone for playback isn't really causing any issues, HOWEVER we also pull in the microphone during the callback. Here's the simplified code in the audio callback:
renderErr = AudioUnitRender(ioUnit, ioActionFlags, inTimeStamp, 1, inNumberFrames, myIOData);
Simple call, but quite often at the 960/1440 transition the AudioUnitRender returns kAudioUnitErr_CannotDoInCurrentContext for both the last 960 and the 1440 callback.
This results in lost microphone data, causing popping/glitching in the audio.
If we switch to using the kAudioUnitSubType_RemoteIO subtype, we reliably get 1024 samples per callback and the AudioUnitRender function works correctly every time. Problem is we don't get echo cancellation so using the device hand-held is worthless in this mode.
So the question is, has something severely changed with iPhone 14 where AudioUnitRender is called during the audio callback when using the kAudioUnitySubType_VoiceProcessingIO? This is most definitely not an IOS 16 bug since there are no issues on iPhone 13 or previous models that support IOS 16.
The fact that we're not getting 1024 samples at a time kind of tells us something is really wrong, but this code has worked correctly for years now and is acting real funky on iPhone 14.
We were experiencing the same issue on our end. We believe it is an Apple issue since, as you said, nothing has changed in this API that we know of. It should be a valid way of interacting with the kAudioUnitSubType_VoiceProcessingIO Audio Unit.
As a workaround, we decided to try registering another callback for capture using the kAudioOutputUnitProperty_SetInputCallback property alongside the one we set with kAudioUnitProperty_SetRenderCallback (which would be handling both render and capture before).
AURenderCallbackStruct callbackStruct;
callbackStruct.inputProc = captureCallback;
callbackStruct.inputProcRefCon = this;
OSStatus result = AudioUnitSetProperty(audio_unit,
kAudioOutputUnitProperty_SetInputCallback,
kAudioUnitScope_Global,
1, // input bus
&callbackStruct,
sizeof(callbackStruct));
Then, the captureCallback calls the AudioUnitRender and copies the data into our buffers:
OSStatus IosAudioUnit::captureCallback(AudioUnitRenderActionFlags* ioActionFlags, const AudioTimeStamp* inTimeStamp, UInt32 inBusNumber, UInt32 inNumberFrames, AudioBufferList* /*ioData*/)
{
const int currBuffSize = inNumberFrames * sizeof(float);
bufferList.mData[0].mDataByteSize = currBuffSize;
if (currBuffSize > MAX_BUF_SIZE) {
printf("ERROR: currBufSize %d > MAX_BUF_SIZE %d\n");
return kAudio_ParamError;
}
AudioUnitRender(audio_unit, ioActionFlags, inTimeStamp, inOutputBusNumber, inNumberFrames, &bufferList);
// Copy the data in the bufferList into our buffers
// ...
}
Here's how we setup the bufferList in our IosAudioUnit constructor:
#define MAX_BUF_SIZE = 23040; // 60 ms of frames in bytes at 48000Hz
IosAudioUnit::IosAudioUnit()
{
AudioBufferList bufferList;
bufferList.mNumberBuffers = 1;
bufferList.mBuffers[0].mData = malloc(MAX_BUF_SIZE);
bufferList.mBuffers[0].mDataByteSize = MAX_BUF_SIZE;
bufferList.mBuffers[0].mNumberChannels = 2;
}
Both the render callback and the new capture callback are still called with sample sizes of 960 and 1440 on iPhone 14, but the AudioUnit seems to handle it better, and there are no more popping/glitches in audio!
Hope this helps!
Related
Playback through my AudioUnit works fine until I start getting gyroscope updates from a CMMotionManager. I assumed this was due to a performance hit, but when I measured the runtime of my callback during said gyroscope updates it isn't as high as other CMMotionManager-less trials with smooth playback, yet the playback is choppy.
Some visual explanation (The red is the time between consecutive callbacks. The green--it's hard to see but there's bits of green right underneath all the red--is the runtime of the callback, which is consistently just a few milliseconds less):
Sorry if the graph is a bit messy, hopefully I'm still getting my point across.
In sum, rather than the runtime of the callback, the quality of the playback seems more tied to the "steadiness" of the frequency at which the callback is, erm, called back. What could be going on here? Could my callback runtimes just be off? That seems unlikely. I'm timing my callback via calls to clock() at the beginning and end. Is my AudioUnit setup wrong? It is admittedly a bit hacked together, and I'm not using an AUGraph or anything.
AudioUnit initialization code:
AudioComponentDescription desc;
desc.componentType = kAudioUnitType_Output;
desc.componentSubType = kAudioUnitSubType_RemoteIO; // Remote I/O is for talking with the hardware
desc.componentFlags = 0;
desc.componentFlagsMask = 0;
desc.componentManufacturer = kAudioUnitManufacturer_Apple;
AudioComponent component = AudioComponentFindNext(NULL, &desc);
AudioComponentInstanceNew(component, &myAudioUnit);
UInt32 enableIO;
AudioUnitElement inputBus = 1;
AudioUnitElement outputBus = 0;
//Disabling IO for recording
enableIO = 0;
AudioUnitSetProperty(myAudioUnit, kAudioOutputUnitProperty_EnableIO, kAudioUnitScope_Input, inputBus, &enableIO, sizeof(enableIO));
//Enabling IO for playback
enableIO = 1;
AudioUnitSetProperty(myAudioUnit, kAudioOutputUnitProperty_EnableIO, kAudioUnitScope_Output, outputBus, &enableIO, sizeof(enableIO));
UInt32 bytesPerSample = BIT_DEPTH/8.0;
AudioStreamBasicDescription stereoStreamFormat = {0};
stereoStreamFormat.mBitsPerChannel = 8 * bytesPerSample;
stereoStreamFormat.mBytesPerFrame = bytesPerSample;
stereoStreamFormat.mBytesPerPacket = bytesPerSample;
stereoStreamFormat.mChannelsPerFrame = 2; // 2 indicates stereo
stereoStreamFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger |
kAudioFormatFlagsNativeEndian |
kAudioFormatFlagIsPacked |
kAudioFormatFlagIsNonInterleaved;
stereoStreamFormat.mFormatID = kAudioFormatLinearPCM;
stereoStreamFormat.mFramesPerPacket = 1;
stereoStreamFormat.mReserved = 0;
stereoStreamFormat.mSampleRate = SAMPLE_RATE;
AudioUnitSetProperty(myAudioUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output, inputBus, &stereoStreamFormat, sizeof(AudioStreamBasicDescription));
AudioUnitSetProperty(myAudioUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input, outputBus, &stereoStreamFormat, sizeof(AudioStreamBasicDescription));
//Setting input callback
AURenderCallbackStruct callbackStruct;
callbackStruct.inputProc = &recordingCallback; //TODO: Should there be an ampersand?
callbackStruct.inputProcRefCon = myAudioUnit;
AudioUnitSetProperty(myAudioUnit, kAudioOutputUnitProperty_SetInputCallback, kAudioUnitScope_Output, inputBus, &callbackStruct, sizeof(callbackStruct)); //TODO: Not sure of scope and bus/element
//Setting output callback
callbackStruct.inputProc = &playbackCallback;
callbackStruct.inputProcRefCon = myAudioUnit;
AudioUnitSetProperty(myAudioUnit, kAudioUnitProperty_SetRenderCallback, kAudioUnitScope_Input, outputBus, &callbackStruct, sizeof(callbackStruct));
AudioUnitInitialize(myAudioUnit);
RemoteIO Playback Callback:
static OSStatus playbackCallback (void *inRefCon, AudioUnitRenderActionFlags *ioActionFlags, const AudioTimeStamp *inTimeStamp, UInt32 inBusNumber, UInt32 inNumberFrames, AudioBufferList *ioData) {
timeOfCallback = clock();
if (timeOfPrevCallback != 0) {
callbackInterval = (double)timeOfCallback - timeOfPrevCallback;
}
clock_t t1, t2;
t1 = clock();
FooBarClass::myCallbackFunction((short *)ioData->mBuffers[0].mData, (short *)ioData->mBuffers[1].mData);
t2 = clock();
cout << "Callback duration: " << ((double)(t2-t1))/CLOCKS_PER_SEC << endl;
cout << "Callback interval: " << callbackInterval/CLOCKS_PER_SEC << endl;
timeOfPrevCallback = timeOfCallback;
//prevCallbackInterval = callbackInterval;
return noErr;
}
In myCallbackFunction I'm reading from a handful of .wav files, filtering each one and mixing them together, and copying the output to the buffers passed to it. In the graph where I mention "incrementally adding computation" I'm referring to the number of input files I'm mixing together. Also, if it matters, gyroscope updates occur via an NSTimer that goes off every 1/25 of a second:
[self.getMotionManager startDeviceMotionUpdates];
gyroUpdateTimer = [NSTimer scheduledTimerWithTimeInterval:GYRO_UPDATE_INTERVAL target:self selector:#selector(doGyroUpdate) userInfo:nil repeats:YES];
...
+(void)doGyroUpdate {
double currentYaw = motionManager.deviceMotion.attitude.yaw;
// a couple more lines of not very expensive code
}
I should also be more clear about what I mean by choppiness in this sense: The audio isn't skipping, it just sounds really bad, as if an additional, crackly track was getting mixed in while the other tracks play smoothly. I'm not talking about clipping either, which it isn't because the only difference between good and bad playback is the gyroscope updates.
Thanks in advance for any tips.
----- Update 1 -----
My runtimes were a bit off because I was using clock(), which apparently doesn't work right for multithreaded applications. Apparently get_clock_time() is the proper way to measure runtimes across multiple threads but it's not implemented for Darwin. Though it's not an ideal solution, I'm using gettimeofday() now to measure the callback run time and intervals. Aside from the now steady intervals between successive callbacks (which were previously pretty erratic), things are more or less the same:
----- Update 2 -----
Interestingly enough, when I start and then stop CMMotionManager updates altogether via stopDeviceMotionUpdates, the crackliness persists...
----- Update 3 -----
'Crackliness' doesn't start until the first CMMotionManager is received, i.e. when the deviceMotion property is first checked after the NSTimer is first triggered. After that, crackliness persists regardless of the update frequency and even after updates are stopped.
You are trying to call Objective C methods, do synchronous file reads, and/or do significant computation (your C++ function) inside a real-time audio render callback. Also, logging to cout from inside a real-time thread is most likely not going to work reliably. These operations can take too long to meet the latency critical real-time requirements of Audio Unit callbacks.
Instead, for any data that does not have a tightly bounded maximum latency to generate, you might just copy that data from a lock free circular fifo or queue inside your render callback, and fill that audio data fifo slightly ahead of time in another thread (perhaps running on an NSTimer or CADisplayLink).
I had a similar issue when activating the location services. The issue was only present on slower devices like the iPod touch 16gb and not present on other hardware. I saw that you have in your graph title BUF_SIZE: 1024
Is this your call back time?
I fixed my problem by increasing the callback time (buffer size).
If you can handle more latency, try increasing the callback time using
NSTimeInterval _preferredDuration = (2048) / 44100.0 ; // Try bigger values here
NSError* err;
[[AVAudioSession sharedInstance]setPreferredIOBufferDuration:_preferredDuration error:&err];
I'm trying to do active noise cancellation in iOS using remote I/O. I have been able to get the audio buffer in 8.24 fixed point format and put it in the speaker as well. Right now I'm trying to capture a sinusoidal wave through microphone (using onlinetonegenerator.com) and reversing the magnitude in each frame I'm getting through the callback. Here goes my code:
static OSStatus PerformThru(
void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData) {
AppDelegate *THIS = (AppDelegate *)inRefCon;
OSStatus err = AudioUnitRender(THIS->rioUnit, ioActionFlags, inTimeStamp, 1, inNumberFrames, ioData);
for(int i = 0; i < ioData->mNumberBuffers; ++i) {
Float32 *flptr = (Float32 *)ioData->mBuffers[i].mData;
for(int j = 0; j < inNumberFrames; ++j) {
*flptr *= -1.; // inverting the buffer value
flptr++;
}
}
return err;
}
But the output tone doesn't seem to create a destructive interference. I'm sensing that when I'm running the app there is a periodic change of amplitude but it is not canceling the input sound.
I think there might be 2 more factors:
Latency in generating the output stream from microphone stream
Difference of amplitude between original sound and generated sound
Any idea how can I take care of these issues in AudioUnit?
Thanks a lot!
It seems you use kAudioUnitSubType_RemoteIO.
Did you try to use kAudioUnitSubType_VoiceProcessingIO?
May be it will help.
This audio unit does signal processing on
the incoming audio (taking out any of the audio that is played from the device
at a given time from the incoming audio).
I setup the audio unit render callback:
AURenderCallbackStruct input;
input.inputProc = RenderAudioBuffer;
input.inputProcRefCon = self;
err = AudioUnitSetProperty(audioPlaybackUnit,
kAudioUnitProperty_SetRenderCallback,
kAudioUnitScope_Input,
0,
&input,
sizeof(input));
Here is the callback method:
OSStatus RenderAudioBuffer( void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData)
{
}
In the callback method, inNumberFrames is always 1024. How do I change it? I have more than 1024 frames at a time instant to render (64K).
You can't specify an exact buffersize in iOS, but you can request one similar to a given size. The code looks something like this:
Float32 bufferSizeInSec = 0.02f;
if(AudioSessionSetProperty(kAudioSessionProperty_PreferredHardwareIOBufferDuration,
sizeof(Float32), &bufferSizeInSec) != noErr) {
return 1;
}
So basically you need to calculate the preferred buffer size in seconds (not samples, weirdly enough), and then hope that the system gives you a buffer size more to your liking.
However, you are probably looking at this problem the wrong way. AudioUnits are meant for realtime processing, so small buffersizes are preferred. A buffersize of 64K is absurdly large, and 1024 frames is actually quite large for a modern iPhone/iPad to process comfortably. Your aglorithm needs to be "block-based", meaning that you should break up the logic so that it can process 64K of samples in 64 calls, each with 1024 frames. This will lead to the most robust code.
I am trying to do a AudioUnitRender with an audioUnit Effect in the callback of a AudioUnit Mixer but no luck.
My mixer goes into a RemoteIO unit and process AudioData (from a file) in its callback. Works just fine.
I then added a AudioUnit effect node (a Reverb) to my graph and linked it to a AudioUnit. My effect node is not connected to anything in the graph.
AudioComponentDescription auEffectDescription;
auEffectDescription.componentType = kAudioUnitType_Effect;
auEffectDescription.componentSubType = kAudioUnitSubType_Reverb2;
auEffectDescription.componentManufacturer = kAudioUnitManufacturer_Apple;
auEffectDescription.componentFlags = 0;
auEffectDescription.componentFlagsMask = 0;
I am aware about the StreamFormat issues as described in the AudioGraph (zerokidz.com/audiograph) example and have set the StreamFormat of my Effect Unit as explained in AudioGraph.
So I know there is an issue of streamformat compatibility (see Bit-shifting audio samples from Float32 to SInt16 results in severe clipping). My ASBD:
[AudioUnitEffect]------------------------------
[AudioUnitEffect] Sample Rate: 44100
[AudioUnitEffect] Format ID: lpcm
[AudioUnitEffect] Format Flags: 41
[AudioUnitEffect] Bytes per Packet: 4
[AudioUnitEffect] Frames per Packet: 1
[AudioUnitEffect] Bytes per Frame: 4
[AudioUnitEffect] Channels per Frame: 2
[AudioUnitEffect] Bits per Channel: 32
[AudioUnitEffect]------------------------------
[AudioUnitMixer]------------------------------
[AudioUnitMixer] Sample Rate: 44100
[AudioUnitMixer] Format ID: lpcm
[AudioUnitMixer] Format Flags: 3116
[AudioUnitMixer] Bytes per Packet: 4
[AudioUnitMixer] Frames per Packet: 1
[AudioUnitMixer] Bytes per Frame: 4
[AudioUnitMixer] Channels per Frame: 2
[AudioUnitMixer] Bits per Channel: 32
[AudioUnitMixer]------------------------------
I have understood (correct me if I am wrong) that my effect is in Float32 while my Mixer uses the AudioUnitSampleType type (SInt32). I have tried various ways of converting the data buffers before and after calling the AudioUnitRender on the effect. I have tried to match the ASBD of my mixer to the AudioUnit Effect, it works but I get some horrible distortion (which would normal if the Effect uses Float32 type (between [-1,+1] I guess?).
The Apple doc says:
*Effect Units Stream format notes
On the input scope, manage stream formats as follows:
- If the input is fed by an audio unit connection, it acquires its stream format from that connection.
- If the input is fed by a render callback function, set your complete application stream format on the bus. Use the same stream format as used for the data provided by the callback. On the output scope, set the same full stream format that you used for the input.
*
Which I find is not that helpful..
Anyway there must a way to do a AudioRenderRender with Effect Unit. Has anyone succeded? Thanks.
André
My mixer callback :
static OSStatus mainMixerCallback(void *inRefCon, AudioUnitRenderActionFlags *ioActionFlags, const AudioTimeStamp *inTimeStamp, UInt32 inBusNumber, UInt32 inNumberFrames, AudioBufferList *ioData)
{
SoundEngine* se = (SoundEngine*)inRefCon;
// init buffers with zeros
memset((Byte *)ioData->mBuffers[0].mData, 0, ioData->mBuffers[0].mDataByteSize);
memset((Byte *)ioData->mBuffers[1].mData, 0, ioData->mBuffers[1].mDataByteSize);
// Render the Audio Track (works fine)
AudioUnitRender((AudioUnit)se.auTrack,ioActionFlags, inTimeStamp, inBusNumber, inNumberFrames, ioData);
// Effect (fails when there. Error:-50)
AudioUnitRender((AudioUnit)se.auEffect, ioActionFlags, inTimeStamp, inBusNumber, inNumberFrames, ioData);
return noErr;
}
You probably solved your problem by now, but just for the record: the buffer needs to have one channel, so you need to set up two AudioBuffers of one channel each instead of the 2 channel one you have now.
Can I get a little help with this?
In a test project, I have an AUSampler -> MixerUnit -> ioUnit and have a render callback set up. It all works. I am using the MusicDeviceMIDIEvent method as defined in MusicDevice.h to play a midi noteOn & noteOff. So in the hack test code below, a noteOn occurs for .5 sec. every 2 seconds.
MusicDeviceMIDIEvent (below) takes a param: inOffsetSampleFrame in order to schedule an event at a future time. What I would like to be able to do is play a noteOn and schedule the noteOff at the same time (without the hack time check I am doing below). I just don't understand what the inOffsetSampleFrame value should be (ex: to play a .5 sec or .2 second note. (in other words, I don't understand the basics of audio timing...).
So, if someone could walk me through the arithmetic to get proper values from the incoming AudioTimeStamp, that would be great! Also perhaps correct me/clarify any of these:
AudioTimeStamp->mSampleTime - sampleTime is the time of the
current sample "slice"? Is this in milliseconds?
AudioTimeStamp->mHostTime - ? host is the computer the app is running on and this is time (in milliseconds?) since computer started? This is a HUGE number. Doesn't it rollover and then cause problems?
inNumberFrames - seems like that is 512 on iOS5 (set through
kAudioUnitProperty_MaximumFramesPerSlice). So the sample is made
up of 512 frames?
I've seen lots of admonitions not to overload the render Callback
function - in particular to avoid Objective C calls - I understand
the reason, but how does one then message the UI or do other
processing?
I guess that's it. Thanks for bearing with me!
inOffsetSampleFrame
If you are scheduling the MIDI Event from the audio unit's render thread, then you can supply a
sample offset that the audio unit may apply when applying that event in its next audio unit render.
This allows you to schedule to the sample, the time when a MIDI command is applied and is particularly
important when starting new notes. If you are not scheduling in the audio unit's render thread,
then you should set this value to 0
// MusicDeviceMIDIEvent function def:
extern OSStatus
MusicDeviceMIDIEvent( MusicDeviceComponent inUnit,
UInt32 inStatus,
UInt32 inData1,
UInt32 inData2,
UInt32 inOffsetSampleFrame)
//my callback
OSStatus MyCallback( void * inRefCon,
AudioUnitRenderActionFlags * ioActionFlags,
const AudioTimeStamp * inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList * ioData)
{
Float64 sampleTime = inTimeStamp->mSampleTime;
UInt64 hostTime = inTimeStamp->mHostTime;
[(__bridge Audio*)inRefCon audioEvent:sampleTime andHostTime:hostTime];
return 1;
}
// OBJ-C method
- (void)audioEvent:(Float64) sampleTime andHostTime:(UInt64)hostTime
{
OSStatus result = noErr;
Float64 nowTime = (sampleTime/self.graphSampleRate); // sample rate: 44100.0
if (nowTime - lastTime > 2) {
UInt32 noteCommand = kMIDIMessage_NoteOn << 4 | 0;
result = MusicDeviceMIDIEvent (mySynthUnit, noteCommand, 60, 120, 0);
lastTime = sampleTime/self.graphSampleRate;
}
if (nowTime - lastTime > .5) {
UInt32 noteCommand = kMIDIMessage_NoteOff << 4 | 0;
result = MusicDeviceMIDIEvent (mySynthUnit, noteCommand, 60, 0, 0);
}
}
The answer here is that I misunderstood the purpose of inOffsetSampleFrame despite it being aptly named. I thought I could use it to schedule a noteOff event at some arbitrary time in the future so I didn't have to manage noteOffs, but the scope of this is simply within the current sample frame. Oh well.