How to do a AudioUnitRender with a AudioUnitEffect? - ios

I am trying to do a AudioUnitRender with an audioUnit Effect in the callback of a AudioUnit Mixer but no luck.
My mixer goes into a RemoteIO unit and process AudioData (from a file) in its callback. Works just fine.
I then added a AudioUnit effect node (a Reverb) to my graph and linked it to a AudioUnit. My effect node is not connected to anything in the graph.
AudioComponentDescription auEffectDescription;
auEffectDescription.componentType = kAudioUnitType_Effect;
auEffectDescription.componentSubType = kAudioUnitSubType_Reverb2;
auEffectDescription.componentManufacturer = kAudioUnitManufacturer_Apple;
auEffectDescription.componentFlags = 0;
auEffectDescription.componentFlagsMask = 0;
I am aware about the StreamFormat issues as described in the AudioGraph (zerokidz.com/audiograph) example and have set the StreamFormat of my Effect Unit as explained in AudioGraph.
So I know there is an issue of streamformat compatibility (see Bit-shifting audio samples from Float32 to SInt16 results in severe clipping). My ASBD:
[AudioUnitEffect]------------------------------
[AudioUnitEffect] Sample Rate: 44100
[AudioUnitEffect] Format ID: lpcm
[AudioUnitEffect] Format Flags: 41
[AudioUnitEffect] Bytes per Packet: 4
[AudioUnitEffect] Frames per Packet: 1
[AudioUnitEffect] Bytes per Frame: 4
[AudioUnitEffect] Channels per Frame: 2
[AudioUnitEffect] Bits per Channel: 32
[AudioUnitEffect]------------------------------
[AudioUnitMixer]------------------------------
[AudioUnitMixer] Sample Rate: 44100
[AudioUnitMixer] Format ID: lpcm
[AudioUnitMixer] Format Flags: 3116
[AudioUnitMixer] Bytes per Packet: 4
[AudioUnitMixer] Frames per Packet: 1
[AudioUnitMixer] Bytes per Frame: 4
[AudioUnitMixer] Channels per Frame: 2
[AudioUnitMixer] Bits per Channel: 32
[AudioUnitMixer]------------------------------
I have understood (correct me if I am wrong) that my effect is in Float32 while my Mixer uses the AudioUnitSampleType type (SInt32). I have tried various ways of converting the data buffers before and after calling the AudioUnitRender on the effect. I have tried to match the ASBD of my mixer to the AudioUnit Effect, it works but I get some horrible distortion (which would normal if the Effect uses Float32 type (between [-1,+1] I guess?).
The Apple doc says:
*Effect Units Stream format notes
On the input scope, manage stream formats as follows:
- If the input is fed by an audio unit connection, it acquires its stream format from that connection.
- If the input is fed by a render callback function, set your complete application stream format on the bus. Use the same stream format as used for the data provided by the callback. On the output scope, set the same full stream format that you used for the input.
*
Which I find is not that helpful..
Anyway there must a way to do a AudioRenderRender with Effect Unit. Has anyone succeded? Thanks.
André
My mixer callback :
static OSStatus mainMixerCallback(void *inRefCon, AudioUnitRenderActionFlags *ioActionFlags, const AudioTimeStamp *inTimeStamp, UInt32 inBusNumber, UInt32 inNumberFrames, AudioBufferList *ioData)
{
SoundEngine* se = (SoundEngine*)inRefCon;
// init buffers with zeros
memset((Byte *)ioData->mBuffers[0].mData, 0, ioData->mBuffers[0].mDataByteSize);
memset((Byte *)ioData->mBuffers[1].mData, 0, ioData->mBuffers[1].mDataByteSize);
// Render the Audio Track (works fine)
AudioUnitRender((AudioUnit)se.auTrack,ioActionFlags, inTimeStamp, inBusNumber, inNumberFrames, ioData);
// Effect (fails when there. Error:-50)
AudioUnitRender((AudioUnit)se.auEffect, ioActionFlags, inTimeStamp, inBusNumber, inNumberFrames, ioData);
return noErr;
}

You probably solved your problem by now, but just for the record: the buffer needs to have one channel, so you need to set up two AudioBuffers of one channel each instead of the 2 channel one you have now.

Related

AudioUnitRender error kAudioUnitErr_CannotDoInCurrentContext on iPhone 14 only

We have a communications application that has been out for over 8 years now on the IOS platform, and recently we have run into an issue on the new iPhone 14.
We are using the audio session category AVAudioSessionCategoryPlayAndRecord with AVAudioSessionModeVoiceChat. We are also using the kAudioUnitSubType_VoiceProcessingIO component.
Part of the CoreAudio setup sets the callback size as follows:
NSTimeInterval desiredBufferDuration = .02;
BOOL prefResult = [session setPreferredIOBufferDuration:desiredBufferDuration error:&nsError];
Then when asking for the buffer duration back with
NSTimeInterval actualBufferDuration = [session IOBufferDuration];
We get the expected .0213333, which is 1024 samples at 48kHz.
In the audio callback, we have ALWAYS received 1024 samples. In our callback, we simply log the number of samples supplied as follows:
static OSStatus inputAudioCallback (void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData)
{
NSLog (#"A %d", inNumberFrames);
}
The code is simplified, but you get the idea.
On any hardware device other than iPhone 14, we get 1024 every time from this log statement.
On the iPhone 14, we get the following sequence:
A 960 (6 times)
A 1440
A 960 (7 times)
A 1440
And it repeats. This alone for playback isn't really causing any issues, HOWEVER we also pull in the microphone during the callback. Here's the simplified code in the audio callback:
renderErr = AudioUnitRender(ioUnit, ioActionFlags, inTimeStamp, 1, inNumberFrames, myIOData);
Simple call, but quite often at the 960/1440 transition the AudioUnitRender returns kAudioUnitErr_CannotDoInCurrentContext for both the last 960 and the 1440 callback.
This results in lost microphone data, causing popping/glitching in the audio.
If we switch to using the kAudioUnitSubType_RemoteIO subtype, we reliably get 1024 samples per callback and the AudioUnitRender function works correctly every time. Problem is we don't get echo cancellation so using the device hand-held is worthless in this mode.
So the question is, has something severely changed with iPhone 14 where AudioUnitRender is called during the audio callback when using the kAudioUnitySubType_VoiceProcessingIO? This is most definitely not an IOS 16 bug since there are no issues on iPhone 13 or previous models that support IOS 16.
The fact that we're not getting 1024 samples at a time kind of tells us something is really wrong, but this code has worked correctly for years now and is acting real funky on iPhone 14.
We were experiencing the same issue on our end. We believe it is an Apple issue since, as you said, nothing has changed in this API that we know of. It should be a valid way of interacting with the kAudioUnitSubType_VoiceProcessingIO Audio Unit.
As a workaround, we decided to try registering another callback for capture using the kAudioOutputUnitProperty_SetInputCallback property alongside the one we set with kAudioUnitProperty_SetRenderCallback (which would be handling both render and capture before).
AURenderCallbackStruct callbackStruct;
callbackStruct.inputProc = captureCallback;
callbackStruct.inputProcRefCon = this;
OSStatus result = AudioUnitSetProperty(audio_unit,
kAudioOutputUnitProperty_SetInputCallback,
kAudioUnitScope_Global,
1, // input bus
&callbackStruct,
sizeof(callbackStruct));
Then, the captureCallback calls the AudioUnitRender and copies the data into our buffers:
OSStatus IosAudioUnit::captureCallback(AudioUnitRenderActionFlags* ioActionFlags, const AudioTimeStamp* inTimeStamp, UInt32 inBusNumber, UInt32 inNumberFrames, AudioBufferList* /*ioData*/)
{
const int currBuffSize = inNumberFrames * sizeof(float);
bufferList.mData[0].mDataByteSize = currBuffSize;
if (currBuffSize > MAX_BUF_SIZE) {
printf("ERROR: currBufSize %d > MAX_BUF_SIZE %d\n");
return kAudio_ParamError;
}
AudioUnitRender(audio_unit, ioActionFlags, inTimeStamp, inOutputBusNumber, inNumberFrames, &bufferList);
// Copy the data in the bufferList into our buffers
// ...
}
Here's how we setup the bufferList in our IosAudioUnit constructor:
#define MAX_BUF_SIZE = 23040; // 60 ms of frames in bytes at 48000Hz
IosAudioUnit::IosAudioUnit()
{
AudioBufferList bufferList;
bufferList.mNumberBuffers = 1;
bufferList.mBuffers[0].mData = malloc(MAX_BUF_SIZE);
bufferList.mBuffers[0].mDataByteSize = MAX_BUF_SIZE;
bufferList.mBuffers[0].mNumberChannels = 2;
}
Both the render callback and the new capture callback are still called with sample sizes of 960 and 1440 on iPhone 14, but the AudioUnit seems to handle it better, and there are no more popping/glitches in audio!
Hope this helps!

Active noise cancellation in iOS

I'm trying to do active noise cancellation in iOS using remote I/O. I have been able to get the audio buffer in 8.24 fixed point format and put it in the speaker as well. Right now I'm trying to capture a sinusoidal wave through microphone (using onlinetonegenerator.com) and reversing the magnitude in each frame I'm getting through the callback. Here goes my code:
static OSStatus PerformThru(
void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData) {
AppDelegate *THIS = (AppDelegate *)inRefCon;
OSStatus err = AudioUnitRender(THIS->rioUnit, ioActionFlags, inTimeStamp, 1, inNumberFrames, ioData);
for(int i = 0; i < ioData->mNumberBuffers; ++i) {
Float32 *flptr = (Float32 *)ioData->mBuffers[i].mData;
for(int j = 0; j < inNumberFrames; ++j) {
*flptr *= -1.; // inverting the buffer value
flptr++;
}
}
return err;
}
But the output tone doesn't seem to create a destructive interference. I'm sensing that when I'm running the app there is a periodic change of amplitude but it is not canceling the input sound.
I think there might be 2 more factors:
Latency in generating the output stream from microphone stream
Difference of amplitude between original sound and generated sound
Any idea how can I take care of these issues in AudioUnit?
Thanks a lot!
It seems you use kAudioUnitSubType_RemoteIO.
Did you try to use kAudioUnitSubType_VoiceProcessingIO?
May be it will help.
This audio unit does signal processing on
the incoming audio (taking out any of the audio that is played from the device
at a given time from the incoming audio).

Encoding PCM (CMSampleBufferRef) to AAC on iOS - How to set frequency and bitrate?

I want to encode PCM (CMSampleBufferRef(s) going live from AVCaptureAudioDataOutputSampleBufferDelegate) into AAC.
When the first CMSampleBufferRef arrives, I set both (in/out) AudioStreamBasicDescription(s), "out" according to documentation
AudioStreamBasicDescription inAudioStreamBasicDescription = *CMAudioFormatDescriptionGetStreamBasicDescription((CMAudioFormatDescriptionRef)CMSampleBufferGetFormatDescription(sampleBuffer));
AudioStreamBasicDescription outAudioStreamBasicDescription = {0}; // Always initialize the fields of a new audio stream basic description structure to zero, as shown here: ...
outAudioStreamBasicDescription.mSampleRate = 44100; // The number of frames per second of the data in the stream, when the stream is played at normal speed. For compressed formats, this field indicates the number of frames per second of equivalent decompressed data. The mSampleRate field must be nonzero, except when this structure is used in a listing of supported formats (see “kAudioStreamAnyRate”).
outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC; // kAudioFormatMPEG4AAC_HE does not work. Can't find `AudioClassDescription`. `mFormatFlags` is set to 0.
outAudioStreamBasicDescription.mFormatFlags = kMPEG4Object_AAC_SSR; // Format-specific flags to specify details of the format. Set to 0 to indicate no format flags. See “Audio Data Format Identifiers” for the flags that apply to each format.
outAudioStreamBasicDescription.mBytesPerPacket = 0; // The number of bytes in a packet of audio data. To indicate variable packet size, set this field to 0. For a format that uses variable packet size, specify the size of each packet using an AudioStreamPacketDescription structure.
outAudioStreamBasicDescription.mFramesPerPacket = 1024; // The number of frames in a packet of audio data. For uncompressed audio, the value is 1. For variable bit-rate formats, the value is a larger fixed number, such as 1024 for AAC. For formats with a variable number of frames per packet, such as Ogg Vorbis, set this field to 0.
outAudioStreamBasicDescription.mBytesPerFrame = 0; // The number of bytes from the start of one frame to the start of the next frame in an audio buffer. Set this field to 0 for compressed formats. ...
outAudioStreamBasicDescription.mChannelsPerFrame = 1; // The number of channels in each frame of audio data. This value must be nonzero.
outAudioStreamBasicDescription.mBitsPerChannel = 0; // ... Set this field to 0 for compressed formats.
outAudioStreamBasicDescription.mReserved = 0; // Pads the structure out to force an even 8-byte alignment. Must be set to 0.
and AudioConverterRef.
AudioClassDescription audioClassDescription;
memset(&audioClassDescription, 0, sizeof(audioClassDescription));
UInt32 size;
NSAssert(AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders, sizeof(outAudioStreamBasicDescription.mFormatID), &outAudioStreamBasicDescription.mFormatID, &size) == noErr, nil);
uint32_t count = size / sizeof(AudioClassDescription);
AudioClassDescription descriptions[count];
NSAssert(AudioFormatGetProperty(kAudioFormatProperty_Encoders, sizeof(outAudioStreamBasicDescription.mFormatID), &outAudioStreamBasicDescription.mFormatID, &size, descriptions) == noErr, nil);
for (uint32_t i = 0; i < count; i++) {
if ((outAudioStreamBasicDescription.mFormatID == descriptions[i].mSubType) && (kAppleSoftwareAudioCodecManufacturer == descriptions[i].mManufacturer)) {
memcpy(&audioClassDescription, &descriptions[i], sizeof(audioClassDescription));
}
}
NSAssert(audioClassDescription.mSubType == outAudioStreamBasicDescription.mFormatID && audioClassDescription.mManufacturer == kAppleSoftwareAudioCodecManufacturer, nil);
AudioConverterRef audioConverter;
memset(&audioConverter, 0, sizeof(audioConverter));
NSAssert(AudioConverterNewSpecific(&inAudioStreamBasicDescription, &outAudioStreamBasicDescription, 1, &audioClassDescription, &audioConverter) == 0, nil);
And then, I convert every CMSampleBufferRef into raw AAC data.
AudioBufferList inAaudioBufferList;
CMBlockBufferRef blockBuffer;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampleBuffer, NULL, &inAaudioBufferList, sizeof(inAaudioBufferList), NULL, NULL, 0, &blockBuffer);
NSAssert(inAaudioBufferList.mNumberBuffers == 1, nil);
uint32_t bufferSize = inAaudioBufferList.mBuffers[0].mDataByteSize;
uint8_t *buffer = (uint8_t *)malloc(bufferSize);
memset(buffer, 0, bufferSize);
AudioBufferList outAudioBufferList;
outAudioBufferList.mNumberBuffers = 1;
outAudioBufferList.mBuffers[0].mNumberChannels = inAaudioBufferList.mBuffers[0].mNumberChannels;
outAudioBufferList.mBuffers[0].mDataByteSize = bufferSize;
outAudioBufferList.mBuffers[0].mData = buffer;
UInt32 ioOutputDataPacketSize = 1;
NSAssert(AudioConverterFillComplexBuffer(audioConverter, inInputDataProc, &inAaudioBufferList, &ioOutputDataPacketSize, &outAudioBufferList, NULL) == 0, nil);
NSData *data = [NSData dataWithBytes:outAudioBufferList.mBuffers[0].mData length:outAudioBufferList.mBuffers[0].mDataByteSize];
free(buffer);
CFRelease(blockBuffer);
inInputDataProc() implementation:
OSStatus inInputDataProc(AudioConverterRef inAudioConverter, UInt32 *ioNumberDataPackets, AudioBufferList *ioData, AudioStreamPacketDescription **outDataPacketDescription, void *inUserData)
{
AudioBufferList audioBufferList = *(AudioBufferList *)inUserData;
ioData->mBuffers[0].mData = audioBufferList.mBuffers[0].mData;
ioData->mBuffers[0].mDataByteSize = audioBufferList.mBuffers[0].mDataByteSize;
return noErr;
}
Now, the data holds my raw AAC, which I wrap into ADTS frame with proper ADTS header and sequence of these ADTS frames is playable AAC document.
But I don't understand this code as much as I want to. Generally, I don't understand the audio... I've just wrote it somehow following blogs, forums and docs, in pretty much time and now it works but I don't know why and how to change some parameters. So here are my questions:
I need to use this converter during HW encoder is occupied (by AVAssetWriter). This is why I make SW converter via AudioConverterNewSpecific() and not AudioConverterNew(). But now setting outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC_HE; does not work. Can't find AudioClassDescription. Even if mFormatFlags is set to 0. What am I loosing by using kAudioFormatMPEG4AAC (kMPEG4Object_AAC_SSR) over kAudioFormatMPEG4AAC_HE? What should I use for live stream? kMPEG4Object_AAC_SSR or kMPEG4Object_AAC_Main?
How to change sample rate properly? If I set outAudioStreamBasicDescription.mSampleRate to 22050 or 8000 for example, the audio playback is like slowed down. I set the sampling frequency index in ADTS header for same frequency as outAudioStreamBasicDescription.mSampleRate is.
How to change bitrate? ffmpeg -i shows this info for produced aac:
Stream #0:0: Audio: aac, 44100 Hz, mono, fltp, 64 kb/s.
How to change it to 16 kbps for example? Bitrate is decreasing as I'm decreasing the frequency, but I believe this is not the only way? And playback is damaged by decreasing the frequency as I'm mentioning in 2 anyway.
How to calculate the size of buffer? Now I set it to uint32_t bufferSize = inAaudioBufferList.mBuffers[0].mDataByteSize; as I believe compressed format won't be larger than uncompressed... But isn't it unnecessarily too much?
How to set ioOutputDataPacketSize properly? If I am getting the documentation right, I should set it as UInt32 ioOutputDataPacketSize = bufferSize / outAudioStreamBasicDescription.mBytesPerPacket; but mBytesPerPacket is 0. If I set it to 0, AudioConverterFillComplexBuffer() returns error. If I set it to 1, it works but I don't know why...
In inInputDataProc() there are 3 "out" parameters. I set just ioData. Should I also set ioNumberDataPackets and outDataPacketDescription? Why and how?
You may need to change the sample rate of the raw audio data by using a resampling audio unit before feeding the audio to the AAC converter. Otherwise there will be a mismatch between the AAC header and the audio data.

Specifying number of frames to process in audiounit render callback on iOS

I setup the audio unit render callback:
AURenderCallbackStruct input;
input.inputProc = RenderAudioBuffer;
input.inputProcRefCon = self;
err = AudioUnitSetProperty(audioPlaybackUnit,
kAudioUnitProperty_SetRenderCallback,
kAudioUnitScope_Input,
0,
&input,
sizeof(input));
Here is the callback method:
OSStatus RenderAudioBuffer( void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData)
{
}
In the callback method, inNumberFrames is always 1024. How do I change it? I have more than 1024 frames at a time instant to render (64K).
You can't specify an exact buffersize in iOS, but you can request one similar to a given size. The code looks something like this:
Float32 bufferSizeInSec = 0.02f;
if(AudioSessionSetProperty(kAudioSessionProperty_PreferredHardwareIOBufferDuration,
sizeof(Float32), &bufferSizeInSec) != noErr) {
return 1;
}
So basically you need to calculate the preferred buffer size in seconds (not samples, weirdly enough), and then hope that the system gives you a buffer size more to your liking.
However, you are probably looking at this problem the wrong way. AudioUnits are meant for realtime processing, so small buffersizes are preferred. A buffersize of 64K is absurdly large, and 1024 frames is actually quite large for a modern iPhone/iPad to process comfortably. Your aglorithm needs to be "block-based", meaning that you should break up the logic so that it can process 64K of samples in 64 calls, each with 1024 frames. This will lead to the most robust code.

AudioTimeStamp format + 'MusicDeviceMIDIEvent'

Can I get a little help with this?
In a test project, I have an AUSampler -> MixerUnit -> ioUnit and have a render callback set up. It all works. I am using the MusicDeviceMIDIEvent method as defined in MusicDevice.h to play a midi noteOn & noteOff. So in the hack test code below, a noteOn occurs for .5 sec. every 2 seconds.
MusicDeviceMIDIEvent (below) takes a param: inOffsetSampleFrame in order to schedule an event at a future time. What I would like to be able to do is play a noteOn and schedule the noteOff at the same time (without the hack time check I am doing below). I just don't understand what the inOffsetSampleFrame value should be (ex: to play a .5 sec or .2 second note. (in other words, I don't understand the basics of audio timing...).
So, if someone could walk me through the arithmetic to get proper values from the incoming AudioTimeStamp, that would be great! Also perhaps correct me/clarify any of these:
AudioTimeStamp->mSampleTime - sampleTime is the time of the
current sample "slice"? Is this in milliseconds?
AudioTimeStamp->mHostTime - ? host is the computer the app is running on and this is time (in milliseconds?) since computer started? This is a HUGE number. Doesn't it rollover and then cause problems?
inNumberFrames - seems like that is 512 on iOS5 (set through
kAudioUnitProperty_MaximumFramesPerSlice). So the sample is made
up of 512 frames?
I've seen lots of admonitions not to overload the render Callback
function - in particular to avoid Objective C calls - I understand
the reason, but how does one then message the UI or do other
processing?
I guess that's it. Thanks for bearing with me!
inOffsetSampleFrame
If you are scheduling the MIDI Event from the audio unit's render thread, then you can supply a
sample offset that the audio unit may apply when applying that event in its next audio unit render.
This allows you to schedule to the sample, the time when a MIDI command is applied and is particularly
important when starting new notes. If you are not scheduling in the audio unit's render thread,
then you should set this value to 0
// MusicDeviceMIDIEvent function def:
extern OSStatus
MusicDeviceMIDIEvent( MusicDeviceComponent inUnit,
UInt32 inStatus,
UInt32 inData1,
UInt32 inData2,
UInt32 inOffsetSampleFrame)
//my callback
OSStatus MyCallback( void * inRefCon,
AudioUnitRenderActionFlags * ioActionFlags,
const AudioTimeStamp * inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList * ioData)
{
Float64 sampleTime = inTimeStamp->mSampleTime;
UInt64 hostTime = inTimeStamp->mHostTime;
[(__bridge Audio*)inRefCon audioEvent:sampleTime andHostTime:hostTime];
return 1;
}
// OBJ-C method
- (void)audioEvent:(Float64) sampleTime andHostTime:(UInt64)hostTime
{
OSStatus result = noErr;
Float64 nowTime = (sampleTime/self.graphSampleRate); // sample rate: 44100.0
if (nowTime - lastTime > 2) {
UInt32 noteCommand = kMIDIMessage_NoteOn << 4 | 0;
result = MusicDeviceMIDIEvent (mySynthUnit, noteCommand, 60, 120, 0);
lastTime = sampleTime/self.graphSampleRate;
}
if (nowTime - lastTime > .5) {
UInt32 noteCommand = kMIDIMessage_NoteOff << 4 | 0;
result = MusicDeviceMIDIEvent (mySynthUnit, noteCommand, 60, 0, 0);
}
}
The answer here is that I misunderstood the purpose of inOffsetSampleFrame despite it being aptly named. I thought I could use it to schedule a noteOff event at some arbitrary time in the future so I didn't have to manage noteOffs, but the scope of this is simply within the current sample frame. Oh well.

Resources