Playing Percussion instruments with midi on iOS - ios

I have an app that handles playing multiple midi instruments. Everything works great except for playing percussion instruments. I understand that in order to play percussion in General MIDI you must send the events to channel 10. I've tried a bunch of different things, and I can't figure out how to get it to work, here's an example of how I'm doing it for melodic instruments vs percussion.
// Melodic instrument
MusicDeviceMIDIEvent(self.samplerUnit, 0x90, (UInt8)pitch, 127, 0);
// Percussion Instruments
MusicDeviceMIDIEvent(self.samplerUnit, 0x99, (UInt8)pitch, 127, 0);
The sampler Unit is an AudioUnit and the pitch is given as an int through my UI.
Thanks in advance!

Assuming you have some sort of a General MIDI sound font or similar loaded, you need to set the correct status byte before sending pitch/velocity information. So in case of a Standard MIDI Drum Kit (channel 9), you'd do something like this in Swift:
var status = OSStatus(noErr)
let drumCommand = UInt32( 0xC9 | 0 )
let noteOnCommand = UInt32(0x90 | channel)
status = MusicDeviceMIDIEvent(self._samplerUnit, drumCommand, 0, 0, 0) // set device
status = MusicDeviceMIDIEvent(self._samplerUnit, noteOnCommand, noteNum, velocity, 0) // sends note ON message
No need to undertake anything special for MIDI note off messages.

Ok, so I got it working. I guess the way I load the sound font makes it so the channel stuff doesn't do anything. Instead I had to set the bankMSB property on AUSamplerBankPresetData to kAUSampler_DefaultPercussionBankMSB instead of kAUSampler_DefaultMelodicBankMSB
I added a different font loading method specifically for percussion:
- (OSStatus) loadPercussionWithSoundFont: (NSURL *)bankURL {
OSStatus result = noErr;
// fill out a bank preset data structure
AUSamplerBankPresetData bpdata;
bpdata.bankURL = (__bridge CFURLRef) bankURL;
bpdata.bankMSB = kAUSampler_DefaultPercussionBankMSB;
bpdata.bankLSB = kAUSampler_DefaultBankLSB;
bpdata.presetID = (UInt8) 32;
// set the kAUSamplerProperty_LoadPresetFromBank property
result = AudioUnitSetProperty(self.samplerUnit,
kAUSamplerProperty_LoadPresetFromBank,
kAudioUnitScope_Global,
0,
&bpdata,
sizeof(bpdata));
// check for errors
NSCAssert (result == noErr,
#"Unable to set the preset property on the Sampler. Error code:%d '%.4s'",
(int) result,
(const char *)&result);
return result;
}

Related

Inter App Audio technology : make effect node and instrument node independent

I am writing an HOST application that uses Core Audio's new iOS 7 Inter App Audio technology. I have managed to get the instruments apps and effects app with the help of Inter-App Audio Examples .
The issue is that the effect node is dependent upon the instrument node. I want to make effect node and instrument node independent.
Here i my Try.
if (desc.componentType == kAudioUnitType_RemoteEffect) {
// if ([self isRemoteInstrumentConnected]) {
if (!_engineStarted) // Check if session is active
[self checkStartOrStopEngine];
if ([self isGraphStarted]) // Check if graph is running and or is created, if so, stop it
[self checkStartStopGraph];
if ([self checkGraphInitialized ]) // Check if graph has been inititialized if so, uninitialize it.
Check(AUGraphUninitialize(hostGraph));
Check (AUGraphAddNode (hostGraph, &desc, &effectNode)); // Add remote instrument
//Disconnect previous chain
// Check(AUGraphDisconnectNodeInput(hostGraph, mixerNode, remoteBus));
//Connect the effect node to the mixer on the remoteBus
Check(AUGraphConnectNodeInput (hostGraph, effectNode, 0, mixerNode, remoteBus));
//Connect the remote instrument node to the effect node on bus 0
Check(AUGraphConnectNodeInput (hostGraph, instrumentNode, 0, effectNode, 0));
//Grab audio units from the graph
Check(AUGraphNodeInfo(hostGraph, effectNode, 0, &effect));
currentUnit = &effect;
}
if (currentUnit) {
Check (AudioUnitSetProperty (*currentUnit, // Set stereo format
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Output,
playerBus,
&stereoStreamFormat,
sizeof (stereoStreamFormat)));
UInt32 maxFrames = 4096;
Check(AudioUnitSetProperty(*currentUnit,
kAudioUnitProperty_MaximumFramesPerSlice,
kAudioUnitScope_Global, playerBus,
&maxFrames,
sizeof(maxFrames)));
[self addAudioUnitPropertyListeners:*currentUnit]; // Add property listeners to audio unit
Check(AUGraphInitialize (hostGraph)); // Initialize the graph
[self checkStartStopGraph]; //Start the graph
}
[_connectedNodes addObject:rau];
but my Application Crashes on this Line --
Check(AUGraphInitialize (hostGraph));
And the Error i got ,
ConnectAudioUnit failed with error
-10860 Initialize failed with error
-10860 error -10860 from AUGraphInitialize (hostGraph)
Note :- I have also Attached screenshot of code portion for better understand.
Edit 1 :-
- (void)createGraph {
// 1
NewAUGraph(&hostGraph);
// 2
AudioComponentDescription iOUnitDescription;
iOUnitDescription.componentType =
kAudioUnitType_Output;
iOUnitDescription.componentSubType =
kAudioUnitSubType_RemoteIO;
iOUnitDescription.componentManufacturer =
kAudioUnitManufacturer_Apple;
iOUnitDescription.componentFlags = 0;
iOUnitDescription.componentFlagsMask = 0;
AUGraphAddNode(hostGraph, &iOUnitDescription, &outNode);
// 3
AUGraphOpen(hostGraph);
// 4
Check(AUGraphNodeInfo(hostGraph, outNode, 0, &outputUnit));
// 5
AudioStreamBasicDescription format;
format.mChannelsPerFrame = 2;
format.mSampleRate =
[[AVAudioSession sharedInstance] sampleRate];
format.mFormatID = kAudioFormatLinearPCM;
format.mFormatFlags =
kAudioFormatFlagsNativeFloatPacked |
kAudioFormatFlagIsNonInterleaved;
format.mBytesPerFrame = sizeof(Float32);
format.mBytesPerPacket = sizeof(Float32);
format.mBitsPerChannel = 32;
format.mFramesPerPacket = 1;
AudioUnitSetProperty(mixerUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Output,
1,
&format,
sizeof(format));
AudioUnitSetProperty(mixerUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Input,
0,
&format,
sizeof(format));
CAShow(hostGraph);
}
So the error you're seeing, as per apple docs, is due to The specified node cannot be found.
It looks like you've taken the Apple example app you linked and just deleted a bit to attempt to remove 1 node, but I don't believe its that simple. The documentation of the example clearly states the two nodes are dependent. Just changing the addition of remotes method is not going to be enough, because the host still is attempting to create both, as shown by the error you're seeing.
From this file in the example project, you are only showing the changes you made to addRemoteAU but you need to be making changes to createGraph as well, since that is where the hostGraph is initialized with its nodes. If you initialize the graph with only 1 node, then in addRemoteAU you should stop seeing an error due to a node not being found, since the graph at that point won't expect two nodes (which it does now from it's creation).

Handle Varying Number of Samples in Audio Unit Rendering Cycle

This is a problem that's come up in my app after the introduction of the iPhone 6s and 6s+, and I'm almost positive that it is because the new model's built-in mic is stuck recording at 48kHz (you can read more about this here). To clarify, this was never a problem with previous phone models that I've tested. I'll walk through my Audio Engine implementation and the varying results at different points depending on the phone model further below.
So here's what's happening - when my code runs on previous devices I get a consistent number of audio samples in each CMSampleBuffer returned by the AVCaptureDevice, usually 1024 samples. The render callback for my audio unit graph provides an appropriate buffer with space for 1024 frames. Everything works great and sounds great.
Then Apple had to go make this damn iPhone 6s (just kidding, it's great, this bug is just getting to my head) and now I get some very inconsistent and confusing results. The AVCaptureDevice now varies between capturing 940 or 941 samples and the render callback now starts making a buffer with space for 940 or 941 sample frames on the first call, but then immediately starts increasing the space it reserves on subsequent calls up to 1010, 1012, or 1024 sample frames, then stays there. The space it ends up reserving varies by session. To be honest, I have no idea how this render callback is determining how many frames it prepares for the render, but I'm guessing it has to do with the sample rate of the Audio Unit that the render callback is on.
The format of the CMSampleBuffer comes in at 44.1kHz sample rate no matter what the device is, so I'm guessing theres some sort of implicit sample rate conversion that happens before I'm even receiving the CMSampleBuffer from the AVCaptureDevice on the 6s. The only difference is that the preferred hardware sample rate of the 6s is 48kHz opposed to earlier versions at 44.1kHz.
I've read that with the 6s you do have to be ready to make space for a varying number of samples being returned, but is the kind of behavior I described above normal? If it is, how can my render cycle be tailored to handle this?
Below is the code that is processing the audio buffers if you care to look further into this:
The audio samples buffers, which are CMSampleBufferRefs, come in through the mic AVCaptureDevice and are sent to my audio processing function that does the following to the captured CMSampleBufferRef named audioBuffer
CMBlockBufferRef buffer = CMSampleBufferGetDataBuffer(audioBuffer);
CMItemCount numSamplesInBuffer = CMSampleBufferGetNumSamples(audioBuffer);
AudioBufferList audioBufferList;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(audioBuffer,
NULL,
&audioBufferList,
sizeof(audioBufferList),
NULL,
NULL,
kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment,
&buffer
);
self.audioProcessingCallback(&audioBufferList, numSamplesInBuffer, audioBuffer);
CFRelease(buffer);
This is putting the the audio samples into an AudioBufferList and sending it, along with the number of samples and the retained CMSampleBuffer, to the below function that I use for audio processing. TL;DR the following code sets up some Audio Units that are in an Audio Graph, using the CMSampleBuffer's format to set the ASBD for input, runs the audio samples through a converter unit, a newTimePitch unit, and then another converter unit. I then start a render call on the output converter unit with the number of samples that I received from the CMSampleBufferRef and put the rendered samples back into the AudioBufferList to subsequently be written out to the movie file, more on the Audio Unit Render Callback below.
movieWriter.audioProcessingCallback = {(audioBufferList, numSamplesInBuffer, CMSampleBuffer) -> () in
var ASBDSize = UInt32(sizeof(AudioStreamBasicDescription))
self.currentInputAudioBufferList = audioBufferList.memory
let formatDescription = CMSampleBufferGetFormatDescription(CMSampleBuffer)
let sampleBufferASBD = CMAudioFormatDescriptionGetStreamBasicDescription(formatDescription!)
if (sampleBufferASBD.memory.mFormatID != kAudioFormatLinearPCM) {
print("Bad ASBD")
}
if(sampleBufferASBD.memory.mChannelsPerFrame != self.currentInputASBD.mChannelsPerFrame || sampleBufferASBD.memory.mSampleRate != self.currentInputASBD.mSampleRate){
// Set currentInputASBD to format of data coming IN from camera
self.currentInputASBD = sampleBufferASBD.memory
print("New IN ASBD: \(self.currentInputASBD)")
// set the ASBD for converter in's input to currentInputASBD
var err = AudioUnitSetProperty(self.converterInAudioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Input,
0,
&self.currentInputASBD,
UInt32(sizeof(AudioStreamBasicDescription)))
self.checkErr(err, "Set converter in's input stream format")
// Set currentOutputASBD to the in/out format for newTimePitch unit
err = AudioUnitGetProperty(self.newTimePitchAudioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Input,
0,
&self.currentOutputASBD,
&ASBDSize)
self.checkErr(err, "Get NewTimePitch ASBD stream format")
print("New OUT ASBD: \(self.currentOutputASBD)")
//Set the ASBD for the convert out's input to currentOutputASBD
err = AudioUnitSetProperty(self.converterOutAudioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Input,
0,
&self.currentOutputASBD,
ASBDSize)
self.checkErr(err, "Set converter out's input stream format")
//Set the ASBD for the converter out's output to currentInputASBD
err = AudioUnitSetProperty(self.converterOutAudioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Output,
0,
&self.currentInputASBD,
ASBDSize)
self.checkErr(err, "Set converter out's output stream format")
//Initialize the graph
err = AUGraphInitialize(self.auGraph)
self.checkErr(err, "Initialize audio graph")
self.checkAllASBD()
}
self.currentSampleTime += Double(numSamplesInBuffer)
var timeStamp = AudioTimeStamp()
memset(&timeStamp, 0, sizeof(AudioTimeStamp))
timeStamp.mSampleTime = self.currentSampleTime
timeStamp.mFlags = AudioTimeStampFlags.SampleTimeValid
var flags = AudioUnitRenderActionFlags(rawValue: 0)
err = AudioUnitRender(self.converterOutAudioUnit,
&flags,
&timeStamp,
0,
UInt32(numSamplesInBuffer),
audioBufferList)
self.checkErr(err, "Render Call on converterOutAU")
}
The Audio Unit Render Callback that is called once the AudioUnitRender call reaches the input converter unit is below
func pushCurrentInputBufferIntoAudioUnit(inRefCon : UnsafeMutablePointer<Void>, ioActionFlags : UnsafeMutablePointer<AudioUnitRenderActionFlags>, inTimeStamp : UnsafePointer<AudioTimeStamp>, inBusNumber : UInt32, inNumberFrames : UInt32, ioData : UnsafeMutablePointer<AudioBufferList>) -> OSStatus {
let bufferRef = UnsafeMutablePointer<AudioBufferList>(inRefCon)
ioData.memory = bufferRef.memory
print(inNumberFrames);
return noErr
}
Blah, this is a huge brain dump but I really appreciate ANY help. Please let me know if there's any additional information you need.
Generally, you handle slight variations in buffer size (but a constant sample rate in and out) by putting the incoming samples in a lock-free circular fifo, and not removing any blocks of samples from that circular fifo until you have a full size block plus potentially some safety padding to cover future size jitter.
The variation in size probably has to do with the sample rate converter ratio not being a simple multiple, the resampling filter(s) needed, and any buffering needed for the resampling process.
1024 * (44100/48000) = 940.8
So that rate conversion might explain the jitter between 940 and 941 samples. If the hardware is always shipping out blocks of 1024 samples at a fixed rate of 48 kHz, and you need that block resampled to 44100 for your callback ASAP, there's a fraction of a converted sample that eventually needs to be output on only some output callbacks.

Why is my multi-channel mixer no longer playing in iOS 8?

I've written some code to play multi-instrument general MIDI files on iOS. It works fine in iOS 7, but stopped working on iOS 8.
I've stripped it down to its essence here. Instead of creating 16 channels for my multi-channel mixer, I just create one sampler node, and map all the tracks to that channel. It still exhibits the same problem as the multi-sampler version. None of the Audio Toolbox calls return an error code (they all return 0) in iOS 7 or iOS 8. The sequence plays through the speakers in iOS 7, on both the simulator and on iPhone/iPad devices. Run the exact same code on the iOS 8 simulator, or an iPhone/iPad device, and no sound is produced.
If you comment out the call to [self initGraphFromMIDISequence], it plays on iOS 8 with the default sine-wave sound.
#implementation MyMusicPlayer {
MusicPlayer _musicPlayer;
MusicSequence _musicSequence;
AUGraph _processingGraph;
}
- (void)playMidi:(NSURL*)midiFileURL {
NewMusicSequence(&_musicSequence);
MusicSequenceFileLoad(_musicSequence, CFBridgingRetain(midiFileURL), 0, 0);
NewMusicPlayer(&_musicPlayer);
MusicPlayerSetSequence(_musicPlayer, _musicSequence);
[self initGraphFromMIDISequence];
MusicPlayerPreroll(_musicPlayer);
MusicPlayerStart(_musicPlayer);
}
// Sets up an AUGraph with one channel whose instrument is loaded from a sound bank.
// Maps all the tracks of the MIDI sequence onto that channel. Basically this is a
// way to replace the default sine-wave sound with another (single) instrument.
- (void)initGraphFromMIDISequence {
NewAUGraph(&_processingGraph);
// Add one sampler unit to the graph.
AUNode samplerNode;
AudioComponentDescription cd = {};
cd.componentManufacturer = kAudioUnitManufacturer_Apple;
cd.componentType = kAudioUnitType_MusicDevice;
cd.componentSubType = kAudioUnitSubType_Sampler;
AUGraphAddNode(_processingGraph, &cd, &samplerNode);
// Add a Mixer unit node to the graph
cd.componentType = kAudioUnitType_Mixer;
cd.componentSubType = kAudioUnitSubType_MultiChannelMixer;
AUNode mixerNode;
AUGraphAddNode(_processingGraph, &cd, &mixerNode);
// Add the Output unit node to the graph
cd.componentType = kAudioUnitType_Output;
cd.componentSubType = kAudioUnitSubType_RemoteIO; // Output to speakers.
AUNode ioNode;
AUGraphAddNode(_processingGraph, &cd, &ioNode);
AUGraphOpen(_processingGraph);
// Obtain the mixer unit instance from its corresponding node, and set the bus count to 1.
AudioUnit mixerUnit;
AUGraphNodeInfo(_processingGraph, mixerNode, NULL, &mixerUnit);
UInt32 const numChannels = 1;
AudioUnitSetProperty(mixerUnit,
kAudioUnitProperty_ElementCount,
kAudioUnitScope_Input,
0,
&numChannels,
sizeof(numChannels));
// Connect the sampler node's output 0 to mixer node output 0.
AUGraphConnectNodeInput(_processingGraph, samplerNode, 0, mixerNode, 0);
// Connect the mixer unit to the output unit.
AUGraphConnectNodeInput(_processingGraph, mixerNode, 0, ioNode, 0);
// Obtain reference to the audio unit from its node.
AudioUnit samplerUnit;
AUGraphNodeInfo(_processingGraph, samplerNode, 0, &samplerUnit);
MusicSequenceSetAUGraph(_musicSequence, _processingGraph);
// Set the destination for each track to our single sampler node.
UInt32 trackCount;
MusicSequenceGetTrackCount(_musicSequence, &trackCount);
MusicTrack track;
for (int i = 0; i < trackCount; i++) {
MusicSequenceGetIndTrack(_musicSequence, i, &track);
MusicTrackSetDestNode(track, samplerNode);
}
// You can use either a DLS or an SF2 file bundled with your app; both work in iOS 7.
//NSString *soundBankPath = [[NSBundle mainBundle] pathForResource:#"GeneralUserv1.44" ofType:#"sf2"];
NSString *soundBankPath = [[NSBundle mainBundle] pathForResource:#"gs_instruments" ofType:#"dls"];
NSURL *bankURL = [NSURL fileURLWithPath:soundBankPath];
AUSamplerBankPresetData bpdata;
bpdata.bankURL = (__bridge CFURLRef) bankURL;
bpdata.bankMSB = kAUSampler_DefaultMelodicBankMSB;
bpdata.bankLSB = kAUSampler_DefaultBankLSB;
bpdata.presetID = 0;
UInt8 instrumentNumber = 46; // pick any GM instrument 0-127
bpdata.presetID = instrumentNumber;
AudioUnitSetProperty(samplerUnit,
kAUSamplerProperty_LoadPresetFromBank,
kAudioUnitScope_Global,
0,
&bpdata,
sizeof(bpdata));
}
I have some code, not included here, which polls to see if the sequence is still playing, by calling MusicPlayerGetTime on the MusicPlayer instance. In iOS 7, the result of that call each time is the number of seconds that have elapsed since it started playing. In iOS 8, the call always returns 0, which presumably means the MusicPlayer does not start playing the sequence on the call to MusicPlayerStart.
The code above is highly order-dependent -- you have to make certain calls before others; e.g., opening the graph before calling getInfo on a node, and not loading instruments until you've assigned the tracks to channels. I've followed all the advice in other StackOverflow threads, and have verified that getting the order correct makes error codes disappear.
Any iOS MIDI experts know what might have changed between iOS 7 and iOS 8 to make this code stop working?
In iOS 8 Apple introduced a slick Obj-C abstraction of the core audio API - AVAudioEngine.
You should probably check it out. https://developer.apple.com/videos/wwdc/2014/#502

ios binaural audio unit

i'm new in audiounit.
I'm confused to generate binaural tone filter, I was create two sound with left only and right only sound and add filter kAudioUnitSubType_LowPassFilter for each sound. When playing, i'm using UISlider to change kLowPassParam_CutoffFrequency for each player, this is a code :
Float32 value = slider.value; //only 160-190 hz
AEAudioUnitFilter *toneLeft = [self.sound objectForKey:#"binaural_left"];
AEAudioUnitFilter *toneRight = [self.sound objectForKey:#"binaural_right"];
if(toneLeft && toneRight){
Float32 leftFreq = value - self.rangeSlider.value; // i have two slider, for frequency and range
Float32 rightFreq = value + self.rangeSlider.value;
AudioUnitSetParameter(toneLeft.audioUnit,
kLowPassParam_CutoffFrequency,
kAudioUnitScope_Global,
0,
leftFreq,
0);
AudioUnitSetParameter(toneRight.audioUnit,
kLowPassParam_CutoffFrequency,
kAudioUnitScope_Global,
0,
rightFreq,
0);
}
but when sound played, i didn't hear a binaural, only the frequency changes.
I got an idea from : Idea
I'm using theamazingaudioengine.com framework.
Thanks for your help.

Encoding PCM (CMSampleBufferRef) to AAC on iOS - How to set frequency and bitrate?

I want to encode PCM (CMSampleBufferRef(s) going live from AVCaptureAudioDataOutputSampleBufferDelegate) into AAC.
When the first CMSampleBufferRef arrives, I set both (in/out) AudioStreamBasicDescription(s), "out" according to documentation
AudioStreamBasicDescription inAudioStreamBasicDescription = *CMAudioFormatDescriptionGetStreamBasicDescription((CMAudioFormatDescriptionRef)CMSampleBufferGetFormatDescription(sampleBuffer));
AudioStreamBasicDescription outAudioStreamBasicDescription = {0}; // Always initialize the fields of a new audio stream basic description structure to zero, as shown here: ...
outAudioStreamBasicDescription.mSampleRate = 44100; // The number of frames per second of the data in the stream, when the stream is played at normal speed. For compressed formats, this field indicates the number of frames per second of equivalent decompressed data. The mSampleRate field must be nonzero, except when this structure is used in a listing of supported formats (see “kAudioStreamAnyRate”).
outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC; // kAudioFormatMPEG4AAC_HE does not work. Can't find `AudioClassDescription`. `mFormatFlags` is set to 0.
outAudioStreamBasicDescription.mFormatFlags = kMPEG4Object_AAC_SSR; // Format-specific flags to specify details of the format. Set to 0 to indicate no format flags. See “Audio Data Format Identifiers” for the flags that apply to each format.
outAudioStreamBasicDescription.mBytesPerPacket = 0; // The number of bytes in a packet of audio data. To indicate variable packet size, set this field to 0. For a format that uses variable packet size, specify the size of each packet using an AudioStreamPacketDescription structure.
outAudioStreamBasicDescription.mFramesPerPacket = 1024; // The number of frames in a packet of audio data. For uncompressed audio, the value is 1. For variable bit-rate formats, the value is a larger fixed number, such as 1024 for AAC. For formats with a variable number of frames per packet, such as Ogg Vorbis, set this field to 0.
outAudioStreamBasicDescription.mBytesPerFrame = 0; // The number of bytes from the start of one frame to the start of the next frame in an audio buffer. Set this field to 0 for compressed formats. ...
outAudioStreamBasicDescription.mChannelsPerFrame = 1; // The number of channels in each frame of audio data. This value must be nonzero.
outAudioStreamBasicDescription.mBitsPerChannel = 0; // ... Set this field to 0 for compressed formats.
outAudioStreamBasicDescription.mReserved = 0; // Pads the structure out to force an even 8-byte alignment. Must be set to 0.
and AudioConverterRef.
AudioClassDescription audioClassDescription;
memset(&audioClassDescription, 0, sizeof(audioClassDescription));
UInt32 size;
NSAssert(AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders, sizeof(outAudioStreamBasicDescription.mFormatID), &outAudioStreamBasicDescription.mFormatID, &size) == noErr, nil);
uint32_t count = size / sizeof(AudioClassDescription);
AudioClassDescription descriptions[count];
NSAssert(AudioFormatGetProperty(kAudioFormatProperty_Encoders, sizeof(outAudioStreamBasicDescription.mFormatID), &outAudioStreamBasicDescription.mFormatID, &size, descriptions) == noErr, nil);
for (uint32_t i = 0; i < count; i++) {
if ((outAudioStreamBasicDescription.mFormatID == descriptions[i].mSubType) && (kAppleSoftwareAudioCodecManufacturer == descriptions[i].mManufacturer)) {
memcpy(&audioClassDescription, &descriptions[i], sizeof(audioClassDescription));
}
}
NSAssert(audioClassDescription.mSubType == outAudioStreamBasicDescription.mFormatID && audioClassDescription.mManufacturer == kAppleSoftwareAudioCodecManufacturer, nil);
AudioConverterRef audioConverter;
memset(&audioConverter, 0, sizeof(audioConverter));
NSAssert(AudioConverterNewSpecific(&inAudioStreamBasicDescription, &outAudioStreamBasicDescription, 1, &audioClassDescription, &audioConverter) == 0, nil);
And then, I convert every CMSampleBufferRef into raw AAC data.
AudioBufferList inAaudioBufferList;
CMBlockBufferRef blockBuffer;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampleBuffer, NULL, &inAaudioBufferList, sizeof(inAaudioBufferList), NULL, NULL, 0, &blockBuffer);
NSAssert(inAaudioBufferList.mNumberBuffers == 1, nil);
uint32_t bufferSize = inAaudioBufferList.mBuffers[0].mDataByteSize;
uint8_t *buffer = (uint8_t *)malloc(bufferSize);
memset(buffer, 0, bufferSize);
AudioBufferList outAudioBufferList;
outAudioBufferList.mNumberBuffers = 1;
outAudioBufferList.mBuffers[0].mNumberChannels = inAaudioBufferList.mBuffers[0].mNumberChannels;
outAudioBufferList.mBuffers[0].mDataByteSize = bufferSize;
outAudioBufferList.mBuffers[0].mData = buffer;
UInt32 ioOutputDataPacketSize = 1;
NSAssert(AudioConverterFillComplexBuffer(audioConverter, inInputDataProc, &inAaudioBufferList, &ioOutputDataPacketSize, &outAudioBufferList, NULL) == 0, nil);
NSData *data = [NSData dataWithBytes:outAudioBufferList.mBuffers[0].mData length:outAudioBufferList.mBuffers[0].mDataByteSize];
free(buffer);
CFRelease(blockBuffer);
inInputDataProc() implementation:
OSStatus inInputDataProc(AudioConverterRef inAudioConverter, UInt32 *ioNumberDataPackets, AudioBufferList *ioData, AudioStreamPacketDescription **outDataPacketDescription, void *inUserData)
{
AudioBufferList audioBufferList = *(AudioBufferList *)inUserData;
ioData->mBuffers[0].mData = audioBufferList.mBuffers[0].mData;
ioData->mBuffers[0].mDataByteSize = audioBufferList.mBuffers[0].mDataByteSize;
return noErr;
}
Now, the data holds my raw AAC, which I wrap into ADTS frame with proper ADTS header and sequence of these ADTS frames is playable AAC document.
But I don't understand this code as much as I want to. Generally, I don't understand the audio... I've just wrote it somehow following blogs, forums and docs, in pretty much time and now it works but I don't know why and how to change some parameters. So here are my questions:
I need to use this converter during HW encoder is occupied (by AVAssetWriter). This is why I make SW converter via AudioConverterNewSpecific() and not AudioConverterNew(). But now setting outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC_HE; does not work. Can't find AudioClassDescription. Even if mFormatFlags is set to 0. What am I loosing by using kAudioFormatMPEG4AAC (kMPEG4Object_AAC_SSR) over kAudioFormatMPEG4AAC_HE? What should I use for live stream? kMPEG4Object_AAC_SSR or kMPEG4Object_AAC_Main?
How to change sample rate properly? If I set outAudioStreamBasicDescription.mSampleRate to 22050 or 8000 for example, the audio playback is like slowed down. I set the sampling frequency index in ADTS header for same frequency as outAudioStreamBasicDescription.mSampleRate is.
How to change bitrate? ffmpeg -i shows this info for produced aac:
Stream #0:0: Audio: aac, 44100 Hz, mono, fltp, 64 kb/s.
How to change it to 16 kbps for example? Bitrate is decreasing as I'm decreasing the frequency, but I believe this is not the only way? And playback is damaged by decreasing the frequency as I'm mentioning in 2 anyway.
How to calculate the size of buffer? Now I set it to uint32_t bufferSize = inAaudioBufferList.mBuffers[0].mDataByteSize; as I believe compressed format won't be larger than uncompressed... But isn't it unnecessarily too much?
How to set ioOutputDataPacketSize properly? If I am getting the documentation right, I should set it as UInt32 ioOutputDataPacketSize = bufferSize / outAudioStreamBasicDescription.mBytesPerPacket; but mBytesPerPacket is 0. If I set it to 0, AudioConverterFillComplexBuffer() returns error. If I set it to 1, it works but I don't know why...
In inInputDataProc() there are 3 "out" parameters. I set just ioData. Should I also set ioNumberDataPackets and outDataPacketDescription? Why and how?
You may need to change the sample rate of the raw audio data by using a resampling audio unit before feeding the audio to the AAC converter. Otherwise there will be a mismatch between the AAC header and the audio data.

Resources