AVAudioSession issue when using SFSpeechRecognizer after AVSpeechUtterance - ios

am trying to use SFSpeechRecognizer for speech to text, after speaking a welcome message to the user via AVSpeechUtterance. But randomly, the speech recognition does not start(after speaking the welcome message) and it throws the error message below.
[avas] ERROR: AVAudioSession.mm:1049: -[AVAudioSession setActive:withOptions:error:]: Deactivating an audio session that has running I/O. All I/O should be stopped or paused prior to deactivating the audio session.
It works few times. Am not clear on why is it not working consistently.
I tried the solutions mentioned in other SO posts, where it mentions to check if there are audio players running. I added that check in the speech to text part of the code. It returns false (i.e. no other audio player is running) But still the speech to text does not start listening for the user speech. Can you pls guide me on what is going wrong.
Am testing on iPhone 6 running iOS 10.3
Below are code snippets used:
TextToSpeech:
- (void) speak:(NSString *) textToSpeak {
[[AVAudioSession sharedInstance] setActive:NO withOptions:0 error:nil];
[[AVAudioSession sharedInstance] setCategory:AVAudioSessionCategoryPlayback
withOptions:AVAudioSessionCategoryOptionDuckOthers error:nil];
[synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryImmediate];
AVSpeechUtterance* utterance = [[AVSpeechUtterance new] initWithString:textToSpeak];
utterance.voice = [AVSpeechSynthesisVoice voiceWithLanguage:locale];
utterance.rate = (AVSpeechUtteranceMinimumSpeechRate * 1.5 + AVSpeechUtteranceDefaultSpeechRate) / 2.5 * rate * rate;
utterance.pitchMultiplier = 1.2;
[synthesizer speakUtterance:utterance];
}
- (void)speechSynthesizer:(AVSpeechSynthesizer*)synthesizer didFinishSpeechUtterance:(AVSpeechUtterance*)utterance {
//Return success message back to caller
[[AVAudioSession sharedInstance] setActive:NO withOptions:0 error:nil];
[[AVAudioSession sharedInstance] setCategory:AVAudioSessionCategoryAmbient
withOptions: 0 error: nil];
[[AVAudioSession sharedInstance] setActive:YES withOptions: 0 error:nil];
}
Speech To Text:
- (void) recordUserSpeech:(NSString *) lang {
NSLocale *locale = [[NSLocale alloc] initWithLocaleIdentifier:lang];
self.sfSpeechRecognizer = [[SFSpeechRecognizer alloc] initWithLocale:locale];
[self.sfSpeechRecognizer setDelegate:self];
NSLog(#"Step1: ");
// Cancel the previous task if it's running.
if ( self.recognitionTask ) {
NSLog(#"Step2: ");
[self.recognitionTask cancel];
self.recognitionTask = nil;
}
NSLog(#"Step3: ");
[self initAudioSession];
self.recognitionRequest = [[SFSpeechAudioBufferRecognitionRequest alloc] init];
NSLog(#"Step4: ");
if (!self.audioEngine.inputNode) {
NSLog(#"Audio engine has no input node");
}
if (!self.recognitionRequest) {
NSLog(#"Unable to created a SFSpeechAudioBufferRecognitionRequest object");
}
self.recognitionTask = [self.sfSpeechRecognizer recognitionTaskWithRequest:self.recognitionRequest resultHandler:^(SFSpeechRecognitionResult *result, NSError *error) {
bool isFinal= false;
if (error) {
[self stopAndRelease];
NSLog(#"In recognitionTaskWithRequest.. Error code ::: %ld, %#", (long)error.code, error.description);
[self sendErrorWithMessage:error.localizedFailureReason andCode:error.code];
}
if (result) {
[self sendResults:result.bestTranscription.formattedString];
isFinal = result.isFinal;
}
if (isFinal) {
NSLog(#"result.isFinal: ");
[self stopAndRelease];
//return control to caller
}
}];
NSLog(#"Step5: ");
AVAudioFormat *recordingFormat = [self.audioEngine.inputNode outputFormatForBus:0];
[self.audioEngine.inputNode installTapOnBus:0 bufferSize:1024 format:recordingFormat block:^(AVAudioPCMBuffer * _Nonnull buffer, AVAudioTime * _Nonnull when) {
//NSLog(#"Installing Audio engine: ");
[self.recognitionRequest appendAudioPCMBuffer:buffer];
}];
NSLog(#"Step6: ");
[self.audioEngine prepare];
NSLog(#"Step7: ");
NSError *err;
[self.audioEngine startAndReturnError:&err];
}
- (void) initAudioSession
{
AVAudioSession *audioSession = [AVAudioSession sharedInstance];
[audioSession setCategory:AVAudioSessionCategoryRecord error:nil];
[audioSession setMode:AVAudioSessionModeMeasurement error:nil];
[audioSession setActive:YES withOptions:AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation error:nil];
}
-(void) stopAndRelease
{
NSLog(#"Invoking SFSpeechRecognizer stopAndRelease: ");
[self.audioEngine stop];
[self.recognitionRequest endAudio];
[self.audioEngine.inputNode removeTapOnBus:0];
self.recognitionRequest = nil;
[self.recognitionTask cancel];
self.recognitionTask = nil;
}
Regarding the logs added, am able to see all logs till "Step7" printed.
When debugging the code in the device, it consistently triggers break at the below lines (I have exception breakpoints set) though, continue keeps on with the execution. It however happens same way during few successful executions as well.
AVAudioFormat *recordingFormat = [self.audioEngine.inputNode outputFormatForBus:0];
[self.audioEngine prepare];

The reason is audio didn't completely finish, when -speechSynthesizer:didFinishSpeechUtterance: was called, therefore you get such kind of error trying to call setActive:NO. You cant deactivate AudioSession or change any settings during I/O is running. Workaround: wait for several ms (how long read below) and then perform AudioSession deactivation and stuff.
A few words about audio playing completion.
That might seem weird at first glance, but I've spent tones of time to research this issue. When you put last sound chunk to device output you have only approximate timing when it actually will be completed. Look at the AudioSession property ioBufferDuration:
The audio I/O buffer duration is the number of seconds for a single
audio input/output cycle. For example, with an I/O buffer duration of
0.005 s, on each audio I/O cycle:
You receive 0.005 s of audio if obtaining input.
You must provide 0.005 s of audio if providing output.
The typical maximum I/O buffer duration is 0.93 s (corresponding to 4096 sample
frames at a sample rate of 44.1 kHz). The minimum I/O buffer duration
is at least 0.005 s (256 frames) but might be lower depending on the
hardware in use.
So, we can interpret this value as the one chunk playback time. But you still have a small non-calculated duration between this timeline and actual audio playing completion (hardware delay). I would say you need wait about ioBufferDuration * 1000 + delay ms for being sure audio playing complete (ioBufferDuration * 1000 - coz it is duration in seconds), where delay is some quite small value.
More over seems like even Apple developers are also not pretty sure about audio completion time. Quick look at the new audio class AVAudioPlayerNode and func scheduleBuffer(_ buffer: AVAudioPCMBuffer, completionHandler: AVFoundation.AVAudioNodeCompletionHandler? = nil):
#param completionHandler called after the buffer has been consumed by
the player or the player is stopped. may be nil.
#discussion Schedules the buffer to be played following any previously scheduled commands. It is possible for the completionHandler to be called
before rendering begins or before the buffer is played completely.
You can read more about audio processing in Understanding the Audio Unit Render Callback Function (AudioUnit is low-level API that provides fasten access to I/O data).

Related

Audio recording using AVAudioEngine with setting output audio port

In order to make avaliable playback and recording at the same time we use these methods for setting AVAudioSession category:
AVAudioSession *audioSession = [AVAudioSession sharedInstance];
[audioSession setCategory:AVAudioSessionCategoryPlayAndRecord error:NULL];
By doing so, the output audio port switches from line out speaker to built-in speaker. In the loop recording window we need simultaneously working playback from line out speaker and audio recording from microphone. To play sound from line out speaker after setting AVAudioSession category we use a method for setting output audio port:
[[AVAudioSession sharedInstance]
overrideOutputAudioPort:AVAudioSessionPortOverrideSpeaker error:nil];
We try to arrange both recording and playback using AVAudio Engine.
Structure of AVAudioEngine connections:
// input->inputMixer->mainEqualizer ->tap
// 0 0 |0
// |
// |
// |0 0 0
// recordPlayerNode→recordMixer→meteringMixer→|
// 0 1 0 0 |
// |->mainMixer->out
// |
// volumePlayer→|
// 0 1
After execution overrideOutputAudioPort the recording feature stops working on iPhone 6S and higher. We perform recording in this manner:
if(self.isHeadsetPluggedIn)
{
volumePlayer.volume = 1;
}
else
{
volumePlayer.volume = 0.000001;
}
[volumePlayer play];
[mainEqualizer installTapOnBus:0 bufferSize:0 format:tempAudioFile.processingFormat block:^(AVAudioPCMBuffer *buf, AVAudioTime *when)
{
if(self.isRecord)
{
[volumePlayer scheduleBuffer:buf completionHandler:nil];
recordedFrameCount += buf.frameLength;
if (self.isLimitedRecord && recordedFrameCount >= [AVAudioSession sharedInstance].sampleRate * 90)
{
self.isRecord = false;
[self.delegate showAlert:RecTimeLimit];
}
NSError *error;
[tempAudioFile writeFromBuffer:buf error:&error];
if(error)
{
NSLog(#"Allert while write to file: %#",error.localizedDescription);
}
[self updateMetersForMicro];
}
else
{
[mainEqualizer removeTapOnBus:0];
[self.delegate recordDidFinish];
callbackBlock(recordUrl);
[mainEngine stop];
}
}];
During the investigation we have discovered an interesing fact – if
volumePlayer.volume = 1;
when headphones are not connected, then the buffer that comes from microhone starts to fill and the sound keeps recording, but there appears an effect of a very loud sound repetition in the speaker. Otherwise, PCMBuffer is filled with zeros.
The question is: how can we set AVAudioSession, or recording process so we could record audio using a microphone and play audio using line out speaker?
P.S. Recording with AVAudioRecorder works correctly with these settings.

No audio from AVCaptureSession after changing AVAudioSession

I'm making an app that supports both video playback and recording. I always want to allow background audio mixing except for during video playback (during playback, background audio should be muted). Therefore, I use the two methods below while changing the state of playback. When my AVPlayer starts loading, I call MuteBackgroundAudio, and when I dismiss the view controller containing it, I call ResumeBackgroundAudio. This works as expected, and the audio returns successfully after leaving playback.
The issue is that after doing this at least once, whenever I record anything using AVCaptureSession, no sounds gets recorded. My session is configured like so:
AVCaptureDevice *audioDevice = [[AVCaptureDevice devicesWithMediaType:AVMediaTypeAudio] firstObject];
AVCaptureDeviceInput *audioDeviceInput = [AVCaptureDeviceInput deviceInputWithDevice:audioDevice error:&error];
if (error)
{
NSLog(#"%#", error);
}
if ([self.session canAddInput:audioDeviceInput])
{
[self.session addInput:audioDeviceInput];
}
[self.session setAutomaticallyConfiguresApplicationAudioSession:NO];
// ... videoDeviceInput
Note that I have not set usesApplicationAudioSession, so it defaults to YES.
void MuteBackgroundAudio(void)
{
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0), ^{
if ([[AVAudioSession sharedInstance] isOtherAudioPlaying] && !isMuted)
{
isMuted = YES;
NSError *error = nil;
[[AVAudioSession sharedInstance] setCategory:AVAudioSessionCategoryPlayAndRecord
withOptions:AVAudioSessionCategoryOptionDefaultToSpeaker
error:&error];
if (error)
{
NSLog(#"DEBUG - Set category error %ld, %#", (long)error.code, error.localizedDescription);
}
NSError *error2 = nil;
[[AVAudioSession sharedInstance] setActive:YES
withOptions:0
error:&error2];
if (error2)
{
NSLog(#"DEBUG - Set active error 2 %ld, %#", (long)error.code, error.localizedDescription);
}
}
});
}
void ResumeBackgroundAudio(void)
{
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0), ^{
if (isMuted)
{
AVAudioSession *audioSession = [AVAudioSession sharedInstance];
NSError *deactivationError = nil;
[audioSession setActive:NO withOptions:AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation error:&deactivationError];
if (deactivationError)
{
NSLog(#"DEBUG - Failed at deactivating audio session, retrying...");
ResumeBackgroundAudio();
return;
}
isMuted = NO;
NSLog(#"DEBUG - Audio session deactivated");
NSError *categoryError = nil;
[audioSession setCategory:AVAudioSessionCategoryPlayAndRecord
withOptions:AVAudioSessionCategoryOptionDefaultToSpeaker | AVAudioSessionCategoryOptionMixWithOthers
error:&categoryError];
if (categoryError)
{
NSLog(#"DEBUG - Failed at setting category");
return;
}
NSLog(#"DEBUG - Audio session category set to mix with others");
NSError *activationError = nil;
[audioSession setActive:YES error:&activationError];
if (activationError)
{
NSLog(#"DEBUG - Failed at activating audio session");
return;
}
NSLog(#"DEBUG - Audio session activated");
}
});
}
Debugging
I have noticed that the audioSession always needs two tries to successfully deactivate after calling ResumeBackgroundAudio. It seems my AVPlayer does not get deallocated or stopped in time, due to this comment in AVAudioSession.h:
Note that this method will throw an exception in apps linked on or
after iOS 8 if the session is set inactive while it has running or
paused I/O (e.g. audio queues, players, recorders, converters, remote
I/Os, etc.).
The fact that no sound gets recorded bring me to believe the audioSession does not actually get activated, but my logging says it does (always in the second iteration of the recursion).
I got the idea of using recursion to solve this problem from this post.
To clarify, the flow that causes the problem is the following:
Open app with Spotify playing
Begin playback of any content in the app
Spotify gets muted, playback begins (MuteBackgroundAudio)
Playback ends, Spotify starts playing again (ResumeBackgroundAudio)
Start recording
Stop recording, get mad that there is no audio
I've had the exact same issue as you're describing, down to the very last detail ([audioSession setActive:NO withOptions:AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation error:&deactivationError]; failed the first time no matter what)
The only acceptable way, for my case, was to stop then start the AVCaptureSession. I'm not entirely sure but I think this is the same exact way Whatsapp handles it - their camera behaves exactly like mine with the solution I'm suggesting.
Removing and adding the same audio input on the already running session also seemed to work but I got a terrible camera freeze - nothing compared to the session start / stop.
The recursion solution is nice, but I think it would be a great idea to 'throttle' that recursion call (add a short delay & a max retry count of some sort) in case for a legit reason the set fails every time. Otherwise your stack will overflow and your app will crash.
If you or anyone found a better solution to this I would love to know it.

AVAudioRecorder not recording in background after audio session interruption ended

I am recording audio in my app, both in foreground and in background. I also handle AVAudioSessionInterruptionNotification to stop recording when interruption begins and start again when it ends. Although in foreground it works as expected, when app is recording in background and I receive a call it doesn't start again recording after call ends. My code is the following:
- (void)p_handleAudioSessionInterruptionNotification:(NSNotification *)notification
{
NSUInteger interruptionType = [[[notification userInfo] objectForKey:AVAudioSessionInterruptionTypeKey] unsignedIntegerValue];
if (interruptionType == AVAudioSessionInterruptionTypeBegan) {
if (self.isRecording && !self.interruptedWhileRecording) {
[self.recorder stop];
self.interruptedWhileRecording = YES;
return;
}
}
if (interruptionType == AVAudioSessionInterruptionTypeEnded) {
if (self.interruptedWhileRecording) {
NSError *error = nil;
[[AVAudioSession sharedInstance] setActive:YES error:&error];
NSDictionary *settings = #{
AVEncoderAudioQualityKey: #(AVAudioQualityMax),
AVSampleRateKey: #8000,
AVFormatIDKey: #(kAudioFormatLinearPCM),
AVNumberOfChannelsKey: #1,
AVLinearPCMBitDepthKey: #16,
AVLinearPCMIsBigEndianKey: #NO,
AVLinearPCMIsFloatKey: #NO
};
_recorder = [[AVAudioRecorder alloc] initWithURL:fileURL settings:settings error:nil];
[self.recorder record];
self.interruptedWhileRecording = NO;
return;
}
}
}
Note that fileURL points to new caf file in a NSDocumentDirectory subdirectory. Background mode audio is configured. I also tried voip and play silence, both to no success.
The NSError in AVAudioSessionInterruptionTypeEnded block is a OSStatus error 560557684 which I haven't found how to tackle.
Any help would be much appreciated.
Error 560557684 is for AVAudioSessionErrorCodeCannotInterruptOthers. This happens when your background app is trying to activate an audio session that doesn't mix with other audio sessions. Background apps cannot start audio sessions that don't mix with the foreground app's audio session because that would interrupt the audio of the app currently being used by the user.
To fix this make sure to set your session category to one that is mixable, such as AVAudioSessionCategoryPlayback. Also be sure to set the category option AVAudioSessionCategoryOptionMixWithOthers (required) and AVAudioSessionCategoryOptionDuckOthers (optional). For example:
// background audio *must* mix with other sessions (or setActive will fail)
NSError *sessionError = nil;
[[AVAudioSession sharedInstance] setCategory:AVAudioSessionCategoryPlayback
withOptions:AVAudioSessionCategoryOptionMixWithOthers | AVAudioSessionCategoryOptionDuckOthers
error:&sessionError];
if (sessionError) {
NSLog(#"ERROR: setCategory %#", [sessionError localizedDescription]);
}
The error code 560557684 is actually 4 ascii characters '!int' in a 32 bit integer. The error codes are listed in the AVAudioSession.h file (see also AVAudioSession):
#enum AVAudioSession error codes
#abstract These are the error codes returned from the AVAudioSession API.
...
#constant AVAudioSessionErrorCodeCannotInterruptOthers
The app's audio session is non-mixable and trying to go active while in the background.
This is allowed only when the app is the NowPlaying app.
typedef NS_ENUM(NSInteger, AVAudioSessionErrorCode)
{
...
AVAudioSessionErrorCodeCannotInterruptOthers = '!int', /* 0x21696E74, 560557684 */
...
I added the following
[[UIApplication sharedApplication] beginReceivingRemoteControlEvents];
before configuring AVAudioSession and it worked. Still don't know what bugs may appear.
I had this same error when trying to use AVSpeechSynthesizer().speak() while my app was in the background. #progmr's answer solved the problem for me, though I also had to call AVAudioSession.sharedInstance().setActive(true) too.
For completeness, here's my code in Swift 5.
In application(_:didFinishLaunchingWithOptions:):
do {
try AVAudioSession.sharedInstance().setCategory(AVAudioSession.Category.playback,
options: [.mixWithOthers,
.duckOthers])
} catch let error as NSError {
print("Error setting up AVAudioSession : \(error.localizedDescription)")
}
Then in my view controller:
do {
try AVAudioSession.sharedInstance().setActive(true)
} catch let error as NSError {
print("Error : \(error.localizedDescription)")
}
let speechUtterance = AVSpeechUtterance(string: "Hello, World")
let speechSynth = AVSpeechSynthesizer()
speechSynth.speak(speechUtterance)
Note: When setActive(true) is called, it reduces the volume of anything else playing at the time. To turn the volume back up afterwards, you need to call setActive(false) - for me the best time to do that was once I'd been notified that the speech had finished in the corresponding AVSpeechSynthesizerDelegate method.

I want to call 20 times per second the installTapOnBus:bufferSize:format:block:

I want to waveform display in real-time input from the microphone.
I have been implemented using the installTapOnBus:bufferSize:format:block:, This function is called three times in one second.
I want to set this function to be called 20 times per second.
Where can I set?
AVAudioSession *audioSession = [AVAudioSession sharedInstance];
NSError* error = nil;
if (audioSession.isInputAvailable) [audioSession setCategory:AVAudioSessionCategoryPlayAndRecord error:&error];
if(error){
return;
}
[audioSession setActive:YES error:&error];
if(error){
retur;
}
self.engine = [[[AVAudioEngine alloc] init] autorelease];
AVAudioMixerNode* mixer = [self.engine mainMixerNode];
AVAudioInputNode* input = [self.engine inputNode];
[self.engine connect:input to:mixer format:[input inputFormatForBus:0]];
// tap ... 1 call in 16537Frames
// It does not change even if you change the bufferSize
[input installTapOnBus:0 bufferSize:4096 format:[input inputFormatForBus:0] block:^(AVAudioPCMBuffer* buffer, AVAudioTime* when) {
for (UInt32 i = 0; i < buffer.audioBufferList->mNumberBuffers; i++) {
Float32 *data = buffer.audioBufferList->mBuffers[i].mData;
UInt32 frames = buffer.audioBufferList->mBuffers[i].mDataByteSize / sizeof(Float32);
// create waveform
...
}
}];
[self.engine startAndReturnError:&error];
if (error) {
return;
}
they say, Apple Support replied no: (on sep 2014)
Yes, currently internally we have a fixed tap buffer size (0.375s),
and the client specified buffer size for the tap is not taking effect.
but someone resizes buffer size and gets 40ms
https://devforums.apple.com/thread/249510?tstart=0
Can not check it, neen in ObjC :(
UPD it works! just single line:
[input installTapOnBus:0 bufferSize:1024 format:[mixer outputFormatForBus:0] block:^(AVAudioPCMBuffer *buffer, AVAudioTime *when) {
buffer.frameLength = 1024; //here
The AVAudioNode class reference states that the implementation may choose a buffer size other than the one that you supply, so as far as I know, we are stuck with the very large buffer size. This is unfortunate, because AVAudioEngine is otherwise an excellent Core Audio wrapper. Since I too need to use the input tap for something other than recording, I'm looking into The Amazing Audio Engine, as well as the Core Audio C API (see the iBook Learning Core Audio for excellent tutorials on it), as alternatives.
***Update: It turns out that you can access the AudioUnit of the AVAudioInputNode and install a render callback on it. Via AVAudioSession, you can set your audio session's desired buffer size (not guaranteed, but certainly better than node taps). Thus far, I've gotten buffer sizes as low as 64 samples using this approach. I'll post back here with code once I've had a chance to test this.
As of iOS 13 in 2019, there is AVAudioSinkNode, which may better accomplish what you are looking for. While you could have also created a regular AVAudioUnit / Node and attached it to the input/output, the difference with an AVAudioSinkNode is that there is no output required. That makes it more like a tap and circumvents issues with incomplete chains that might occur when using a regular Audio Unit / Node.
For more information:
https://developer.apple.com/videos/play/wwdc2019/510/
https://devstreaming-cdn.apple.com/videos/wwdc/2019/510v8txdlekug3npw2m/510/510_whats_new_in_avaudioengine.pdf?dl=1
https://developer.apple.com/documentation/avfoundation/avaudiosinknode?language=objc
The relevant Swift code is on page 10 (with a small error corrected below) of the session's PDF.
// Create Engine
let engine = AVAudioEngine()
// Create and Attach AVAudioSinkNode
let sinkNode = AVAudioSinkNode() { (timeStamp, frames, audioBufferList) ->
OSStatus in
…
}
engine.attach(sinkNode)
I imagine that you'll still have to follow the typical real-time audio rules when using this (e.g. no allocating/freeing memory, no ObjC calls, no locking or waiting on locks, etc.). A ring buffer may still be helpful here.
Don't know why or even if this works yet, just trying a few things out. But for sure the NSLogs indicate a 21 ms interval, 1024 samples coming in per buffer...
AVAudioEngine* sEngine = NULL;
- (void)applicationDidBecomeActive:(UIApplication *)application
{
/*
Restart any tasks that were paused (or not yet started) while the application was inactive. If the application was previously in the background, optionally refresh the user interface.
*/
[glView startAnimation];
AVAudioSession *audioSession = [AVAudioSession sharedInstance];
NSError* error = nil;
if (audioSession.isInputAvailable) [audioSession setCategory:AVAudioSessionCategoryPlayAndRecord error:&error];
if(error){
return;
}
[audioSession setActive:YES error:&error];
if(error){
return;
}
sEngine = [[AVAudioEngine alloc] init];
AVAudioMixerNode* mixer = [sEngine mainMixerNode];
AVAudioInputNode* input = [sEngine inputNode];
[sEngine connect:input to:mixer format:[input inputFormatForBus:0]];
__block NSTimeInterval start = 0.0;
// tap ... 1 call in 16537Frames
// It does not change even if you change the bufferSize
[input installTapOnBus:0 bufferSize:1024 format:[input inputFormatForBus:0] block:^(AVAudioPCMBuffer* buffer, AVAudioTime* when) {
if (start == 0.0)
start = [AVAudioTime secondsForHostTime:[when hostTime]];
// why does this work? because perhaps the smaller buffer is reused by the audioengine, with the code to dump new data into the block just using the block size as set here?
// I am not sure that this is supported by apple?
NSLog(#"buffer frame length %d", (int)buffer.frameLength);
buffer.frameLength = 1024;
UInt32 frames = 0;
for (UInt32 i = 0; i < buffer.audioBufferList->mNumberBuffers; i++) {
Float32 *data = buffer.audioBufferList->mBuffers[i].mData;
frames = buffer.audioBufferList->mBuffers[i].mDataByteSize / sizeof(Float32);
// create waveform
///
}
NSLog(#"%d frames are sent at %lf", (int) frames, [AVAudioTime secondsForHostTime:[when hostTime]] - start);
}];
[sEngine startAndReturnError:&error];
if (error) {
return;
}
}
You might be able to use a CADisplayLink to achieve this. A CADisplayLink will give you a callback each time the screen refreshes, which typically will be much more than 20 times per second (so additional logic may be required to throttle or cap the number of times your method is executed in your case).
This is obviously a solution that is quite discrete from your audio work, and to the extent you require a solution that reflects your session, it might not work. But when we need frequent recurring callbacks on iOS, this is often the approach of choice, so it's an idea.

Why can't I record from RemoteIOUnit after changing AudioSession category from SoloAmbient to PlayAndRecord?

My app has both audio play and record features, and I want to only set the audio session's category to PlayAndRecord when the user initiates recording, so the standard audio playback will be muted by the mute switch, etc.
I'm having a problem though, where my call to AudioUnitRender to record audio input is failing with errParam (-50) after I change the audio session category to PlayAndRecord. If I start my app using the PlayAndRecord category, then recording works correctly.
#implementation MyAudioSession
- (instancetype)init {
NSError *error = nil;
AVAudioSession *session = [AVAudioSession sharedInstance];
[session setCategory:AVAudioSessionCategorySoloAmbient];
[session setActive:YES error:&error];
}
- (void)enableRecording {
void (^setCategory)(void) = ^{
NSError *error;
AVAudioSession *session = [AVAudioSession sharedInstance];
[session setCategory:AVAudioSessionCategoryPlayAndRecord error:&error];
};
// Do I need to set the category from the main thread?
if ([NSThread isMainThread]) {
setCategory();
} else {
dispatch_sync(dispatch_get_main_queue(), ^{
setCategory();
});
}
}
#end
#interface MyRecorder {
AudioUnit ioUnit_;
AudioBufferList *tmpRecordListPtr_
#end
#implementation MyRecorder
- (instancetype)init {
// Sets up AUGraph with just a RemoteIOUnit node, recording enabled, callback, etc.
// Set up audio buffers
tmpRecordListPtr_ = malloc(sizeof(AudioBufferList) + 64 * sizeof(AudioBuffer));
}
- (OSStatus)doRecordCallback:(AudioUnitRenderActionFlags *)ioActionFlags
timeStamp:(AudioTimeStamp *)inTimeStamp
busNumber:(UInt32)busNumber
numFrames:(UInt32)numFrames
bufferOut:(AudioBufferList *)ioData {
// Set up buffers... All this works fine if I initialize the audio session to
// PlayAndRecord in -[MyAudioSession init]
OSStatus status = AudioUnitRender(ioUnit_, ioActionFlags, inTimeStamp, busNumber,
numFrames, tmpRecordListPtr_);
// This fails with errParam, but only if I start my app in SoloAmbient and then
// later change it to PlayAndRecord
}
#end
OSStatus MyRecorderCallback(void *inRefCon, AudioUnitRenderActionFlags *ioActionFlags,
AudioTimeStamp *inTimestamp, UInt32 inBusNumber,
UInt32 inNumberFrames, AudioBufferList *ioData) {
MyRecorder *recorder = (MyRecorder *)inRefCon;
return [recorder doRecordCallback:ioActionFlags
timeStamp:inTimestamp
busNumber:inBusNumber
numFrames:inNumberFrames
bufferOut:ioData];
}
I'm testing on an iPod touch (5th gen) running iOS 7.1.2.
Has anybody else encountered this issue? Any suggestions for fixes or more info I can post?
EDIT: Object lifecycle is similar to:
- (void)startRecording {
[mySession enableRecording];
[myRecorder release];
myRecorder = [[MyRecorder alloc] init];
[myRecorder start]; // starts the AUGraph
}
Without looking at your code it is difficult to comment. But I am doing something similar in my app, and I found that it is important to pay careful attention to what audio session settings can be changed only when the audio session is inactive.
// Get the app's audioSession singleton object
AVAudioSession* session = [AVAudioSession sharedInstance];
//error handling
NSError* audioSessionError = nil;
SDR_DEBUGPRINT(("Setting session not active!\n"));
[session setActive:NO error:&audioSessionError]; // shut down the audio session if it is active
It is important to setActive to "NO" prior to changing the session category, for instance. Failure to do so might allow render callbacks to occur while the session is being configured.
Looking at the lifecycle flow, I'm trying to see where you stop the AUGraph prior to setting up the audio session for recording. The code I use for stopping the AUGraph follows. I call it prior to any attempts to reconfigure the audio session.
- (void)stopAUGraph {
if(mAudioGraph != nil) {
Boolean isRunning = FALSE;
OSStatus result = AUGraphIsRunning(mAudioGraph, &isRunning);
if(result) {
DEBUGPRINT(("AUGraphIsRunning result %d %08X %4.4s\n", (int)result, (int)result, (char*)&result));
return;
}
if(isRunning) {
result = AUGraphStop(mAudioGraph);
if(result) {
DEBUGPRINT(("AUGraphStop result %d %08X %4.4s\n", (int)result, (int)result, (char*)&result));
return;
} else {
DEBUGPRINT(("mAudioGraph has been stopped!\n"));
}
} else {
DEBUGPRINT(("mAudioGraph is already stopped!\n"));
}
}
You need to make sure the RemoteIO Audio Unit (the audio graph) is stopped before deactivating and/or changing the audio session type. Then (re)initialize the RemoteIO Audio Unit after setting the new session type and before (re)starting the graph, as the new session type or options may change some of the allowable settings. Also, it helps to check all the prior audio unit and audio session call error codes before any graph (re)start.

Resources