iOS - AVAudioPlayerNode.play() execution is very slow - ios

I'm using AVAudioEngine for audio in an iOS game application. A problem I've encountered is that AVAudioPlayerNode.play() takes a long time to execute, which can be a problem in real-time applications such as games.
play() just activates the player node - you don't have to call it every time you play a sound. As such, it doesn't have to be called that often, but it does have to be called occasionally, such as to activate the player initially, or after it's been deactivated (which happens in some situations). Even if only called occasionally, the long execution times can be a problem, especially if you need to call play() on multiple players at once.
The execution time for play() seems to be proportional to the value of AVAudioSession.ioBufferDuration, which you can request to be changed using AVAudioSession.setPreferredIOBufferDuration(). Here's some code I'm using to test this:
import AVFoundation
import UIKit
class ViewController: UIViewController {
private let engine = AVAudioEngine()
private let player = AVAudioPlayerNode()
private let ioBufferSize = 1024.0 // Or 256.0
override func viewDidLoad() {
super.viewDidLoad()
let audioSession = AVAudioSession.sharedInstance()
try! audioSession.setPreferredIOBufferDuration(ioBufferSize / 44100.0)
try! audioSession.setActive(true)
engine.attach(player)
engine.connect(player, to: engine.mainMixerNode, format: nil)
try! engine.start()
print("IO buffer duration: \(audioSession.ioBufferDuration)")
}
override func touchesBegan(_ touches: Set<UITouch>, with event: UIEvent?) {
if player.isPlaying {
player.stop()
} else {
let startTime = CACurrentMediaTime()
player.play()
let endTime = CACurrentMediaTime()
print("\(endTime - startTime)")
}
}
}
Here are some sample timings for play() that I got using a buffer size of 1024 (which I believe is the default):
0.0218
0.0147
0.0211
0.0160
0.0184
0.0194
0.0129
0.0160
Here are some sample timings using a buffer size of 256:
0.0014
0.0029
0.0033
0.0023
0.0030
0.0039
0.0031
0.0032
As you can see above, for a buffer size of 1024, execution times tend to be in the 15-20 ms range (around a full frame at 60 FPS). With a buffer size of 256, it's around 3 ms - not as bad, but still costly when you only have ~17 ms per frame to work with.
This is on an iPad Mini 2 running iOS 12.4.2. This is obviously an old device, but the results I see on the simulator seem similarly proportional, so it seems to have more to do with the buffer size and the behavior of the function itself than with the hardware being used. I don't know what's going on under the hood, but it seems possible that play() blocks until the beginning of the next audio cycle, or something like that.
Requesting a lower buffer size seems like a partial solution, but there are some potential drawbacks. According to the documentation here, lower buffer sizes can mean more disk access when streaming from a file, and irrespective of that, the request may not be honored at all. Also, here, someone reports playback problems related to low buffer sizes. Taking all this into account, I'm disinclined to pursue this as a solution.
That leaves me with execution times for play() in the 15-20 ms range, which typically means a missed frame at 60 FPS. If I arrange things so that only one call to play() is made at a time, and only infrequently, maybe it won't be noticeable, but it's not ideal.
I've searched for information and asked about this in other places, but it seems either not many people are encountering this behavior in practice, or it isn't an issue for them.
AVAudioEngine is intended for use in real-time applications, so if I'm right that AVAudioPlayerNode.play() blocks for a significant amount of time proportional to the buffer size, that seems like a design issue. I realize this probably isn't an issue many are dealing with, but I'm posting here to ask if anyone has encountered this specific issue with AVAudioEngine, and if so, if there's any insight, suggestions, or workarounds anyone can offer.

I've investigated this fairly thoroughly. Here are my findings.
Having now tested the behavior on a variety of devices and iOS versions (including the latest version at the time of this writing, 13.2), and having had others test it as well, my current conclusion is that the long execution times for AVAudioPlayerNode.play() are inherent and that there's no obvious workaround. As noted in my original post, the execution times can be reduced by requesting a lower buffer duration, but as discussed earlier, this doesn't seem like a viable solution.
I heard from a credible source that calling play() on a background thread (e.g. using Grand Central Dispatch) should be safe, and indeed this would be one way to solve the problem. However, although it may technically be safe to call play() (or other AVAudioEngine-related functions) on different threads, I'm skeptical as to whether this is a good idea (further explanation below).
The documentation doesn't state this as far as I can tell, but AVAudioEngine will throw NSException's under various circumstances, which, without special handling, will result in application termination in Swift.
One of the things that will cause an NSException to be thrown is if you call AVAudioPlayerNode.play() while the engine is not running. Obviously if you only have your own code to worry about, you can take steps to ensure this doesn't occur.
However, iOS itself will sometimes stop the engine of its own accord, for example when an audio interruption occurs. If you call play() subsequent to that and before restarting the engine, an NSException will be thrown. It's fairly easy to avoid this mistake if all your calls to play() are on the main thread, but multithreading complicates the issue and seems like it could introduce the risk of accidentally calling play() after the engine has been stopped. Although there may be ways to work around this, multithreading seems to introduce undesirable complexity and fragility, so I've opted not to pursue it.
My current strategy is as follows. For the reasons discussed earlier, I'm not using multithreading. Instead, I'm doing everything I can to reduce the number of calls to play(), both overall and per-frame. This includes, among other things, only supporting stereo audio (for various reasons, supporting both mono and stereo can lead to more calls to play(), which is undesirable).
Lastly, I also investigated alternatives to AVAudioEngine. OpenAL is still supported on iOS, but is deprecated. A custom implementation using low-level APIs such as Audio Queue Services or Audio Units would be a possibility, but would be non-trivial. I've also looked at some open-source solutions, but the options I looked at use AVAudioEngine under the hood themselves and therefore suffer from the same problems, and/or have other shortcomings or limitations of their own. Of course there are also commercial options available, which may provide a solution for some developers.

Related

ARSession and Recording Video

I’m manually writing a video recorder. Unfortunately it’s necessary if you want to record video and use ARKit at the same time. I’ve got most of it figured out, but now I need to optimize it a bit because my phone gets pretty hot running ARKit, Vision and this recorder all at once.
To make the recorder, you need to use an AVAssetWriter with an AVAssetWriterInput (and AVAssetWriterInputPixelBufferAdaptor). The input has a isReadyForMoreMediaData property you need to check before you can write another frame. I’m recording in real-time (or as close to as possible).
Right now, when ARKit.ARSession gives me a new session I immediately pass it to the AVAssetWriterInput. What I want to do is add it to a queue, and have loop check to see if there’s samples available to write. For the life of me I can’t figure out how to do that efficiently.
I want to just run a while loop like this, but it seems like it would be a bad idea:
func startSession() {
// …
while isRunning {
guard !pixelBuffers.isEmpty && writerInput.isReadyForMoreMediaData else {
continue
}
// process sample
}
}
Can I run this a separate thread from the ARSession.delegateQueue? I don't want to run into issues with CVPixelBuffers from the camera being retained for too long.

Timing accuracy with swift using GCD dispatch_after

I'm trying to create a metronome for iOS in Swift. I'm using a GCD dispatch queue to time an AVAudioPlayer. The variable machineDelay is being used to time the player, but its running slower than the time I'm asking of it.
For example, if I ask for a delay of 1sec, it plays at 1.2sec. 0.749sec plays at about 0.92sec, and 0.5sec plays at about 0.652sec. I could try to compensate by adjusting for this discrepancy but I feel like there's something I'm missing here.
If there's a better way to do this altogether, please give suggestions. This is my first personal project so I welcome ideas.
Here are the various functions that should apply to this question:
func milliseconds(beats: Int) -> Double {
let ms = (60 / Double(beats))
return ms
}
func audioPlayerDidFinishPlaying(player: AVAudioPlayer, successfully flag: Bool) {
if self.playState == false {
return
}
playerPlay(playerTick, delay: NSTimeInterval(milliseconds(bpm)))
}
func playerPlay(player: AVAudioPlayer, delay: NSTimeInterval) {
let machineDelay: Int64 = Int64((delay - player.duration) * Double(NSEC_PER_SEC))
dispatch_after(dispatch_time(DISPATCH_TIME_NOW, machineDelay),dispatch_get_main_queue(), { () -> Void in
player.play()
})
}
I have never really done anything with sound on iOS but I can tell you why you are getting those inconsistent timings.
What happens when you use dispatch_after() is that some timer is set somewhere in the OS and at some point soon after it expires, it puts your block on the queue. "at some point after" is going to be short, but depending on what the OS is doing, it will almost certainly not be close to zero.
The main queue is serviced by the main thread using the run loop. This means your task to play the sound is competing for use of the CPU with all the UI functionality. This means that the chance of it playing the sound immediately is quite low.
Finally, the completion handler will fire at some short time after the sound finishes playing but not necessarily straight away.
All of these little delays add up to the latency you are seeing. Unfortunately, depending on what the device is doing, that latency can vary. This is never going to work for something that needs precise timings.
There are, I think, a couple of ways to achieve what you want. However, audio programming is beyond my area of expertise. You probably want to start by looking at Core Audio. My five minutes of research suggests either Audio Queue Services or OpenAL, but those five minutes are literally everything I know about sound on iOS.
dispatch_after is not intended for sample accurate callbacks.
If you are writing audio applications there is no way to escape, you need to implement some CoreAudio code in one way or another.
It will "pull" specific counts of samples. Do the math (figuratively ;)

Memory leak: steady increase in memory usage with simple device motion logging

Consider this simple Swift code that logs device motion data to a CSV file on disk.
let motionManager = CMMotionManager()
var handle: NSFileHandle? = nil
override func viewDidLoad() {
super.viewDidLoad()
let documents = NSSearchPathForDirectoriesInDomains(.DocumentDirectory, .UserDomainMask, true)[0] as NSString
let file = documents.stringByAppendingPathComponent("/data.csv")
NSFileManager.defaultManager().createFileAtPath(file, contents: nil, attributes: nil)
handle = NSFileHandle(forUpdatingAtPath: file)
motionManager.startDeviceMotionUpdatesToQueue(NSOperationQueue.currentQueue(), withHandler: {(data, error) in
let data_points = [data.timestamp, data.attitude.roll, data.attitude.pitch, data.attitude.yaw, data.userAcceleration.x,
data.userAcceleration.y, data.userAcceleration.z, data.rotationRate.x, data.rotationRate.y, data.rotationRate.z]
let line = ",".join(data_points.map { $0.description }) + "\n"
let encoded = line.dataUsingEncoding(NSUTF8StringEncoding)!
self.handle!.writeData(encoded)
})
}
I've been stuck on this for days. There appears to be a memory leak, as memory
consumption steadily increases until the OS suspends the app for exceeding resources.
It's critical that this app be able to run for long periods without interruption. Some notes:
I've tried using NSOutputStream and a CSV-writing library (CHCSVParser), but the issue is still present
Executing the logging code asynchronously (wrapping startDeviceMotionUpdatesToQueue in dispatch_async) does not remove the issue
Performing the sensor data processing in a background NSOperationQueue does fix the issue (only when maxConcurrentOperationCount >= 2). However, that causes concurrency issues in file writing: the output file is garbled with lines intertwined between each other.
The issue does not seem to appear when logging accelerometer data only, but does seem to appear when logging multiple sensors (e.g. accelerometer + gyroscope). Perhaps there's a threshold of file writing throughput that triggers this issue?
The memory spikes seem to be spaced out at roughly 10 second intervals (steps in the above graph). Perhaps that's indicative of something? (could be an artifact of the memory instrumentation infrastructure, or perhaps it's garbage collection)
Any pointers? I've tried to use Instruments, but I don't have the skills the use it effectively. It seems that the exploding memory usage is caused by __NSOperationInternal. Here's a sample Instruments trace.
Thank you.
First, see this answer of mine:
https://stackoverflow.com/a/28566113/341994
You should not be looking at the Memory graphs in the debugger; believe only what Instruments tells you. Debug builds and Release builds are memory-managed very differently in Swift.
Second, if there is still trouble, try wrapping the interior of your handler in an autoreleasepool closure. I do not expect that that would make a difference, however (as this is not a loop), and I do not expect that it will be necessary, as I suspect that using Instruments will reveal that there was never any problem in the first place. However, the autoreleasepool call will make sure that autoreleased objects are not given a chance to accumulate.

Call to CFReadStreamRead stops execution in thread

NB: The entire code base for this project is so large that posting any meaningful amount wold render this question too localised, I have tried to distil any code down to the bare-essentials. I'm not expecting anyone to solve my problems directly but I will up vote those answers I find helpful or intriguing.
This project uses a modified version of AudioStreamer to playback audio files that are saved to locally to the device (iPhone).
The stream is set up and scheduled on the current loop using this code (unaltered from the standard AudioStreamer project as far as I know):
CFStreamClientContext context = {0, self, NULL, NULL, NULL};
CFReadStreamSetClient(
stream,
kCFStreamEventHasBytesAvailable | kCFStreamEventErrorOccurred | kCFStreamEventEndEncountered,
ASReadStreamCallBack,
&context);
CFReadStreamScheduleWithRunLoop(stream, CFRunLoopGetCurrent(), kCFRunLoopCommonModes);
The ASReadStreamCallBack calls:
- (void)handleReadFromStream:(CFReadStreamRef)aStream
eventType:(CFStreamEventType)eventType
On the AudioStreamer object, this all works fine until the stream is read using this code:
BOOL hasBytes = NO; //Added for debugging
hasBytes = CFReadStreamHasBytesAvailable(stream);
length = CFReadStreamRead(stream, bytes, kAQDefaultBufSize);
hasBytes is YES but when CFReadStreamRead is called execution stops, the App does not crash it just stops exciting, any break points below the CFReadStreamRead call are not hit and ASReadStreamCallBack is not called again.
I am at a loss to what might cause this, my best guess is the thread is being terminated? But the hows and whys is why I'm asking SO.
Has anyone seen this behaviour before? How can I track it down and ideas on how I might solve it will be very much welcome!
Additional Info Requested via Comments
This is 100% repeatable
CFReadStreamHasBytesAvailable was added by me for debugging but removing it has no effect
First, I assume that CFReadStreamScheduleWithRunLoop() is running on the same thread as CFReadStreamRead()?
Is this thread processing its runloop? Failure to do this is my main suspicion. Do you have a call like CFRunLoopRun() or equivalent on this thread?
Typically there is no reason to spawn a separate thread for reading streams asynchronously, so I'm a little confused about your threading design. Is there really a background thread involved here? Also, typically CFReadStreamRead() would be in your client callback (when you receive the kCFStreamEventHasBytesAvailable event (which it appears to be in the linked code), but you're suggesting ASReadStreamCallBack is never called. How have you modified AudioStreamer?
It is possible that the stream pointer is just corrupt in some way. CFReadStreamRead should certainly not block if bytes are available (it certainly would never block for more than a few milliseconds for local files). Can you provide the code you use to create the stream?
Alternatively, CFReadStreams send messages asynchronously but it is possible (but not likely) that it's blocking because the runloop isn't being processed.
If you prefer, I've uploaded my AudioPlayer inspired by Matt's AudioStreamer hosted at https://code.google.com/p/audjustable/. It supports local files (as well as HTTP). I think it does what you wanted (stream files from more than just HTTP).

Most Effiecient Thread Implmentation IOS

I've been trying to find a thread implementation in IOS that suits my projects needs. So far I've failed to find an acceptable solution.
My Problem :
I need to read audio from up to 16 mp3 files on disk simultaneously.
What I have tried:
First off I tried using a NSTimer witch repeats. The timer was not fast enough and the audio would drop out when I played any more than 4 files.
Second I tried Using an NSThread with a priority of 1. The audio just about played correctly but the UI Became wholly unresponsive.
Finally I tried dispatching blocks using GCD in my callback whenever I needed more audio from a file. Again the audio would drop out but the UI was responsive.
In all three of the examples above I also tried dividing up the work load by creating 4 threads and having each thread handle 4 audio files each but this caused really bad synchronization problems with the audio.
Are there other thread options that I can try or do the above sum up what IOS has to offer?
Do you think that reading from 16 files from disk simultaneously is too much of a strain for the IOS system?
Is there a limit of how many threads IOS can handle?
To avoid making my question sound like a discussion I will summarize as follows.
What IOS thread technology is best suited for very frequent calling, quickly completing execution, that can be easily synchronized and will not impact on UI responsiveness.
Any anecdotal advice from solving a similar audio programming problem is also appreciated.
EDIT 1
This is some stripped down code I modelled on a suggestion from a so user. All I'm after solid advice on what setup is going to work best for me. Since my last post I tried NSThread and it does seem to leave me with audio dropouts. Also I tried using NSConditions so that my thread is wasting processing power when its not filling buffer but using these locks seems like a real bad idea for audio callbacks.
OSStatus channelMixerCallback(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData) {
AudioInfo = myaudio[inBusNumber];
if(myaudio.needsbufferfill==YES)
{
[refToSelf performSelector:#selector(GetAudioForItem:) onThread:engineDescribtion.producerthread withObject:myaudio waitUntilDone:false];
}
}
-(void) startthread
{
engineDescribtion.producerthread =[[NSThread alloc]initWithTarget:self selector:#selector(dosinglerunloop) object:nil];
[engineDescribtion.producerthread start];
}
-(void)dosinglerunloop
{
BOOL isstarted=YES;
NSAutoreleasePool *pool=[[NSAutoreleasePool alloc]init];
do {
[[NSRunLoop currentRunLoop]addPort:[NSMachPort port] forMode:NSDefaultRunLoopMode];
[[NSRunLoop currentRunLoop]runMode:NSDefaultRunLoopMode beforeDate:[NSDate distantFuture]];
} while (isstarted);
[pool release];
}
- (void)GetAudioForItem:(AudioInfo *)info
{
// use data in Audio Info to seek to
//corrent place in file
//and extract audio to buffers
}
Problem 0:
Your audio render callbacks should never lock. Example: Creating a single heap allocation will lock.
Your threads will all compete for the hardware. To keep the UI responsive, you should not have many highest priority threads (the audio playback should be the only one). Consider the number of cores, disks, etc you have available in your design.
If you still have issues once you have correctly fixed that: Loading short files into memory can offload some of the disk's demand to memory.
You should profile to determine what is actually the problem: It may be CPU or I/O. You may be simply missing your render deadlines and equating audio dropouts to "can't read fast enough". If you are using a lot of CPU, then Disk I/O may not be the problem. Decoding and performing sample rate conversion on 16 mp3 files can require relatively high CPU (as one example of the things you need to look for).
pthreads will be fastest, but will require some work to implement right. That really doesn't matter at this time because there seem to be a few high level issues yet and there are multiple APIs which should handle the task just fine.
Your program should be smart enough to detect when read buffers cannot be filled fast enough.
You are pre filling the buffers, correct?
Presumably, you are using a run loop?
Well, there's only one disk… So any solution that requires 16 simultaneous reads might be an issue. (Depending on if you're I/O bound or CPU bound.)
NSTimer is not going to get you consistent results.
I don't see any reason why NSThread would kill UI responsiveness, perhaps you had a bug.
I'm going with this system being disk-bound because 16 channels of MP3 is no problem CPU-wise on modern machines - how much rattling is coming from your box? I would probably be tempted to use just one thread to fill the empty buffers with the buffer sized to accommodate, (averageDiskLatency*(bytes/msec)*16*bodgeFactor) bytes of audio stream, (bodgeFactor means rounded up to 8K boundary and add a few 8K's). Whenever threads/callbacks/whatever empty a buffer and so start on the other one, they should queue the empty buffer to the disk read thread, (thread-safe producer-consumer queue), to get it filled up again. Probably, each buffer should include a 'fileControl' instance containing the the fileSpec, file handle, state variable for EOF etc, error string space and anything else needed for the read thread to work as well as the buffer space itself.
This design allows the disk to read nice, large chunks without being annoyingly preempted half-way through reads and being avoidably forced to move lumps of metal too often.
Rgds,
Martin
PS - If you haven't got one already, get an SSD - works wonders for multi-channel audio/video latency.

Resources