Most Effiecient Thread Implmentation IOS

Most Effiecient Thread Implmentation IOS - ios

I've been trying to find a thread implementation in IOS that suits my projects needs. So far I've failed to find an acceptable solution.
My Problem :
I need to read audio from up to 16 mp3 files on disk simultaneously.
What I have tried:
First off I tried using a NSTimer witch repeats. The timer was not fast enough and the audio would drop out when I played any more than 4 files.
Second I tried Using an NSThread with a priority of 1. The audio just about played correctly but the UI Became wholly unresponsive.
Finally I tried dispatching blocks using GCD in my callback whenever I needed more audio from a file. Again the audio would drop out but the UI was responsive.
In all three of the examples above I also tried dividing up the work load by creating 4 threads and having each thread handle 4 audio files each but this caused really bad synchronization problems with the audio.
Are there other thread options that I can try or do the above sum up what IOS has to offer?
Do you think that reading from 16 files from disk simultaneously is too much of a strain for the IOS system?
Is there a limit of how many threads IOS can handle?
To avoid making my question sound like a discussion I will summarize as follows.
What IOS thread technology is best suited for very frequent calling, quickly completing execution, that can be easily synchronized and will not impact on UI responsiveness.
Any anecdotal advice from solving a similar audio programming problem is also appreciated.
EDIT 1
This is some stripped down code I modelled on a suggestion from a so user. All I'm after solid advice on what setup is going to work best for me. Since my last post I tried NSThread and it does seem to leave me with audio dropouts. Also I tried using NSConditions so that my thread is wasting processing power when its not filling buffer but using these locks seems like a real bad idea for audio callbacks.
OSStatus channelMixerCallback(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData) {
AudioInfo = myaudio[inBusNumber];
if(myaudio.needsbufferfill==YES)
{
[refToSelf performSelector:#selector(GetAudioForItem:) onThread:engineDescribtion.producerthread withObject:myaudio waitUntilDone:false];
}
}
-(void) startthread
{
engineDescribtion.producerthread =[[NSThread alloc]initWithTarget:self selector:#selector(dosinglerunloop) object:nil];
[engineDescribtion.producerthread start];
}
-(void)dosinglerunloop
{
BOOL isstarted=YES;
NSAutoreleasePool *pool=[[NSAutoreleasePool alloc]init];
do {
[[NSRunLoop currentRunLoop]addPort:[NSMachPort port] forMode:NSDefaultRunLoopMode];
[[NSRunLoop currentRunLoop]runMode:NSDefaultRunLoopMode beforeDate:[NSDate distantFuture]];
} while (isstarted);
[pool release];
}
- (void)GetAudioForItem:(AudioInfo *)info
{
// use data in Audio Info to seek to
//corrent place in file
//and extract audio to buffers
}

Problem 0:
Your audio render callbacks should never lock. Example: Creating a single heap allocation will lock.
Your threads will all compete for the hardware. To keep the UI responsive, you should not have many highest priority threads (the audio playback should be the only one). Consider the number of cores, disks, etc you have available in your design.
If you still have issues once you have correctly fixed that: Loading short files into memory can offload some of the disk's demand to memory.
You should profile to determine what is actually the problem: It may be CPU or I/O. You may be simply missing your render deadlines and equating audio dropouts to "can't read fast enough". If you are using a lot of CPU, then Disk I/O may not be the problem. Decoding and performing sample rate conversion on 16 mp3 files can require relatively high CPU (as one example of the things you need to look for).
pthreads will be fastest, but will require some work to implement right. That really doesn't matter at this time because there seem to be a few high level issues yet and there are multiple APIs which should handle the task just fine.
Your program should be smart enough to detect when read buffers cannot be filled fast enough.
You are pre filling the buffers, correct?
Presumably, you are using a run loop?

Well, there's only one disk… So any solution that requires 16 simultaneous reads might be an issue. (Depending on if you're I/O bound or CPU bound.)
NSTimer is not going to get you consistent results.
I don't see any reason why NSThread would kill UI responsiveness, perhaps you had a bug.

I'm going with this system being disk-bound because 16 channels of MP3 is no problem CPU-wise on modern machines - how much rattling is coming from your box? I would probably be tempted to use just one thread to fill the empty buffers with the buffer sized to accommodate, (averageDiskLatency*(bytes/msec)*16*bodgeFactor) bytes of audio stream, (bodgeFactor means rounded up to 8K boundary and add a few 8K's). Whenever threads/callbacks/whatever empty a buffer and so start on the other one, they should queue the empty buffer to the disk read thread, (thread-safe producer-consumer queue), to get it filled up again. Probably, each buffer should include a 'fileControl' instance containing the the fileSpec, file handle, state variable for EOF etc, error string space and anything else needed for the read thread to work as well as the buffer space itself.
This design allows the disk to read nice, large chunks without being annoyingly preempted half-way through reads and being avoidably forced to move lumps of metal too often.
Rgds,
Martin
PS - If you haven't got one already, get an SSD - works wonders for multi-channel audio/video latency.

Related

iOS - AVAudioPlayerNode.play() execution is very slow

I'm using AVAudioEngine for audio in an iOS game application. A problem I've encountered is that AVAudioPlayerNode.play() takes a long time to execute, which can be a problem in real-time applications such as games.
play() just activates the player node - you don't have to call it every time you play a sound. As such, it doesn't have to be called that often, but it does have to be called occasionally, such as to activate the player initially, or after it's been deactivated (which happens in some situations). Even if only called occasionally, the long execution times can be a problem, especially if you need to call play() on multiple players at once.
The execution time for play() seems to be proportional to the value of AVAudioSession.ioBufferDuration, which you can request to be changed using AVAudioSession.setPreferredIOBufferDuration(). Here's some code I'm using to test this:
import AVFoundation
import UIKit
class ViewController: UIViewController {
private let engine = AVAudioEngine()
private let player = AVAudioPlayerNode()
private let ioBufferSize = 1024.0 // Or 256.0
override func viewDidLoad() {
super.viewDidLoad()
let audioSession = AVAudioSession.sharedInstance()
try! audioSession.setPreferredIOBufferDuration(ioBufferSize / 44100.0)
try! audioSession.setActive(true)
engine.attach(player)
engine.connect(player, to: engine.mainMixerNode, format: nil)
try! engine.start()
print("IO buffer duration: \(audioSession.ioBufferDuration)")
}
override func touchesBegan(_ touches: Set<UITouch>, with event: UIEvent?) {
if player.isPlaying {
player.stop()
} else {
let startTime = CACurrentMediaTime()
player.play()
let endTime = CACurrentMediaTime()
print("\(endTime - startTime)")
}
}
}
Here are some sample timings for play() that I got using a buffer size of 1024 (which I believe is the default):
0.0218
0.0147
0.0211
0.0160
0.0184
0.0194
0.0129
0.0160
Here are some sample timings using a buffer size of 256:
0.0014
0.0029
0.0033
0.0023
0.0030
0.0039
0.0031
0.0032
As you can see above, for a buffer size of 1024, execution times tend to be in the 15-20 ms range (around a full frame at 60 FPS). With a buffer size of 256, it's around 3 ms - not as bad, but still costly when you only have ~17 ms per frame to work with.
This is on an iPad Mini 2 running iOS 12.4.2. This is obviously an old device, but the results I see on the simulator seem similarly proportional, so it seems to have more to do with the buffer size and the behavior of the function itself than with the hardware being used. I don't know what's going on under the hood, but it seems possible that play() blocks until the beginning of the next audio cycle, or something like that.
Requesting a lower buffer size seems like a partial solution, but there are some potential drawbacks. According to the documentation here, lower buffer sizes can mean more disk access when streaming from a file, and irrespective of that, the request may not be honored at all. Also, here, someone reports playback problems related to low buffer sizes. Taking all this into account, I'm disinclined to pursue this as a solution.
That leaves me with execution times for play() in the 15-20 ms range, which typically means a missed frame at 60 FPS. If I arrange things so that only one call to play() is made at a time, and only infrequently, maybe it won't be noticeable, but it's not ideal.
I've searched for information and asked about this in other places, but it seems either not many people are encountering this behavior in practice, or it isn't an issue for them.
AVAudioEngine is intended for use in real-time applications, so if I'm right that AVAudioPlayerNode.play() blocks for a significant amount of time proportional to the buffer size, that seems like a design issue. I realize this probably isn't an issue many are dealing with, but I'm posting here to ask if anyone has encountered this specific issue with AVAudioEngine, and if so, if there's any insight, suggestions, or workarounds anyone can offer.

I've investigated this fairly thoroughly. Here are my findings.
Having now tested the behavior on a variety of devices and iOS versions (including the latest version at the time of this writing, 13.2), and having had others test it as well, my current conclusion is that the long execution times for AVAudioPlayerNode.play() are inherent and that there's no obvious workaround. As noted in my original post, the execution times can be reduced by requesting a lower buffer duration, but as discussed earlier, this doesn't seem like a viable solution.
I heard from a credible source that calling play() on a background thread (e.g. using Grand Central Dispatch) should be safe, and indeed this would be one way to solve the problem. However, although it may technically be safe to call play() (or other AVAudioEngine-related functions) on different threads, I'm skeptical as to whether this is a good idea (further explanation below).
The documentation doesn't state this as far as I can tell, but AVAudioEngine will throw NSException's under various circumstances, which, without special handling, will result in application termination in Swift.
One of the things that will cause an NSException to be thrown is if you call AVAudioPlayerNode.play() while the engine is not running. Obviously if you only have your own code to worry about, you can take steps to ensure this doesn't occur.
However, iOS itself will sometimes stop the engine of its own accord, for example when an audio interruption occurs. If you call play() subsequent to that and before restarting the engine, an NSException will be thrown. It's fairly easy to avoid this mistake if all your calls to play() are on the main thread, but multithreading complicates the issue and seems like it could introduce the risk of accidentally calling play() after the engine has been stopped. Although there may be ways to work around this, multithreading seems to introduce undesirable complexity and fragility, so I've opted not to pursue it.
My current strategy is as follows. For the reasons discussed earlier, I'm not using multithreading. Instead, I'm doing everything I can to reduce the number of calls to play(), both overall and per-frame. This includes, among other things, only supporting stereo audio (for various reasons, supporting both mono and stereo can lead to more calls to play(), which is undesirable).
Lastly, I also investigated alternatives to AVAudioEngine. OpenAL is still supported on iOS, but is deprecated. A custom implementation using low-level APIs such as Audio Queue Services or Audio Units would be a possibility, but would be non-trivial. I've also looked at some open-source solutions, but the options I looked at use AVAudioEngine under the hood themselves and therefore suffer from the same problems, and/or have other shortcomings or limitations of their own. Of course there are also commercial options available, which may provide a solution for some developers.

Asynchronous NSStream I/O with GCD

I am working with an external device that I receive data from. I want to handle its data read/write queue asynchronously, in a thread.
I've got it mostly working: There is a class that simply manages the two streams, using the NSStreamDelegate to respond to incoming data, as well as responding to NSStreamEventHasSpaceAvailable for sending out data that's waiting in a buffer after having failed to be sent earlier.
This class, let's call it SerialIOStream, does not know about threads or GCD queues. Instead, its user, let's call it DeviceCommunicator, uses a GCD queue in which it initializes the SerialIOStream class (which in turn creates and opens the streams, scheduling them in the current runloop):
ioQueue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT,0);
dispatch_async(ioQueue, ^{
ioStreams = [[SerialIOStream alloc] initWithPath:[#"/dev/tty.mydevice"]];
[[NSRunLoop currentRunLoop] run];
});
That way, the SerialIOStreams stream:handleEvent: method runs in that GCD queue, apparently.
However, this causes some problems. I believe I run into concurrency issues, up to getting crashes, mainly at the point of feeding pending data to the output stream. There's a critical part in the code where I pass the buffered output data to the stream, then see how much data was actually accepted into the stream, and then removing that part from my buffer:
NSInteger n = self.dataToWrite.length;
if (n > 0 && stream.hasSpaceAvailable) {
NSInteger bytesWritten = [stream write:self.dataToWrite.bytes maxLength:n];
if (bytesWritten > 0) {
[self.dataToWrite replaceBytesInRange:NSMakeRange(0, bytesWritten) withBytes:NULL length:0];
}
}
The above code can get called from two places:
From the user (DeviceCommunicator)
From the local stream:handleEvent: method, after being told that there's space in the output stream.
Those may be (well, surely are) running in separate thread, and therefore I need to make sure they do not run concurrently this code.
I thought I'd solve this by using the following code in DeviceCommunicator when sending new data out:
dispatch_async (ioQueue, ^{
[ioStreams writeData:data];
});
(writeData adds the data to dataToWrite, see above, and then runs the above code that sends it to the stream.)
However, that doesn't work, apparently because ioQueue is a concurrent queue, which may decide to use any available thread, and therefore lead to a race condition when writeData get called by the DeviceCommunicator while there's also a call to it from stream:handleEvent:, on separate threads.
So, I guess I am mixing expectations of threads (which I'm a bit more familiar with) into my apparent misunderstandings with GCD queues.
How do I solve this properly?
I could add an NSLock, protecting the writeData method with it, and I believe that would solve the issue in that place. But I am not so sure that that's how GCD is supposed to be used - I get the impression that'd be a cludge.
Shall I rather make a separate class, using its own serial queue, for accessing and modifying the dataToWrite buffer, perhaps?
I am still trying to grasp the patterns that are involved with this. Somehow, it looks like a classic producer / consumer pattern, but on two levels, and I'm not doing this right.

Long story, short: Don't cross the streams! (haha)
NSStream is a RunLoop-based abstraction (which is to say that it intends to do its work cooperatively on an NSRunLoop, an approach which pre-dates GCD). If you're primarily using GCD to support concurrency in the rest of your code, then NSStream is not an ideal choice for doing I/O. GCD provides its own API for managing I/O. See the section entitled "Managing Dispatch I/O" on this page.
If you want to continue to use NSStream, you can either do so by scheduling your NSStreams on the main thread RunLoop or you can start a dedicated background thread, schedule it on a RunLoop over there, and then marshal your data back and forth between that thread and your GCD queues. (...but don't do that; just bite the bullet and use dispatch_io.)

Where is my AVPlayer's memory, and how do I get it back?

I'm playing heaps of videos at the same time with AVPlayer. To reduce loading times, I'm storing the corresponding views in a NSCache.
This works fine until reaching a certain number of videos, from which the videos simply stop playing, or even appearing.
There's no error, log or memory warning. In particular, I'm listening to UIApplicationDidReceiveMemoryWarningNotification to clear the cache but this is never received.
If I remove the cache, all the videos play at expense of worse performance.
This makes me suspect that AVPlayer is using memory from a different process (which one?). And when that memory reaches a certain limit, new players cease to work.
Is this correct?
If so, is there a way to be notified when this magic limit is reached to take the appropriate measures (e.g., clear the cache) to ensure playback of other media?

Good news and bad news - good is you can probably fix the problem, bad is it takes work and is somewhat complex.
Root Problem
The reason you don't get notified early happens because iOS does not find out that your app has exceeded its memory budget til its almost too late, then it immediately kills it. The problem has to do with the way iOS (and OS X) manage the file system cache. Normally, when files get opened, as you read the data, the file data gets transferred into a buffer in the Unified Buffer Cache (a term you can google for more info) - I'll call it UBC from now on.
So suppose you have 10 open files, and you have read every file to the end, but have not closed the files. Well, all that data is sitting in the UBC. Now, if you close the files, the buffers are all freed. And technically, the OS can purge this buffers too - only it seems by the time it realizes that memory is tight, it chooses to blow the app away first (and there may be valid reasons for it to do this). So imagine that you app is showing videos, and the way the videos get loaded is through the file system, the number of free buffers starts dropping. At some point iOS notices this, tracks down who most belong to (your app), and wham, kills your app ASAP.
I hit this problem myself in an open source project I support, PhotoScrollerNetwork. Users started complaining that their project was getting terminated by the system, like you, without any notification. I tried in vain to monitor the UBC (there are APIs on OS X to do so, but not on iOS). In the end I found a solution using an heuristic - monitor all your memory usage including the UBC, and don't exceed 50% of the total available iOS memory pool.
So (you might ask) - what is the Apple approved way to solve this problem? Well, there is none. How do I know that? Because I had a half hour long discussion at WWDC 2012 with the Director of Core iOS in one of the labs (after getting ping ponged around by others who had no idea what I was talking about). In the end, after I explained the above heuristic, he told me directly that the solution was probably as good as any he could think of. Without an API to directly monitor the UBC, you can only approximate its usage and adjust accordingly.
But you say, I'm using the NSCache - why doesn't the system account for the AVPlayer memory there? There reason is undoubtedly the UBC - an AVPlayer instance probably only consumes a few thousand K of memory itself - its the open file to the video that is not accounted for by iOS.
Possible Solutions
1) If you can load the videos directly into a NSData object, and keep that in the NSCache, you can most likely totally avoid the UBC issues mentioned above. [I don't know enough about the AV system to know if you can do this.] In this case the system should be capable of purging memory when it needs to.
2) Continue using your original code, but add memory management to it. That is, when you get create an AVPlayer instance, you will need to account for the size of the video in bytes, and keep a running tally of all this memory. When you approach 50% of total device free memory, then start purging old AVPlayers.
Code
For completeness, I've provided the relevant code from PhotoScrollerNetwork below. If you want more details you can peruse the project - however its quite complex so expect to spend some time (its doing JPEG decoding on the fly for massive images and writing tiles to the file system as the decode proceeds).
// Data Structure
typedef struct {
size_t freeMemory;
size_t usedMemory;
size_t totlMemory;
size_t resident_size;
size_t virtual_size;
} freeMemory;
Early on in your app:
// ubc_threshold_ratio defaults to 0.5f
// Take a big chunk of either free memory or all memory
freeMemory fm = [self freeMemory:#"Initialize"];
float freeThresh = (float)fm.freeMemory*ubc_threshold_ratio;
float totalThresh = (float)fm.totlMemory*ubc_threshold_ratio;
size_t ubc_threshold = lrintf(MAX(freeThresh, totalThresh));
size_t ubc_usage = 0;
// Method on some class to monitor the memory pool
- (freeMemory)freeMemory:(NSString *)msg
{
// http://stackoverflow.com/questions/5012886
mach_port_t host_port;
mach_msg_type_number_t host_size;
vm_size_t pagesize;
freeMemory fm = { 0, 0, 0, 0, 0 };
host_port = mach_host_self();
host_size = sizeof(vm_statistics_data_t) / sizeof(integer_t);
host_page_size(host_port, &pagesize);
vm_statistics_data_t vm_stat;
if (host_statistics(host_port, HOST_VM_INFO, (host_info_t)&vm_stat, &host_size) != KERN_SUCCESS) {
LOG(#"Failed to fetch vm statistics");
} else {
/* Stats in bytes */
natural_t mem_used = (vm_stat.active_count +
vm_stat.inactive_count +
vm_stat.wire_count) * pagesize;
natural_t mem_free = vm_stat.free_count * pagesize;
natural_t mem_total = mem_used + mem_free;
fm.freeMemory = (size_t)mem_free;
fm.usedMemory = (size_t)mem_used;
fm.totlMemory = (size_t)mem_total;
struct task_basic_info info;
if(dump_memory_usage(&info)) {
fm.resident_size = (size_t)info.resident_size;
fm.virtual_size = (size_t)info.virtual_size;
}
#if MEMORY_DEBUGGING == 1
LOG(#"%#: "
"total: %u "
"used: %u "
"FREE: %u "
" [resident=%u virtual=%u]",
msg,
(unsigned int)mem_total,
(unsigned int)mem_used,
(unsigned int)mem_free,
(unsigned int)fm.resident_size,
(unsigned int)fm.virtual_size
);
#endif
}
return fm;
}
When you open a video, add the size to ubc_usage, and when you close one decrement it. When you want to open a new video, test ubc_usage against ubc_threadhold, and its it exceeds the value you have to close something first.
PS: you can try calling that freeMemory method at other times, and see, but in my case it hardly changes at all when files get opened - the system seems to consider the whole UBC as "free", since it could purge it if it needed to (I guess).

If you're throwing all of these videos in a NSCache, you have to be prepared for the cache to throw away items when it feels like they are consuming too much memory. From the NSCache documentation:
The NSCache class incorporates various auto-removal policies, which
ensure that it does not use too much of the system’s memory. The
system automatically carries out these policies if memory is needed by
other applications. When invoked, these policies remove some items
from the cache, minimizing its memory footprint.
Check to see if you're getting nils back from the cache, and if you are, you'll have to reconstruct your objects.
Edit:
It is also worth mentioning that objc.io #7 advises against storing large objects in a NSCache:
The eviction method of NSCache is non-deterministic and not
documented. It’s not a good idea to put in super-large objects like
images that might fill up your cache faster than it can evict itself.

iOS: Playing PCM buffers from a stream

I'm receiving a series of UDP packets from a socket containing encoded PCM buffers. After decoding them, I'm left with an int16 * audio buffer, which I'd like to immediately play back.
The intended logic goes something like this:
init(){
initTrack(track, output, channels, sample_rate, ...);
}
onReceiveBufferFromSocket(NSData data){
//Decode the buffer
int16 * buf = handle_data(data);
//Play data
write_to_track(track, buf, length_of_buf, etc);
}
I'm not sure about everything that has to do with playing back the buffers though. On Android, I'm able to achieve this by creating an AudioTrack object, setting it up by specifying a sample rate, a format, channels, etc... and then just calling the "write" method with the buffer (like I wish I could in my pseudo-code above) but on iOS I'm coming up short.
I tried using the Audio File Stream Services, but I'm guessing I'm doing something wrong since no sound ever comes out and I feel like those functions by themselves don't actually do any playback. I also attempted to understand the Audio Queue Services (which I think might be close to what I want), however I was unable to find any simple code samples for its usage.
Any help would be greatly appreciated, specially in the form of example code.

You need to use some type of buffer to hold your incoming UDP data. This is an easy and good circular buffer that I have used.
Then to play back data from the buffer, you can use Audio Unit framework. Here is a good example project.
Note: The first link also shows you how to playback using Audio Unit.

You could use audioQueue services as well, make sure your doing some kind of packet re-ordering, if your using ffmpeg to decode the streams there is an option for this.
otherwise audio queues are easy to set up.
https://github.com/mooncatventures-group/iFrameExtractor/blob/master/Classes/AudioController.m
You could also use AudioUnits, a bit more complicated though.

Call to CFReadStreamRead stops execution in thread

NB: The entire code base for this project is so large that posting any meaningful amount wold render this question too localised, I have tried to distil any code down to the bare-essentials. I'm not expecting anyone to solve my problems directly but I will up vote those answers I find helpful or intriguing.
This project uses a modified version of AudioStreamer to playback audio files that are saved to locally to the device (iPhone).
The stream is set up and scheduled on the current loop using this code (unaltered from the standard AudioStreamer project as far as I know):
CFStreamClientContext context = {0, self, NULL, NULL, NULL};
CFReadStreamSetClient(
stream,
kCFStreamEventHasBytesAvailable | kCFStreamEventErrorOccurred | kCFStreamEventEndEncountered,
ASReadStreamCallBack,
&context);
CFReadStreamScheduleWithRunLoop(stream, CFRunLoopGetCurrent(), kCFRunLoopCommonModes);
The ASReadStreamCallBack calls:
- (void)handleReadFromStream:(CFReadStreamRef)aStream
eventType:(CFStreamEventType)eventType
On the AudioStreamer object, this all works fine until the stream is read using this code:
BOOL hasBytes = NO; //Added for debugging
hasBytes = CFReadStreamHasBytesAvailable(stream);
length = CFReadStreamRead(stream, bytes, kAQDefaultBufSize);
hasBytes is YES but when CFReadStreamRead is called execution stops, the App does not crash it just stops exciting, any break points below the CFReadStreamRead call are not hit and ASReadStreamCallBack is not called again.
I am at a loss to what might cause this, my best guess is the thread is being terminated? But the hows and whys is why I'm asking SO.
Has anyone seen this behaviour before? How can I track it down and ideas on how I might solve it will be very much welcome!
Additional Info Requested via Comments
This is 100% repeatable
CFReadStreamHasBytesAvailable was added by me for debugging but removing it has no effect

First, I assume that CFReadStreamScheduleWithRunLoop() is running on the same thread as CFReadStreamRead()?
Is this thread processing its runloop? Failure to do this is my main suspicion. Do you have a call like CFRunLoopRun() or equivalent on this thread?
Typically there is no reason to spawn a separate thread for reading streams asynchronously, so I'm a little confused about your threading design. Is there really a background thread involved here? Also, typically CFReadStreamRead() would be in your client callback (when you receive the kCFStreamEventHasBytesAvailable event (which it appears to be in the linked code), but you're suggesting ASReadStreamCallBack is never called. How have you modified AudioStreamer?

It is possible that the stream pointer is just corrupt in some way. CFReadStreamRead should certainly not block if bytes are available (it certainly would never block for more than a few milliseconds for local files). Can you provide the code you use to create the stream?
Alternatively, CFReadStreams send messages asynchronously but it is possible (but not likely) that it's blocking because the runloop isn't being processed.
If you prefer, I've uploaded my AudioPlayer inspired by Matt's AudioStreamer hosted at https://code.google.com/p/audjustable/. It supports local files (as well as HTTP). I think it does what you wanted (stream files from more than just HTTP).

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart