Alsa: snd_pci_readi() and real-time threads - pthreads

I've got a dedicated thread that caputures audio from Alsa through snd_pcm_readi(). Periodically I get a short read, meaning snd_pcm_readi() returns a positive integer lower than my buffer size, and there's obviously a 'pop' sound in my audio stream. Then I set the thread priority to real-time and this gives a tangible benefit, far less short reads, but this doesn't solve.
Now the question: before going down the bumpy road of a real-time patched Linux kernel, there's something else I can do to squeeze out some more performance? Is calling snd_pcm_readi() in a dedicated thread the best way to pull audio out of Alsa?

For playback, the buffer size determines the latency.
For capture, it does not; only the period size determines how long you must wait until recorded samples are reported to be available.
So to prevent overruns, make the buffer as large as possible (e.g., by calling snd_pcm_hw_params_set_buffer_size_max() after setting the other parameters).

Related

MP3 radio Stream buffer underrun detection

any pointers to detect through a script on linux that an mp3 radio stream is breaking up, i am having issues with my radio station when the internet connection slows down and causes the stream on the client side to stop, buffer and then play.
There are a few ways to do this.
Method 1: Assume constant bitrate
If you know that you will have a constant bitrate, you can measure that bitrate over time on the server and determine when it slows below a threshold. Note that this isn't the most accurate method, and won't always work. Not all streams use a constant bitrate. But, this method is as easy as counting bytes received over the wire.
Method 2: Playback on server
You can run a headless player on the server (via cvlc or similar) and track when it has buffer underruns. This will work at any bitrate and will give you a decent idea of what's happening on the clients. This sort of player setup also enables utility functions like silence detection. The downside is that it takes a little bit of CPU to decode, and a bit more effort to automate.
Method 3 (preferred): Log output buffer on source
Your source encoder will have a buffer on its output, data waiting to be sent to the server. When this buffer grows over a particular threshold, log it. This means that output over the network stalled for whatever reason. This method gets the appropriate data right from the source, and ensures you don't have to worry about clock synchronization issues that can occur over time in your monitoring of audio streams. (44.1 kHz to your encoder might be 44.101 kHz to a player.) This method might require modifying your source client.

How to ensure audio rendered within time limit on iOS?

I am rendering low-latency audio from my custom synth code via the iOS Audio Unit render callback. Obviously if my rendering code is too slow then it will return from the callback too late and there will be a buffer underrun. I know from experience this results in silence being output.
I want to know what time limit I have so that I can manage the level of processing to match the device limitations etc..
Obviously the length of the buffer (in samples) determines the duration of audio being rendered and this sets an overall limit. However I suspect that the Apple audio engine will have a smaller time limit between issuing the render callback and requiring the response.
How can I find out this time limit and is that something I can do within the callback function itself?
If I happen to exceed the time limit and cause a buffer underrun, is there a notification I can receive or a status object I can interrogate?
NB: In my app I am creating a single 'output' audio unit, so I don't need to worry about chaining audio units together.
The amount of audio rendering that can be done in Audio Unit callbacks depends on the iOS device model and OS version, and well as potential CPU clock speed throttling due to temperature or background modes. Thus, it needs to be profiled on the oldest, slowest iOS device you plan on your app supporting, with some margin.
To support iOS 9, I very conservatively profile my apps on an iPhone 4S test device (ARM Cortex A9 CPU at 800 MHz), or an even older slower device by using an earlier iOS version. When doing this profiling, one can add some percentage of "make work" to test an audio callback and see if there is any margin (For a 50% margin, generate the sample buffer twice, etc.) Other developers appear to be less conservative.
This is why it is important for an mobile audio developer to have (or have access to) to several iOS devices (the older the better). If the callback meets the time limit on an old slow text device, it will very likely be more than fast enough on any newer iOS device.
Depending on the OS version, an underrun can either result in silence, or the Audio Unit stopping or crashing (which can be detected by no more or not enough callbacks within some predictable amount of time).
But the best way to avoid underrun is to do most of the heavy audio work in another thread outside the audio unit thread, and pass samples to/from the audio unit callback using a lock-free circular fifo/queue.
Adding to what hotpaw2 said, the worst performing iOS device I have encountered is the iPhone touch 16G without the rear facing camera. I have done projects where every device except the ipod touch 16G plays audio smoothly. I had to bump up the buffer duration to the next size to accommodate.
I typically have done all audio prepping prior before the render callback in a separate lockless ring buffer and keep the render callback limited to copying data. I let the application "deal" with a buffer underruns.
I personally never measured the render callback variance but I would guess that it would be consistently equal to the buffer duration time and would extremely minimal jitter (eg 5ms). I doubt it would be 4.9 ms one time then 5.1 ms the next time.
To get some timing info, in mach_time.hyou can use mach_absolute_time() to get some timing.
You didn't really say what your timing requirements are. I assume you need low latency audio. Otherwise, you can just set the buffer duration to be pretty big. I assume that you want to increase latency for slow devices using this code. I usually find what works on an iPod 16G and use that as a worst case.
NSTimeInterval _preferredDuration = ...
NSError* err;
[[AVAudioSession sharedInstance]setPreferredIOBufferDuration:_preferredDuration error:&err];
And of course, you should get the actual duration used. The OS will pick some power of two based on the sample rate:
NSTimeInterval _actualBufferDuration;
_actualBufferDuration = [[AVAudioSession sharedInstance] IOBufferDuration];
As far as adjusting for device performance. You can set the buffer duration

How does a gnuradio source block know how many samples to output?

I'm trying to understand how gnuradio source blocks work. I know how to make a simple one that outputs a constant and I understand what sample rate means, but I'm not sure how (or where) to combine the two.
Is the source block in charge of regulating the amount of data to output? Or does the amount that it outputs depend upon other blocks in the flow graph and how much they consume? Some source blocks take sample_rate as an input, which makes me think it's the former. But other blocks don't, which makes me think it's the later.
If a source block is in charge of its sample rate, how does it regulate it? Do they check the system clock and output samples based upon that?
Do they check the system clock and output samples based upon that?
Definitely not. All GNU Radio blocks operate at the maximum speed the processor can give.
However, GNU Radio relies on the fact that each flowgraph may have a source and/or sink device (e.g USRP, other SDR device, sound card) that produces/consumes samples in a constant rate. Consequently, the flowgraph is throttled at the rate of the hardware.
In order to avoid CPU saturation, if none of these hardware devices exist, GNU Radio provides the Throttle block that tries (it is not so accurate) to throttle the samples per second at the given rate, by sleeping for suitable amount of time between each sample that passes through the Throttle block.
As far the sample_rate parameter concerns, excluding the Throttle and device specific blocks, it is used only for graphical representation or internal calculations.

Time between callback calls?

I have a lab project that uses mainly PyAudio and to further understand its way of working I made some measurements, in this case time between callbacks (using callback mode).
I timed it, and got an interesting result
(#256 chunk size, 44.1k fs): 0.0099701;0.0000365;0.0000201;0.0201579
This pattern goes on and on.
Between two longer calls, we have two shorter calls and sometimes the longer call is shorter (mind you I don't do anything else in the program than time the callbacks).
If we average this out we get our desired callback time:
1/44100 * 256 (roughly 5.8ms)
Here is my measurement visualized:
So can someone explain what exactly happens here under the hood?
What happens under the hood in PortAudio is dependent on a number of factors, including:
Which native audio API PortAudio is talking to
What buffer size and latency parameters you passed to Pa_OpenStream()
The capabilities of the audio hardware and its drivers, including its supported buffer sizes, buffering model and timing characteristics.
Under some circumstances PortAudio will request larger buffers from the native audio API and then invoke the PortAudio user callback multiple times in quick succession. This can happen if you have selected a small callback buffer size and a long latency.
Another scenario is that the native audio API doesn't support the buffer size that you requested for your callback size (framesPerBuffer parameter to Pa_OpenStream()). In this case PortAudio will be forced to use a driver-supported buffer size and then "adapt" between that buffer size and your callback buffer size. This adaption process can cause irregular timing.
Yet another possibility is that the native audio API uses a large ring buffer. Each time PortAudio polls the native host API, it will work to fill the native ring buffer by calling your callback as many times as needed. In this case irregular timing is related to the polling rate.
The above are not the only possibilities.
One likely explanation of what is happening in your case is that PortAudio is calling your callback 3 times in fast succession (a guess would be that the native buffer size is 3x your callback buffer size), for one of the reasons above.
Another possibility is that the native audio subsystem is signalling PortAudio irregularly. This can happen if a system layer below PortAudio is doing similar kinds of buffering to what I described above. I have seen this happen with DirectSound on Windows 7 for example. ASIO4ALL drivers will exhibit +/- 1ms jitter (which is not what you're seeing).
You can try reducing the requested stream latency to 0 and see if that changes the result. This will force double-buffering, which may or may not produce stable output. Another thing to try is to use the paFramesPerBufferUnspecified parameter, which will cause the callback to be called with the native buffer size -- then you can observe whether there is greater periodicity, what that buffer size is, and also whether the buffer size varies from callback to callback.
You didn't say which operating system and host API you're targetting, so it's hard to give more specific details than the above.
The internal buffering models used by the various PortAudio host API backends are described in some detail on the PortAudio wiki.
To answer a related question: why is it like this? Aside from the cases where it is a function of the lower layers of the native audio subsystem, or the buffer adaption process, it is often a result of specifying a large suggested latency to Pa_OpenStream(). Some PortAudio host APIs will relax the buffer periodicity if the specified latency is very high, in order to reduce system load that would be caused by high-frequency timer callbacks.

Low jitter audio on iOS

I'd like to load a small audio clip like a beep into memory, and schedule playback after x seconds with very low jitter. My application ideally gets less than +-1ms, but +-5ms could still be useful. The time is synchronized to a remote application without a microphone. My question is what kind of jitter can I expect from the audio APIs, and are they all equal in this regard?
I'm not familiar with the audio APIs, but from the latency discussions I've seen the number 5.8ms using remoteIO audio units. Does this mean +-3ms would be the best precision possible?
You would need to set this process as Real-Time to have a guarantee of low delay, otherwise you can get jitter in seconds because operating system can decide to make some background job.
Once you got it as real-time, you might archive lower delay.
Please check with Apple if you can make process real-time (with scheduling options). You might want to have extra permissions and kernel level support in your app to do it properly, that you can have guaranteed 1ms delay for audio app.

Resources