portaudio video/audio sync - portaudio

I use ffmpeg to decode video/audio stream and use portaudio to play audio. I encounter a sync problem with portaudio. I have a function like below,
double AudioPlayer::getPlaySec() const
{
double const latency = Pa_GetStreamInfo( mPaStream )->outputLatency;
double const bytesPerSec = mSampleRate * Pa_GetSampleSize( mSampleFormat ) * mChannel;
double const playtime = mConsumedBytes / bytesPerSec;
return playtime - latency;
}
mCousumeBytes is the byte count which written into audio device in the portaudio callback function. I thought I could have got the playing time according to the byte count. Actually, when I execute the other process ( like open firefox ) which make cpu busy, the audio become intermittent, but the callback doesn't stop so that mConsumeBytes is more than expected, and getPlaySec return a time which is larger than playing time.
I have no idea how this happened. Any suggestion is welcome. Thanks!

Latency, in PortAudio is defined a bit vaguely. Something like the average time between when you put data into the buffer and when you can expect it to play. That's not something you want to use for this purpose.
Instead, to find the current playback time of the device, you can actually poll the device using the Pa_GetStreamTime function.
You may want to see this document for more detailed info.

I know this is old. But still; PortAudio v19+ can provide you with its own sample rate. You should use that for audio sync, since actual sample rate playback can differ between different hardware. PortAudio might try to compensate (depending on implementation). If you have drift problems, try using that.

Related

Record and send audio data to c++ function

I need to send audio data in real-time in PCM format 8 KHz 16 Bit Mono.
Audio must been sent like array of chars with length
(<#char *data#>, <#int len#>).
Now I'm beginner in Audio processing and cant really understand how to accomplish that. My best try was been to convert to iLBC format and try but it couldn't work. Is there any sample how to record and convert audio to any format. I have already read Learning Core Audio from Chris Adamson and Kevin Avila but I really didn't find solution that works.
Simple what i need:
(record)->(convert?)-> send(char *data, int length);
Couse I need to send data like arrays of chars i cant use player.
EDIT:
I managed to make everything work with recording and with reading buffers. What I can't manage is :
if (ref[i]->mAudioDataByteSize != 0){
char * data = (char*)ref[i]->mAudioData;
sendData(mHandle, data, ref[i]->mAudioDataByteSize);
}
This is not really a beginner task. The solutions are to use either the RemoteIO Audio Unit, the Audio Queue API, or an AVAudioEngine installTapOnBus block. These will give you near real-time (depending on the buffer size) buffers of audio samples (Int16's or Floats, etc.) that you can convert, compress, pack into other data types or arrays, etc. Usually by calling a callback function or block that you provide to do whatever you want with the incoming recorded audio sample buffers.

speakhere sample can not work correctly when I redefine the time interval of callback function

I downloaded the speakHere example and changed the parameters liked below:
#define kBufferDurationSeconds 0.020
void AQRecorder::SetupAudioFormat(UInt32 inFormatID)
{
memset(&mRecordFormat,0, sizeof(mRecordFormat));
mRecordFormat.mFormatID =kAudioFormatLinearPCM;
mRecordFormat.mSampleRate =8000.0;
mRecordFormat.mFormatFlags =kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
mRecordFormat.mBitsPerChannel = 16;
mRecordFormat.mFramesPerPacket = mRecordFormat.mChannelsPerFrame =1;
mRecordFormat.mBytesPerFrame = (mRecordFormat.mBitsPerChannel/8) * mRecordFormat.mChannelsPerFrame;
mRecordFormat.mBytesPerPacket = mRecordFormat.mBytesPerFrame ;
}
But I found that it seemed the time interval of the callback function AQRecorder::MyInputBufferHandler() was called but not per 20ms. It called the callback function four times with 1ms interval and after 500ms calls the callback function one time, then four times with 1ms, then 500ms,over and over again. But I set the parameter kBufferDurationSeconds = 0.02
what cause this problem. Please help me.
In iOS, the Audio Session setPreferredIOBufferDuration API (did you even use an OS buffer duration call?) is only a request regarding the app's preference. The OS is free to choose a buffer duration different, but compatible with what iOS thinks is best (for battery life, compatibility with other apps, etc.)
Audio Queues run on top of Audio Units. If the RemoteIO Audio Unit is using 500 mS buffers, it will cut them up into 4 smaller Audio Queue buffers and pass those smaller buffers to the Audio Queue API in a quick burst.
If you use the Audio Unit API instead of the Audio Queue API, and the Audio Session API for a setPreferredIOBufferDuration message, you may be able to request and get shorter, more evenly spaced, audio buffers.

What's the reason of using Circular Buffer in iOS Audio Calling APP?

My question is pretty much self explanatory. Sorry if it seems too dumb.
I am writing a iOS VoIP dialer and have checked some open-source code(iOS audio calling app). And almost all of those use Circular Buffer for storing recorded and received PCM audio data. SO i am wondering why we need to use a Circular Buffer in this case. What's the exact reason for using such audio buffer.
Thanks in advance.
Using a circular buffer lets you process the input and output data asynchronously from it's source. The audio render process takes place on a high priority thread. It asks for audio samples from your app (playback), and offers audio (recording/processing) on a timer in the form of callbacks.
A typical scenario would be for the audio callback to fire every 0.023 seconds to ask for (and/or offer) 1024 samples of audio. This thread is synchronized with system hardware so it is imperative that your callback returns before the 0.023 seconds is up. If you don't, the hardware won't wait for you, it will just skip that cycle and you will have an audible pop or silence, or miss audio you are trying to record.
A circular buffer's place is to pass data between threads. In an audio application that would be moving the samples to and from the audio thread asynchronously. One thread produces samples on to the "head" of the buffer, and the other thread consumes them from the "tail".
Here's an example, retrieving audio samples from the microphone and writing them to disk. Your app has subscribed to a callback that fires every 0.023 seconds, offering 1024 samples to be recorded. The naive approach would be to simply write the audio to disk from within that callback.
void myCallback(float *samples,int sampleCount, SampleSaver *saver){
SampleSaverSaveSamples(saver,samples,sampleCount);
}
This will work!! Most of the time...
The problem is that there is no guarantee that writing to disk will finish before 0.023 seconds, so every now and then, your recording has a pop in it because SampleSaver just plain took too long and the hardware just skips the next callback.
The right way to do this is to use a circular buffer. I personally use TPCircularBuffer because it's awesome. The way it works (externally) is that you ask the buffer for a pointer to write data to (the head) on one thread, then on another thread you ask the buffer for a pointer to read from (the tail). Here's how it would be done using TPCircularBuffer (skipping setup and using a simplified callback).
//this is on the high priority thread that can't wait for anything like a slow write to disk
void myCallback(float *samples,int sampleCount, TPCircularBuffer *buffer){
int32_t availableBytes = 0;
float *head = TPCircularBufferHead(buffer, &availableBytes);
memcpy(head,samples,sampleCount * sizeof(float));//copies samples to head
TPCircularBufferProduce(buffer,sampleCount * sizeof(float)); //moves buffer head "forward in the circle"
}
This operation is super quick and puts no extra pressure on that sensitive audio thread. You then create your own timer a separate thread to write the samples to disk.
//this is on some background thread that can take it's sweet time
void myLeisurelySavingCallback(TPCircularBuffer *buffer, SampleSaver *saver){
int32_t available;
float *tail = TPCircularBufferTail(buffer, &available);
int samplesInBuffer = available / sizeof(float); //mono
SampleSaverSaveSamples(saver, tail, samplesInBuffer);
TPCircularBufferConsume(buffer, samplesInBuffer * sizeof(float)); // moves tail forward
}
And there you have it, not only do you avoid audio glitches, but if you initialize a big enough buffer, you can set your write to disk callback to only fire every second or two (after the circular buffer has built up a good bit of audio) which is much easier on your system than writing to disk every 0.023 seconds!
The main reason to use the buffer is so the samples can be handled asynchronously. They are a great way to pass messages between threads without locks as well. Here is a good article explaining a neat memory trick for the implementation of a circular buffer.
Good question. There is another good reason for using Circular Buffer.
In iOS, if you use callbacks(Audio unit) for recording and playing audio(In-fact you need to use it if you want to create a real-time audio transferring app) then you will get a chunk of data for a specific amount of time(let's say 20 milliseconds) from the recorder callback. And in iOS, you will never get fixed length of data always(If you set the callback interval as 20ms then you will get 370 or 372 bytes of data. And you will never know when you will get 370 bytes or 372 bytes. Correct me if i am wrong). Then, to transfer the audio through UDP packets you need to use a codec for data encoding and decoding(G729 is generally used for VoIP apps). But g729 takes data by the multiplier of 8. Assume, you encode 368(8*46) bytes per 20ms. So what are you going to do with rest of the data ? You need to store it by sequence for the next chunk to process.
SO that's the reason. There are some other details thing but i kapt it simple for your better understanding. Just comment below if you have any question.

Millisecond (and greater) precision for audio file elapsed time in iOS

I am looking for a low-latency way of finding out how many seconds have elapsed in an audio file to guaranteed millisecond precision in real-time. According to the AVAudioPlayer class reference, a call to -currentTime will return "the offset of the current playback position, measured in seconds from the start of the sound", however an NSTimeInterval is a double and this implies fractions of a second are possible.
As a testing scenario, I have an audio file playing and the user taps a button. Playback DOES NOT pause/stop, but at the moment the button was tapped I would like to obtain information about the elapsed time. In the real application, the "button may be pressed" many times in one second, hence the need for millisecond precision.
My files are stored as AIFF files and are around 1-10 minutes in length. Ideally I would like to find out exactly which sample frame is 'up-next' when playback resumes - however, this level of precision is a little excessive and millisecond precision is perfectly acceptable.
Is AVAudioPlayer's -currentTime method sufficient to achieve guaranteed millisecond precision for a currently-playing audio file? Or, would it be preferable to use a lower-level API such as iOS's Audio Units?
If you want sub-millisecond relative time resolution, convert to raw PCM and count buffers * length + samples using a low latency RemoteIO Audio Unit configuration. Most iOS devices will support as small as 6 mS RemoteIO buffers of 256 samples, with a callback for each buffer.

OpenAL buffer update in real-time

I'm working in iOS and have a simple OpenAL project running.
The difference to most openAL projects i've seen is that im not loading in a sound file. Instead I load an array of raw data into the alBufferData. Using a couple of equations I can load in data to produce white noise, sine and pulse waves. And all is working well.
My problem is that I need a way to modify this data whilst the sound is playing in real-time.
Is there a way to modify this data without having to create a new buffer (i tried the approach of creating a new buffer with new data and then use it instead but its nowhere near quick enough).
Any help or suggestions of other ways to accomplish this would be much appreciated.
Thanks
I haven't done it on iOS, but with openAL on the PC what you would do is chain a few buffers together. Each buffer would have a small time period's worth of data. Periodically, check to see if the playing buffer is done, and if so, add it to a free list for reuse When you want to change the sound, write the new waveform into a free buffer and add it to the chain. You select the buffer size to balance latency and required update rate - smaller buffers allow faster response to changes, but need to be generated more often.
This page suggests that a half second update rate is doable. Whether you can go faster depends on the complexity of your calculations as well as on the overhead of the OS.
Changing the data during playback is not supported in OpenAL.
However, you can still try it and see if you get acceptable defaults (though you'll be racing against the OpenAL playback mechanism, and any lag-outs in your app could throw it off, so do this at your own risk).
There's an Apple extension version of ALBufferData that tells OpenAL to use the data you give it directly, rather than making its own local copy. You set it up like so:
typedef ALvoid AL_APIENTRY (*alBufferDataStaticProcPtr) (const ALint bid,
ALenum format,
const ALvoid* data,
ALsizei size,
ALsizei freq);
static alBufferDataStaticProcPtr alBufferDataStatic = NULL;
alBufferDataStatic = (alBufferDataStaticProcPtr) alcGetProcAddress(NULL, (const ALCchar*) "alBufferDataStatic");
Call alBufferDataStatic() it like you would call alBufferData():
alBufferDataStatic(bufferId, format, data, size, frequency);
Since it's now using your sound data buffer rather than its own, you could conceivably modify that data and it won't be the wiser (provided you're not modifying things too close to where it's currently playing from in the buffer).
However, this approach is risky, since it depends on timing you're not fully in control of. To be 100% safe you'll need to use Audio Units.

Resources