How to encode and decode Real-time Audio using OpusCodec in IOS?

How to encode and decode Real-time Audio using OpusCodec in IOS? - ios

I am working on a app which has following requirements :
Record real time audio from iOS device (iPhone)
Encode this audio data to Opus data and send it to server over WebSocket
Decode received data to pcm again
Play received audio from WebSocket server on iOS device(iPhone)
I've used AVAudioEngine for this.
var engine = AVAudioEngine()
var input: AVAudioInputNode = engine.inputNode
var format: AVAudioFormat = input.outputFormat(forBus: AVAudioNodeBus(0))
input.installTap(onBus: AVAudioNodeBus(0), bufferSize: AVAudioFrameCount(8192), format: format, block: { buf, when in
// ‘buf' contains audio captured from input node at time 'when'
})
// start engine
And I converted AVAudioPCMBuffer to Data using this function
func toData(PCMBuffer: AVAudioPCMBuffer) -> Data {
let channelCount = 1
let channels = UnsafeBufferPointer(start: PCMBuffer.floatChannelData, count: channelCount)
let ch0Data = NSData(bytes: channels[0], length:Int(PCMBuffer.frameLength * PCMBuffer.format.streamDescription.pointee.mBytesPerFrame))
return ch0Data as Data
}
I've found Opus Library from CocoaPod libopus libopus
I have searched a lot on how to use OpusCodec in IOS but haven't got the solution.
How to encode and decode this data using OpusCodec ? And Am I need jitterBuffer ? If I need How to use it in IOS
This Code for Opus Codec but voice doesn't clear
#import "OpusManager.h"
#import <opus/opus.h>
#define SAMPLE_RATE 16000
#define CHANNELS 1
#define BITRATE SAMPLE_RATE * CHANNELS
/**
* Audio frame size
* It is divided by time. When calling, you must use the audio data of
exactly one frame (multiple of 2.5ms: 2.5, 5, 10, 20, 40, 60ms).
* Fs/ms 2.5 5 10 20 40 60
* 8kHz 20 40 80 160 320 480
* 16kHz 40 80 160 320 640 960
* 24KHz 60 120 240 480 960 1440
* 48kHz 120 240 480 960 1920 2880
*/
#define FRAME_SIZE 320
#define APPLICATION OPUS_APPLICATION_VOIP
#define MAX_PACKET_BYTES (FRAME_SIZE * CHANNELS * sizeof(float))
#define MAX_FRAME_SIZE (FRAME_SIZE * CHANNELS * sizeof(float))
typedef opus_int16 OPUS_DATA_SIZE_T;
#implementation OpusManager {
OpusEncoder *_encoder;
OpusDecoder *_decoder;
}
int size;
int error;
unsigned char encodedPacket[MAX_PACKET_BYTES];
- (instancetype)init {
self = [super init];
if (self) {
size = opus_encoder_get_size(CHANNELS);
_encoder = malloc(size);
error = opus_encoder_init(_encoder, SAMPLE_RATE, CHANNELS, APPLICATION);
_encoder = opus_encoder_create(SAMPLE_RATE, CHANNELS, APPLICATION, &error);
_decoder = opus_decoder_create(SAMPLE_RATE, CHANNELS, &error);
opus_encoder_ctl(_encoder, OPUS_SET_BITRATE(BITRATE));
opus_encoder_ctl(_encoder, OPUS_SET_COMPLEXITY(10));
opus_encoder_ctl(_encoder, OPUS_SET_SIGNAL(OPUS_SIGNAL_VOICE));
opus_encoder_ctl(_encoder, OPUS_SET_VBR(0));
opus_encoder_ctl(_encoder, OPUS_SET_APPLICATION(APPLICATION));
opus_encoder_ctl(_encoder, OPUS_SET_DTX(1));
opus_encoder_ctl(_encoder, OPUS_SET_INBAND_FEC(0));
opus_encoder_ctl(_encoder, OPUS_SET_BANDWIDTH(12000));
opus_encoder_ctl(_encoder, OPUS_SET_PACKET_LOSS_PERC(1));
opus_encoder_ctl(_encoder, OPUS_SET_INBAND_FEC(1));
opus_encoder_ctl(_encoder, OPUS_SET_FORCE_CHANNELS(CHANNELS));
opus_encoder_ctl(_encoder, OPUS_SET_PACKET_LOSS_PERC(1));
}
return self;
}
- (NSData *)encode:(NSData *)PCM {
opus_int16 *PCMPtr = (opus_int16 *)PCM.bytes;
int PCMSize = (int)PCM.length / sizeof(opus_int16);
opus_int16 *PCMEnd = PCMPtr + PCMSize;
NSMutableData *mutData = [NSMutableData data];
unsigned char encodedPacket[MAX_PACKET_BYTES];
// Record opus block size
OPUS_DATA_SIZE_T encodedBytes = 0;
while (PCMPtr + FRAME_SIZE < PCMEnd) {
encodedBytes = opus_encode_float(_encoder, (const float *) PCMPtr, FRAME_SIZE, encodedPacket, MAX_PACKET_BYTES);
if (encodedBytes <= 0) {
NSLog(#"ERROR: encodedBytes<=0");
return nil;
}
NSLog(#"encodedBytes: %d", encodedBytes);
// Save the opus block size
[mutData appendBytes:&encodedBytes length:sizeof(encodedBytes)];
// Save opus data
[mutData appendBytes:encodedPacket length:encodedBytes];
PCMPtr += FRAME_SIZE;
}
NSLog(#"mutData: %lu", (unsigned long)mutData.length);
NSLog(#"encodedPacket: %s", encodedPacket);
return mutData.length > 0 ? mutData : nil;
}
- (NSData *)decode:(NSData *)opus {
unsigned char *opusPtr = (unsigned char *)opus.bytes;
int opusSize = (int)opus.length;
unsigned char *opusEnd = opusPtr + opusSize;
NSMutableData *mutData = [NSMutableData data];
float decodedPacket[MAX_FRAME_SIZE];
int decodedSamples = 0;
// Save data for opus block size
OPUS_DATA_SIZE_T nBytes = 0;
while (opusPtr < opusEnd) {
// Take out the opus block size data
nBytes = *(OPUS_DATA_SIZE_T *)opusPtr;
opusPtr += sizeof(nBytes);
decodedSamples = opus_decode_float(_decoder, opusPtr, nBytes,decodedPacket, MAX_FRAME_SIZE, 0);
if (decodedSamples <= 0) {
NSLog(#"ERROR: decodedSamples<=0");
return nil;
}
NSLog(#"decodedSamples:%d", decodedSamples);
[mutData appendBytes:decodedPacket length:decodedSamples *sizeof(opus_int16)];
opusPtr += nBytes;
}
NSLog(#"mutData: %lu", (unsigned long)mutData.length);
return mutData.length > 0 ? mutData : nil;
}
#end

Try to lower the bandwidth or set an higher bitrate. I think 16kbit for a 12kHz bandwidth mono audio is probably too low. Think it will be better to leave bandwidth to auto with application VOIP setted. There could be other problems around but the "doesn't sound good" is not enough to analyze.

I would suggest playing around with bitrate and bandwidth.
I have succeeded to make it work with the parameters described here:
https://ddanilov.me/how-to-enable-in-band-fec-for-opus-codec/.

Related

AudioFileWriteBytes performance for stereo file

I'm writing a stereo wave file with AudioFileWriteBytes (CoreAudio / iOS) and the only way I can get it to work is by calling it for each sample on each channel.
The following code works:
// Prepare the format AudioStreamBasicDescription;
AudioStreamBasicDescription asbd = {
.mSampleRate = session.samplerate,
.mFormatID = kAudioFormatLinearPCM,
.mFormatFlags = kAudioFormatFlagIsBigEndian| kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked,
.mChannelsPerFrame = 2,
.mBitsPerChannel = 16,
.mFramesPerPacket = 1, // Always 1 for uncompressed formats
.mBytesPerPacket = 4, // 16 bits for 2 channels = 4 bytes
.mBytesPerFrame = 4 // 16 bits for 2 channels = 4 bytes
};
// Set up the file
AudioFileID audioFile;
OSStatus audioError = noErr;
audioError = AudioFileCreateWithURL((__bridge CFURLRef)fileURL, kAudioFileAIFFType, &asbd, kAudioFileFlags_EraseFile, &audioFile);
if (audioError != noErr) {
NSLog(#"Error creating file");
return;
}
// Write samples
UInt64 currentFrame = 0;
while (currentFrame < totalLengthInFrames) {
UInt64 numberOfFramesToWrite = totalLengthInFrames - currentFrame;
if (numberOfFramesToWrite > 2048) {
numberOfFramesToWrite = 2048;
}
UInt32 sampleByteCount = sizeof(int16_t);
UInt32 bytesToWrite = (UInt32)numberOfFramesToWrite * sampleByteCount;
int16_t *sampleBufferLeft = (int16_t *)malloc(bytesToWrite);
int16_t *sampleBufferRight = (int16_t *)malloc(bytesToWrite);
// Some magic to fill the buffers
for (int j = 0; j < numberOfFramesToWrite; j++) {
int16_t left = CFSwapInt16HostToBig(sampleBufferLeft[j]);
int16_t right = CFSwapInt16HostToBig(sampleBufferRight[j]);
audioError = AudioFileWriteBytes(audioFile, false, (currentFrame + j) * 4, &sampleByteCount, &left);
assert(audioError == noErr);
audioError = AudioFileWriteBytes(audioFile, false, (currentFrame + j) * 4 + 2, &sampleByteCount, &right);
assert(audioError == noErr);
}
free(sampleBufferLeft);
free(sampleBufferRight);
currentFrame += numberOfFramesToWrite;
}
However, it is (obviously) very slow and inefficient.
I can't find anything on how to use it with a big buffer so that I can write more than a single sample while also writing 2 channels.
I tried making a buffer going LRLRLRLR (left / right), and then write that with just one AudioFileWriteBytes call. I expected that to work, but it produced a file filled with noise.
This is the code:
UInt64 currentFrame = 0;
UInt64 bytePos = 0;
while (currentFrame < totalLengthInFrames) {
UInt64 numberOfFramesToWrite = totalLengthInFrames - currentFrame;
if (numberOfFramesToWrite > 2048) {
numberOfFramesToWrite = 2048;
}
UInt32 sampleByteCount = sizeof(int16_t);
UInt32 bytesInBuffer = (UInt32)numberOfFramesToWrite * sampleByteCount;
UInt32 bytesInOutputBuffer = (UInt32)numberOfFramesToWrite * sampleByteCount * 2;
int16_t *sampleBufferLeft = (int16_t *)malloc(bytesInBuffer);
int16_t *sampleBufferRight = (int16_t *)malloc(bytesInBuffer);
int16_t *outputBuffer = (int16_t *)malloc(bytesInOutputBuffer);
// Some magic to fill the buffers
for (int j = 0; j < numberOfFramesToWrite; j++) {
int16_t left = CFSwapInt16HostToBig(sampleBufferLeft[j]);
int16_t right = CFSwapInt16HostToBig(sampleBufferRight[j]);
outputBuffer[(j * 2)] = left;
outputBuffer[(j * 2) + 1] = right;
}
audioError = AudioFileWriteBytes(audioFile, false, bytePos, &bytesInOutputBuffer, &outputBuffer);
assert(audioError == noErr);
free(sampleBufferLeft);
free(sampleBufferRight);
free(outputBuffer);
bytePos += bytesInOutputBuffer;
currentFrame += numberOfFramesToWrite;
}
I also tried to just write the buffers at once (2048*L, 2048*R, etc.) which I did not expect to work, and it didn't.
How do I speed this up AND get a working wave file?

I tried making a buffer going LRLRLRLR (left / right), and then write that with just one AudioFileWriteBytes call.
This is the correct approach if using (the rather difficult) Audio File Services.
If possible, instead of the very low level Audio File Services, use Extended Audio File Services. It is a wrapper around Audio File Services that has built in format converters. Or even better yet, use AVAudioFile it is a wrapper around Extended Audio File Services that covers most common use cases.
If you are set on using Audio File Services, you'll have to interleave the audio manually like you had tried. Maybe show the code where you attempted this.

Opus for iOS, crashing with 16000 sample rate

I am developing Voip application with Opus for iOS (Objective-C and C++).
It works fine with 8000, 12000, 24000 and 48000 sampling rate except with 16000, where the application crashes on opus_encode method.
Here is what i am doing:
m_oAudioSession = [AVAudioSession sharedInstance];
[m_oAudioSession setCategory:AVAudioSessionCategoryPlayAndRecord error:&m_oError];
[m_oAudioSession setMode:AVAudioSessionModeVoiceChat error:&m_oError];
[m_oAudioSession setPreferredSampleRate:VOIP_AUDIO_DRIVER_DEFAULT_SAMPLE_RATE error:&m_oError];
[m_oAudioSession setPreferredInputNumberOfChannels:VOIP_AUDIO_DRIVER_DEFAULT_INPUT_CHANNELS error:&m_oError];
[m_oAudioSession setPreferredOutputNumberOfChannels:VOIP_AUDIO_DRIVER_DEFAULT_OUTPUT_CHANNELS error:&m_oError];
[m_oAudioSession setPreferredIOBufferDuration:VOIP_AUDIO_DRIVER_DEFAULT_BUFFER_DURATION error:&m_oError];
[m_oAudioSession setActive:YES error:&m_oError];
Constants:
VOIP_AUDIO_DRIVER_DEFAULT_SAMPLE_RATE is 16000
VOIP_AUDIO_DRIVER_DEFAULT_INPUT_CHANNELS is 1
VOIP_AUDIO_DRIVER_DEFAULT_OUTPUT_CHANNELS is 1
VOIP_AUDIO_DRIVER_DEFAULT_BUFFER_DURATION is 0.02
VOIP_AUDIO_DRIVER_FRAMES_PER_PACKET is 1
After that i am using a real sampling rate and buffer duration from m_oAudioSession.sampleRate and m_oAudioSession.IOBufferDuration. They are set into m_fSampleRate and m_fBufferDuration variables.
The configurations are:
//Describes audio component:
m_sAudioDescription.componentType = kAudioUnitType_Output;
m_sAudioDescription.componentSubType = kAudioUnitSubType_VoiceProcessingIO/*kAudioUnitSubType_RemoteIO*/;
m_sAudioDescription.componentFlags = 0;
m_sAudioDescription.componentFlagsMask = 0;
m_sAudioDescription.componentManufacturer = kAudioUnitManufacturer_Apple;
m_sAudioFormat.mSampleRate = m_fSampleRate;
m_sAudioFormat.mFormatID = kAudioFormatLinearPCM;
m_sAudioFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
m_sAudioFormat.mFramesPerPacket = VOIP_AUDIO_DRIVER_FRAMES_PER_PACKET;
m_sAudioFormat.mChannelsPerFrame = VOIP_AUDIO_DRIVER_DEFAULT_INPUT_CHANNELS;
m_sAudioFormat.mBitsPerChannel = (UInt32)(8 * m_iBytesPerSample);
m_sAudioFormat.mBytesPerFrame = (UInt32)((m_sAudioFormat.mBitsPerChannel / 8) * m_sAudioFormat.mChannelsPerFrame);
m_sAudioFormat.mBytesPerPacket = m_sAudioFormat.mBytesPerFrame * m_sAudioFormat.mFramesPerPacket;
m_sAudioFormat.mReserved = 0;
The calculations i make are:
m_iBytesPerSample = sizeof(/*AudioSampleType*/SInt16);
//Calculating buffer size:
int samplesPerFrame = (int)(m_fBufferDuration * m_fSampleRate) + 1;
m_iBufferSizeBytes = samplesPerFrame * m_iBytesPerSample;
//Allocating input buffer:
UInt32 inputBufferListSize = offsetof(AudioBufferList, mBuffers[0]) + (sizeof(AudioBuffer) * m_sAudioFormat.mChannelsPerFrame);
m_sInputBuffer = (AudioBufferList *)VoipAlloc(inputBufferListSize);
m_sInputBuffer->mNumberBuffers = m_sAudioFormat.mChannelsPerFrame;
//Pre-mallocating buffers for AudioBufferLists
for(VoipUInt32 tmp_int1 = 0; tmp_int1 < m_sInputBuffer->mNumberBuffers; tmp_int1++)
{
m_sInputBuffer->mBuffers[tmp_int1].mNumberChannels = VOIP_AUDIO_DRIVER_DEFAULT_INPUT_CHANNELS;
m_sInputBuffer->mBuffers[tmp_int1].mDataByteSize = (UInt32)m_iBufferSizeBytes;
m_sInputBuffer->mBuffers[tmp_int1].mData = VoipAlloc(m_iBufferSizeBytes);
memset(m_sInputBuffer->mBuffers[tmp_int1].mData, 0, m_iBufferSizeBytes);
}
The reading and writing from audio unit are done using m_sInputBuffer.
Here is the Opus creation:
m_oEncoder = opus_encoder_create(m_iSampleRate, m_iNumberOfChannels, VOIP_AUDIO_CODECS_OPUS_APPLICATION_TYPE, &_error);
if (_error < 0)
{
fprintf(stderr, "VoipAudioCodecs error: failed to create an encoder: %s\n", opus_strerror(_error));
return;
}
_error = opus_encoder_ctl(m_oEncoder, OPUS_SET_BITRATE(VOIP_AUDIO_CODECS_OPUS_BITRATE));
if (_error < 0)
{
fprintf(stderr, "VoipAudioCodecs error: failed to set the bitrate: %s\n", opus_strerror(_error));
return;
}
m_oDecoder = opus_decoder_create(m_iSampleRate, m_iNumberOfChannels, &_error);
if (_error < 0)
{
fprintf(stderr, "VoipAudioCodecs error: failed to create a decoder: %s\n", opus_strerror(_error));
return;
}
Opus configurations are:
VOIP_AUDIO_CODECS_OPUS_BITRATE is OPUS_BITRATE_MAX
//64000 //70400 //84800 //112000
VOIP_AUDIO_CODECS_OPUS_APPLICATION_TYPE is OPUS_APPLICATION_VOIP
//OPUS_APPLICATION_AUDIO
VOIP_AUDIO_CODECS_OPUS_MAX_FRAME_SIZE is 5760
//Minimum: (120ms; 5760 for 48kHz)
VOIP_AUDIO_CODECS_OPUS_BYTES_SIZE is 960
//120, 240, 480, 960, 1920, 2880
When i encode and decode i use these methods:
Encode_Opus(VoipInt16* rawSamples, int rawSamplesSize)
{
unsigned char encodedData[m_iMaxPacketSize];
VoipInt32 bytesEncoded;
int frameSize = rawSamplesSize / m_iBytesPerSample;
bytesEncoded = opus_encode(m_oEncoder, rawSamples, frameSize, encodedData, m_iMaxPacketSize);
if (bytesEncoded < 0)
{
fprintf(stderr, "VoipAudioCodecs error: encode failed: %s\n", opus_strerror(bytesEncoded));
return nullptr;
}
sVoipAudioCodecOpusEncoded* resultStruct = (sVoipAudioCodecOpusEncoded* )VoipAlloc(sizeof(sVoipAudioCodecOpusEncoded));
resultStruct->m_data = (unsigned char*)VoipAlloc(bytesEncoded);
memcpy(resultStruct->m_data, encodedData, bytesEncoded);
resultStruct->m_dataSize = bytesEncoded;
return resultStruct;
}
Decode_Opus(void* encodedSamples, VoipInt32 encodedSamplesSize)
{
VoipInt16 decodedPacket[VOIP_AUDIO_CODECS_OPUS_MAX_FRAME_SIZE];
int _frameSize = opus_decode(m_oDecoder, (const unsigned char*)encodedSamples, encodedSamplesSize, decodedPacket, VOIP_AUDIO_CODECS_OPUS_MAX_FRAME_SIZE, 0);
if (_frameSize < 0)
{
fprintf(stderr, "VoipAudioCodecs error: decoder failed: %s\n", opus_strerror(_frameSize));
return nullptr;
}
size_t frameSize = (size_t)_frameSize;
sVoipAudioCodecOpusDecoded* resultStruct = (sVoipAudioCodecOpusDecoded* )VoipAlloc(sizeof(sVoipAudioCodecOpusDecoded));
resultStruct->m_data = (VoipInt16*)VoipAlloc(frameSize * m_iBytesPerSample);
memcpy(resultStruct->m_data, decodedPacket, (frameSize * m_iBytesPerSample));
resultStruct->m_dataSize = frameSize * m_iBytesPerSample;
return resultStruct;
}
When the app should send data:
VoipUInt32 itemsForProcess = inputAudioQueue->getItemCount();
for (int tmp_queueItems = 0; tmp_queueItems < itemsForProcess; tmp_queueItems++)
{
sVoipQueue* tmp_samples = inputAudioQueue->popItem();
m_oCircularTempInputBuffer->writeDataToBuffer(tmp_samples->m_pData, tmp_samples->m_iDataSize);
while (void* tmp_buffer = m_oCircularTempInputBuffer->readDataFromBuffer(VOIP_AUDIO_CODECS_OPUS_BYTES_SIZE))
{
sVoipAudioCodecOpusEncoded* encodedSamples = Encode_Opus((VoipInt16*)tmp_buffer, VOIP_AUDIO_CODECS_OPUS_BYTES_SIZE);
//Then packeting and the real sending using tcp socket…
}
//Rest of the code…
}
Here is the reading:
sVoipAudioCodecOpusDecoded* decodedSamples = Decode_Opus(inputPacket->m_pPacketData, (VoipInt32)inputPacket->m_iPacketSize);
if (decodedSamples != nullptr)
{
m_oCircularTempOutputBuffer->writeDataToBuffer(decodedSamples->m_data, decodedSamples->m_dataSize);
VoipFree((void**)&decodedSamples->m_data);
VoipFree((void**)&decodedSamples);
}
while (void* tmp_buffer = m_oCircularTempOutputBuffer->readDataFromBuffer(m_iBufferSizeBytes))
{
outputAudioQueue->pushItem(tmp_buffer, m_iBufferSizeBytes);
}
inputAudioQueue is queue with recorded data from my audio unit’s callback.
outputAudioQueue is a queue used from my audio unit’s callback to play the sound.
m_iMaxPacketSize is the same as m_iBufferSizeBytes.
My questions are:
I was wondering, are my calculations correct?
And if not, how can i improve them?
Do you see any mistake into the code?
Do you have a suggestion for fixing the crash bug on opus_encode method when the sampling rate is set to 16000?
Thank you in advance.
PS. I made some tests with sampling rate on 16000 and found this: If i use this formula: frame_duration = frame_size / sample rate, and if i set frame_duration to preferedIOBufferDuration: 120 / 16000 = 0.0075 //AVAudioSession sets 0.008000 —— Crashes
240 / 16000 = 0.015 //AVAudioSession sets 0.016000 —— Crashes
480 / 16000 = 0.03 //AVAudioSession sets 0.032000 —— Crashes
960 / 16000 = 0.06 //AVAudioSession sets 0.064000 —— Crashes
1920 / 16000 = 0.12 //AVAudioSession sets 0.128000 —— Works
2880 / 16000 = 0.18 //AVAudioSession sets 0.128000 —— CrashesThen i found that there is no encoder crash with sampling rate 16000 and preferredIOBufferDuration 0.12(1920), where AVAudioSession sets 0.128000. So it works only in this case.
Any ideas ?

iOS using AudioQueueNewOutput to output sound in both left and right channel

I am developing an app to show sin wave.
I am using AudioQueueNewOutput to output mono sound is OK, but when I come to stereo output, I have no idea how to do it.
I know the mChannelsPerFrame = 2 can generate wave in both left and right channel.
I also want to know what is the sequence of sending bytes to left and right channel? Is the first byte to left channel and the second byte to right channel?
Code:
_audioFormat = new AudioStreamBasicDescription();
_audioFormat->mSampleRate = SAMPLE_RATE; // 44100
_audioFormat->mFormatID = kAudioFormatLinearPCM;
_audioFormat->mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
_audioFormat->mFramesPerPacket = 1;
_audioFormat->mChannelsPerFrame = NUM_CHANNELS; // 1
_audioFormat->mBitsPerChannel = BITS_PER_CHANNEL; // 16
_audioFormat->mBytesPerPacket = BYTES_PER_FRAME; // 2
_audioFormat->mBytesPerFrame = BYTES_PER_FRAME; // 2
and
_sineTableLength = _audioFormat.mSampleRate / SAMPLE_LIMIT_FACTOR; // 44100/100 = 441
_sineTable = new SInt16[_sineTableLength];
for(int i = 0; i < _sineTableLength; i++)
{
// Transfer values between -1.0 and 1.0 to integer values between -sample max and sample max
_sineTable[i] = (SInt16)(sin(i * 2 * M_PI / _sineTableLength) * 32767);
}
and
AudioQueueNewOutput (&_audioFormat,
playbackCallback,
(__bridge void *)(self),
nil,
nil,
0,
&_queueObject);
static void playbackCallback (void* inUserData,
AudioQueueRef inAudioQueue,
AudioQueueBufferRef bufferReference){
SInt16* sample = (SInt16*)bufferReference->mAudioData;
// bufferSize 1024
for(int i = 0; i < bufferSize; i += _audioFormat.mBytesPerFrame, sample++)
{
// set value for *sample
// 9ms sin wave and 4.5ms 0
...
}
...
AudioQueueEnqueueBuffer(...)
}

Several days later,I have found the answer.
First: AudioStreamBasicDescription can set just like this ;
Then: bufferSize change from 1024 to 2048 ;
And: SInt16 in SInt16* sample = (SInt16*)bufferReference->mAudioData; all change to SInt32. Because the channel double,the bits double;
Last: Each 16 bits contains data that left or right channel need in sample,just feed it whatever you want.

Audio Queue Interface can deal with 40ms Audio Frame?

I am trying to record the audio in segment of 40ms.
Audio Queue Interface can deal with 40ms Audio Frame?
If 'Yes' then how can we achieve it?
Thanks.

yes, its possible , you need to set configure AudioQueue Accordingly,
Basically the AudioQueue Buffer size, has to be set for 40ms, so it would be around,
int AQRecorder::ComputeRecordBufferSize(const AudioStreamBasicDescription *format, float seconds)
{
int packets, frames, bytes = 0;
try {
frames = (int)ceil(seconds * format->mSampleRate);
if (format->mBytesPerFrame > 0)
bytes = frames * format->mBytesPerFrame;
else {
UInt32 maxPacketSize;
if (format->mBytesPerPacket > 0)
maxPacketSize = format->mBytesPerPacket; // constant packet size
else {
UInt32 propertySize = sizeof(maxPacketSize);
XThrowIfError(AudioQueueGetProperty(mQueue, kAudioQueueProperty_MaximumOutputPacketSize, &maxPacketSize,
&propertySize), "couldn't get queue's maximum output packet size");
}
if (format->mFramesPerPacket > 0)
packets = frames / format->mFramesPerPacket;
else
packets = frames; // worst-case scenario: 1 frame in a packet
if (packets == 0) // sanity check
packets = 1;
bytes = packets * maxPacketSize;
}
} catch (CAXException e) {
char buf[256];
return 0;
}
return bytes;
}
and to set the format,
void AQRecorder::SetupAudioFormat(UInt32 inFormatID)
{
AudioStreamBasicDescription sRecordFormat;
FillOutASBDForLPCM (sRecordFormat,
SAMPLING_RATE,
1,
8*BYTES_PER_PACKET,
8*BYTES_PER_PACKET,
false,
false
);
memset(&mRecordFormat, 0, sizeof(mRecordFormat));
mRecordFormat.SetFrom(sRecordFormat);
}
for my case, values of these Macros are,
#define SAMPLING_RATE 16000
#define kNumberRecordBuffers 3
#define BYTES_PER_PACKET 2

AudioConverterFillComplexBuffer work with Internet streamed mp3

I am currently streaming mp3 audio through the Internet. I am using AudioFileStream to parse the mp3 steam
comes through a CFReadStreamRef, decode the mp3 using AudioConverterFillComplexBuffer and copy the converted PCM
data into a ring buffer and finally play the PCM using RemoteIO.
The problem I am currently facing is the AudioConverterFillComplexBuffer always returns 0 (no error) but the conversion
result seems incorrect. In details, I can notice,
A. The UInt32 *ioOutputDataPacketSize keeps the same value I sent in.
B. The convertedData.mBuffers[0].mDataByteSize always been set to the size of the outputbuffer (doesn't matter how big the buffer is).
C. I can only hear clicking noise with the output data.
Below is my procedures for rendering the audio.
The same procedure works for my Audio queue implementation so I believe
I didn't something wrong in either the place I invoking the AudioConverterFillComplexBuffer or the callback of AudioConverterFillComplexBuffer.
I have been stuck on this issue for a long time. Any help will be highly appreciated.
Open a AudioFileStream.
// create an audio file stream parser
AudioFileTypeID fileTypeHint = kAudioFileMP3Type;
AudioFileStreamOpen(self, MyPropertyListenerProc, MyPacketsProc, fileTypeHint, &audioFileStream);
Handle the parsed data in the callback function ("MyPacketsProc").
void MyPacketsProc(void * inClientData,
UInt32 inNumberBytes,
UInt32 inNumberPackets,
const void * inInputData,
AudioStreamPacketDescription *inPacketDescriptions)
{
#synchronized(self)
{
// Init the audio converter.
if (!audioConverter)
AudioConverterNew(&asbd, &asbd_out, &audioConverter);
struct mp3Data mSettings;
memset(&mSettings, 0, sizeof(mSettings));
UInt32 packetsPerBuffer = 0;
UInt32 outputBufferSize = 1024 * 32; // 32 KB is a good starting point.
UInt32 sizePerPacket = asbd.mBytesPerPacket;
// Calculate the size per buffer.
// Variable Bit Rate Data.
if (sizePerPacket == 0)
{
UInt32 size = sizeof(sizePerPacket);
AudioConverterGetProperty(audioConverter, kAudioConverterPropertyMaximumOutputPacketSize, &size, &sizePerPacket);
if (sizePerPacket > outputBufferSize)
outputBufferSize = sizePerPacket;
packetsPerBuffer = outputBufferSize / sizePerPacket;
}
//CBR
else
packetsPerBuffer = outputBufferSize / sizePerPacket;
// Prepare the input data for the callback.
mSettings.inputBuffer.mDataByteSize = inNumberBytes;
mSettings.inputBuffer.mData = (void *)inInputData;
mSettings.inputBuffer.mNumberChannels = 1;
mSettings.numberPackets = inNumberPackets;
mSettings.packetDescription = inPacketDescriptions;
// Set up our output buffers
UInt8 * outputBuffer = (UInt8*)malloc(sizeof(UInt8) * outputBufferSize);
memset(outputBuffer, 0, outputBufferSize);
// describe output data buffers into which we can receive data.
AudioBufferList convertedData;
convertedData.mNumberBuffers = 1;
convertedData.mBuffers[0].mNumberChannels = 1;
convertedData.mBuffers[0].mDataByteSize = outputBufferSize;
convertedData.mBuffers[0].mData = outputBuffer;
// Convert.
UInt32 ioOutputDataPackets = packetsPerBuffer;
OSStatus result = AudioConverterFillComplexBuffer(audioConverter,
converterComplexInputDataProc,
&mSettings,
&ioOutputDataPackets,
&convertedData,
NULL
);
// Enqueue the ouput pcm data.
TPCircularBufferProduceBytes(&m_pcmBuffer, convertedData.mBuffers[0].mData, convertedData.mBuffers[0].mDataByteSize);
free(outputBuffer);
}
}
Feed the audio converter from its callback function ("converterComplexInputDataProc").
OSStatus converterComplexInputDataProc(AudioConverterRef inAudioConverter,
UInt32* ioNumberDataPackets,
AudioBufferList* ioData,
AudioStreamPacketDescription** ioDataPacketDescription,
void* inUserData)
{
struct mp3Data THIS = (struct mp3Data) inUserData;
if (THIS->inputBuffer.mDataByteSize > 0)
{
*ioNumberDataPackets = THIS->numberPackets;
ioData->mNumberBuffers = 1;
ioData->mBuffers[0].mDataByteSize = THIS->inputBuffer.mDataByteSize;
ioData->mBuffers[0].mData = THIS->inputBuffer.mData;
ioData->mBuffers[0].mNumberChannels = 1;
if (ioDataPacketDescription)
*ioDataPacketDescription = THIS->packetDescription;
}
else
*ioDataPacketDescription = 0;
return 0;
}
Playback using the RemoteIO component.
The input and output AudioStreamBasicDescription.
Input:
Sample Rate: 16000
Format ID: .mp3
Format Flags: 0
Bytes per Packet: 0
Frames per Packet: 576
Bytes per Frame: 0
Channels per Frame: 1
Bits per Channel: 0
output:
Sample Rate: 44100
Format ID: lpcm
Format Flags: 3116
Bytes per Packet: 4
Frames per Packet: 1
Bytes per Frame: 4
Channels per Frame: 1
Bits per Channel: 32

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

How to encode and decode Real-time Audio using OpusCodec in IOS? - ios

Try to lower the bandwidth or set an higher bitrate. I think 16kbit for a 12kHz bandwidth mono audio is probably too low. Think it will be better to leave bandwidth to auto with application VOIP setted. There could be other problems around but the "doesn't sound good" is not enough to analyze.

I would suggest playing around with bitrate and bandwidth. I have succeeded to make it work with the parameters described here: https://ddanilov.me/how-to-enable-in-band-fec-for-opus-codec/.

Related

AudioFileWriteBytes performance for stereo file

Opus for iOS, crashing with 16000 sample rate

iOS using AudioQueueNewOutput to output sound in both left and right channel

Audio Queue Interface can deal with 40ms Audio Frame?

AudioConverterFillComplexBuffer work with Internet streamed mp3

Categories

Resources