H264 Video Streaming over RTMP on iOS

H264 Video Streaming over RTMP on iOS - ios

With a bit of digging, I have found a library that extracts NAL units from .mp4 file while it is being written. I'm attempting to packetize this information to flv over RTMP using libavformat and libavcodec. I setup a video stream using:
-(void)setupVideoStream {
int ret = 0;
videoCodec = avcodec_find_decoder(STREAM_VIDEO_CODEC);
if (videoCodec == nil) {
NSLog(#"Could not find encoder %i", STREAM_VIDEO_CODEC);
return;
}
videoStream = avformat_new_stream(oc, videoCodec);
videoCodecContext = videoStream->codec;
videoCodecContext->codec_type = AVMEDIA_TYPE_VIDEO;
videoCodecContext->codec_id = STREAM_VIDEO_CODEC;
videoCodecContext->pix_fmt = AV_PIX_FMT_YUV420P;
videoCodecContext->profile = FF_PROFILE_H264_BASELINE;
videoCodecContext->bit_rate = 512000;
videoCodecContext->bit_rate_tolerance = 0;
videoCodecContext->width = STREAM_WIDTH;
videoCodecContext->height = STREAM_HEIGHT;
videoCodecContext->time_base.den = STREAM_TIME_BASE;
videoCodecContext->time_base.num = 1;
videoCodecContext->gop_size = STREAM_GOP;
videoCodecContext->has_b_frames = 0;
videoCodecContext->ticks_per_frame = 2;
videoCodecContext->qcompress = 0.6;
videoCodecContext->qmax = 51;
videoCodecContext->qmin = 10;
videoCodecContext->max_qdiff = 4;
videoCodecContext->i_quant_factor = 0.71;
if (oc->oformat->flags & AVFMT_GLOBALHEADER)
videoCodecContext->flags |= CODEC_FLAG_GLOBAL_HEADER;
videoCodecContext->extradata = avcCHeader;
videoCodecContext->extradata_size = avcCHeaderSize;
ret = avcodec_open2(videoStream->codec, videoCodec, NULL);
if (ret < 0)
NSLog(#"Could not open codec!");
}
Then I connect, and each time the library extracts a NALU, it returns an array holding one or two NALUs to my RTMPClient. The method that handles the actual streaming looks like this:
-(void)writeNALUToStream:(NSArray*)data time:(double)pts {
int ret = 0;
uint8_t *buffer = NULL;
int bufferSize = 0;
// Number of NALUs within the data array
int numNALUs = [data count];
// First NALU
NSData *fNALU = [data objectAtIndex:0];
int fLen = [fNALU length];
// If there is more than one NALU...
if (numNALUs > 1) {
// Second NALU
NSData *sNALU = [data objectAtIndex:1];
int sLen = [sNALU length];
// Allocate a buffer the size of first data and second data
buffer = av_malloc(fLen + sLen);
// Copy the first data bytes of fLen into the buffer
memcpy(buffer, [fNALU bytes], fLen);
// Copy the second data bytes of sLen into the buffer + fLen + 1
memcpy(buffer + fLen + 1, [sNALU bytes], sLen);
// Update the size of the buffer
bufferSize = fLen + sLen;
}else {
// Allocate a buffer the size of first data
buffer = av_malloc(fLen);
// Copy the first data bytes of fLen into the buffer
memcpy(buffer, [fNALU bytes], fLen);
// Update the size of the buffer
bufferSize = fLen;
}
// Initialize the packet
av_init_packet(&pkt);
//av_packet_from_data(&pkt, buffer, bufferSize);
// Set the packet data to the buffer
pkt.data = buffer;
pkt.size = bufferSize;
pkt.pts = pts;
// Stream index 0 is the video stream
pkt.stream_index = 0;
// Add a key frame flag every 15 frames
if ((processedFrames % 15) == 0)
pkt.flags |= AV_PKT_FLAG_KEY;
// Write the frame to the stream
ret = av_interleaved_write_frame(oc, &pkt);
if (ret < 0)
NSLog(#"Error writing frame %i to stream", processedFrames);
else {
// Update the number of frames successfully streamed
frameCount++;
// Update the number of bytes successfully sent
bytesSent += pkt.size;
}
// Update the number of frames processed
processedFrames++;
// Update the number of bytes processed
processedBytes += pkt.size;
free((uint8_t*)buffer);
// Free the packet
av_free_packet(&pkt);
}
After about 100 or so frames, I get an error:
malloc: *** error for object 0xe5bfa0: incorrect checksum for freed object - object was probably modified after being freed.
*** set a breakpoint in malloc_error_break to debug
Which I cannot seem to stop from happening. I've tried commenting out the av_free_packet() method and the free() along with trying to use av_packet_from_data() rather than initializing the packet and setting the data and size values.
My question is; how can I stop this error from happening, and according to wireshark, these are proper RTMP h264 packets, but they do not play anything more than a black screen. Is there some glaring error that I am overlooking?

It looks to me like you are overflowing your buffer and corrupting you stream here:
memcpy(buffer + fLen + 1, [sNALU bytes], sLen);
You are allocating fLen + sLen bytes then writing fLen + sLen + 1 bytes. Just get rid of the + 1.
Because your AVPacket is allocated on the stack av_free_packet() is not needed.
Finally, it is considered good practice to to allocate extra bytes for libav. av_malloc(size + FF_INPUT_BUFFER_PADDING_SIZE )

Related

AudioFileWriteBytes performance for stereo file

I'm writing a stereo wave file with AudioFileWriteBytes (CoreAudio / iOS) and the only way I can get it to work is by calling it for each sample on each channel.
The following code works:
// Prepare the format AudioStreamBasicDescription;
AudioStreamBasicDescription asbd = {
.mSampleRate = session.samplerate,
.mFormatID = kAudioFormatLinearPCM,
.mFormatFlags = kAudioFormatFlagIsBigEndian| kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked,
.mChannelsPerFrame = 2,
.mBitsPerChannel = 16,
.mFramesPerPacket = 1, // Always 1 for uncompressed formats
.mBytesPerPacket = 4, // 16 bits for 2 channels = 4 bytes
.mBytesPerFrame = 4 // 16 bits for 2 channels = 4 bytes
};
// Set up the file
AudioFileID audioFile;
OSStatus audioError = noErr;
audioError = AudioFileCreateWithURL((__bridge CFURLRef)fileURL, kAudioFileAIFFType, &asbd, kAudioFileFlags_EraseFile, &audioFile);
if (audioError != noErr) {
NSLog(#"Error creating file");
return;
}
// Write samples
UInt64 currentFrame = 0;
while (currentFrame < totalLengthInFrames) {
UInt64 numberOfFramesToWrite = totalLengthInFrames - currentFrame;
if (numberOfFramesToWrite > 2048) {
numberOfFramesToWrite = 2048;
}
UInt32 sampleByteCount = sizeof(int16_t);
UInt32 bytesToWrite = (UInt32)numberOfFramesToWrite * sampleByteCount;
int16_t *sampleBufferLeft = (int16_t *)malloc(bytesToWrite);
int16_t *sampleBufferRight = (int16_t *)malloc(bytesToWrite);
// Some magic to fill the buffers
for (int j = 0; j < numberOfFramesToWrite; j++) {
int16_t left = CFSwapInt16HostToBig(sampleBufferLeft[j]);
int16_t right = CFSwapInt16HostToBig(sampleBufferRight[j]);
audioError = AudioFileWriteBytes(audioFile, false, (currentFrame + j) * 4, &sampleByteCount, &left);
assert(audioError == noErr);
audioError = AudioFileWriteBytes(audioFile, false, (currentFrame + j) * 4 + 2, &sampleByteCount, &right);
assert(audioError == noErr);
}
free(sampleBufferLeft);
free(sampleBufferRight);
currentFrame += numberOfFramesToWrite;
}
However, it is (obviously) very slow and inefficient.
I can't find anything on how to use it with a big buffer so that I can write more than a single sample while also writing 2 channels.
I tried making a buffer going LRLRLRLR (left / right), and then write that with just one AudioFileWriteBytes call. I expected that to work, but it produced a file filled with noise.
This is the code:
UInt64 currentFrame = 0;
UInt64 bytePos = 0;
while (currentFrame < totalLengthInFrames) {
UInt64 numberOfFramesToWrite = totalLengthInFrames - currentFrame;
if (numberOfFramesToWrite > 2048) {
numberOfFramesToWrite = 2048;
}
UInt32 sampleByteCount = sizeof(int16_t);
UInt32 bytesInBuffer = (UInt32)numberOfFramesToWrite * sampleByteCount;
UInt32 bytesInOutputBuffer = (UInt32)numberOfFramesToWrite * sampleByteCount * 2;
int16_t *sampleBufferLeft = (int16_t *)malloc(bytesInBuffer);
int16_t *sampleBufferRight = (int16_t *)malloc(bytesInBuffer);
int16_t *outputBuffer = (int16_t *)malloc(bytesInOutputBuffer);
// Some magic to fill the buffers
for (int j = 0; j < numberOfFramesToWrite; j++) {
int16_t left = CFSwapInt16HostToBig(sampleBufferLeft[j]);
int16_t right = CFSwapInt16HostToBig(sampleBufferRight[j]);
outputBuffer[(j * 2)] = left;
outputBuffer[(j * 2) + 1] = right;
}
audioError = AudioFileWriteBytes(audioFile, false, bytePos, &bytesInOutputBuffer, &outputBuffer);
assert(audioError == noErr);
free(sampleBufferLeft);
free(sampleBufferRight);
free(outputBuffer);
bytePos += bytesInOutputBuffer;
currentFrame += numberOfFramesToWrite;
}
I also tried to just write the buffers at once (2048*L, 2048*R, etc.) which I did not expect to work, and it didn't.
How do I speed this up AND get a working wave file?

I tried making a buffer going LRLRLRLR (left / right), and then write that with just one AudioFileWriteBytes call.
This is the correct approach if using (the rather difficult) Audio File Services.
If possible, instead of the very low level Audio File Services, use Extended Audio File Services. It is a wrapper around Audio File Services that has built in format converters. Or even better yet, use AVAudioFile it is a wrapper around Extended Audio File Services that covers most common use cases.
If you are set on using Audio File Services, you'll have to interleave the audio manually like you had tried. Maybe show the code where you attempted this.

iOS using AudioQueueNewOutput to output sound in both left and right channel

I am developing an app to show sin wave.
I am using AudioQueueNewOutput to output mono sound is OK, but when I come to stereo output, I have no idea how to do it.
I know the mChannelsPerFrame = 2 can generate wave in both left and right channel.
I also want to know what is the sequence of sending bytes to left and right channel? Is the first byte to left channel and the second byte to right channel?
Code:
_audioFormat = new AudioStreamBasicDescription();
_audioFormat->mSampleRate = SAMPLE_RATE; // 44100
_audioFormat->mFormatID = kAudioFormatLinearPCM;
_audioFormat->mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
_audioFormat->mFramesPerPacket = 1;
_audioFormat->mChannelsPerFrame = NUM_CHANNELS; // 1
_audioFormat->mBitsPerChannel = BITS_PER_CHANNEL; // 16
_audioFormat->mBytesPerPacket = BYTES_PER_FRAME; // 2
_audioFormat->mBytesPerFrame = BYTES_PER_FRAME; // 2
and
_sineTableLength = _audioFormat.mSampleRate / SAMPLE_LIMIT_FACTOR; // 44100/100 = 441
_sineTable = new SInt16[_sineTableLength];
for(int i = 0; i < _sineTableLength; i++)
{
// Transfer values between -1.0 and 1.0 to integer values between -sample max and sample max
_sineTable[i] = (SInt16)(sin(i * 2 * M_PI / _sineTableLength) * 32767);
}
and
AudioQueueNewOutput (&_audioFormat,
playbackCallback,
(__bridge void *)(self),
nil,
nil,
0,
&_queueObject);
static void playbackCallback (void* inUserData,
AudioQueueRef inAudioQueue,
AudioQueueBufferRef bufferReference){
SInt16* sample = (SInt16*)bufferReference->mAudioData;
// bufferSize 1024
for(int i = 0; i < bufferSize; i += _audioFormat.mBytesPerFrame, sample++)
{
// set value for *sample
// 9ms sin wave and 4.5ms 0
...
}
...
AudioQueueEnqueueBuffer(...)
}

Several days later,I have found the answer.
First: AudioStreamBasicDescription can set just like this ;
Then: bufferSize change from 1024 to 2048 ;
And: SInt16 in SInt16* sample = (SInt16*)bufferReference->mAudioData; all change to SInt32. Because the channel double,the bits double;
Last: Each 16 bits contains data that left or right channel need in sample,just feed it whatever you want.

AudioConverter number of packets is wrong

I've set up a class to convert audio from one format to another given an input and output AudioStreamBasicDescription. When I convert Linear PCM from the mic to iLBC, it works and gives me 6 packets when I give it 1024 packets from the AudioUnitRender function. I then send those 226 bytes via UDP to the same app running on a different device. The problem is that when I use the same class to convert back into Linear PCM for giving to an audio unit input, the AudioConverterFillComplexBuffer function doesn't give 1024 packets, it gives 960... This means that the audio unit input is expecting 4096 bytes (2048 x 2 for stereo) but I can only give it 3190 or so, and so it sounds really crackly and distorted...
If I give AudioConverter 1024 packets of LinearPCM, convert to iLBC, convert back to LinearPCM, surely I should get 1024 packets again?
Audio converter function:
-(void) doConvert {
// Start converting
if (converting) return;
converting = YES;
while (true) {
// Get next buffer
id bfr = [buffers getNextBuffer];
if (!bfr) {
converting = NO;
return;
}
// Get info
NSArray* bfrs = ([bfr isKindOfClass:[NSArray class]] ? bfr : #[bfr]);
int bfrSize = 0;
for (NSData* dat in bfrs) bfrSize += dat.length;
int inputPackets = bfrSize / self.inputFormat.mBytesPerPacket;
int outputPackets = (inputPackets * self.inputFormat.mFramesPerPacket) / self.outputFormat.mFramesPerPacket;
// Create output buffer
AudioBufferList* bufferList = (AudioBufferList*) malloc(sizeof(AudioBufferList) * self.outputFormat.mChannelsPerFrame);
bufferList -> mNumberBuffers = self.outputFormat.mChannelsPerFrame;
for (int i = 0 ; i < self.outputFormat.mChannelsPerFrame ; i++) {
bufferList -> mBuffers[i].mNumberChannels = 1;
bufferList -> mBuffers[i].mDataByteSize = 4*1024;
bufferList -> mBuffers[i].mData = malloc(bufferList -> mBuffers[i].mDataByteSize);
}
// Create input buffer
AudioBufferList* inputBufferList = (AudioBufferList*) malloc(sizeof(AudioBufferList) * bfrs.count);
inputBufferList -> mNumberBuffers = bfrs.count;
for (int i = 0 ; i < bfrs.count ; i++) {
inputBufferList -> mBuffers[i].mNumberChannels = 1;
inputBufferList -> mBuffers[i].mDataByteSize = [[bfrs objectAtIndex:i] length];
inputBufferList -> mBuffers[i].mData = (void*) [[bfrs objectAtIndex:i] bytes];
}
// Create sound data payload
struct SoundDataPayload payload;
payload.data = inputBufferList;
payload.numPackets = inputPackets;
payload.packetDescriptions = NULL;
payload.used = NO;
// Convert data
UInt32 numPackets = outputPackets;
OSStatus err = AudioConverterFillComplexBuffer(converter, acvConverterComplexInput, &payload, &numPackets, bufferList, NULL);
if (err)
continue;
// Check how to output
if (bufferList -> mNumberBuffers > 1) {
// Output as array
NSMutableArray* array = [NSMutableArray arrayWithCapacity:bufferList -> mNumberBuffers];
for (int i = 0 ; i < bufferList -> mNumberBuffers ; i++)
[array addObject:[NSData dataWithBytes:bufferList -> mBuffers[i].mData length:bufferList -> mBuffers[i].mDataByteSize]];
// Save
[convertedBuffers addBuffer:array];
} else {
// Output as data
NSData* newData = [NSData dataWithBytes:bufferList -> mBuffers[0].mData length:bufferList -> mBuffers[0].mDataByteSize];
// Save
[convertedBuffers addBuffer:newData];
}
// Free memory
for (int i = 0 ; i < bufferList -> mNumberBuffers ; i++)
free(bufferList -> mBuffers[i].mData);
free(inputBufferList);
free(bufferList);
// Tell delegate
if (self.convertHandler)
//dispatch_async(dispatch_get_main_queue(), self.convertHandler);
self.convertHandler();
}
}
Formats when converting to iLBC:
// Get input format from mic
UInt32 size = sizeof(AudioStreamBasicDescription);
AudioStreamBasicDescription inputDesc;
AudioUnitGetProperty(self.ioUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output, 1, &inputDesc, &size);
// Set output stream description
size = sizeof(AudioStreamBasicDescription);
AudioStreamBasicDescription outputDescription;
memset(&outputDescription, 0, size);
outputDescription.mFormatID = kAudioFormatiLBC;
OSStatus err = AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL, &size, &outputDescription);
Formats when converting from iLBC:
// Set input stream description
size = sizeof(AudioStreamBasicDescription);
AudioStreamBasicDescription inputDescription;
memset(&inputDescription, 0, size);
inputDescription.mFormatID = kAudioFormatiLBC;
AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL, &size, &inputDescription);
// Set output stream description
UInt32 size = sizeof(AudioStreamBasicDescription);
AudioStreamBasicDescription outputDesc;
AudioUnitGetProperty(unit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input, 0, &outputDesc, &size);

You have to use an intermediate buffer to save up enough bytes from enough incoming packets to exactly match the number requested by the audio unit input. Depending on any one UDP packet in compressed format to be exactly the right size won't work.
The AudioConverter may buffer samples and change the packet sizes depending on the compression format.

AudioConverterFillComplexBuffer work with Internet streamed mp3

I am currently streaming mp3 audio through the Internet. I am using AudioFileStream to parse the mp3 steam
comes through a CFReadStreamRef, decode the mp3 using AudioConverterFillComplexBuffer and copy the converted PCM
data into a ring buffer and finally play the PCM using RemoteIO.
The problem I am currently facing is the AudioConverterFillComplexBuffer always returns 0 (no error) but the conversion
result seems incorrect. In details, I can notice,
A. The UInt32 *ioOutputDataPacketSize keeps the same value I sent in.
B. The convertedData.mBuffers[0].mDataByteSize always been set to the size of the outputbuffer (doesn't matter how big the buffer is).
C. I can only hear clicking noise with the output data.
Below is my procedures for rendering the audio.
The same procedure works for my Audio queue implementation so I believe
I didn't something wrong in either the place I invoking the AudioConverterFillComplexBuffer or the callback of AudioConverterFillComplexBuffer.
I have been stuck on this issue for a long time. Any help will be highly appreciated.
Open a AudioFileStream.
// create an audio file stream parser
AudioFileTypeID fileTypeHint = kAudioFileMP3Type;
AudioFileStreamOpen(self, MyPropertyListenerProc, MyPacketsProc, fileTypeHint, &audioFileStream);
Handle the parsed data in the callback function ("MyPacketsProc").
void MyPacketsProc(void * inClientData,
UInt32 inNumberBytes,
UInt32 inNumberPackets,
const void * inInputData,
AudioStreamPacketDescription *inPacketDescriptions)
{
#synchronized(self)
{
// Init the audio converter.
if (!audioConverter)
AudioConverterNew(&asbd, &asbd_out, &audioConverter);
struct mp3Data mSettings;
memset(&mSettings, 0, sizeof(mSettings));
UInt32 packetsPerBuffer = 0;
UInt32 outputBufferSize = 1024 * 32; // 32 KB is a good starting point.
UInt32 sizePerPacket = asbd.mBytesPerPacket;
// Calculate the size per buffer.
// Variable Bit Rate Data.
if (sizePerPacket == 0)
{
UInt32 size = sizeof(sizePerPacket);
AudioConverterGetProperty(audioConverter, kAudioConverterPropertyMaximumOutputPacketSize, &size, &sizePerPacket);
if (sizePerPacket > outputBufferSize)
outputBufferSize = sizePerPacket;
packetsPerBuffer = outputBufferSize / sizePerPacket;
}
//CBR
else
packetsPerBuffer = outputBufferSize / sizePerPacket;
// Prepare the input data for the callback.
mSettings.inputBuffer.mDataByteSize = inNumberBytes;
mSettings.inputBuffer.mData = (void *)inInputData;
mSettings.inputBuffer.mNumberChannels = 1;
mSettings.numberPackets = inNumberPackets;
mSettings.packetDescription = inPacketDescriptions;
// Set up our output buffers
UInt8 * outputBuffer = (UInt8*)malloc(sizeof(UInt8) * outputBufferSize);
memset(outputBuffer, 0, outputBufferSize);
// describe output data buffers into which we can receive data.
AudioBufferList convertedData;
convertedData.mNumberBuffers = 1;
convertedData.mBuffers[0].mNumberChannels = 1;
convertedData.mBuffers[0].mDataByteSize = outputBufferSize;
convertedData.mBuffers[0].mData = outputBuffer;
// Convert.
UInt32 ioOutputDataPackets = packetsPerBuffer;
OSStatus result = AudioConverterFillComplexBuffer(audioConverter,
converterComplexInputDataProc,
&mSettings,
&ioOutputDataPackets,
&convertedData,
NULL
);
// Enqueue the ouput pcm data.
TPCircularBufferProduceBytes(&m_pcmBuffer, convertedData.mBuffers[0].mData, convertedData.mBuffers[0].mDataByteSize);
free(outputBuffer);
}
}
Feed the audio converter from its callback function ("converterComplexInputDataProc").
OSStatus converterComplexInputDataProc(AudioConverterRef inAudioConverter,
UInt32* ioNumberDataPackets,
AudioBufferList* ioData,
AudioStreamPacketDescription** ioDataPacketDescription,
void* inUserData)
{
struct mp3Data THIS = (struct mp3Data) inUserData;
if (THIS->inputBuffer.mDataByteSize > 0)
{
*ioNumberDataPackets = THIS->numberPackets;
ioData->mNumberBuffers = 1;
ioData->mBuffers[0].mDataByteSize = THIS->inputBuffer.mDataByteSize;
ioData->mBuffers[0].mData = THIS->inputBuffer.mData;
ioData->mBuffers[0].mNumberChannels = 1;
if (ioDataPacketDescription)
*ioDataPacketDescription = THIS->packetDescription;
}
else
*ioDataPacketDescription = 0;
return 0;
}
Playback using the RemoteIO component.
The input and output AudioStreamBasicDescription.
Input:
Sample Rate: 16000
Format ID: .mp3
Format Flags: 0
Bytes per Packet: 0
Frames per Packet: 576
Bytes per Frame: 0
Channels per Frame: 1
Bits per Channel: 0
output:
Sample Rate: 44100
Format ID: lpcm
Format Flags: 3116
Bytes per Packet: 4
Frames per Packet: 1
Bytes per Frame: 4
Channels per Frame: 1
Bits per Channel: 32

iOS: How to read an audio file into a float buffer

I have a really short audio file, say a 10th of a second in (say) .PCM format
I want to use RemoteIO to loop through the file repeatedly to produce a continuous musical tone. So how do I read this into an array of floats?
EDIT: while I could probably dig out the file format, extract the file into an NSData and process it manually, I'm guessing there is a more sensible generic approach... ( that eg copes with different formats )

You can use ExtAudioFile to read data from any supported data format in numerous client formats. Here is an example to read a file as 16-bit integers:
CFURLRef url = /* ... */;
ExtAudioFileRef eaf;
OSStatus err = ExtAudioFileOpenURL((CFURLRef)url, &eaf);
if(noErr != err)
/* handle error */
AudioStreamBasicDescription format;
format.mSampleRate = 44100;
format.mFormatID = kAudioFormatLinearPCM;
format.mFormatFlags = kAudioFormatFormatFlagIsPacked;
format.mBitsPerChannel = 16;
format.mChannelsPerFrame = 2;
format.mBytesPerFrame = format.mChannelsPerFrame * 2;
format.mFramesPerPacket = 1;
format.mBytesPerPacket = format.mFramesPerPacket * format.mBytesPerFrame;
err = ExtAudioFileSetProperty(eaf, kExtAudioFileProperty_ClientDataFormat, sizeof(format), &format);
/* Read the file contents using ExtAudioFileRead */
If you wanted Float32 data, you would set up format like this:
format.mFormatID = kAudioFormatLinearPCM;
format.mFormatFlags = kAudioFormatFlagsNativeFloatPacked;
format.mBitsPerChannel = 32;

This is the code I have used to convert my audio data (audio file ) into floating point representation and saved into an array.
-(void) PrintFloatDataFromAudioFile {
NSString * name = #"Filename"; //YOUR FILE NAME
NSString * source = [[NSBundle mainBundle] pathForResource:name ofType:#"m4a"]; // SPECIFY YOUR FILE FORMAT
const char *cString = [source cStringUsingEncoding:NSASCIIStringEncoding];
CFStringRef str = CFStringCreateWithCString(
NULL,
cString,
kCFStringEncodingMacRoman
);
CFURLRef inputFileURL = CFURLCreateWithFileSystemPath(
kCFAllocatorDefault,
str,
kCFURLPOSIXPathStyle,
false
);
ExtAudioFileRef fileRef;
ExtAudioFileOpenURL(inputFileURL, &fileRef);
AudioStreamBasicDescription audioFormat;
audioFormat.mSampleRate = 44100; // GIVE YOUR SAMPLING RATE
audioFormat.mFormatID = kAudioFormatLinearPCM;
audioFormat.mFormatFlags = kLinearPCMFormatFlagIsFloat;
audioFormat.mBitsPerChannel = sizeof(Float32) * 8;
audioFormat.mChannelsPerFrame = 1; // Mono
audioFormat.mBytesPerFrame = audioFormat.mChannelsPerFrame * sizeof(Float32); // == sizeof(Float32)
audioFormat.mFramesPerPacket = 1;
audioFormat.mBytesPerPacket = audioFormat.mFramesPerPacket * audioFormat.mBytesPerFrame; // = sizeof(Float32)
// 3) Apply audio format to the Extended Audio File
ExtAudioFileSetProperty(
fileRef,
kExtAudioFileProperty_ClientDataFormat,
sizeof (AudioStreamBasicDescription), //= audioFormat
&audioFormat);
int numSamples = 1024; //How many samples to read in at a time
UInt32 sizePerPacket = audioFormat.mBytesPerPacket; // = sizeof(Float32) = 32bytes
UInt32 packetsPerBuffer = numSamples;
UInt32 outputBufferSize = packetsPerBuffer * sizePerPacket;
// So the lvalue of outputBuffer is the memory location where we have reserved space
UInt8 *outputBuffer = (UInt8 *)malloc(sizeof(UInt8 *) * outputBufferSize);
AudioBufferList convertedData ;//= malloc(sizeof(convertedData));
convertedData.mNumberBuffers = 1; // Set this to 1 for mono
convertedData.mBuffers[0].mNumberChannels = audioFormat.mChannelsPerFrame; //also = 1
convertedData.mBuffers[0].mDataByteSize = outputBufferSize;
convertedData.mBuffers[0].mData = outputBuffer; //
UInt32 frameCount = numSamples;
float *samplesAsCArray;
int j =0;
double floatDataArray[882000] ; // SPECIFY YOUR DATA LIMIT MINE WAS 882000 , SHOULD BE EQUAL TO OR MORE THAN DATA LIMIT
while (frameCount > 0) {
ExtAudioFileRead(
fileRef,
&frameCount,
&convertedData
);
if (frameCount > 0) {
AudioBuffer audioBuffer = convertedData.mBuffers[0];
samplesAsCArray = (float *)audioBuffer.mData; // CAST YOUR mData INTO FLOAT
for (int i =0; i<1024 /*numSamples */; i++) { //YOU CAN PUT numSamples INTEAD OF 1024
floatDataArray[j] = (double)samplesAsCArray[i] ; //PUT YOUR DATA INTO FLOAT ARRAY
printf("\n%f",floatDataArray[j]); //PRINT YOUR ARRAY'S DATA IN FLOAT FORM RANGING -1 TO +1
j++;
}
}
}}

I'm not familiar with RemoteIO, but I am familiar with WAV's and thought I'd post some format information on them. If you need, you should be able to easily parse out information such as duration, bit rate, etc...
First, here is an excellent website detailing the WAVE PCM soundfile format. This site also does an excellent job illustrating what the different byte addresses inside the "fmt" sub-chunk refer to.
WAVE File format
A WAVE is composed of a "RIFF" chunk and subsequent sub-chunks
Every chunk is at least 8 bytes
First 4 bytes is the Chunk ID
Next 4 bytes is the Chunk Size (The Chunk Size gives the size of the remainder of the chunk excluding the 8 bytes used for the Chunk ID and Chunk Size)
Every WAVE has the following chunks / sub chunks
"RIFF" (first and only chunk. All the rest are technically sub-chunks.)
"fmt " (usually the first sub-chunk after "RIFF" but can be anywhere between "RIFF" and "data". This chunk has information about the WAV such as number of channels, sample rate, and byte rate)
"data" (must be the last sub-chunk and contains all the sound data)
Common WAVE Audio Formats:
PCM
IEEE_Float
PCM_EXTENSIBLE (with a sub format of PCM or IEEE_FLOAT)
WAVE Duration and Size
A WAVE File's duration can be calculated as follows:
seconds = DataChunkSize / ByteRate
Where
ByteRate = SampleRate * NumChannels * BitsPerSample/8
and DataChunkSize does not include the 8 bytes reserved for the ID and Size of the "data" sub-chunk.
Knowing this, the DataChunkSize can be calculated if you know the duration of the WAV and the ByteRate.
DataChunkSize = seconds * ByteRate
This can be useful for calculating the size of the wav data when converting from formats like mp3 or wma. Note that a typical wav header is 44 bytes followed by DataChunkSize (this is always the case if the wav was converted using the Normalizer tool - at least as of this writing).

Update for Swift 5
This is a simple function that helps get your audio file into an array of floats. This is for both mono and stereo audio, To get the second channel of stereo audio, just uncomment sample 2
import AVFoundation
//..
do {
guard let url = Bundle.main.url(forResource: "audio_example", withExtension: "wav") else { return }
let file = try AVAudioFile(forReading: url)
if let format = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: file.fileFormat.sampleRate, channels: file.fileFormat.channelCount, interleaved: false), let buf = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: AVAudioFrameCount(file.length)) {
try file.read(into: buf)
guard let floatChannelData = buf.floatChannelData else { return }
let frameLength = Int(buf.frameLength)
let samples = Array(UnsafeBufferPointer(start:floatChannelData[0], count:frameLength))
// let samples2 = Array(UnsafeBufferPointer(start:floatChannelData[1], count:frameLength))
print("samples")
print(samples.count)
print(samples.prefix(10))
// print(samples2.prefix(10))
}
} catch {
print("Audio Error: \(error)")
}

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart