Recording wav file in compressed format in iOS

Recording wav file in compressed format in iOS - ios

Right now i am using AQRecorder for recording audio as .wav. My audio file description is as below;
mRecordFormat.mFormatID = kAudioFormatLinearPCM;
mRecordFormat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
mRecordFormat.mSampleRate = 8000.0;
mRecordFormat.mBitsPerChannel = 16;
mRecordFormat.mChannelsPerFrame = 1;
mRecordFormat.mFramesPerPacket = 1;
mRecordFormat.mBytesPerPacket = 2;
mRecordFormat.mBytesPerFrame = 2;
I want to know that can i record data to .wav in compressed format if yes please let me know how can i do this.(I dont want to record file in .caf format)

Related

How to set AudioStreamBasicDescription properties?

I'm trying to play PCM stream data from server using AudioQueue.
PCM data format :
Sample rate = 48000, num of channel = 2, Bit per sample = 16
And, server is not streaming fixed bytes to client. (variable bytes.)
(ex : 30848, 128, 2764, ... bytes )
How to set ASBD ?
I don't know how to set mFramesPerPacket, mBytesPerFrame, mBytesPerPacket .
I have read Apple reference document, but there is no detailed descriptions.
Please give me any idea.
New added : Here, ASBD structure what I have setted. (language : Swift)
// Create ASBD structure & set properties.
var streamFormat = AudioStreamBasicDescription()
streamFormat.mSampleRate = 48000
streamFormat.mFormatID = kAudioFormatLinearPCM
streamFormat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked
streamFormat.mFramesPerPacket = 1
streamFormat.mChannelsPerFrame = 2
streamFormat.mBitsPerChannel = 16
streamFormat.mBytesPerFrame = (streamFormat.mBitsPerChannel / 8) * streamFormat.mChannelsPerFrame
streamFormat.mBytesPerPacket = streamFormat.mBytesPerFrame
streamFormat.mReserved = 0
// Create AudioQueue for playing PCM streaming data.
var err = AudioQueueNewOutput(&streamFormat, self.queueCallbackProc, nil, nil, nil, 0, &aq)
...
I have setted ASBD structure like the above.
AudioQueue play streamed PCM data very well for a few seconds,
but soon playing is stop. What can I do?
(still streaming, and queueing AudioQueue)
Please give me any idea.

ASBD is just a structure underneath defined like follows:
struct AudioStreamBasicDescription
{
Float64 mSampleRate;
AudioFormatID mFormatID;
AudioFormatFlags mFormatFlags;
UInt32 mBytesPerPacket;
UInt32 mFramesPerPacket;
UInt32 mBytesPerFrame;
UInt32 mChannelsPerFrame;
UInt32 mBitsPerChannel;
UInt32 mReserved;
};
typedef struct AudioStreamBasicDescription AudioStreamBasicDescription;
You may set the variables of a struct like this:
AudioStreamBasicDescription streamFormat;
streamFormat.mFormatID = kAudioFormatLinearPCM;
streamFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagsNativeEndian | kAudioFormatFlagIsPacked;
streamFormat.mSampleRate = sampleRate;
streamFormat.mBitsPerChannel = bitsPerChannel;
streamFormat.mChannelsPerFrame = channelsPerFrame;
streamFormat.mFramesPerPacket = 1;
int bytes = (bitsPerChannel / 8) * channelsPerFrame;
streamFormat.mBytesPerFrame = bytes;
streamFormat.mBytesPerPacket = bytes;

AudioStreamBasicDescription setting values for wav

I am trying to play a simple PCM file on iOS but couldn't wrap my head around AudioStreamBasicDescription and this link does not provide enough information.
I get this values from terminal
afinfo BlameItOnTheNight.wav
File: BlameItOnTheNight.wav
File type ID: WAVE
Num Tracks: 1
----
Data format: 2 ch, 44100 Hz, 'lpcm' (0x0000000C) 16-bit little-endian signed integer
no channel layout.
estimated duration: 9.938141 sec
audio bytes: 1753088
audio packets: 438272
bit rate: 1411200 bits per second
packet size upper bound: 4
maximum packet size: 4
audio data file offset: 44
optimized
source bit depth: I16
----
Then I choose values in code
- (void)setupAudioFormat:(AudioStreamBasicDescription*)format
{
format->mSampleRate = 44100.0;
format->mFormatID = kAudioFormatLinearPCM;
format->mFramesPerPacket = 1;
format->mChannelsPerFrame = 2;
format->mBytesPerFrame = format->mChannelsPerFrame * sizeof(Float32);
format->mBytesPerPacket = format->mFramesPerPacket * format->mBytesPerFrame;
format->mBitsPerChannel = sizeof(Float32) * 8;
format->mReserved = 0;
format->mFormatFlags = kAudioFormatFlagIsSignedInteger |
kAudioFormatFlagsNativeEndian |
kAudioFormatFlagIsPacked;
}
Audio plays really fast.
Whats the correct way the calculate this values based on actual audio file?

when I changed the values I was getting following error.
error for object 0x7fba72c50db8: incorrect checksum for freed object - object was probably modified after being freed.
*** set a breakpoint in malloc_error_break to debug
then finally I figured out that my AudioStreamBasicDescription bitsperchannel values was not correct also the buffer size was not enough.
So first I have changed the values to
- (void)setupAudioFormat:(AudioStreamBasicDescription*)format
{
format->mSampleRate = 44100.0;
format->mFormatID = kAudioFormatLinearPCM;
format->mFramesPerPacket = 1; //For uncompressed audio, the value is 1. For variable bit-rate formats, the value is a larger fixed number, such as 1024 for AAC
format->mChannelsPerFrame = 2;
format->mBytesPerFrame = format->mChannelsPerFrame * 2;
format->mBytesPerPacket = format->mFramesPerPacket * format->mBytesPerFrame;
format->mBitsPerChannel = 16;
format->mReserved = 0;
format->mFormatFlags = kAudioFormatFlagIsSignedInteger |
kAudioFormatFlagsNativeEndian |
kLinearPCMFormatFlagIsPacked;
}
then when I allocate buffer I increased the size
// Allocate and prime playback buffers
playState.playing = true;
for (int i = 0; i < NUM_BUFFERS && playState.playing; i++)
{
AudioQueueAllocateBuffer(playState.queue, 32000, &playState.buffers[i]);
AudioOutputCallback(&playState, playState.queue, playState.buffers[i]);
}
In my original code it was set to 8000, now changing it to 32000 solves the problem.

AudioConverterFillComplexBuffer '!stt' error

I have an Audio App that from time to time is need to encode audio data from PCM to AAC format. I'm using software decoder (Actually I don't care what encoder is used, but I've checked twice, and there's software decoder). I'm using (https://github.com/TheAmazingAudioEngine/TheAmazingAudioEngine library)
I setup Audio session category as kAudioSessionCategory_PlayAndRecord
Have these formats
// Output format
outputFormat.mFormatID = kAudioFormatMPEG4AAC;
outputFormat.mSampleRate = 44100;
outputFormat.mFormatFlags = kMPEG4Object_AAC_Scalable;
outputFormat.mChannelsPerFrame = 2;
outputFormat.mBitsPerChannel = 0;
outputFormat.mBytesPerFrame = 0;
outputFormat.mBytesPerPacket = 0;
outputFormat.mFramesPerPacket = 1024;
// Input format
audioDescription.mFormatID = kAudioFormatLinearPCM;
audioDescription.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked | kAudioFormatFlagsNativeEndian | kAudioFormatFlagIsNonInterleaved;
audioDescription.mChannelsPerFrame = 2;
audioDescription.mBytesPerPacket = sizeof(SInt16);
audioDescription.mFramesPerPacket = 1;
audioDescription.mBytesPerFrame = sizeof(SInt16);
audioDescription.mBitsPerChannel = 8 * sizeof(SInt16);
audioDescription.mSampleRate = 44100.0;
And All works perfectly with kAudioSessionProperty_OverrideCategoryMixWithOthers == YES
But when I setup kAudioSessionProperty_OverrideCategoryMixWithOthers with NO then:
iOS Simulator - OK
iPod (6.1.3) - OK
iPhone 4S (7.0.3) - Fail with ('!sst' error on `AudioConverterFillComplexBuffer`) call
iPad 3(7.0.3) - Fail
As I already said all works fine until I change audio session property kAudioSessionProperty_OverrideCategoryMixWithOthers to NO
So the question is:
What is this error about? (I didn't found any clues in Headers and documentation what does this error mean (There's also no '!sst' string in any public framework header))
How can I fix it?
If you have any other Ideas that you think I need to try - Feel free to comment in.

Audio Queue Converting sample rate iOS

ok so, noob to iOS. I am using the Audio Queue Buffer to record audio. The Linear PCM format defaults to 44100 Hz, 1 channel, 16bit, little endian. Is there a way I can force a format of 8000 hz, 1 channel, 32bit floating point, little endian?

You can specify the format you want at initialization:
AudioStreamBasicDescription asbd;
asbd.mSampleRate = 8000;
asbd.mFormatID = kAudioFormatLinearPCM;
asbd.mFormatFlags = kLinearPCMFormatFlagIsFloat;
asbd.mBytesPerPacket = sizeof(float);
asbd.mFramesPerPacket = 1;
asbd.mBytesPerFrame = sizeof(float);
asbd.mChannelsPerFrame = 1;
asbd.mBitsPerChannel = sizeof(float) * CHAR_BIT;
asbd.mReserved = 0;
OSStatus e = AudioQueueNewInput(&asbd, ...............

iOS: How to read an audio file into a float buffer

I have a really short audio file, say a 10th of a second in (say) .PCM format
I want to use RemoteIO to loop through the file repeatedly to produce a continuous musical tone. So how do I read this into an array of floats?
EDIT: while I could probably dig out the file format, extract the file into an NSData and process it manually, I'm guessing there is a more sensible generic approach... ( that eg copes with different formats )

You can use ExtAudioFile to read data from any supported data format in numerous client formats. Here is an example to read a file as 16-bit integers:
CFURLRef url = /* ... */;
ExtAudioFileRef eaf;
OSStatus err = ExtAudioFileOpenURL((CFURLRef)url, &eaf);
if(noErr != err)
/* handle error */
AudioStreamBasicDescription format;
format.mSampleRate = 44100;
format.mFormatID = kAudioFormatLinearPCM;
format.mFormatFlags = kAudioFormatFormatFlagIsPacked;
format.mBitsPerChannel = 16;
format.mChannelsPerFrame = 2;
format.mBytesPerFrame = format.mChannelsPerFrame * 2;
format.mFramesPerPacket = 1;
format.mBytesPerPacket = format.mFramesPerPacket * format.mBytesPerFrame;
err = ExtAudioFileSetProperty(eaf, kExtAudioFileProperty_ClientDataFormat, sizeof(format), &format);
/* Read the file contents using ExtAudioFileRead */
If you wanted Float32 data, you would set up format like this:
format.mFormatID = kAudioFormatLinearPCM;
format.mFormatFlags = kAudioFormatFlagsNativeFloatPacked;
format.mBitsPerChannel = 32;

This is the code I have used to convert my audio data (audio file ) into floating point representation and saved into an array.
-(void) PrintFloatDataFromAudioFile {
NSString * name = #"Filename"; //YOUR FILE NAME
NSString * source = [[NSBundle mainBundle] pathForResource:name ofType:#"m4a"]; // SPECIFY YOUR FILE FORMAT
const char *cString = [source cStringUsingEncoding:NSASCIIStringEncoding];
CFStringRef str = CFStringCreateWithCString(
NULL,
cString,
kCFStringEncodingMacRoman
);
CFURLRef inputFileURL = CFURLCreateWithFileSystemPath(
kCFAllocatorDefault,
str,
kCFURLPOSIXPathStyle,
false
);
ExtAudioFileRef fileRef;
ExtAudioFileOpenURL(inputFileURL, &fileRef);
AudioStreamBasicDescription audioFormat;
audioFormat.mSampleRate = 44100; // GIVE YOUR SAMPLING RATE
audioFormat.mFormatID = kAudioFormatLinearPCM;
audioFormat.mFormatFlags = kLinearPCMFormatFlagIsFloat;
audioFormat.mBitsPerChannel = sizeof(Float32) * 8;
audioFormat.mChannelsPerFrame = 1; // Mono
audioFormat.mBytesPerFrame = audioFormat.mChannelsPerFrame * sizeof(Float32); // == sizeof(Float32)
audioFormat.mFramesPerPacket = 1;
audioFormat.mBytesPerPacket = audioFormat.mFramesPerPacket * audioFormat.mBytesPerFrame; // = sizeof(Float32)
// 3) Apply audio format to the Extended Audio File
ExtAudioFileSetProperty(
fileRef,
kExtAudioFileProperty_ClientDataFormat,
sizeof (AudioStreamBasicDescription), //= audioFormat
&audioFormat);
int numSamples = 1024; //How many samples to read in at a time
UInt32 sizePerPacket = audioFormat.mBytesPerPacket; // = sizeof(Float32) = 32bytes
UInt32 packetsPerBuffer = numSamples;
UInt32 outputBufferSize = packetsPerBuffer * sizePerPacket;
// So the lvalue of outputBuffer is the memory location where we have reserved space
UInt8 *outputBuffer = (UInt8 *)malloc(sizeof(UInt8 *) * outputBufferSize);
AudioBufferList convertedData ;//= malloc(sizeof(convertedData));
convertedData.mNumberBuffers = 1; // Set this to 1 for mono
convertedData.mBuffers[0].mNumberChannels = audioFormat.mChannelsPerFrame; //also = 1
convertedData.mBuffers[0].mDataByteSize = outputBufferSize;
convertedData.mBuffers[0].mData = outputBuffer; //
UInt32 frameCount = numSamples;
float *samplesAsCArray;
int j =0;
double floatDataArray[882000] ; // SPECIFY YOUR DATA LIMIT MINE WAS 882000 , SHOULD BE EQUAL TO OR MORE THAN DATA LIMIT
while (frameCount > 0) {
ExtAudioFileRead(
fileRef,
&frameCount,
&convertedData
);
if (frameCount > 0) {
AudioBuffer audioBuffer = convertedData.mBuffers[0];
samplesAsCArray = (float *)audioBuffer.mData; // CAST YOUR mData INTO FLOAT
for (int i =0; i<1024 /*numSamples */; i++) { //YOU CAN PUT numSamples INTEAD OF 1024
floatDataArray[j] = (double)samplesAsCArray[i] ; //PUT YOUR DATA INTO FLOAT ARRAY
printf("\n%f",floatDataArray[j]); //PRINT YOUR ARRAY'S DATA IN FLOAT FORM RANGING -1 TO +1
j++;
}
}
}}

I'm not familiar with RemoteIO, but I am familiar with WAV's and thought I'd post some format information on them. If you need, you should be able to easily parse out information such as duration, bit rate, etc...
First, here is an excellent website detailing the WAVE PCM soundfile format. This site also does an excellent job illustrating what the different byte addresses inside the "fmt" sub-chunk refer to.
WAVE File format
A WAVE is composed of a "RIFF" chunk and subsequent sub-chunks
Every chunk is at least 8 bytes
First 4 bytes is the Chunk ID
Next 4 bytes is the Chunk Size (The Chunk Size gives the size of the remainder of the chunk excluding the 8 bytes used for the Chunk ID and Chunk Size)
Every WAVE has the following chunks / sub chunks
"RIFF" (first and only chunk. All the rest are technically sub-chunks.)
"fmt " (usually the first sub-chunk after "RIFF" but can be anywhere between "RIFF" and "data". This chunk has information about the WAV such as number of channels, sample rate, and byte rate)
"data" (must be the last sub-chunk and contains all the sound data)
Common WAVE Audio Formats:
PCM
IEEE_Float
PCM_EXTENSIBLE (with a sub format of PCM or IEEE_FLOAT)
WAVE Duration and Size
A WAVE File's duration can be calculated as follows:
seconds = DataChunkSize / ByteRate
Where
ByteRate = SampleRate * NumChannels * BitsPerSample/8
and DataChunkSize does not include the 8 bytes reserved for the ID and Size of the "data" sub-chunk.
Knowing this, the DataChunkSize can be calculated if you know the duration of the WAV and the ByteRate.
DataChunkSize = seconds * ByteRate
This can be useful for calculating the size of the wav data when converting from formats like mp3 or wma. Note that a typical wav header is 44 bytes followed by DataChunkSize (this is always the case if the wav was converted using the Normalizer tool - at least as of this writing).

Update for Swift 5
This is a simple function that helps get your audio file into an array of floats. This is for both mono and stereo audio, To get the second channel of stereo audio, just uncomment sample 2
import AVFoundation
//..
do {
guard let url = Bundle.main.url(forResource: "audio_example", withExtension: "wav") else { return }
let file = try AVAudioFile(forReading: url)
if let format = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: file.fileFormat.sampleRate, channels: file.fileFormat.channelCount, interleaved: false), let buf = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: AVAudioFrameCount(file.length)) {
try file.read(into: buf)
guard let floatChannelData = buf.floatChannelData else { return }
let frameLength = Int(buf.frameLength)
let samples = Array(UnsafeBufferPointer(start:floatChannelData[0], count:frameLength))
// let samples2 = Array(UnsafeBufferPointer(start:floatChannelData[1], count:frameLength))
print("samples")
print(samples.count)
print(samples.prefix(10))
// print(samples2.prefix(10))
}
} catch {
print("Audio Error: \(error)")
}

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart