I'm developing an application that should record a user's voice and stream it to a custom device via the MQTT protocol.
The audio specification for the custom device: little-endian, unsigned, 16-bit LPCM at 8khz sample rate. Packets should be 1000 bytes each.
I'm not familiar with AudioEngine and I found this sample of code which I believe fits my case:
func startRecord() {
audioEngine = AVAudioEngine()
let bus = 0
let inputNode = audioEngine.inputNode
let inputFormat = inputNode.outputFormat(forBus: bus)
var streamDescription = AudioStreamBasicDescription()
streamDescription.mFormatID = kAudioFormatLinearPCM.littleEndian
streamDescription.mSampleRate = 8000.0
streamDescription.mChannelsPerFrame = 1
streamDescription.mBitsPerChannel = 16
streamDescription.mBytesPerPacket = 1000
let outputFormat = AVAudioFormat(streamDescription: &streamDescription)!
guard let converter: AVAudioConverter = AVAudioConverter(from: inputFormat, to: outputFormat) else {
print("Can't convert in to this format")
return
}
inputNode.installTap(onBus: 0, bufferSize: 1024, format: inputFormat) { (buffer, time) in
print("Buffer format: \(buffer.format)")
var newBufferAvailable = true
let inputCallback: AVAudioConverterInputBlock = { inNumPackets, outStatus in
if newBufferAvailable {
outStatus.pointee = .haveData
newBufferAvailable = false
return buffer
} else {
outStatus.pointee = .noDataNow
return nil
}
}
let convertedBuffer = AVAudioPCMBuffer(pcmFormat: outputFormat, frameCapacity: AVAudioFrameCount(outputFormat.sampleRate) * buffer.frameLength / AVAudioFrameCount(buffer.format.sampleRate))!
var error: NSError?
let status = converter.convert(to: convertedBuffer, error: &error, withInputFrom: inputCallback)
assert(status != .error)
print("Converted buffer format:", convertedBuffer.format)
}
audioEngine.prepare()
do {
try audioEngine.start()
} catch {
print("Can't start the engine: \(error)")
}
}
But currently, the converter can't convert the input format to my output format and I don't understand why.
If I change my output format to something like that:
let outputFormat = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: 8000.0, channels: 1, interleaved: false)!
Then it works.
Your streamDescription is wrong, you hadn't filled in all the fields, and mBytesPerPacket was wrong - this is not the same kind of packet your protocol calls for. For uncompressed audio (like LPCM) AudioStreamBasicDescription requires this field to be 1. If your protocol requires samples to be in groups of 1000, then you will have to do that.
Try this
var streamDescription = AudioStreamBasicDescription()
streamDescription.mSampleRate = 8000.0
streamDescription.mFormatID = kAudioFormatLinearPCM
streamDescription.mFormatFlags = kAudioFormatFlagIsSignedInteger // no endian flag means little endian
streamDescription.mBytesPerPacket = 2
streamDescription.mFramesPerPacket = 1
streamDescription.mBytesPerFrame = 2
streamDescription.mChannelsPerFrame = 1
streamDescription.mBitsPerChannel = 16
streamDescription.mReserved = 0
I try to build VoIP application on iOS with echo cancellation. For AEC as I understand I need to use Audio Units. The main problem is how to use AVAudioConverter to encode microphone data to Opus?
opusASBD = AudioStreamBasicDescription(mSampleRate: 48000.0,
mFormatID: kAudioFormatOpus,
mFormatFlags: 0,
mBytesPerPacket: 0,
mFramesPerPacket: 2880,
mBytesPerFrame: 0,
mChannelsPerFrame: 1,
mBitsPerChannel: 0,
mReserved: 0)
decoderOutputASBD = AudioStreamBasicDescription(mSampleRate: 48000.0,
mFormatID: kAudioFormatLinearPCM,
mFormatFlags: kLinearPCMFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked | kAudioFormatFlagIsNonInterleaved,
mBytesPerPacket: 2,
mFramesPerPacket: 1,
mBytesPerFrame: 2,
mChannelsPerFrame: 1,
mBitsPerChannel: 16,
mReserved: 0)
self.converterSpeaker = AVAudioConverter(from: AVAudioFormat(streamDescription: &opusASBD)!,
to: AVAudioFormat(streamDescription: &decoderOutputASBD)!)
self.converterMic = AVAudioConverter(from: AVAudioFormat(streamDescription: &decoderOutputASBD)!,
to: AVAudioFormat(streamDescription: &opusASBD)!)
self.converterMic?.bitRate = 48000
var inDesc = AudioComponentDescription(componentType: kAudioUnitType_Output,
componentSubType: kAudioUnitSubType_VoiceProcessingIO,
componentManufacturer: kAudioUnitManufacturer_Apple,
componentFlags: 0,
componentFlagsMask: 0)
if let inputComponent = AudioComponentFindNext(nil, &inDesc) {
let status = AudioComponentInstanceNew(inputComponent, &self.audioUnit)
if status == noErr {
var flag = UInt32(1)
AudioUnitSetProperty(self.audioUnit,
kAudioOutputUnitProperty_EnableIO,
kAudioUnitScope_Input,
1,
&flag,
UInt32(MemoryLayout<UInt32>.size))
AudioUnitSetProperty(self.audioUnit,
kAudioOutputUnitProperty_EnableIO,
kAudioUnitScope_Output,
0,
&flag,
UInt32(MemoryLayout<UInt32>.size))
AudioUnitSetProperty(self.audioUnit,
kAUVoiceIOProperty_VoiceProcessingEnableAGC,
kAudioUnitScope_Global,
0,
&flag,
UInt32(MemoryLayout<UInt32>.size))
AudioUnitSetProperty(self.audioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Input,
0,
&decoderOutputASBD,
UInt32(MemoryLayout<AudioStreamBasicDescription>.size))
AudioUnitSetProperty(self.audioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Output,
1,
&decoderOutputASBD,
UInt32(MemoryLayout<AudioStreamBasicDescription>.size))
var iCallback = AURenderCallbackStruct(inputProc: inputCallback,
inputProcRefCon: UnsafeMutableRawPointer(Unmanaged.passUnretained(self).toOpaque()))
AudioUnitSetProperty(self.audioUnit,
kAudioOutputUnitProperty_SetInputCallback,
kAudioUnitScope_Global,
1,
&iCallback,
UInt32(MemoryLayout<AURenderCallbackStruct>.size))
var rCallback = AURenderCallbackStruct(inputProc: renderCallback,
inputProcRefCon: UnsafeMutableRawPointer(Unmanaged.passUnretained(self).toOpaque()))
AudioUnitSetProperty(self.audioUnit,
kAudioUnitProperty_SetRenderCallback,
kAudioUnitScope_Global,
0,
&rCallback,
UInt32(MemoryLayout<AURenderCallbackStruct>.size))
AudioUnitInitialize(self.audioUnit)
AudioOutputUnitStart(self.audioUnit)
}
I'm using ring buffer for audio data from
https://github.com/michaeltyson/TPCircularBuffer
func inputCallback(_ inRefCon: UnsafeMutableRawPointer,
_ ioActionFlags: UnsafeMutablePointer<AudioUnitRenderActionFlags>,
_ inTimeStamp: UnsafePointer<AudioTimeStamp>,
_ inOutputBusNumber: UInt32,
_ inNumberFrames: UInt32,
_ ioData: UnsafeMutablePointer<AudioBufferList>?) -> OSStatus {
let wSelf: AudioUnits = Unmanaged.fromOpaque(inRefCon).takeUnretainedValue()
var buffers = AudioBufferList(mNumberBuffers: 1, mBuffers: AudioBuffer())
AudioUnitRender(wSelf.audioUnit,
ioActionFlags,
inTimeStamp,
inOutputBusNumber,
inNumberFrames,
&buffers)
TPCircularBufferCopyAudioBufferList(&wSelf.ringBufferMic,
&buffers,
inTimeStamp,
inNumberFrames,
&wSelf.decoderOutputASBD)
wSelf.handleMic(inNumberFrames, inTimeStamp: inTimeStamp.pointee)
return noErr
}
func renderCallback(_ inRefCon: UnsafeMutableRawPointer,
_ ioActionFlags: UnsafeMutablePointer<AudioUnitRenderActionFlags>,
_ inTimeStamp: UnsafePointer<AudioTimeStamp>,
_ inOutputBusNumber: UInt32,
_ inNumberFrames: UInt32,
_ ioData: UnsafeMutablePointer<AudioBufferList>?) -> OSStatus {
let wSelf: AudioUnits = Unmanaged.fromOpaque(inRefCon).takeUnretainedValue()
if let data = ioData {
let audioBufferList = UnsafeMutableAudioBufferListPointer(data)
if let buffer = audioBufferList.first {
buffer.mData?.assumingMemoryBound(to: Float32.self).assign(repeating: 0, count: Int(inNumberFrames))
}
var ioLengthInFrames = inNumberFrames
TPCircularBufferDequeueBufferListFrames(&wSelf.ringBufferSpeaker,
&ioLengthInFrames,
ioData!,
nil,
&wSelf.decoderOutputASBD)
}
return noErr
}
In microphone handler I just encoding to Opus then decoding and trying to render decoded audio data (DEBUG). But my voice is corrupted
func handleMic(_ frames: UInt32, inTimeStamp: AudioTimeStamp) {
var ioLengthInFrames = frames
var its = inTimeStamp
self.inputBufferMic = AVAudioPCMBuffer(pcmFormat: AVAudioFormat(streamDescription: &self.decoderOutputASBD)!,
frameCapacity: ioLengthInFrames)!
self.inputBufferMic.frameLength = self.inputBufferMic.frameCapacity
TPCircularBufferDequeueBufferListFrames(&self.ringBufferMic,
&ioLengthInFrames,
self.inputBufferMic.mutableAudioBufferList,
&its,
&self.decoderOutputASBD)
self.outputBufferMic = AVAudioCompressedBuffer(format: AVAudioFormat(streamDescription: &self.opusASBD)!,
packetCapacity: 1,
maximumPacketSize: 960)
var error: NSError?
self.converterMic?.convert(to: self.outputBufferMic,
error: &error,
withInputFrom: { [weak self] (packetCount, outputStatus) -> AVAudioBuffer? in
outputStatus.pointee = .haveData
return self?.inputBufferMic
})
if let e = error {
LoggerManager.sharedInstance.log("<AudioUnits>: OPUS encoding error:\n \(e)")
return
}
let mData = NSData(bytes: self.outputBufferMic.data,
length: Int(self.outputBufferMic.byteLength))
self.inputBufferSpeaker = AVAudioCompressedBuffer(format: AVAudioFormat(streamDescription: &self.opusASBD)!,
packetCapacity: 1,
maximumPacketSize: Int(AudioUnits.frameSize))
self.outputBufferSpeaker = AVAudioPCMBuffer(pcmFormat: AVAudioFormat(streamDescription: &self.decoderOutputASBD)!,
frameCapacity: AVAudioFrameCount(AudioUnits.frameSize))!
self.outputBufferSpeaker.frameLength = self.outputBufferSpeaker.frameCapacity
memcpy(self.inputBufferSpeaker.data, mData.bytes.bindMemory(to: UInt8.self, capacity: 1), mData.length)
self.inputBufferSpeaker.byteLength = UInt32(mData.length)
self.inputBufferSpeaker.packetCount = AVAudioPacketCount(1)
self.inputBufferSpeaker.packetDescriptions![0].mDataByteSize = self.inputBufferSpeaker.byteLength
self.converterSpeaker?.convert(to: self.outputBufferSpeaker,
error: &error,
withInputFrom: { [weak self] (packetCount, outputStatus) -> AVAudioBuffer? in
outputStatus.pointee = .haveData
return self?.inputBufferSpeaker
})
if let e = error {
LoggerManager.sharedInstance.log("<AudioUnits>: OPUS decoding error:\n \(e)")
return
}
TPCircularBufferCopyAudioBufferList(&self.ringBufferSpeaker,
&self.outputBufferSpeaker.mutableAudioBufferList.pointee,
nil,
AudioUnits.frameSize,
&self.decoderOutputASBD)
}
I am trying to read the frequencies values from a CMSampleBuffer returned by captureOutput of AVCaptureAudioDataOutputSampleBufferDelegate.
The idea is to create a AVAudioPCMBuffer so that then I can read its floatChannelData. But I am not sure how to pass the buffer to it.
I guess I could create it with:
public func captureOutput(_ output: AVCaptureOutput,
didOutput sampleBuffer: CMSampleBuffer,
from connection: AVCaptureConnection) {
guard let blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer) else {
return
}
let length = CMBlockBufferGetDataLength(blockBuffer)
let audioFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: 44100, channels: 1, interleaved: false)
let pcmBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat!, frameCapacity: AVAudioFrameCount(length))
pcmBuffer?.frameLength = pcmBuffer!.frameCapacity
But how could I fill its data?
Something along these lines should help:
var asbd = CMSampleBufferGetFormatDescription(sampleBuffer)!.audioStreamBasicDescription!
var audioBufferList = AudioBufferList()
var blockBuffer : CMBlockBuffer?
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(
sampleBuffer,
bufferListSizeNeededOut: nil,
bufferListOut: &audioBufferList,
bufferListSize: MemoryLayout<AudioBufferList>.size,
blockBufferAllocator: nil,
blockBufferMemoryAllocator: nil,
flags: 0,
blockBufferOut: &blockBuffer
)
let mBuffers = audioBufferList.mBuffers
let frameLength = AVAudioFrameCount(Int(mBuffers.mDataByteSize) / MemoryLayout<Float>.size)
let pcmBuffer = AVAudioPCMBuffer(pcmFormat: AVAudioFormat(streamDescription: &asbd)!, frameCapacity: frameLength)!
pcmBuffer.frameLength = frameLength
pcmBuffer.mutableAudioBufferList.pointee.mBuffers = mBuffers
pcmBuffer.mutableAudioBufferList.pointee.mNumberBuffers = 1
This seems to create a valid AVAudioPCMBuffer at the end of it inside a capture session. But it's at the wrong frame length for my use case right now, so need to do some further buffering.
I have, for the past week, been trying to take audio from the microphone (on iOS), down sample it and write that to a '.aac' file.
I've finally gotten to point where it's almost working
let inputNode = audioEngine.inputNode
let inputFormat = inputNode.outputFormat(forBus: 0)
let bufferSize = UInt32(4096)
//let sampleRate = 44100.0
let sampleRate = 8000
let bitRate = sampleRate * 16
let fileUrl = url(appending: "NewRecording.aac")
print("Write to \(fileUrl)")
do {
outputFile = try AVAudioFile(forWriting: fileUrl,
settings: [
AVFormatIDKey: kAudioFormatMPEG4AAC,
AVSampleRateKey: sampleRate,
AVEncoderBitRateKey: bitRate,
AVEncoderAudioQualityKey: AVAudioQuality.high.rawValue,
AVNumberOfChannelsKey: 1],
commonFormat: .pcmFormatFloat32,
interleaved: false)
} catch let error {
print("Failed to create audio file for \(fileUrl): \(error)")
return
}
recordButton.setImage(RecordingStyleKit.imageOfMicrophone(fill: .red), for: [])
// Down sample the audio to 8kHz
let fmt = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: Double(sampleRate), channels: 1, interleaved: false)!
let converter = AVAudioConverter(from: inputFormat, to: fmt)!
inputNode.installTap(onBus: 0, bufferSize: AVAudioFrameCount(bufferSize), format: inputFormat) { (buffer, time) in
let inputCallback: AVAudioConverterInputBlock = { inNumPackets, outStatus in
outStatus.pointee = AVAudioConverterInputStatus.haveData
return buffer
}
let convertedBuffer = AVAudioPCMBuffer(pcmFormat: fmt,
frameCapacity: AVAudioFrameCount(fmt.sampleRate) * buffer.frameLength / AVAudioFrameCount(buffer.format.sampleRate))!
var error: NSError? = nil
let status = converter.convert(to: convertedBuffer, error: &error, withInputFrom: inputCallback)
assert(status != .error)
if let outputFile = self.outputFile {
do {
try outputFile.write(from: convertedBuffer)
}
catch let error {
print("Write failed: \(error)")
}
}
}
audioEngine.prepare()
do {
try audioEngine.start()
}
catch {
print(error.localizedDescription)
}
The problem is, the resulting file MUST be in MPEG ADTS, AAC, v4 LC, 8 kHz, monaural format, but the code above only generates MPEG ADTS, AAC, v2 LC, 8 kHz, monaural
That is, it MUST be v4, not v2 (I have no choice)
(This result is generated by using file {name} on the command line to dump it's properties. I also use MediaInfo to provide additional information)
I've been trying to figure out if there is someway to provide a hint or setting to AVAudioFile which will change the LC (Low Complexity) version from 2 to 4?
I've been scanning through the docs and examples but can't seem to find any suggestions
I'm using AVAudioEngine and I'm trying to get it to output .pcmFormatInt16 at 16000Hz but I can't seem to get it to work. Here's what I'm doing:
let audioEngine = AVAudioEngine()
let mixer = AVAudioMixerNode()
let input = self.audioEngine.inputNode!
audioEngine.attach(mixer)
audioEngine.connect(input, to: mixer, format: input.outputFormat(forBus: 0))
let recordingFormat = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: 16000.0, channels: 1, interleaved: true)
mixer.installTap(onBus: 0, bufferSize: 2048, format: recordingFormat) { [weak self] (buffer, _) in
// buffer here is all 0's!
}
self.audioEngine.prepare()
try! self.audioEngine.start()
As noted above, when I access the buffer it is always all 0, silence.
AVAudioEngine don't supports changing sample rate.
You can use AVAudioConverter to change sample rate like that
let inputFormat = input.outputFormat(forBus: 0)
let recordingFormat = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: 16000.0, channels: 1, interleaved: true)
converter = AVAudioConverter(from: inputFormat, to: recordingFormat)
mixer.installTap(onBus: 0, bufferSize: 2048, format: inputFormat) { [weak self] (buffer, _) in
let convertedBuffer = self?.converter.convertBuffer(additionalBuffer: buffer)
}