I'm developing an application that should record a user's voice and stream it to a custom device via the MQTT protocol.
The audio specification for the custom device: little-endian, unsigned, 16-bit LPCM at 8khz sample rate. Packets should be 1000 bytes each.
I'm not familiar with AudioEngine and I found this sample of code which I believe fits my case:
func startRecord() {
audioEngine = AVAudioEngine()
let bus = 0
let inputNode = audioEngine.inputNode
let inputFormat = inputNode.outputFormat(forBus: bus)
var streamDescription = AudioStreamBasicDescription()
streamDescription.mFormatID = kAudioFormatLinearPCM.littleEndian
streamDescription.mSampleRate = 8000.0
streamDescription.mChannelsPerFrame = 1
streamDescription.mBitsPerChannel = 16
streamDescription.mBytesPerPacket = 1000
let outputFormat = AVAudioFormat(streamDescription: &streamDescription)!
guard let converter: AVAudioConverter = AVAudioConverter(from: inputFormat, to: outputFormat) else {
print("Can't convert in to this format")
return
}
inputNode.installTap(onBus: 0, bufferSize: 1024, format: inputFormat) { (buffer, time) in
print("Buffer format: \(buffer.format)")
var newBufferAvailable = true
let inputCallback: AVAudioConverterInputBlock = { inNumPackets, outStatus in
if newBufferAvailable {
outStatus.pointee = .haveData
newBufferAvailable = false
return buffer
} else {
outStatus.pointee = .noDataNow
return nil
}
}
let convertedBuffer = AVAudioPCMBuffer(pcmFormat: outputFormat, frameCapacity: AVAudioFrameCount(outputFormat.sampleRate) * buffer.frameLength / AVAudioFrameCount(buffer.format.sampleRate))!
var error: NSError?
let status = converter.convert(to: convertedBuffer, error: &error, withInputFrom: inputCallback)
assert(status != .error)
print("Converted buffer format:", convertedBuffer.format)
}
audioEngine.prepare()
do {
try audioEngine.start()
} catch {
print("Can't start the engine: \(error)")
}
}
But currently, the converter can't convert the input format to my output format and I don't understand why.
If I change my output format to something like that:
let outputFormat = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: 8000.0, channels: 1, interleaved: false)!
Then it works.
Your streamDescription is wrong, you hadn't filled in all the fields, and mBytesPerPacket was wrong - this is not the same kind of packet your protocol calls for. For uncompressed audio (like LPCM) AudioStreamBasicDescription requires this field to be 1. If your protocol requires samples to be in groups of 1000, then you will have to do that.
Try this
var streamDescription = AudioStreamBasicDescription()
streamDescription.mSampleRate = 8000.0
streamDescription.mFormatID = kAudioFormatLinearPCM
streamDescription.mFormatFlags = kAudioFormatFlagIsSignedInteger // no endian flag means little endian
streamDescription.mBytesPerPacket = 2
streamDescription.mFramesPerPacket = 1
streamDescription.mBytesPerFrame = 2
streamDescription.mChannelsPerFrame = 1
streamDescription.mBitsPerChannel = 16
streamDescription.mReserved = 0
Related
I am trying PCM (microphone input)->G711 and am having trouble getting it to work.
With the current implementation, the conversion itself succeeds, but it is as if there is no data stored in the buffer.
I have been doing a lot of research, but I can't seem to find the cause of the problem, so I would appreciate your advice.
var audioEngine = AVAudioEngine()
var convertFormatBasicDescription = AudioStreamBasicDescription(
mSampleRate: 8000,
mFormatID: kAudioFormatULaw,
mFormatFlags: AudioFormatFlags(kAudioFormatULaw),
mBytesPerPacket: 1,
mFramesPerPacket: 1,
mBytesPerFrame: 1,
mChannelsPerFrame: 1,
mBitsPerChannel: 8,
mReserved: 0
)
func convert () {
let inputNode = audioEngine.inputNode
let micInputFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: 44100, channels: 1, interleaved: true)!
let convertFormat = AVAudioFormat(streamDescription: &convertFormatBasicDescription)!
let converter = AVAudioConverter(from: micInputFormat, to: convertFormat)!
inputNode.installTap(onBus: 0, bufferSize: 1024, format: micInputFormat) { [weak self] (buffer, time) in
var newBufferAvailable = true
let inputCallback: AVAudioConverterInputBlock = { inNumPackets, outStatus in
if newBufferAvailable {
outStatus.pointee = .haveData
newBufferAvailable = false
return buffer
} else {
outStatus.pointee = .noDataNow
return nil
}
}
let convertedBuffer = AVAudioPCMBuffer(pcmFormat: convertFormat, frameCapacity: AVAudioFrameCount(convertFormat.sampleRate) * buffer.frameLength / AVAudioFrameCount(buffer.format.sampleRate))!
convertedBuffer.frameLength = convertedBuffer.frameCapacity
var error: NSError?
let status = converter.convert(to: convertedBuffer, error: &error, withInputFrom: inputCallback)
print(convertedBuffer.floatChannelData) <- nil
}
audioEngine.prepare()
try audioEngine.start()
}
Thank you.
It was audioBufferList.pointee.mBuffers.mData, not floatChannelData, that was needed to get the data converted to G711.
I'm trying to use AVAudioEngine instead of AVAudioPlayer because I need to do some per-packet processing as the audio is playing, but before I can get that far, I need to convert the 16-bit 8khz mono audio data to stereo so the AVAudioEngine will play it. This is my (incomplete) attempt to do it. I'm currently stuck at how to make AVAudioConverter do the mono-to-stereo conversion. If I don't use the AVAudioConverter, the iOS runtime complains that the input format doesn't match the output format. If I do use it (as below), the runtime doesn't complain, but the audio does not play back properly (likely because i'm not doing the mono-to-stereo conversion correctly). Any assistance is appreciated!
private func loadAudioData(audioData: Data?) {
// Load audio data into player
guard let audio = audioData else {return}
do {
let inputAudioFormat = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: Double(sampleRate), channels: 1, interleaved: false)
let outputAudioFormat = self.audioEngine.mainMixerNode.outputFormat(forBus: 0)
if inputAudioFormat != nil {
let inputStreamDescription = inputAudioFormat?.streamDescription.pointee
let outputStreamDescription = outputAudioFormat.streamDescription.pointee
let count = UInt32(audio.count)
if inputStreamDescription != nil && count > 0 {
if let ibpf = inputStreamDescription?.mBytesPerFrame {
let inputFrameCapacity = count / ibpf
let outputFrameCapacity = count / outputStreamDescription.mBytesPerFrame
self.pcmInputBuffer = AVAudioPCMBuffer(pcmFormat: inputAudioFormat!, frameCapacity: inputFrameCapacity)
self.pcmOutputBuffer = AVAudioPCMBuffer(pcmFormat: outputAudioFormat, frameCapacity: outputFrameCapacity)
if let input = self.pcmInputBuffer, let output = self.pcmOutputBuffer {
self.pcmConverter = AVAudioConverter(from: inputAudioFormat!, to: outputAudioFormat)
input.frameLength = input.frameCapacity
let b = UnsafeMutableBufferPointer(start: input.int16ChannelData?[0], count: input.stride * Int(inputFrameCapacity))
let bytesCopied = audio.copyBytes(to: b)
assert(bytesCopied == count)
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: nil)
self.pcmConverter?.convert(to: output, error: nil) { packets, status in
status.pointee = .haveData
return self.pcmInputBuffer // I know this is wrong, but i'm not sure how to do it correctly
}
try audioEngine.start()
}
}
}
}
}
}
Speculative, incorrect answer
How about pcmConverter?.channelMap = [0, 0]?
Actual answer
You don't need to use the audio converter channel map, because mono to stereo AVAudioConverters seem to duplicate the mono channel by default. The main problems were that outputFrameCapacity was wrong, and you use mainMixers outputFormat before calling audioEngine.prepare() or starting the engine.
Assuming sampleRate = 8000, an amended solution looks like this:
private func loadAudioData(audioData: Data?) throws {
// Load audio data into player
guard let audio = audioData else {return}
do {
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: nil)
audioEngine.prepare() // https://stackoverflow.com/a/70392017/22147
let outputAudioFormat = self.audioEngine.mainMixerNode.outputFormat(forBus: 0)
guard let inputAudioFormat = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: Double(sampleRate), channels: 1, interleaved: false) else { return }
let inputStreamDescription = inputAudioFormat.streamDescription.pointee
let outputStreamDescription = outputAudioFormat.streamDescription.pointee
let count = UInt32(audio.count)
if count > 0 {
let ibpf = inputStreamDescription.mBytesPerFrame
let inputFrameCapacity = count / ibpf
let outputFrameCapacity = Float64(inputFrameCapacity) * outputStreamDescription.mSampleRate / inputStreamDescription.mSampleRate
self.pcmInputBuffer = AVAudioPCMBuffer(pcmFormat: inputAudioFormat, frameCapacity: inputFrameCapacity)
self.pcmOutputBuffer = AVAudioPCMBuffer(pcmFormat: outputAudioFormat, frameCapacity: AVAudioFrameCount(outputFrameCapacity))
if let input = self.pcmInputBuffer, let output = self.pcmOutputBuffer {
self.pcmConverter = AVAudioConverter(from: inputAudioFormat, to: outputAudioFormat)
input.frameLength = input.frameCapacity
let b = UnsafeMutableBufferPointer(start: input.int16ChannelData?[0], count: input.stride * Int(inputFrameCapacity))
let bytesCopied = audio.copyBytes(to: b)
assert(bytesCopied == count)
self.pcmConverter?.convert(to: output, error: nil) { packets, status in
status.pointee = .haveData
return self.pcmInputBuffer // I know this is wrong, but i'm not sure how to do it correctly
}
try audioEngine.start()
self.playerNode.scheduleBuffer(output, completionHandler: nil)
self.playerNode.play()
}
}
}
}
I have, for the past week, been trying to take audio from the microphone (on iOS), down sample it and write that to a '.aac' file.
I've finally gotten to point where it's almost working
let inputNode = audioEngine.inputNode
let inputFormat = inputNode.outputFormat(forBus: 0)
let bufferSize = UInt32(4096)
//let sampleRate = 44100.0
let sampleRate = 8000
let bitRate = sampleRate * 16
let fileUrl = url(appending: "NewRecording.aac")
print("Write to \(fileUrl)")
do {
outputFile = try AVAudioFile(forWriting: fileUrl,
settings: [
AVFormatIDKey: kAudioFormatMPEG4AAC,
AVSampleRateKey: sampleRate,
AVEncoderBitRateKey: bitRate,
AVEncoderAudioQualityKey: AVAudioQuality.high.rawValue,
AVNumberOfChannelsKey: 1],
commonFormat: .pcmFormatFloat32,
interleaved: false)
} catch let error {
print("Failed to create audio file for \(fileUrl): \(error)")
return
}
recordButton.setImage(RecordingStyleKit.imageOfMicrophone(fill: .red), for: [])
// Down sample the audio to 8kHz
let fmt = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: Double(sampleRate), channels: 1, interleaved: false)!
let converter = AVAudioConverter(from: inputFormat, to: fmt)!
inputNode.installTap(onBus: 0, bufferSize: AVAudioFrameCount(bufferSize), format: inputFormat) { (buffer, time) in
let inputCallback: AVAudioConverterInputBlock = { inNumPackets, outStatus in
outStatus.pointee = AVAudioConverterInputStatus.haveData
return buffer
}
let convertedBuffer = AVAudioPCMBuffer(pcmFormat: fmt,
frameCapacity: AVAudioFrameCount(fmt.sampleRate) * buffer.frameLength / AVAudioFrameCount(buffer.format.sampleRate))!
var error: NSError? = nil
let status = converter.convert(to: convertedBuffer, error: &error, withInputFrom: inputCallback)
assert(status != .error)
if let outputFile = self.outputFile {
do {
try outputFile.write(from: convertedBuffer)
}
catch let error {
print("Write failed: \(error)")
}
}
}
audioEngine.prepare()
do {
try audioEngine.start()
}
catch {
print(error.localizedDescription)
}
The problem is, the resulting file MUST be in MPEG ADTS, AAC, v4 LC, 8 kHz, monaural format, but the code above only generates MPEG ADTS, AAC, v2 LC, 8 kHz, monaural
That is, it MUST be v4, not v2 (I have no choice)
(This result is generated by using file {name} on the command line to dump it's properties. I also use MediaInfo to provide additional information)
I've been trying to figure out if there is someway to provide a hint or setting to AVAudioFile which will change the LC (Low Complexity) version from 2 to 4?
I've been scanning through the docs and examples but can't seem to find any suggestions
I cant find anywhere how can i limit avaudioengine or mixer nodes output buffer? i found this from raywenderlich tutorial site but they say buffer size is not guaranteed
"installTap(onBus: 0, bufferSize: 1024, format: format) gives you >access to the audio data on the mainMixerNode‘s output bus. You >request a buffer size of 1024 bytes, but the requested size isn’t >guaranteed, especially if you request a buffer that’s too small or >large. Apple’s documentation doesn’t specify what those limits are."
https://www.raywenderlich.com/5154-avaudioengine-tutorial-for-ios-getting-started
i already tried installTap and SetCurrentIOBufferFrameSize(OSstatus) methods but all of it not working on buffer limitation.
func SetCurrentIOBufferFrameSize(inAUHAL: AudioUnit,inIOBufferFrameSize: UInt32) -> OSStatus {
var inIOBufferFrameSize = inIOBufferFrameSize
var propSize = UInt32(MemoryLayout<UInt32>.size)
return AudioUnitSetProperty(inAUHAL, AudioUnitPropertyID(kAudioUnitProperty_ScheduledFileBufferSizeFrames), kAudioUnitScope_Global, 0, &inIOBufferFrameSize, propSize)
}
func initalizeEngine() {
sampleRateConversionRatio = Float(44100 / SampleRate)
engine = AVAudioEngine()
SetCurrentIOBufferFrameSize(inAUHAL: engine.outputNode.audioUnit!, inIOBufferFrameSize: 15)
do {
try AVAudioSession.sharedInstance().setCategory(.playAndRecord , mode: .default , options: .defaultToSpeaker)
try AVAudioSession.sharedInstance().setPreferredIOBufferDuration(ioBufferDuration)
try AVAudioSession.sharedInstance().setPreferredSampleRate(Double(SampleRate))
try AVAudioSession.sharedInstance().setPreferredInputNumberOfChannels(channelCount)
} catch {
assertionFailure("AVAudioSession setup error: \(error)")
}
}
func startRecording() {
downMixer.installTap(onBus: 0, bufferSize: bufferSize, format: format) { buffer, when in
self.serialQueue.async {
let pcmBuffer = AVAudioPCMBuffer(pcmFormat: self.format16KHzMono, frameCapacity: AVAudioFrameCount(Float(buffer.frameCapacity)/self.sampleRateConversionRatio))
var error: NSError? = nil
let inputBlock: AVAudioConverterInputBlock = {inNumPackets, outStatus in
outStatus.pointee = AVAudioConverterInputStatus.haveData
return buffer
}
self.formatConverter.convert(to: pcmBuffer!, error: &error, withInputFrom: inputBlock)
if error != nil {
print(error!.localizedDescription)
}
else if let channelData = pcmBuffer!.int16ChannelData {
let channelDataPointer = channelData.pointee
let channelData = stride(from: 0,
to: Int(pcmBuffer!.frameLength),
by: buffer.stride).map{ channelDataPointer[$0] }
//Return channelDataValueArray
let data = Data(fromArray: channelData)
var byteArray = data.toByteArray()
}
}
}
}
I am recording sound through audio engine and make a file name my_file.caf and trying to make another file which will make its phase inverse that i can cancel its voice in mono.
But when i do some operations and calculations it reversed its sin wave but also reverse the sound.
do {
let inFile: AVAudioFile = try AVAudioFile(forReading: URLFor(filename: "my_file.caf")!)
let format: AVAudioFormat = inFile.processingFormat
let frameCount: AVAudioFrameCount = UInt32(inFile.length)
let outSettings = [AVNumberOfChannelsKey: format.channelCount,
AVSampleRateKey: format.sampleRate,
AVLinearPCMBitDepthKey: 16,
AVFormatIDKey: kAudioFormatMPEG4AAC] as [String : Any]
let outFile: AVAudioFile = try AVAudioFile(forWriting: URLFor(filename: "my_file1.caf")!, settings: outSettings)
let forwardBuffer: AVAudioPCMBuffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: frameCount)!
let reverseBuffer: AVAudioPCMBuffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: frameCount)!
try inFile.read(into: forwardBuffer)
let frameLength = forwardBuffer.frameLength
reverseBuffer.frameLength = frameLength
let audioStride = forwardBuffer.stride
for channelIdx in 0..<forwardBuffer.format.channelCount {
let forwardChannelData = forwardBuffer.floatChannelData?.advanced(by: Int(channelIdx)).pointee
let reverseChannelData = reverseBuffer.floatChannelData?.advanced(by: Int(channelIdx)).pointee
var reverseIdx: Int = 0
for idx in stride(from: frameLength, to: 0, by: -1) {
memcpy(reverseChannelData?.advanced(by: reverseIdx * audioStride), forwardChannelData?.advanced(by: Int(idx) * audioStride), MemoryLayout<Float>.size)
reverseIdx += 1
}
}
try outFile.write(from: reverseBuffer)
} catch let error {
print(error.localizedDescription)
}