AVAudioFile.write(from:) fails when buffer contains interleaved audio - ios

I'm trying to write out an audio file after doing some processing, and am getting an error. I've reduced the error to this simple standalone case:
import Foundation
import AVFoundation
do {
let inputFileURL = URL(fileURLWithPath: "/Users/andrewmadsen/Desktop/test.m4a")
let file = try AVAudioFile(forReading: inputFileURL, commonFormat: .pcmFormatFloat32, interleaved: true)
guard let buffer = AVAudioPCMBuffer(pcmFormat: file.processingFormat, frameCapacity: AVAudioFrameCount(file.length)) else {
throw NSError()
}
buffer.frameLength = buffer.frameCapacity
try file.read(into: buffer)
let tempURL =
URL(fileURLWithPath: NSTemporaryDirectory())
.appendingPathComponent("com.openreelsoftware.AudioWriteTest")
.appendingPathComponent(UUID().uuidString)
.appendingPathExtension("caf")
let fm = FileManager.default
let dirURL = tempURL.deletingLastPathComponent()
if !fm.fileExists(atPath: dirURL.path, isDirectory: nil) {
try fm.createDirectory(at: dirURL, withIntermediateDirectories: true, attributes: nil)
}
var settings = buffer.format.settings
settings[AVAudioFileTypeKey] = kAudioFileCAFType
let tempFile = try AVAudioFile(forWriting: tempURL, settings: settings)
try tempFile.write(from: buffer)
} catch {
print(error)
}
When this code runs, the tempFile.write(from: buffer) call throws an error:
Error Domain=com.apple.coreaudio.avfaudio Code=-50 "(null)" UserInfo={failed call=ExtAudioFileWrite(_imp->_extAudioFile, buffer.frameLength, buffer.audioBufferList)}
test.m4a is a stereo, 44.1 KHz AAC file (from the iTunes store), though the failure occurs with other stereo files in other formats (AIFF and WAV) as well.
The code does not fail, and instead correctly saves the original audio out to a new file if I change the interleaved parameter to false when creating the original input AVAudioFile (file). However, in this case, the following message is logged to the console:
Audio files cannot be non-interleaved. Ignoring setting AVLinearPCMIsNonInterleaved YES.
It seems strange and confusing that writing a non-interleaved buffer works fine, despite a message saying that files must be interleaved, while writing an interleaved buffer fails. This is the opposite of what I expected.
I'm aware that reading a file using the plain AVAudioFile(forReading:) initializer without specifying a format defaults to using non-interleaved (ie. the "standard" AVAudioFormat at the file's actual sample rate and channel count). Does this mean that I really do have to convert interleaved audio to non-interleaved before trying to write it?
Notably, in the actual program where this problem came up, I'm doing something much more complex than simply reading a file in and writing it back out again, and I do need to handle interleaved audio. I have confirmed however that that original, more complex code is also failing only for interleaved stereo audio.
Is there something tricky I need to do to get AVAudioFile to write out a buffer containing interleaved PCM audio?

The mixup here is that there are TWO formats in play: the format of the output file, and the format of the buffers you will write (the processing format). The initializer AVAudioFile(forWriting: settings:) does not let you choose the processing format and defaults to de-interleaved, hence your error.
This opens the file for writing using the standard format (deinterleaved floating point).
You need to use the other initializer: AVAudioFile(forWriting:settings: commonFormat:interleaved:) whose last two arguments specify the processing format (the argument names could have been clearer about that tbh).
var settings: [String : Any] = [:]
settings[AVFormatIDKey] = kAudioFormatMPEG4AAC
settings[AVAudioFileTypeKey] = kAudioFileCAFType
settings[AVSampleRateKey] = buffer.format.sampleRate
settings[AVNumberOfChannelsKey] = 2
settings[AVLinearPCMIsFloatKey] = (buffer.format.commonFormat == .pcmFormatInt32)
let tempFile = try AVAudioFile(forWriting: tempURL, settings: settings, commonFormat: buffer.format.commonFormat, interleaved: buffer.format.isInterleaved)
try tempFile.write(from: buffer)
p.s. passing the buffer format setting directly to AVAudioFile gets you an LPCM caf file, which you may not want, hence I reconstruct the file settings.

Not positive here, but maybe since you're making the outputFile settings the same as the processing format, it's possible that the processing format has an inflexible policy on interleaving, whereas the file settings format will be fine with it - or vice versa.
Here's what I'd try first. Incomplete example, but should be enough to illustrate the areas to test.
let sourceFile: AVAudioFile
let format: AVAudioFormat
do {
// for the moment, try this without any specific format and see what it gives you
let sourceFile = try AVAudioFile(forReading: inputFileURL)
format = sourceFile.processingFormat
print(format) // let's see what we're getting so far, maybe some clues
} catch {
fatalError("Unable to load the source audio file: \(error.localizedDescription).")
}
let sourceSettings = sourceFile.fileFormat.settings
var outputSettings = sourceSettings // start with the settings of the original file rather than the buffer format settings
outputSettings[AVAudioFileTypeKey] = kAudioFileCAFType
// etc...

Related

Why is AVURLAsset Not Loading the File?

OK. I have a nasty feeling that this will be met with the gentle chirp of crickets...
I base that on this and this.
I'm actually wondering if this is a feature, not a bug, as maybe there's a security issue with loading a movie locally, then playing it. I would think that isn't the case, but maybe. It should be noted that the loaded asset comes from a REST interaction with a server, in which the movie data is actually just a part of a data query response. It is not something that is loaded directly from a video streaming page (it is SSL, though).
I'm pretty green at AV Foundation.
I have the following code:
do {
// We create a path to a unique temporary file to grab the media.
let url = URL(fileURLWithPath: NSTemporaryDirectory()).appendingPathComponent(UUID().uuidString)
// Store the media in the temp file.
try myData.write(to: url, options: .atomic)
let options = [AVURLAssetPreferPreciseDurationAndTimingKey: true]
let asset = AVURLAsset(url: url, options: options)
if 0 < asset.tracks.count {
print("YOU GET \(asset.tracks.count) TRACKS!")
} else {
print("NO TRACKS FOR YOU!")
}
} catch let error {
NSLog("Error Encoding AV Media: %#", error._domain)
}
Pretty basic, eh? The "myData" variable contains a MP4 movie (.m4v) that was downloaded. I write it to a temp file, then load that temp file with AVURLAsset, just like it says to do.
The problem is that I can never get the dangblammit movie to play. The file is where it's supposed to be. I can fish out the temp file, slap on a '.m4v' extension, and play it in the QT Viewer.
I am quite prepared to accept a slap upside the head, followed by "ya darn eedjut!", but I'd like to know which "M" I should "RTFM".
The problem seems to be with this line
let url = URL(fileURLWithPath: NSTemporaryDirectory()).appendingPathComponent(UUID().uuidString)
You should add the extension for the file.
let videoName = UUID().uuidString + ".mp4"
let url = URL(fileURLWithPath: NSTemporaryDirectory()).appendingPathComponent(videoName)
Hope it fix.

Writing AVAudioPCMBuffer into an AVAudioFile compressed

We're working on an application which records and persists microphone input. The use of AVAudioRecorder was not an option, because real-time audio processing is needed.
AVAudioEngine is used because it provides low-level access to the input audio.
let audioEngine = AVAudioEngine()
let inputNode = audioEngine.inputNode
let inputFormat = inputNode.inputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: AVAudioFrameCount(inputFormat.sampleRate * sampleInterval), format: inputFormat) { (buffer: AVAudioPCMBuffer, time: AVAudioTime) -> Void in
// sound preprocessing
// writing to audio file
audioFile.write(buffer.floatChannelData![0])
})
Our issue is that the recording is quite large. For a 5 hour recording, the output audio file is 1.2GB with .caf format.
let audioFile = AVAudioFile(forWriting: recordingPath, settings: [:], commonFormat: .pcmFormatFloat32, interleaved: isInterleaved)
Is there a nice way to compress the audio file writing to it?
The default sampling frequency is 44100Hz. We will use AVAudioMixerNode to downsample the input to 20Khz (lower quality is acceptable in our case) but the size of the output won't be acceptable in size.
The recording contains large segments of background noise.
Any suggestions?
The .caf container format supports AAC compression. Enable it by setting the AVAudioFile settings dictionary to [AVFormatIDKey: kAudioFormatMPEG4AAC]:
let audioFile = try! AVAudioFile(forWriting: recordingPath, settings: [AVFormatIDKey: kAudioFormatMPEG4AAC], commonFormat: .pcmFormatFloat32, interleaved: isInterleaved)
There are other settings keys that influence the file size and quality: AVSampleRateKey, AVEncoderBitRateKey and AVEncoderAudioQualityKey.
p.s. you need to close your .caf file when you've finished with it. AVAudioFile doesn't have an explicit close() method, so you close it implicitly by nilling any references to it. Uncompressed .caf files seem to be playable without this, but AAC files are not.

AudioKit File Normalization

I'm trying to normalize audio file after record to make it louder or vice versa, but i'm getting error WARNING AKAudioFile: cannot normalize a silent file
I have checked recordered audioFile.maxLevel and it was 1.17549e-38, minimum float.
I'm using official Recorder example, and to normalize after record i added this code:
let norm = try player.audioFile.normalized(newMaxLevel: -4.0);
What I'm doing wrong? Why maxLevel invalid? Record is loud enough.
Rather than use the internal audio file of the player, make a new instance like so:
if let file = try? AKAudioFile(forReading: url) {
if let normalizedFile = try? file.normalized(newMaxLevel: -4) {
Swift.print("Normalized file sucess: \(normalizedFile.maxLevel)")
}
}
I can add a normalize func to the AKAudioPlayer so that it's available for playback. Essentially, the player just uses the AKAudioFile for initialization, and all subsequent operations happen in a buffer.

How can I specify the format of AVAudioEngine Mic-Input?

I'd like to record the some audio using AVAudioEngine and the users Microphone. I already have a working sample, but just can't figure out how to specify the format of the output that I want...
My requirement would be that I need the AVAudioPCMBuffer as I speak which it currently does...
Would I need to add a seperate node that does some transcoding? I can't find much documentation/samples on that problem...
And I am also a noob when it comes to Audio-Stuff. I know that I want NSData containing PCM-16bit with a max sample-rate of 16000 (8000 would be better)
Here's my working sample:
private var audioEngine = AVAudioEngine()
func startRecording() {
let format = audioEngine.inputNode!.inputFormatForBus(bus)
audioEngine.inputNode!.installTapOnBus(bus, bufferSize: 1024, format: format) { (buffer: AVAudioPCMBuffer, time:AVAudioTime) -> Void in
let audioFormat = PCMBuffer.format
print("\(audioFormat)")
}
audioEngine.prepare()
do {
try audioEngine.start()
} catch { /* Imagine some super awesome error handling here */ }
}
If I changed the format to let' say
let format = AVAudioFormat(commonFormat: AVAudioCommonFormat.PCMFormatInt16, sampleRate: 8000.0, channels: 1, interleaved: false)
then if will produce an error saying that the sample rate needs to be the same as the hwInput...
Any help is very much appreciated!!!
EDIT: I just found AVAudioConverter but I need to be compatible with iOS8 as well...
You cannot change audio format directly on input nor output nodes. In the case of the microphone, the format will always be 44KHz, 1 channel, 32bits. To do so, you need to insert a mixer in between. Then when you connect inputNode > changeformatMixer > mainEngineMixer, you can specify the details of the format you want.
Something like:
var inputNode = audioEngine.inputNode
var downMixer = AVAudioMixerNode()
//I think you the engine's I/O nodes are already attached to itself by default, so we attach only the downMixer here:
audioEngine.attachNode(downMixer)
//You can tap the downMixer to intercept the audio and do something with it:
downMixer.installTapOnBus(0, bufferSize: 2048, format: downMixer.outputFormatForBus(0), block: //originally 1024
{ (buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
print(NSString(string: "downMixer Tap"))
do{
print("Downmixer Tap Format: "+self.downMixer.outputFormatForBus(0).description)//buffer.audioBufferList.debugDescription)
})
//let's get the input audio format right as it is
let format = inputNode.inputFormatForBus(0)
//I initialize a 16KHz format I need:
let format16KHzMono = AVAudioFormat.init(commonFormat: AVAudioCommonFormat.PCMFormatInt16, sampleRate: 11050.0, channels: 1, interleaved: true)
//connect the nodes inside the engine:
//INPUT NODE --format-> downMixer --16Kformat--> mainMixer
//as you can see I m downsampling the default 44khz we get in the input to the 16Khz I want
audioEngine.connect(inputNode, to: downMixer, format: format)//use default input format
audioEngine.connect(downMixer, to: audioEngine.outputNode, format: format16KHzMono)//use new audio format
//run the engine
audioEngine.prepare()
try! audioEngine.start()
I would recommend using an open framework such as EZAudio, instead, though.
The only thing I found that worked to change the sampling rate was
AVAudioSettings.sharedInstance().setPreferredSampleRate(...)
You can tap off engine.inputNode and use the input node's output format:
engine.inputNode.installTap(onBus: 0, bufferSize: 2048,
format: engine.inputNode.outputFormat(forBus: 0))
Unfortunately, there is no guarantee that you will get the sample rate that you want, although it seems like 8000, 12000, 16000, 22050, 44100 all worked.
The following did NOT work:
Setting the my custom format in a tap off engine.inputNode. (Exception)
Adding a mixer with my custom format and tapping that. (Exception)
Adding a mixer, connecting it with the inputNode's format, connecting the mixer to the main mixer with my custom format, then removing the input of the outputNode so as not to send the audio to the speaker and get instant feedback. (Worked, but got all zeros)
Not using my custom format at all in the AVAudioEngine, and using AVAudioConverter to convert from the hardware rate in my tap. (Length of the buffer was not set, no way to tell if results were correct)
This was with iOS 12.3.1.
In order to change the sample rate of input node, you have to first connect the input node to a mixer node, and specify a new format in the parameter.
let input = avAudioEngine.inputNode
let mainMixer = avAudioEngine.mainMixerNode
let newAudioFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: 44100, channels: 1, interleaved: true)
avAudioEngine.connect(input, to: mainMixer, format: newAudioFormat)
Now you can call installTap function on input node with the newAudioFormat.
One more thing I'd like to point out is, since the new launch of iPhone12, the default sample rate of input node has been no longer 44100 anymore. It has been upgraded to 48000.
You cannot change the configuration of input node, try to create a mixer node with the format that you want, attach it to the engine, then connect it to the input node and then connect the mainMixer to the node that you just created. Now you can install a tap on this node to get PCM data.
Note that for some strange reasons, you don't have a lot of choice for sample rate! At least not on iOS 9.1, Use standard 11025, 22050 or 44100. Any other sample rate will fail!
If you just need to change the sample rate and channel, I recommend using row-level API. You do not need to use a mixer or converter. Here you can find the Apple document about low-level recording. If you want, you will be able to convert to Objective-C class and add protocol.
Audio Queue Services Programming Guide
If your goal is simply to end up with AVAudioPCMBuffers that contains audio in your desired format, you can convert the buffers returned in the tap block using AVAudioConverter. This way, you actually don't need to know or care what the format of the inputNode is.
class MyBufferRecorder {
private let audioEngine:AVAudioEngine = AVAudioEngine()
private var inputNode:AVAudioInputNode!
private let audioQueue:DispatchQueue = DispatchQueue(label: "Audio Queue 5000")
private var isRecording:Bool = false
func startRecording() {
if (isRecording) {
return
}
isRecording = true
// must convert (unknown until runtime) input format to our desired output format
inputNode = audioEngine.inputNode
let inputFormat:AVAudioFormat! = inputNode.outputFormat(forBus: 0)
// 9600 is somewhat arbitrary... min seems to be 4800, max 19200... it doesn't matter what we set
// because we don't re-use this value -- we query the buffer returned in the tap block for it's true length.
// Using [weak self] in the tap block is probably a better idea, but it results in weird warnings for now
inputNode.installTap(onBus: 0, bufferSize: AVAudioFrameCount(9600), format: inputFormat) { (buffer, time) in
// not sure if this is necessary
if (!self.isRecording) {
print("\nDEBUG - rejecting callback, not recording")
return }
// not really sure if/why this needs to be async
self.audioQueue.async {
// Convert recorded buffer to our preferred format
let convertedPCMBuffer = AudioUtils.convertPCMBuffer(bufferToConvert: buffer, fromFormat: inputFormat, toFormat: AudioUtils.desiredFormat)
// do something with converted buffer
}
}
do {
// important not to start engine before installing tap
try audioEngine.start()
} catch {
print("\nDEBUG - couldn't start engine!")
return
}
}
func stopRecording() {
print("\nDEBUG - recording stopped")
isRecording = false
inputNode.removeTap(onBus: 0)
audioEngine.stop()
}
}
Separate class:
import Foundation
import AVFoundation
// assumes we want 16bit, mono, 44100hz
// change to what you want
class AudioUtils {
static let desiredFormat:AVAudioFormat! = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: Double(44100), channels: 1, interleaved: false)
// PCM <--> PCM
static func convertPCMBuffer(bufferToConvert: AVAudioPCMBuffer, fromFormat: AVAudioFormat, toFormat: AVAudioFormat) -> AVAudioPCMBuffer {
let convertedPCMBuffer = AVAudioPCMBuffer(pcmFormat: toFormat, frameCapacity: AVAudioFrameCount(bufferToConvert.frameLength))
var error: NSError? = nil
let inputBlock:AVAudioConverterInputBlock = {inNumPackets, outStatus in
outStatus.pointee = AVAudioConverterInputStatus.haveData
return bufferToConvert
}
let formatConverter:AVAudioConverter = AVAudioConverter(from:fromFormat, to: toFormat)!
formatConverter.convert(to: convertedPCMBuffer!, error: &error, withInputFrom: inputBlock)
if error != nil {
print("\nDEBUG - " + error!.localizedDescription)
}
return convertedPCMBuffer!
}
}
This is by no means production ready code -- I'm also learning IOS Audio... so please, please let me know any errors, best practices, or dangerous things going on in that code and I'll keep this answer updated.

Audible glitches on buffer playback via AVAudioPlayerNode in iOS (Swift) *working in simulator, but not on device

When using an AVAudioPlayerNode to schedule a short buffer to play immediately on a touch event ("Touch Up Inside"), I've noticed audible glitches / artifacts on playback while testing. The audio does not glitch at all in iOS simulator, however there is audible distortion on playback when I run the app on an actual iOS device. The audible distortion occurs randomly (the triggered sound will sometimes sound great, while other times it sounds distorted)
I've tried using different audio files, file formats, and preparing the buffer for playback using the prepareWithFrameCount method, but unfortunately the result is always the same and I'm stuck wondering what could be going wrong..
I've stripped the code down to globals for clarity and simplicity. Any help or insight would be greatly appreciated. This is my first attempt at developing an iOS app and my first question posted on Stack Overflow.
let filePath = NSBundle.mainBundle().pathForResource("BD_withSilence", ofType: "caf")!
let fileURL: NSURL = NSURL(fileURLWithPath: filePath)!
var error: NSError?
let file = AVAudioFile(forReading: fileURL, error: &error)
let fileFormat = file.processingFormat
let frameCount = UInt32(file.length)
let buffer = AVAudioPCMBuffer(PCMFormat: fileFormat, frameCapacity: frameCount)
let audioEngine = AVAudioEngine()
let playerNode = AVAudioPlayerNode()
func startEngine() {
var error: NSError?
file.readIntoBuffer(buffer, error: &error)
audioEngine.attachNode(playerNode)
audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: buffer.format)
audioEngine.prepare()
func start() {
var error: NSError?
audioEngine.startAndReturnError(&error)
}
start()
}
startEngine()
let frameCapacity = AVAudioFramePosition(buffer.frameCapacity)
let frameLength = buffer.frameLength
let sampleRate: Double = 44100.0
func play() {
func scheduleBuffer() {
playerNode.scheduleBuffer(buffer, atTime: nil, options: AVAudioPlayerNodeBufferOptions.Interrupts, completionHandler: nil)
playerNode.prepareWithFrameCount(frameLength)
}
if playerNode.playing == false {
scheduleBuffer()
let time = AVAudioTime(sampleTime: frameCapacity, atRate: sampleRate)
playerNode.playAtTime(time)
}
else {
scheduleBuffer()
}
}
// triggered by a "Touch Up Inside" event on a UIButton in my ViewController
#IBAction func triggerPlay(sender: AnyObject) {
play()
}
Update:
Ok I think I've identified the source of the distortion: the volume of the node(s) is too great at output and causes clipping. By adding these two lines in my startEngine function, the distortion no longer occurred:
playerNode.volume = 0.8
audioEngine.mainMixerNode.volume = 0.8
However, I'm still don't know why I need to lower the output- my audio file itself does not clip. I'm guessing that it might be a result of the way that the AVAudioPlayerNodeBufferOptions.Interrupts is implemented. When a buffer interrupts another buffer, could there be an increase in output volume as a result of the interruption, causing output clipping? I'm still looking for a solid understanding as to why this occurs.. If anyone is willing/able to provide any clarification about this that would be fantastic!
Not sure if this is the problem you experienced in 2015, it may be the same issue that #suthar experienced in 2018.
I experienced a very similar problem and was due to the fact that the sampleRate on the device is different to the simulator. On macOS it is 44100 and on iOS Devices (late model ones) it is 48000.
So when you fill your buffer with 44100 samples on a 48000 device, you get 3900 samples of silence. When played back it doesn't sound like silence, it sounds like a glitch.
I used the mainMixer format when connecting my playerNode and also when creating my pcmBuffer. Don't refer to 48000 or 44100 anywhere in the code.
audioEngine.attach( playerNode)
audioEngine.connect( playerNode, to:mixerNode, format:mixerNode.outputFormat(forBus:0))
let pcmBuffer = AVAudioPCMBuffer( pcmFormat:SynthEngine.shared.audioEngine.mainMixerNode.outputFormat( forBus:0),
frameCapacity:AVAudioFrameCount(bufferSize))

Resources