Using AVAudioEngine to record to compressed file - ios

I'm trying to use AVAudioEngine to record sounds from the microphone together with various sound effect files to a AVAudioFile.
I create an AVAudioFile like this:
let settings = self.engine.mainMixerNode.outputFormatForBus(0).settings
try self.audioFile = AVAudioFile(forWriting: self.audioURL, settings: settings, commonFormat: .PCMFormatFloat32, interleaved: false)
I install a tap on the audio engine's mainMixerNode, where I write the buffer to the file:
self.engine.mainMixerNode.installTapOnBus(0, bufferSize: 4096, format: self.engine.mainMixerNode.outputFormatForBus(0)) { (buffer, time) -> Void in
do {
try self.audioFile?.writeFromBuffer(buffer)
} catch let error as NSError {
NSLog("Error writing %#", error.localizedDescription)
}
}
I'm using self.engine.mainMixerNode.outputFormatForBus(0).settingswhen creating the audio file since Apple states that "The buffer format MUST match the file's processing format which is why outputFormatForBus: was used when creating the AVAudioFile object above". In the documentation for installTapOnBus they also say this: " The tap and connection formats (if non-nil) on the specified bus should be identical"
However, this gives me a very large, uncompressed audio file. I want to save the file as .m4a but don't understand where to specify the settings I want to use:
[
AVFormatIDKey: NSNumber(unsignedInt: kAudioFormatMPEG4AAC),
AVSampleRateKey : NSNumber(double: 32000.0), //44100.0
AVNumberOfChannelsKey: NSNumber(int: 1),
AVEncoderBitRatePerChannelKey: NSNumber(int: 16),
AVEncoderAudioQualityKey: NSNumber(int: Int32(AVAudioQuality.High.rawValue))
]
If I pass in these settings instead when creating the audio file, the app crashes when I record.
Any suggestions or ideas on how to solve this?

Related

Sound volume decreasing for no apparent reason

I have an iOS app using SwiftUI. It handles a few sound files and performs some audio recording. This is the function doing the recording work:
func recordAudio(to file: String, for duration: TimeInterval) {
let audioSession:AVAudioSession = AVAudioSession.sharedInstance()
do {try audioSession.setCategory(.playAndRecord, mode: .default)
try audioSession.setActive(true)
let audioFilename = getDocumentsDirectory().appendingPathComponent(file+".m4a"),
audioURL = URL(fileURLWithPath: audioFilename),
settings = [
AVFormatIDKey: Int(kAudioFormatMPEG4AAC),
AVSampleRateKey: 44100,
AVNumberOfChannelsKey: 2,
AVEncoderAudioQualityKey: AVAudioQuality.high.rawValue
]
audioRecorder = try AVAudioRecorder(url: audioURL, settings: settings)
audioRecorder.delegate = self
audioRecorder.record(forDuration: TimeInterval(2.0))
} catch let error as NSError {
print("Failed -- Recording !!! -> \(error)")
}
}
At this point, it basically works, but there is a strange behaviour that I neither understand nor like.
Here is the problem:
When I start the app and play a sound file, the volume is right for my taste.
Then without ever adjusting the volume I perform some recording (using the function above).
Finally after the recording is done, I go back to the file I played just before and play it again; the volume has mysteriously gone down, without me knowing why.
Is there something in my function that could explain that?
Or some other cause that someone could think of?
If I restart the app, the volume automatically goes back to normal.
For information, I am using iOS 14.4.2 and Xcode 12.4.
The audio session will be a decreased volume during playback after recording in .playAndRecord mode. After recording, explicitly set to something like .playback to get the volume you're expecting.

AVAssetWriter fails only in iOS when writing audio from specific videos

I have a sample project for resizing videos that works well for most videos. However, AVAssetWriter fails to write the audio from specific videos with the error:
Error Domain=AVFoundationErrorDomain
Code=-11800 "The operation could not be completed"
UserInfo={
NSLocalizedFailureReason=An unknown error occurred (-12780),
NSLocalizedDescription=The operation could not be completed,
NSUnderlyingError=0x282e956e0 {
Error Domain=NSOSStatusErrorDomain Code=-12780 "(null)"
}
}
What is even more problematic is that the same code works fine if I run it on macOS, but it breaks in iOS. I think it isn't a hardware problem because it also breaks in the iOS simulator.
These are the settings I use for (de)compressing the asset tracks:
func audioDecompressionSettings() -> [String: Any] {
return [
AVFormatIDKey: kAudioFormatLinearPCM
]
}
func audioCompressionSettings() -> [String: Any] {
var audioChannelLayout = AudioChannelLayout()
memset(&audioChannelLayout, 0, MemoryLayout<AudioChannelLayout>.size)
audioChannelLayout.mChannelLayoutTag = kAudioChannelLayoutTag_Stereo
return [
AVFormatIDKey: kAudioFormatMPEG4AAC,
AVSampleRateKey: 44100,
AVEncoderBitRateKey: 128000,
AVNumberOfChannelsKey: 2,
AVChannelLayoutKey: NSData(bytes: &audioChannelLayout, length: MemoryLayout<AudioChannelLayout>.size)
]
}
func videoDecompressionSettings() -> [String: Any] {
return [
kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_420YpCbCr8BiPlanarFullRange,
kCVPixelBufferMetalCompatibilityKey as String: true
]
}
func videoCompressionSettings(size: CGSize) -> [String: Any] {
return [
AVVideoCodecKey: AVVideoCodecType.h264,
AVVideoWidthKey: size.width,
AVVideoHeightKey: size.height
]
}
The complete source code can be found here.
In that project there are two targets, one for Mac and other for iOS, both of them using the same code for resizing the video. I also included two sample video files: fruit.mp4 and rain.mp4. The first one works well in both targets, but the second one breaks in iOS.
Am I missing something here or this is likely to be an Apple bug?
The audio settings for the problematic video are:
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 386 kb/s (default)
and for the other one are:
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 137 kb/s (default)
The important difference between the two is the number of audio channels: 5.1 (5 full bandwidth channels + one low-frequency effects channel) in the first one and stereo (2) in the second one.
When reading the video file, we specify the decompression settings:
[AVFormatIDKey: kAudioFormatLinearPCM]
Which means that the decompressed audio will have the same number of channels as the source file. In our case, we have a 5.1 (actually 6) channels asset and we want to write it to a 2 channels file. It seems that AVAssetWriterInput doesn't handle that case properly in iOS and we get an error.
The solution to the problem is to specify the number of audio channels we want when decompressing the audio from the asset, like this:
[
AVFormatIDKey: kAudioFormatLinearPCM
AVNumberOfChannelsKey: 2
]

AudioKit empty file using renderToFile with AKSequencer

I'm trying to use AudioKit.renderToFile() to export short MIDI passages to audio (m4a):
// renderSequencer is an instance of AKSequencer
self.renderSequencer.loadMIDIFile(fromURL: midiURL)
Conductor.sharedInstance.setInstrument(renderItem.soundID, forOfflineRender: true)
// we only have one track with note content
for track in self.renderSequencer.tracks {
if track.isNotEmpty {
track.setMIDIOutput(Conductor.sharedInstance.midiIn)
}
}
let audioCacheDir = self.module.stateManager.audioCacheDirectory
// strip name off midi file
let midiFileName = String(midiURL.lastPathComponent.split(separator: ".")[0])
audioFileName = midiFileName
audioFileURL = audioCacheDir.appendingPathComponent("\(midiFileName).m4a")
if let audioFileURL = audioFileURL {
let settings = [
AVFormatIDKey: Int(kAudioFormatMPEG4AAC),
AVSampleRateKey: 44100,
AVNumberOfChannelsKey: 2,
AVEncoderAudioQualityKey: AVAudioQuality.high.rawValue
]
let audioFile: AVAudioFile = try! AVAudioFile(forWriting: audioFileURL, settings: settings)
// get time in seconds of audio file (with 4-beat tail)
var duration: Float64 = 0.0
MusicSequenceGetSecondsForBeats(seq, (16.0 + 4), &duration)
// render sequence
do { try AudioKit.renderToFile(audioFile, duration: duration) {
self.renderSequencer.setRate(60.0)
self.renderSequencer.play()
}
} catch { print("Error performing offline file render!") }
}
This does produce an audio file of the expected duration, but it is silent. I've also tried logging from my MIDI output and can see that the events "played" from inside the preload closure are actually being sent/handled.
Mostly, I suppose, I'm curious to know whether this is actually expected to work. I've seen a couple of posts suggesting that renderToFile from MIDI is not supported (while others have suggested they have it working).
I did, btw, also post an issue on the audiokit GitHub.

Writing AVAudioPCMBuffer into an AVAudioFile compressed

We're working on an application which records and persists microphone input. The use of AVAudioRecorder was not an option, because real-time audio processing is needed.
AVAudioEngine is used because it provides low-level access to the input audio.
let audioEngine = AVAudioEngine()
let inputNode = audioEngine.inputNode
let inputFormat = inputNode.inputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: AVAudioFrameCount(inputFormat.sampleRate * sampleInterval), format: inputFormat) { (buffer: AVAudioPCMBuffer, time: AVAudioTime) -> Void in
// sound preprocessing
// writing to audio file
audioFile.write(buffer.floatChannelData![0])
})
Our issue is that the recording is quite large. For a 5 hour recording, the output audio file is 1.2GB with .caf format.
let audioFile = AVAudioFile(forWriting: recordingPath, settings: [:], commonFormat: .pcmFormatFloat32, interleaved: isInterleaved)
Is there a nice way to compress the audio file writing to it?
The default sampling frequency is 44100Hz. We will use AVAudioMixerNode to downsample the input to 20Khz (lower quality is acceptable in our case) but the size of the output won't be acceptable in size.
The recording contains large segments of background noise.
Any suggestions?
The .caf container format supports AAC compression. Enable it by setting the AVAudioFile settings dictionary to [AVFormatIDKey: kAudioFormatMPEG4AAC]:
let audioFile = try! AVAudioFile(forWriting: recordingPath, settings: [AVFormatIDKey: kAudioFormatMPEG4AAC], commonFormat: .pcmFormatFloat32, interleaved: isInterleaved)
There are other settings keys that influence the file size and quality: AVSampleRateKey, AVEncoderBitRateKey and AVEncoderAudioQualityKey.
p.s. you need to close your .caf file when you've finished with it. AVAudioFile doesn't have an explicit close() method, so you close it implicitly by nilling any references to it. Uncompressed .caf files seem to be playable without this, but AAC files are not.

IOS Swift read PCM Buffer

I have a project for Android reading a short[] array with PCM data from microphone Buffer for live analysis. I need to convert this functionality to iOS Swift. In Android it is very simple and looks like this..
import android.media.AudioFormat;
import android.media.AudioRecord;
...
AudioRecord recorder = new AudioRecord(MediaRecorder.AudioSource.DEFAULT, someSampleRate, AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_16BIT, AudioRecord.getMinBufferSize(...));
recorder.startRecording();
later I read the buffer with
recorder.read(data, offset, length); //data is short[]
(That's what i'm looking for)
Documentation: https://developer.android.com/reference/android/media/AudioRecord.html
I'm very new to Swift and iOS. I've read a lot of documentation about AudioToolkit, ...Core and whatever. All I found is C++/Obj-C and Bridging Swift Header solutions. Thats much to advanced and outdated for me.
For now I can read PCM-Data to a CAF-File with AVFoundation
settings = [
AVLinearPCMBitDepthKey: 16 as NSNumber,
AVFormatIDKey: Int(kAudioFormatLinearPCM),
AVLinearPCMIsBigEndianKey: 0 as NSNumber,
AVLinearPCMIsFloatKey: 0 as NSNumber,
AVSampleRateKey: 12000.0,
AVNumberOfChannelsKey: 1 as NSNumber,
]
...
recorder = try AVAudioRecorder(URL: someURL, settings: settings)
recorder.delegate = self
recorder.record()
But that's not what I'm looking for (or?). Is there an elegant way to achieve the android read functionality described above? I need to get a sample-array from the microphone buffer. Or do i need to do the reading on the recorded CAF file?
Thanks a lot! Please help me with easy explanations or code examples. iOS terminology is not mine yet ;-)
If you don't mind floating point samples and 48kHz, you can quickly get audio data from the microphone like so:
let engine = AVAudioEngine() // instance variable
func setup() {
let input = engine.inputNode!
let bus = 0
input.installTapOnBus(bus, bufferSize: 512, format: input.inputFormatForBus(bus)) { (buffer, time) -> Void in
let samples = buffer.floatChannelData[0]
// audio callback, samples in samples[0]...samples[buffer.frameLength-1]
}
try! engine.start()
}

Resources