AVAudioConverter.convertToBuffer throwing error code -50 - ios

I need to convert a .wav file recorded with 2 audio channels to a .wav that has only 1 channel, as well as reduce the bit depth from 32 to 16. I've been trying to use AVAudioConverter.convertToBuffer However, the conversion is throwing an error: Error Domain=NSOSStatusErrorDomain Code=-50 "(null)"
Basically, the only thing that really needs to change is to strip the audio down to a single channel, and the bit depth. I'm getting these files from a different tool, so I can't just change the way the files are recorded.
I'm not that awesome at working with audio, and I'm a bit stumped. The code I'm working on is below - is there anything I'm missing?
let inAudioFileURL:NSURL = <url_to_wav_file>
var inAudioFile:AVAudioFile?
do
{
inAudioFile = try AVAudioFile(forReading: inAudioFileURL)
}
catch let error
{
print ("error: \(error)")
}
let inAudioFormat:AVAudioFormat = inAudioFile!.processingFormat
let inFrameCount:UInt32 = UInt32(inAudioFile!.length)
let inAudioBuffer:AVAudioPCMBuffer = AVAudioPCMBuffer(PCMFormat: inAudioFormat, frameCapacity: inFrameCount)
do
{
try inAudioFile!.readIntoBuffer(inAudioBuffer)
}
catch let error
{
print ("readError: \(error)")
}
let startFormat:AVAudioFormat = AVAudioFormat.init(settings: inAudioFile!.processingFormat.settings)
print ("startFormat: \(startFormat.settings)")
var endFormatSettings = startFormat.settings
endFormatSettings[AVLinearPCMBitDepthKey] = 16
endFormatSettings[AVNumberOfChannelsKey] = 1
endFormatSettings[AVEncoderAudioQualityKey] = AVAudioQuality.Medium.rawValue
print ("endFormatSettings: \(endFormatSettings)")
let endFormat:AVAudioFormat = AVAudioFormat.init(settings: endFormatSettings)
let outBuffer = AVAudioPCMBuffer(PCMFormat: endFormat, frameCapacity: inFrameCount)
let avConverter:AVAudioConverter = AVAudioConverter.init(fromFormat: startFormat, toFormat: endFormat)
do
{
try avConverter.convertToBuffer(outBuffer, fromBuffer: inAudioBuffer)
}
catch let error
{
print ("avconverterError: \(error)")
}
As for the output:
startFormat:
["AVSampleRateKey": 16000,
"AVLinearPCMBitDepthKey": 32,
"AVLinearPCMIsFloatKey": 1,
"AVNumberOfChannelsKey": 2,
"AVFormatIDKey": 1819304813,
"AVLinearPCMIsNonInterleaved": 0,
"AVLinearPCMIsBigEndianKey": 0]
endFormatSettings:
["AVSampleRateKey": 16000,
"AVLinearPCMBitDepthKey": 16,
"AVLinearPCMIsFloatKey": 1,
"AVNumberOfChannelsKey": 1,
"AVFormatIDKey": 1819304813,
"AVLinearPCMIsNonInterleaved": 0,
"AVLinearPCMIsBigEndianKey": 0,
"AVEncoderQualityKey": 64]
avconverterError: Error Domain=NSOSStatusErrorDomain Code=-50 "(null)"

I'm not 100% sure why this is the case, but I found a solution that got this working for me, so here's how I understand the problem. I found this solution by trying to use the alternate convert(to:error:withInputFrom:) method. Using this was giving me a different error:
`ERROR: AVAudioConverter.mm:526: FillComplexProc: required condition is false: [impl->_inputBufferReceived.format isEqual: impl->_inputFormat]`
The problem was caused in the line where I setup the AVAudioConverter:
let avConverter:AVAudioConverter = AVAudioConverter.init(fromFormat: startFormat, toFormat: endFormat)
It appears that the audio converter wants to use the same AVAudioFormat that the input buffer is using, instead of using a copy based on the original's settings. Once I swapped startFormat out for inAudioFormat, the convert(to:error:withInputFrom:) error was dismissed, and things worked as expected. I was then able to go back to using the simpler convert(to:fromBuffer:) method, and the original error I was dealing with also went away.
To recap, the line setting up the converter now looks like:
let avConverter:AVAudioConverter = AVAudioConverter.init(fromFormat: inAudioFormat, toFormat: endFormat)
As for the lack of docs on how to use AVAudioConverter, I have no idea why the API reference has next to nothing. Instead, in Xcode, CMD-click on AVAudioConverter in your code to go to it's header file. There's plenty of comments and info there. Not full sample code or anything, but it's at least something.

Related

Playing a stereo audio buffer from memory with AVAudioEngine

I am trying to play a stereo audio buffer from memory (not from a file) in my iOS app but my application crashes when I attempt to attach the AVAudioPlayerNode 'playerNode' to the AVAudioEngine 'audioEngine'. The error code that I get is as follows:
Thread 1: Exception: "required condition is false: _outputFormat.channelCount == buffer.format.channelCount"
I don't know if this due to the way I have declared the AVAudioEngine, the AVAudioPlayerNode, if there is something wrong with the buffer which I am generating, or if I am attaching the nodes incorrectly (or something else!). I have a feeling that it is something to do with how I am creating a new buffer. I am trying to make a stereo buffer from two separate 'mono' arrays, and perhaps its format is not correct.
I have declared audioEngine: AVAudioEngine! and playerNode: AVAudioPlayerNode! globally:
var audioEngine: AVAudioEngine!
var playerNode: AVAudioPlayerNode!
I then load a mono source audio file that my app is going to process (the data out of this file will not be played, it will be loaded into an array, processed and then loaded into a new buffer):
// Read audio file
let audioFileFormat = audioFile.processingFormat
let frameCount = UInt32(audioFile.length)
let audioBuffer = AVAudioPCMBuffer(pcmFormat: audioFileFormat, frameCapacity: frameCount)!
// Read audio data into buffer
do {
try audioFile.read(into: audioBuffer)
} catch let error {
print(error.localizedDescription)
}
// Convert buffer to array of floats
let input: [Float] = Array(UnsafeBufferPointer(start: audioBuffer.floatChannelData![0], count: Int(audioBuffer.frameLength)))
The array is then sent to a convolution function twice that returns a new array each time. This is because the mono source file needs to become a stereo audio buffer:
maxSignalLength = input.count + 256
let leftAudioArray: [Float] = convolve(inputAudio: input, impulse: normalisedLeftImpulse)
let rightAudioArray: [Float] = convolve(inputAudio: input, impulse: normalisedRightImpulse)
The maxSignalLength variable is currently the length of the input signal + the length of the impulse response (normalisedImpulseResponse) that is being convolved with, which at the moment is 256. This will become an appropriate variable at some point.
I then declare and load the new buffer and its format, I have a feeling that the mistake is somewhere around here as this will be the buffer that is played:
let bufferFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: hrtfSampleRate, channels: 2, interleaved: false)!
let outputBuffer = AVAudioPCMBuffer(pcmFormat: bufferFormat, frameCapacity: AVAudioFrameCount(maxSignalLength))!
Notice that I am not creating an interleaved buffer, I load the stereo audio data to the buffer as follows (which I think may also be wrong):
for ch in 0 ..< 2 {
for i in 0 ..< maxSignalLength {
var val: Float!
if ch == 0 { // Left
val = leftAudioArray[i]
// Limit
if val > 1 {
val = 1
}
if val < -1 {
val = -1
}
} else if ch == 1 { // Right
val = rightAudioArray[i]
// Limit
if val < 1 {
val = 1
}
if val < -1 {
val = -1
}
}
outputBuffer.floatChannelData![ch][i] = val
}
}
The audio is also limited to values between -1 and 1.
Then I finally come to (attempting to) load the buffer to the audio node, attach the audio node to the audio engine, start the audio engine and then play the node.
let frameCapacity = AVAudioFramePosition(outputBuffer.frameCapacity)
let frameLength = outputBuffer.frameLength
playerNode.scheduleBuffer(outputBuffer, at: nil, options: AVAudioPlayerNodeBufferOptions.interrupts, completionHandler: nil)
playerNode.prepare(withFrameCount: frameLength)
let time = AVAudioTime(sampleTime: frameCapacity, atRate: hrtfSampleRate)
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: outputBuffer.format)
audioEngine.prepare()
do {
try audioEngine.start()
} catch let error {
print(error.localizedDescription)
}
playerNode.play(at: time)
The error that I get in runtime is:
AVAEInternal.h:76 required condition is false: [AVAudioPlayerNode.mm:712:ScheduleBuffer: (_outputFormat.channelCount == buffer.format.channelCount)]
It doesn't show the line that this error occurs on. I have been stuck on this for a while now, and have tried lots of different things, but there doesn't seem to be very much clear information about playing audio from memory and not from files with AVAudioEngine from what I could find. Any help would be greatly appreciated.
Thanks!
Edit #1:
Better title
Edit# 2:
UPDATE - I have found out why I was getting the error. It seemed to be caused by setting up the playerNode before attaching it to the audioEngine. Swapping the order stopped the program from crashing and throwing the error:
let frameCapacity = AVAudioFramePosition(outputBuffer.frameCapacity)
let frameLength = outputBuffer.frameLength
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: outputBuffer.format)
audioEngine.prepare()
playerNode.scheduleBuffer(outputBuffer, at: nil, options: AVAudioPlayerNodeBufferOptions.interrupts, completionHandler: nil)
playerNode.prepare(withFrameCount: frameLength)
let time = AVAudioTime(sampleTime: frameCapacity, atRate: hrtfSampleRate)
do {
try audioEngine.start()
} catch let error {
print(error.localizedDescription)
}
playerNode.play(at: time)
However, I don't have any sound. After creating an array of floats of the outputBuffer with the same method as used for the input signal, and taking a look at its contents with a break point it seems to be empty, so I must also be incorrectly storing the data to the outputBuffer.
You might be creating and filling your buffer incorrectly. Try doing it thus:
let fileURL = Bundle.main.url(forResource: "my_file", withExtension: "aiff")!
let file = try! AVAudioFile(forReading: fileURL)
let buffer = AVAudioPCMBuffer(pcmFormat: file.processingFormat, frameCapacity: UInt32(file.length))!
try! file.read(into: buffer)
I have fixed the issue!
I tried a lot of solutions and have ended up completely re-writing the audio engine section of my app and I now have the AVAudioEngine and AVAudioPlayerNode declared within the ViewController class as the following:
class ViewController: UIViewController {
var audioEngine: AVAudioEngine = AVAudioEngine()
var playerNode: AVAudioPlayerNode = AVAudioPlayerNode()
...
I am still unclear if it is better to declare these globally or as class variables in iOS, however I can confirm that my application is playing audio with these declared within the ViewController class. I do know that they shouldn't be declared in a function as they will disappear and stop playing when the function goes out of scope.
However, I still was not getting any audio output until I set the AVAudioPCMBuffer.frameLength to frameCapacity.
I could find very little information online regarding creating a new AVAudioPCMBuffer from an array of floats, but this seems to be the missing step that I needed to do to make my outputBuffer playable. Before I set this, it was at 0 by default.
The frameLength member isn't required in the AVAudioFormat class declaration. But it is important and my buffer wasn't playable until I set it manually, and after the class instance declaration:
let bufferFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: hrtfSampleRate, channels: 2, interleaved: false)!
let frameCapacity = UInt32(audioFile.length)
guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: bufferFormat, frameCapacity: frameCapacity) else {
fatalError("Could not create output buffer.")
}
outputBuffer.frameLength = frameCapacity // Important!
This took a long time to find out, hopefully this will help someone else in the future.

How to determine volume from AVAudioBuffer audio sample

I am attempting to write a small demonstration application that will do some audio measurements (volume and pitch) in realtime.
I've got, I think, to the point where I have the audio samples, but I am new to working with audio and not sure where to go next. Is there a way to determine pitch and volume of a particular sample as a function of the float/integer/byte value of the samples?
Also, I had to add this line "buffer.frameLength = 1" to get the code to run. When I print the variable "inputFormat", I get the value "".
All the material+tutorials that I can find about audio processing (in general and on ios) seems to require a lot of contextual info they leave out.
The code written in swift works to get the samples, and outputs Sample: (~ -8 to +8 float value).
func test() {
let inputNode = audioEngine.inputNode
let inputFormat = inputNode.outputFormat(forBus: 0)
let bufferSize = 10
inputNode.installTap(onBus: 0, bufferSize: AVAudioFrameCount(bufferSize), format: inputFormat) { (buffer, time) in
buffer.frameLength = 1
var i = 0;
while i < Int(buffer.frameLength) && buffer.floatChannelData != nil
{
let sample : Double = Double(buffer.floatChannelData![i].pointee)
print("\nSample: "+String(sample))
i += 1
}
}
audioEngine.prepare()
do {
try audioEngine.start()
}catch {
print(error.localizedDescription)
}
}

Generating video or audio using raw PCM

What is the process of generating .mov or .m4a file using arrays of Int16 as sterio channel for audio?
I can easily generate raw PCM data as [Int16] from a .mov file and store it in two files leftChannel.pcm and rightChannel.pcm and perform some operations for later use. But I am not able to regenerate the video from these files.
Any process, i.e. direct video generation using raw PCM or using intermediate step of generating m4a from PCM will work.
Update:
I figured out how to convert the PCM array to audio file. But it won't play.
private func convertToM4a(leftChannel leftPath : URL, rightChannel rigthPath : URL, converterCallback : ConverterCallback){
let m4aUrl = FileManagerUtil.getTempFileName(parentFolder: FrameExtractor.PCM_ENCODE_FOLDER, fileNameWithExtension: "encodedAudio.m4a")
if FileManager.default.fileExists(atPath: m4aUrl.path) {
try! FileManager.default.removeItem(atPath: m4aUrl.path)
}
do{
let leftBuffer = try NSArray(contentsOf: leftPath, error: ()) as! [Int16]
let rightBuffer = try NSArray(contentsOf: rigthPath, error: ()) as! [Int16]
let sampleRate = 44100
let channels = 2
let frameCapacity = (leftBuffer.count + rightBuffer.count)/2
let outputSettings = [
AVFormatIDKey : NSInteger(kAudioFormatMPEG4AAC),
AVSampleRateKey : NSInteger(sampleRate),
AVNumberOfChannelsKey : NSInteger(channels),
AVAudioFileTypeKey : NSInteger(kAudioFileAAC_ADTSType),
AVLinearPCMIsBigEndianKey : true,
] as [String : Any]
let audioFile = try AVAudioFile(forWriting: m4aUrl, settings: outputSettings, commonFormat: .pcmFormatInt16, interleaved: false)
let format = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: Double(sampleRate), channels: AVAudioChannelCount(channels), interleaved: false)!
let pcmBuffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: AVAudioFrameCount(frameCapacity))!
pcmBuffer.frameLength = pcmBuffer.frameCapacity
for i in 0..<leftBuffer.count {
pcmBuffer.int16ChannelData![0][i] = leftBuffer[i]
}
for i in 0..<rightBuffer.count {
pcmBuffer.int16ChannelData![1][i] = rightBuffer[i]
}
try! audioFile.write(from: pcmBuffer)
converterCallback.m4aEncoded(to: m4aUrl)
} catch {
print(error.localizedDescription)
}
}
Saving it as .m4a with AVAudioFileTypeKey as m4a type was giving malformed file error.
Saving it as .aac with above settings plays the file but with broken sound. Just the buzzing sound with some slow mo effect of the original audio, initially I thought that it is something to do with the input and output of sampling rate but that was not the case.
I assume that something is wrong in Output Dictionary. Any help would be appreciated.
At least the creation of the AAC file with the code you are showing works.
I wrote out two NSArrays with valid Int16 audio data and with your code get a valid result that e.g. when played with (using suffix .aac) in QuickTime Player sounds the same as the input.
How are you creating the input?
Buzzing sound (with lots of noise) is e.g. happening if you reading in audio data using AVAudioFormat with e.g. .pcmFormatInt16 format but the data actually read is in .pcmFormatFloat32 format (most commonly default format). There is unfortunately no runtime warning if you try to do so.
If that's the case try to use .pcmFormatFloat32. If you need it in Int16 you can convert it yourself by basically mapping [-1,1] to [-32768,32767] for both channels.
let fac = Float(1 << 15)
for i in 0..<count {
let val = min(max(inBuffer!.floatChannelData![ch][i] * fac, -fac), fac - 1)
xxx[I] = Int16(val)
}
...

How to record audio in wav format in Swift?

I have searched everywhere for this and i couldn't find proper way of doing it. I have succeeded in recording in .wav format, but the problem is, when i try reading raw data from recorded .wav file, some chunks are in wrong place/aren't there at all.
My code for recording audio:
func startRecording(){
let audioSession = AVAudioSession.sharedInstance()
try! audioSession.setCategory(AVAudioSessionCategoryPlayAndRecord)
try! audioSession.setActive(true)
audioSession.requestRecordPermission({(allowed: Bool) -> Void in print("Accepted")} )
let settings: [String : AnyObject] = [
AVFormatIDKey:Int(kAudioFormatLinearPCM),
AVSampleRateKey:44100.0,
AVNumberOfChannelsKey:1,
AVLinearPCMBitDepthKey:8,
AVLinearPCMIsFloatKey:false,
AVLinearPCMIsBigEndianKey:false,
AVEncoderAudioQualityKey:AVAudioQuality.Max.rawValue
]
let date = NSDate()
let df = NSDateFormatter()
df.dateFormat = "yyyy-MM-dd-HH:mm:ss"
let dfString = df.stringFromDate(date)
let fullPath = documentsPath.stringByAppendingString("/\(dfString).wav")
recorder = try! AVAudioRecorder(URL: NSURL(string: fullPath)!, settings: settings)
recorder.delegate = self
recorder.prepareToRecord()
recorder.record()
}
When i print out data of recorder audio file, i get weird number where 'd' 'a' 't' 'a' should be written, following by zeros. And then, in middle of of data, it appears.
No 64617461 ('d' 'a' 't' 'a') chunk - it should be in place of 464c4c52
64617461 ('d' 'a' 't' 'a') at random spot after a lot of zeros
Is there better way of recording wav file? I am not sure why is this happening, so any help would be appreciated. Should i maybe record in other format then convert it to raw?
Thanks and sorry for so many images.
I think only the fmt chunk is guaranteed to come first. It looks like it's fine to have other chunks before the data chunk, so just skip over non-data chunks.
From http://soundfile.sapp.org/doc/WaveFormat/
A RIFF file starts out with a file header followed by a sequence of data chunks.
You need to update your parser :)

Swift iOS AVSampleBufferDisplayLayer setting up the video layer

I'm attempting to set up a video stream viewer in a Swift based project.
I've reviewed the following (Objective C) which has been very helpful: How AVSampleBufferDisplayLayer displays H.264
In a Swift context, I am having difficulty with the fact that the CMTimebaseCreateWithMasterClock requires the CMTimebase related element to be of type UnsafeMutablePointer. Would someone be able to explain how to convert to this and back to address the issue in the following code section.
var controlTimebase : CMTimebase
var myAllocator : CFAllocator!
CMTimebaseCreateWithMasterClock( myAllocator, CMClockGetHostTimeClock(), CMTimebase)
// Problem is here...below is the expected format.
//CMTimebaseCreateWithMasterClock(allocator: CFAllocator!, masterClock: CMClock!, timebaseOut: UnsafeMutablePointer < Unmanaged < CMTimebase > ? >)
videoLayer.controlTimebase = controlTimebase
Spotted the UnsafeMutablePointer syntax needed in a different context here:
CVPixelBufferPool Error ( kCVReturnInvalidArgument/-6661)
Using that the following seems to compile happily :-)
var _CMTimebasePointer = UnsafeMutablePointer<Unmanaged<CMTimebase>?>.alloc(1)
CMTimebaseCreateWithMasterClock( kCFAllocatorDefault, CMClockGetHostTimeClock(), _CMTimebasePointer )
videoLayer.controlTimebase = _CMTimebasePointer.memory?.takeUnretainedValue()
let cmTimebasePointer = UnsafeMutablePointer<CMTimebase?>.allocate(capacity: 1)
let status = CMTimebaseCreateWithMasterClock(allocator: kCFAllocatorDefault, masterClock: CMClockGetHostTimeClock(), timebaseOut: cmTimebasePointer)
videoLayer.controlTimebase = cmTimebasePointer.pointee
if let controlTimeBase = videoLayer.controlTimebase, status == noErr {
CMTimebaseSetTime(controlTimeBase, time: CMTime.zero)
CMTimebaseSetRate(controlTimeBase, rate: 1)
}

Resources