xCode recording in AAC format not working - ios

Here is my code for recording audio in my iOS8 Swift app:
var fileName = "/SFRecording-" + String(recordingSequence) + ".caf"
var str = storageLocation + fileName
var url = NSURL.fileURLWithPath(str as String)
audioSession.setCategory(AVAudioSessionCategoryRecord, error: nil)
audioSession.setActive(true, error: nil)
var recordSettings = [
AVFormatIDKey:kAudioFormatAppleIMA4,
AVSampleRateKey:44100.0,
AVNumberOfChannelsKey:2,
AVEncoderBitRateKey:12800,
AVLinearPCMBitDepthKey:16,
AVEncoderAudioQualityKey:AVAudioQuality.Max.rawValue
]
var error: NSError?
realRecorder = AVAudioRecorder(URL:url, settings: recordSettings as [NSObject : AnyObject], error: &error)
It works fine but the resultant CAF file is useless on windows systems. I wanted to record in a more familiar format like MP3 but turns out you cannot in iOS due to licensing issues.
Now I want to record in AAC format for which I have switched the file extension from .CAF to .AAC in above code and also switched value in AVFormatIDKey:kAudioFormatAppleIMA4 to kAudioFormatMPEG4AAC but those settings fail to record anything. Am I suppose to change some other setting too to make AAC recording work?
Remember my objective is to record in a format which is readily playable on mac/win/browser

I figured it out myself. I had to remove below lines for .AAC format to work:
AVSampleRateKey:44100.0,
AVNumberOfChannelsKey:2,
AVEncoderBitRateKey:12800,
AVLinearPCMBitDepthKey:16,

Related

How to get All Extensions for UTType Image, Audio, and Video

Is there a way to get All of the different UTType extension types as Strings? I need them specifically for images, audio, and video.
I followed this answer, but it doesn't give me all of the extensions
var types = [String]()
let utiTypes = [kUTTypeImage, kUTTypeMovie, kUTTypeVideo, kUTTypeMP3, kUTTypeAudio, kUTTypeQuickTimeMovie, kUTTypeMPEG, kUTTypeMPEG2Video, kUTTypeMPEG2TransportStream, kUTTypeMPEG4, kUTTypeMPEG4Audio, kUTTypeAppleProtectedMPEG4Audio, kUTTypeAppleProtectedMPEG4Video, kUTTypeAVIMovie, kUTTypeAudioInterchangeFileFormat, kUTTypeWaveformAudio, kUTTypeMIDIAudio, kUTTypeLivePhoto, kUTTypeTIFF, kUTTypeGIF, kUTTypeQuickTimeImage, kUTTypeAppleICNS]
for type in utiTypes {
let str = String(type)
guard let utiStr = fileExtension(for: str) else { continue }
types.appent(utiStr)
}
dump(types)
The results are
15 elements // there are really 21 types
- "jpeg"
- "png"
- "mov"
- "mpg"
- "m2v"
- "ts"
- "mp3"
- "mp4"
- "mp4"
- "avi"
- "aiff"
- "wav"
- "midi"
- "tiff"
- "gif"
The issue here is it doesn't return values like qt or jpg. For example I use the UIDocumentPickerViewController and when I select an image the returned url pathExtension is jpg not jpeg. If I wanted to know if the returned url was an image, and I compared its pathExtension to the types array above, it would say that it doesn't appear in the list.
You can do:
import UniformTypeIdentifiers
let utiTypes = [UTType.image, .movie, .video, .mp3, .audio, .quickTimeMovie, .mpeg, .mpeg2Video, .mpeg2TransportStream, .mpeg4Movie, .mpeg4Audio, .appleProtectedMPEG4Audio, .appleProtectedMPEG4Video, .avi, .aiff, .wav, .midi, .livePhoto, .tiff, .gif, UTType("com.apple.quicktime-image"), .icns]
print(utiTypes.flatMap { $0?.tags[.filenameExtension] ?? [] })
There are 33 file extensions in total for the UTTypes that you have listed when I run this code in a playground. Note that some UTTypes you have listed have no file name extensions associated with them, probably because they are too "generic" (e.g. "image" and "video"). And some UTTypes have multiple file name extensions, and some may be the same with the file name extensions of other UTTypes.
There is no "jpg" or "png" in the output. To see them appear, you will have to use this list:
// I've also removed the types that have no file name extensions
let utiTypes = [
UTType.jpeg,
.png,
.mp3,
.quickTimeMovie,
.mpeg,
.mpeg2Video,
.mpeg2TransportStream,
.mpeg4Movie,
.mpeg4Audio,
.appleProtectedMPEG4Audio,
.avi,
.aiff,
.wav,
.midi,
.tiff,
.gif,
UTType("com.apple.quicktime-image"),
.icns
]
Using the above list, the output for me is:
jpeg
jpg
jpe
png
mp3
mpga
mov
qt
mpg
mpeg
mpe
m75
m15
m2v
ts
mp4
mpg4
mp4
mpg4
m4p
avi
vfw
aiff
aif
wav
wave
bwf
midi
mid
smf
kar
tiff
tif
gif
qtif
qti
icns
Also note that if you want to get the UTType from a file name extension, you can just do:
let type = UTType(tag: "jpg", tagClass: .filenameExtension, conformingTo: nil)
and check whether the file name extension is e.g. that of an image by doing:
type?.isSubtype(of: .image)
Though bear in mind that the file does not necessarily represent an image just because its name says it is :)
For those who are looking for all possible types - here is the full list as of iOS 15:
func allUTITypes() -> [UTType] {
let types : [UTType] =
[.item,
.content,
.compositeContent,
.diskImage,
.data,
.directory,
.resolvable,
.symbolicLink,
.executable,
.mountPoint,
.aliasFile,
.urlBookmarkData,
.url,
.fileURL,
.text,
.plainText,
.utf8PlainText,
.utf16ExternalPlainText,
.utf16PlainText,
.delimitedText,
.commaSeparatedText,
.tabSeparatedText,
.utf8TabSeparatedText,
.rtf,
.html,
.xml,
.yaml,
.sourceCode,
.assemblyLanguageSource,
.cSource,
.objectiveCSource,
.swiftSource,
.cPlusPlusSource,
.objectiveCPlusPlusSource,
.cHeader,
.cPlusPlusHeader]
let types_1: [UTType] =
[.script,
.appleScript,
.osaScript,
.osaScriptBundle,
.javaScript,
.shellScript,
.perlScript,
.pythonScript,
.rubyScript,
.phpScript,
.makefile, //'makefile' is only available in iOS 15.0 or newer
.json,
.propertyList,
.xmlPropertyList,
.binaryPropertyList,
.pdf,
.rtfd,
.flatRTFD,
.webArchive,
.image,
.jpeg,
.tiff,
.gif,
.png,
.icns,
.bmp,
.ico,
.rawImage,
.svg,
.livePhoto,
.heif,
.heic,
.webP,
.threeDContent,
.usd,
.usdz,
.realityFile,
.sceneKitScene,
.arReferenceObject,
.audiovisualContent]
let types_2: [UTType] =
[.movie,
.video,
.audio,
.quickTimeMovie,
UTType("com.apple.quicktime-image"),
.mpeg,
.mpeg2Video,
.mpeg2TransportStream,
.mp3,
.mpeg4Movie,
.mpeg4Audio,
.appleProtectedMPEG4Audio,
.appleProtectedMPEG4Video,
.avi,
.aiff,
.wav,
.midi,
.playlist,
.m3uPlaylist,
.folder,
.volume,
.package,
.bundle,
.pluginBundle,
.spotlightImporter,
.quickLookGenerator,
.xpcService,
.framework,
.application,
.applicationBundle,
.applicationExtension,
.unixExecutable,
.exe,
.systemPreferencesPane,
.archive,
.gzip,
.bz2,
.zip,
.appleArchive,
.spreadsheet,
.presentation,
.database,
.message,
.contact,
.vCard,
.toDoItem,
.calendarEvent,
.emailMessage,
.internetLocation,
.internetShortcut,
.font,
.bookmark,
.pkcs12,
.x509Certificate,
.epub,
.log]
.compactMap({ $0 })
return types + types_1 + types_2
}
Note: I've intentionally split data into 3 arrays to speed up compilation time.
You should not try to compare the extension of the URL returned by UIDocumentPickerViewController to a known list of extensions. Instead, use url.resourceValues(forKeys: [.contentTypeKey]).contentType to get a UTType for the returned URL, and then check that it conforms to .image: type.conforms(to: .image).

Can't encode an Audio file to Base64?

Objective: Dialog Flow Voice Bot Api
I need to send a wav file to the Dialog Flow Api and the format and settings were pre-defined.
So I recorded an audio using AVAudioRecorder in .wav format using
following settings
audioFilename = getDocumentsDirectory().appendingPathComponent("input.wav")
let settings: [String: Any] = [
AVFormatIDKey: Int(kAudioFormatLinearPCM),
AVSampleRateKey: 16000,
AVNumberOfChannelsKey: 2,
AVLinearPCMBitDepthKey: 16,
AVLinearPCMIsBigEndianKey: false,
AVEncoderAudioQualityKey: AVAudioQuality.max.rawValue
]
do {
audioRecorder = try AVAudioRecorder(url: audioFilename!, settings: settings)
audioRecorder.isMeteringEnabled = true
audioRecorder.prepareToRecord()
audioRecorder.delegate = self
audioRecorder.record()
recordButton.setTitle("Tap to Stop", for: .normal)
} catch {
print(error.localizedDescription)
finishRecording(success: false)
}
}
Then I tried to convert it into Base64 audio format
let outputFile = try Data.init(contentsOf: fileUrl)
let base64String = outputFile.base64EncodedString(options: NSData.Base64EncodingOptions.init(rawValue: 0))
print(base64String)
So whenever I try to decode that encoded string, using an online converter, it displays some corrupted bytes
Thoughts??
So I've found the answer to the question.
The reason my byte array wasn't able to maintain correct headers was because of a key which I omitted in the settings variable
AVAudioFileTypeKey: kAudioFileWAVEType
let settings: [String: Any] = [
AVSampleRateKey: 16000,
AVNumberOfChannelsKey: 1,
AVAudioFileTypeKey: kAudioFileWAVEType, //MANDATORY
AVFormatIDKey: kAudioFormatLinearPCM,
AVLinearPCMIsBigEndianKey: false,
AVLinearPCMIsNonInterleaved: true,
AVEncoderAudioQualityKey: AVAudioQuality.high.rawValue
]
It was given in the docs that if you won't provide the settings i.e.
audioRecorder = try AVAudioRecorder(url: audioFilename!, settings: [:] /*empty settings*/)
then
❌ AVAudio recorder will automatically prepare the file from the Format defined in the file. ❌
But turns out, that didn't help either 😫
So whilst I was playing with the settings, I found this very important key AVAudioFileTypeKey, which helped in maintaining the correct headers and thus a valid .wav file 😎
This is how a wav file with Valid headers look like

Generating video or audio using raw PCM

What is the process of generating .mov or .m4a file using arrays of Int16 as sterio channel for audio?
I can easily generate raw PCM data as [Int16] from a .mov file and store it in two files leftChannel.pcm and rightChannel.pcm and perform some operations for later use. But I am not able to regenerate the video from these files.
Any process, i.e. direct video generation using raw PCM or using intermediate step of generating m4a from PCM will work.
Update:
I figured out how to convert the PCM array to audio file. But it won't play.
private func convertToM4a(leftChannel leftPath : URL, rightChannel rigthPath : URL, converterCallback : ConverterCallback){
let m4aUrl = FileManagerUtil.getTempFileName(parentFolder: FrameExtractor.PCM_ENCODE_FOLDER, fileNameWithExtension: "encodedAudio.m4a")
if FileManager.default.fileExists(atPath: m4aUrl.path) {
try! FileManager.default.removeItem(atPath: m4aUrl.path)
}
do{
let leftBuffer = try NSArray(contentsOf: leftPath, error: ()) as! [Int16]
let rightBuffer = try NSArray(contentsOf: rigthPath, error: ()) as! [Int16]
let sampleRate = 44100
let channels = 2
let frameCapacity = (leftBuffer.count + rightBuffer.count)/2
let outputSettings = [
AVFormatIDKey : NSInteger(kAudioFormatMPEG4AAC),
AVSampleRateKey : NSInteger(sampleRate),
AVNumberOfChannelsKey : NSInteger(channels),
AVAudioFileTypeKey : NSInteger(kAudioFileAAC_ADTSType),
AVLinearPCMIsBigEndianKey : true,
] as [String : Any]
let audioFile = try AVAudioFile(forWriting: m4aUrl, settings: outputSettings, commonFormat: .pcmFormatInt16, interleaved: false)
let format = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: Double(sampleRate), channels: AVAudioChannelCount(channels), interleaved: false)!
let pcmBuffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: AVAudioFrameCount(frameCapacity))!
pcmBuffer.frameLength = pcmBuffer.frameCapacity
for i in 0..<leftBuffer.count {
pcmBuffer.int16ChannelData![0][i] = leftBuffer[i]
}
for i in 0..<rightBuffer.count {
pcmBuffer.int16ChannelData![1][i] = rightBuffer[i]
}
try! audioFile.write(from: pcmBuffer)
converterCallback.m4aEncoded(to: m4aUrl)
} catch {
print(error.localizedDescription)
}
}
Saving it as .m4a with AVAudioFileTypeKey as m4a type was giving malformed file error.
Saving it as .aac with above settings plays the file but with broken sound. Just the buzzing sound with some slow mo effect of the original audio, initially I thought that it is something to do with the input and output of sampling rate but that was not the case.
I assume that something is wrong in Output Dictionary. Any help would be appreciated.
At least the creation of the AAC file with the code you are showing works.
I wrote out two NSArrays with valid Int16 audio data and with your code get a valid result that e.g. when played with (using suffix .aac) in QuickTime Player sounds the same as the input.
How are you creating the input?
Buzzing sound (with lots of noise) is e.g. happening if you reading in audio data using AVAudioFormat with e.g. .pcmFormatInt16 format but the data actually read is in .pcmFormatFloat32 format (most commonly default format). There is unfortunately no runtime warning if you try to do so.
If that's the case try to use .pcmFormatFloat32. If you need it in Int16 you can convert it yourself by basically mapping [-1,1] to [-32768,32767] for both channels.
let fac = Float(1 << 15)
for i in 0..<count {
let val = min(max(inBuffer!.floatChannelData![ch][i] * fac, -fac), fac - 1)
xxx[I] = Int16(val)
}
...

AVAudioConverter.convertToBuffer throwing error code -50

I need to convert a .wav file recorded with 2 audio channels to a .wav that has only 1 channel, as well as reduce the bit depth from 32 to 16. I've been trying to use AVAudioConverter.convertToBuffer However, the conversion is throwing an error: Error Domain=NSOSStatusErrorDomain Code=-50 "(null)"
Basically, the only thing that really needs to change is to strip the audio down to a single channel, and the bit depth. I'm getting these files from a different tool, so I can't just change the way the files are recorded.
I'm not that awesome at working with audio, and I'm a bit stumped. The code I'm working on is below - is there anything I'm missing?
let inAudioFileURL:NSURL = <url_to_wav_file>
var inAudioFile:AVAudioFile?
do
{
inAudioFile = try AVAudioFile(forReading: inAudioFileURL)
}
catch let error
{
print ("error: \(error)")
}
let inAudioFormat:AVAudioFormat = inAudioFile!.processingFormat
let inFrameCount:UInt32 = UInt32(inAudioFile!.length)
let inAudioBuffer:AVAudioPCMBuffer = AVAudioPCMBuffer(PCMFormat: inAudioFormat, frameCapacity: inFrameCount)
do
{
try inAudioFile!.readIntoBuffer(inAudioBuffer)
}
catch let error
{
print ("readError: \(error)")
}
let startFormat:AVAudioFormat = AVAudioFormat.init(settings: inAudioFile!.processingFormat.settings)
print ("startFormat: \(startFormat.settings)")
var endFormatSettings = startFormat.settings
endFormatSettings[AVLinearPCMBitDepthKey] = 16
endFormatSettings[AVNumberOfChannelsKey] = 1
endFormatSettings[AVEncoderAudioQualityKey] = AVAudioQuality.Medium.rawValue
print ("endFormatSettings: \(endFormatSettings)")
let endFormat:AVAudioFormat = AVAudioFormat.init(settings: endFormatSettings)
let outBuffer = AVAudioPCMBuffer(PCMFormat: endFormat, frameCapacity: inFrameCount)
let avConverter:AVAudioConverter = AVAudioConverter.init(fromFormat: startFormat, toFormat: endFormat)
do
{
try avConverter.convertToBuffer(outBuffer, fromBuffer: inAudioBuffer)
}
catch let error
{
print ("avconverterError: \(error)")
}
As for the output:
startFormat:
["AVSampleRateKey": 16000,
"AVLinearPCMBitDepthKey": 32,
"AVLinearPCMIsFloatKey": 1,
"AVNumberOfChannelsKey": 2,
"AVFormatIDKey": 1819304813,
"AVLinearPCMIsNonInterleaved": 0,
"AVLinearPCMIsBigEndianKey": 0]
endFormatSettings:
["AVSampleRateKey": 16000,
"AVLinearPCMBitDepthKey": 16,
"AVLinearPCMIsFloatKey": 1,
"AVNumberOfChannelsKey": 1,
"AVFormatIDKey": 1819304813,
"AVLinearPCMIsNonInterleaved": 0,
"AVLinearPCMIsBigEndianKey": 0,
"AVEncoderQualityKey": 64]
avconverterError: Error Domain=NSOSStatusErrorDomain Code=-50 "(null)"
I'm not 100% sure why this is the case, but I found a solution that got this working for me, so here's how I understand the problem. I found this solution by trying to use the alternate convert(to:error:withInputFrom:) method. Using this was giving me a different error:
`ERROR: AVAudioConverter.mm:526: FillComplexProc: required condition is false: [impl->_inputBufferReceived.format isEqual: impl->_inputFormat]`
The problem was caused in the line where I setup the AVAudioConverter:
let avConverter:AVAudioConverter = AVAudioConverter.init(fromFormat: startFormat, toFormat: endFormat)
It appears that the audio converter wants to use the same AVAudioFormat that the input buffer is using, instead of using a copy based on the original's settings. Once I swapped startFormat out for inAudioFormat, the convert(to:error:withInputFrom:) error was dismissed, and things worked as expected. I was then able to go back to using the simpler convert(to:fromBuffer:) method, and the original error I was dealing with also went away.
To recap, the line setting up the converter now looks like:
let avConverter:AVAudioConverter = AVAudioConverter.init(fromFormat: inAudioFormat, toFormat: endFormat)
As for the lack of docs on how to use AVAudioConverter, I have no idea why the API reference has next to nothing. Instead, in Xcode, CMD-click on AVAudioConverter in your code to go to it's header file. There's plenty of comments and info there. Not full sample code or anything, but it's at least something.

How to record audio in wav format in Swift?

I have searched everywhere for this and i couldn't find proper way of doing it. I have succeeded in recording in .wav format, but the problem is, when i try reading raw data from recorded .wav file, some chunks are in wrong place/aren't there at all.
My code for recording audio:
func startRecording(){
let audioSession = AVAudioSession.sharedInstance()
try! audioSession.setCategory(AVAudioSessionCategoryPlayAndRecord)
try! audioSession.setActive(true)
audioSession.requestRecordPermission({(allowed: Bool) -> Void in print("Accepted")} )
let settings: [String : AnyObject] = [
AVFormatIDKey:Int(kAudioFormatLinearPCM),
AVSampleRateKey:44100.0,
AVNumberOfChannelsKey:1,
AVLinearPCMBitDepthKey:8,
AVLinearPCMIsFloatKey:false,
AVLinearPCMIsBigEndianKey:false,
AVEncoderAudioQualityKey:AVAudioQuality.Max.rawValue
]
let date = NSDate()
let df = NSDateFormatter()
df.dateFormat = "yyyy-MM-dd-HH:mm:ss"
let dfString = df.stringFromDate(date)
let fullPath = documentsPath.stringByAppendingString("/\(dfString).wav")
recorder = try! AVAudioRecorder(URL: NSURL(string: fullPath)!, settings: settings)
recorder.delegate = self
recorder.prepareToRecord()
recorder.record()
}
When i print out data of recorder audio file, i get weird number where 'd' 'a' 't' 'a' should be written, following by zeros. And then, in middle of of data, it appears.
No 64617461 ('d' 'a' 't' 'a') chunk - it should be in place of 464c4c52
64617461 ('d' 'a' 't' 'a') at random spot after a lot of zeros
Is there better way of recording wav file? I am not sure why is this happening, so any help would be appreciated. Should i maybe record in other format then convert it to raw?
Thanks and sorry for so many images.
I think only the fmt chunk is guaranteed to come first. It looks like it's fine to have other chunks before the data chunk, so just skip over non-data chunks.
From http://soundfile.sapp.org/doc/WaveFormat/
A RIFF file starts out with a file header followed by a sequence of data chunks.
You need to update your parser :)

Resources