Generating video or audio using raw PCM - ios

What is the process of generating .mov or .m4a file using arrays of Int16 as sterio channel for audio?
I can easily generate raw PCM data as [Int16] from a .mov file and store it in two files leftChannel.pcm and rightChannel.pcm and perform some operations for later use. But I am not able to regenerate the video from these files.
Any process, i.e. direct video generation using raw PCM or using intermediate step of generating m4a from PCM will work.
Update:
I figured out how to convert the PCM array to audio file. But it won't play.
private func convertToM4a(leftChannel leftPath : URL, rightChannel rigthPath : URL, converterCallback : ConverterCallback){
let m4aUrl = FileManagerUtil.getTempFileName(parentFolder: FrameExtractor.PCM_ENCODE_FOLDER, fileNameWithExtension: "encodedAudio.m4a")
if FileManager.default.fileExists(atPath: m4aUrl.path) {
try! FileManager.default.removeItem(atPath: m4aUrl.path)
}
do{
let leftBuffer = try NSArray(contentsOf: leftPath, error: ()) as! [Int16]
let rightBuffer = try NSArray(contentsOf: rigthPath, error: ()) as! [Int16]
let sampleRate = 44100
let channels = 2
let frameCapacity = (leftBuffer.count + rightBuffer.count)/2
let outputSettings = [
AVFormatIDKey : NSInteger(kAudioFormatMPEG4AAC),
AVSampleRateKey : NSInteger(sampleRate),
AVNumberOfChannelsKey : NSInteger(channels),
AVAudioFileTypeKey : NSInteger(kAudioFileAAC_ADTSType),
AVLinearPCMIsBigEndianKey : true,
] as [String : Any]
let audioFile = try AVAudioFile(forWriting: m4aUrl, settings: outputSettings, commonFormat: .pcmFormatInt16, interleaved: false)
let format = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: Double(sampleRate), channels: AVAudioChannelCount(channels), interleaved: false)!
let pcmBuffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: AVAudioFrameCount(frameCapacity))!
pcmBuffer.frameLength = pcmBuffer.frameCapacity
for i in 0..<leftBuffer.count {
pcmBuffer.int16ChannelData![0][i] = leftBuffer[i]
}
for i in 0..<rightBuffer.count {
pcmBuffer.int16ChannelData![1][i] = rightBuffer[i]
}
try! audioFile.write(from: pcmBuffer)
converterCallback.m4aEncoded(to: m4aUrl)
} catch {
print(error.localizedDescription)
}
}
Saving it as .m4a with AVAudioFileTypeKey as m4a type was giving malformed file error.
Saving it as .aac with above settings plays the file but with broken sound. Just the buzzing sound with some slow mo effect of the original audio, initially I thought that it is something to do with the input and output of sampling rate but that was not the case.
I assume that something is wrong in Output Dictionary. Any help would be appreciated.

At least the creation of the AAC file with the code you are showing works.
I wrote out two NSArrays with valid Int16 audio data and with your code get a valid result that e.g. when played with (using suffix .aac) in QuickTime Player sounds the same as the input.
How are you creating the input?
Buzzing sound (with lots of noise) is e.g. happening if you reading in audio data using AVAudioFormat with e.g. .pcmFormatInt16 format but the data actually read is in .pcmFormatFloat32 format (most commonly default format). There is unfortunately no runtime warning if you try to do so.
If that's the case try to use .pcmFormatFloat32. If you need it in Int16 you can convert it yourself by basically mapping [-1,1] to [-32768,32767] for both channels.
let fac = Float(1 << 15)
for i in 0..<count {
let val = min(max(inBuffer!.floatChannelData![ch][i] * fac, -fac), fac - 1)
xxx[I] = Int16(val)
}
...

Related

How to get All Extensions for UTType Image, Audio, and Video

Is there a way to get All of the different UTType extension types as Strings? I need them specifically for images, audio, and video.
I followed this answer, but it doesn't give me all of the extensions
var types = [String]()
let utiTypes = [kUTTypeImage, kUTTypeMovie, kUTTypeVideo, kUTTypeMP3, kUTTypeAudio, kUTTypeQuickTimeMovie, kUTTypeMPEG, kUTTypeMPEG2Video, kUTTypeMPEG2TransportStream, kUTTypeMPEG4, kUTTypeMPEG4Audio, kUTTypeAppleProtectedMPEG4Audio, kUTTypeAppleProtectedMPEG4Video, kUTTypeAVIMovie, kUTTypeAudioInterchangeFileFormat, kUTTypeWaveformAudio, kUTTypeMIDIAudio, kUTTypeLivePhoto, kUTTypeTIFF, kUTTypeGIF, kUTTypeQuickTimeImage, kUTTypeAppleICNS]
for type in utiTypes {
let str = String(type)
guard let utiStr = fileExtension(for: str) else { continue }
types.appent(utiStr)
}
dump(types)
The results are
15 elements // there are really 21 types
- "jpeg"
- "png"
- "mov"
- "mpg"
- "m2v"
- "ts"
- "mp3"
- "mp4"
- "mp4"
- "avi"
- "aiff"
- "wav"
- "midi"
- "tiff"
- "gif"
The issue here is it doesn't return values like qt or jpg. For example I use the UIDocumentPickerViewController and when I select an image the returned url pathExtension is jpg not jpeg. If I wanted to know if the returned url was an image, and I compared its pathExtension to the types array above, it would say that it doesn't appear in the list.
You can do:
import UniformTypeIdentifiers
let utiTypes = [UTType.image, .movie, .video, .mp3, .audio, .quickTimeMovie, .mpeg, .mpeg2Video, .mpeg2TransportStream, .mpeg4Movie, .mpeg4Audio, .appleProtectedMPEG4Audio, .appleProtectedMPEG4Video, .avi, .aiff, .wav, .midi, .livePhoto, .tiff, .gif, UTType("com.apple.quicktime-image"), .icns]
print(utiTypes.flatMap { $0?.tags[.filenameExtension] ?? [] })
There are 33 file extensions in total for the UTTypes that you have listed when I run this code in a playground. Note that some UTTypes you have listed have no file name extensions associated with them, probably because they are too "generic" (e.g. "image" and "video"). And some UTTypes have multiple file name extensions, and some may be the same with the file name extensions of other UTTypes.
There is no "jpg" or "png" in the output. To see them appear, you will have to use this list:
// I've also removed the types that have no file name extensions
let utiTypes = [
UTType.jpeg,
.png,
.mp3,
.quickTimeMovie,
.mpeg,
.mpeg2Video,
.mpeg2TransportStream,
.mpeg4Movie,
.mpeg4Audio,
.appleProtectedMPEG4Audio,
.avi,
.aiff,
.wav,
.midi,
.tiff,
.gif,
UTType("com.apple.quicktime-image"),
.icns
]
Using the above list, the output for me is:
jpeg
jpg
jpe
png
mp3
mpga
mov
qt
mpg
mpeg
mpe
m75
m15
m2v
ts
mp4
mpg4
mp4
mpg4
m4p
avi
vfw
aiff
aif
wav
wave
bwf
midi
mid
smf
kar
tiff
tif
gif
qtif
qti
icns
Also note that if you want to get the UTType from a file name extension, you can just do:
let type = UTType(tag: "jpg", tagClass: .filenameExtension, conformingTo: nil)
and check whether the file name extension is e.g. that of an image by doing:
type?.isSubtype(of: .image)
Though bear in mind that the file does not necessarily represent an image just because its name says it is :)
For those who are looking for all possible types - here is the full list as of iOS 15:
func allUTITypes() -> [UTType] {
let types : [UTType] =
[.item,
.content,
.compositeContent,
.diskImage,
.data,
.directory,
.resolvable,
.symbolicLink,
.executable,
.mountPoint,
.aliasFile,
.urlBookmarkData,
.url,
.fileURL,
.text,
.plainText,
.utf8PlainText,
.utf16ExternalPlainText,
.utf16PlainText,
.delimitedText,
.commaSeparatedText,
.tabSeparatedText,
.utf8TabSeparatedText,
.rtf,
.html,
.xml,
.yaml,
.sourceCode,
.assemblyLanguageSource,
.cSource,
.objectiveCSource,
.swiftSource,
.cPlusPlusSource,
.objectiveCPlusPlusSource,
.cHeader,
.cPlusPlusHeader]
let types_1: [UTType] =
[.script,
.appleScript,
.osaScript,
.osaScriptBundle,
.javaScript,
.shellScript,
.perlScript,
.pythonScript,
.rubyScript,
.phpScript,
.makefile, //'makefile' is only available in iOS 15.0 or newer
.json,
.propertyList,
.xmlPropertyList,
.binaryPropertyList,
.pdf,
.rtfd,
.flatRTFD,
.webArchive,
.image,
.jpeg,
.tiff,
.gif,
.png,
.icns,
.bmp,
.ico,
.rawImage,
.svg,
.livePhoto,
.heif,
.heic,
.webP,
.threeDContent,
.usd,
.usdz,
.realityFile,
.sceneKitScene,
.arReferenceObject,
.audiovisualContent]
let types_2: [UTType] =
[.movie,
.video,
.audio,
.quickTimeMovie,
UTType("com.apple.quicktime-image"),
.mpeg,
.mpeg2Video,
.mpeg2TransportStream,
.mp3,
.mpeg4Movie,
.mpeg4Audio,
.appleProtectedMPEG4Audio,
.appleProtectedMPEG4Video,
.avi,
.aiff,
.wav,
.midi,
.playlist,
.m3uPlaylist,
.folder,
.volume,
.package,
.bundle,
.pluginBundle,
.spotlightImporter,
.quickLookGenerator,
.xpcService,
.framework,
.application,
.applicationBundle,
.applicationExtension,
.unixExecutable,
.exe,
.systemPreferencesPane,
.archive,
.gzip,
.bz2,
.zip,
.appleArchive,
.spreadsheet,
.presentation,
.database,
.message,
.contact,
.vCard,
.toDoItem,
.calendarEvent,
.emailMessage,
.internetLocation,
.internetShortcut,
.font,
.bookmark,
.pkcs12,
.x509Certificate,
.epub,
.log]
.compactMap({ $0 })
return types + types_1 + types_2
}
Note: I've intentionally split data into 3 arrays to speed up compilation time.
You should not try to compare the extension of the URL returned by UIDocumentPickerViewController to a known list of extensions. Instead, use url.resourceValues(forKeys: [.contentTypeKey]).contentType to get a UTType for the returned URL, and then check that it conforms to .image: type.conforms(to: .image).

Playing a stereo audio buffer from memory with AVAudioEngine

I am trying to play a stereo audio buffer from memory (not from a file) in my iOS app but my application crashes when I attempt to attach the AVAudioPlayerNode 'playerNode' to the AVAudioEngine 'audioEngine'. The error code that I get is as follows:
Thread 1: Exception: "required condition is false: _outputFormat.channelCount == buffer.format.channelCount"
I don't know if this due to the way I have declared the AVAudioEngine, the AVAudioPlayerNode, if there is something wrong with the buffer which I am generating, or if I am attaching the nodes incorrectly (or something else!). I have a feeling that it is something to do with how I am creating a new buffer. I am trying to make a stereo buffer from two separate 'mono' arrays, and perhaps its format is not correct.
I have declared audioEngine: AVAudioEngine! and playerNode: AVAudioPlayerNode! globally:
var audioEngine: AVAudioEngine!
var playerNode: AVAudioPlayerNode!
I then load a mono source audio file that my app is going to process (the data out of this file will not be played, it will be loaded into an array, processed and then loaded into a new buffer):
// Read audio file
let audioFileFormat = audioFile.processingFormat
let frameCount = UInt32(audioFile.length)
let audioBuffer = AVAudioPCMBuffer(pcmFormat: audioFileFormat, frameCapacity: frameCount)!
// Read audio data into buffer
do {
try audioFile.read(into: audioBuffer)
} catch let error {
print(error.localizedDescription)
}
// Convert buffer to array of floats
let input: [Float] = Array(UnsafeBufferPointer(start: audioBuffer.floatChannelData![0], count: Int(audioBuffer.frameLength)))
The array is then sent to a convolution function twice that returns a new array each time. This is because the mono source file needs to become a stereo audio buffer:
maxSignalLength = input.count + 256
let leftAudioArray: [Float] = convolve(inputAudio: input, impulse: normalisedLeftImpulse)
let rightAudioArray: [Float] = convolve(inputAudio: input, impulse: normalisedRightImpulse)
The maxSignalLength variable is currently the length of the input signal + the length of the impulse response (normalisedImpulseResponse) that is being convolved with, which at the moment is 256. This will become an appropriate variable at some point.
I then declare and load the new buffer and its format, I have a feeling that the mistake is somewhere around here as this will be the buffer that is played:
let bufferFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: hrtfSampleRate, channels: 2, interleaved: false)!
let outputBuffer = AVAudioPCMBuffer(pcmFormat: bufferFormat, frameCapacity: AVAudioFrameCount(maxSignalLength))!
Notice that I am not creating an interleaved buffer, I load the stereo audio data to the buffer as follows (which I think may also be wrong):
for ch in 0 ..< 2 {
for i in 0 ..< maxSignalLength {
var val: Float!
if ch == 0 { // Left
val = leftAudioArray[i]
// Limit
if val > 1 {
val = 1
}
if val < -1 {
val = -1
}
} else if ch == 1 { // Right
val = rightAudioArray[i]
// Limit
if val < 1 {
val = 1
}
if val < -1 {
val = -1
}
}
outputBuffer.floatChannelData![ch][i] = val
}
}
The audio is also limited to values between -1 and 1.
Then I finally come to (attempting to) load the buffer to the audio node, attach the audio node to the audio engine, start the audio engine and then play the node.
let frameCapacity = AVAudioFramePosition(outputBuffer.frameCapacity)
let frameLength = outputBuffer.frameLength
playerNode.scheduleBuffer(outputBuffer, at: nil, options: AVAudioPlayerNodeBufferOptions.interrupts, completionHandler: nil)
playerNode.prepare(withFrameCount: frameLength)
let time = AVAudioTime(sampleTime: frameCapacity, atRate: hrtfSampleRate)
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: outputBuffer.format)
audioEngine.prepare()
do {
try audioEngine.start()
} catch let error {
print(error.localizedDescription)
}
playerNode.play(at: time)
The error that I get in runtime is:
AVAEInternal.h:76 required condition is false: [AVAudioPlayerNode.mm:712:ScheduleBuffer: (_outputFormat.channelCount == buffer.format.channelCount)]
It doesn't show the line that this error occurs on. I have been stuck on this for a while now, and have tried lots of different things, but there doesn't seem to be very much clear information about playing audio from memory and not from files with AVAudioEngine from what I could find. Any help would be greatly appreciated.
Thanks!
Edit #1:
Better title
Edit# 2:
UPDATE - I have found out why I was getting the error. It seemed to be caused by setting up the playerNode before attaching it to the audioEngine. Swapping the order stopped the program from crashing and throwing the error:
let frameCapacity = AVAudioFramePosition(outputBuffer.frameCapacity)
let frameLength = outputBuffer.frameLength
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: outputBuffer.format)
audioEngine.prepare()
playerNode.scheduleBuffer(outputBuffer, at: nil, options: AVAudioPlayerNodeBufferOptions.interrupts, completionHandler: nil)
playerNode.prepare(withFrameCount: frameLength)
let time = AVAudioTime(sampleTime: frameCapacity, atRate: hrtfSampleRate)
do {
try audioEngine.start()
} catch let error {
print(error.localizedDescription)
}
playerNode.play(at: time)
However, I don't have any sound. After creating an array of floats of the outputBuffer with the same method as used for the input signal, and taking a look at its contents with a break point it seems to be empty, so I must also be incorrectly storing the data to the outputBuffer.
You might be creating and filling your buffer incorrectly. Try doing it thus:
let fileURL = Bundle.main.url(forResource: "my_file", withExtension: "aiff")!
let file = try! AVAudioFile(forReading: fileURL)
let buffer = AVAudioPCMBuffer(pcmFormat: file.processingFormat, frameCapacity: UInt32(file.length))!
try! file.read(into: buffer)
I have fixed the issue!
I tried a lot of solutions and have ended up completely re-writing the audio engine section of my app and I now have the AVAudioEngine and AVAudioPlayerNode declared within the ViewController class as the following:
class ViewController: UIViewController {
var audioEngine: AVAudioEngine = AVAudioEngine()
var playerNode: AVAudioPlayerNode = AVAudioPlayerNode()
...
I am still unclear if it is better to declare these globally or as class variables in iOS, however I can confirm that my application is playing audio with these declared within the ViewController class. I do know that they shouldn't be declared in a function as they will disappear and stop playing when the function goes out of scope.
However, I still was not getting any audio output until I set the AVAudioPCMBuffer.frameLength to frameCapacity.
I could find very little information online regarding creating a new AVAudioPCMBuffer from an array of floats, but this seems to be the missing step that I needed to do to make my outputBuffer playable. Before I set this, it was at 0 by default.
The frameLength member isn't required in the AVAudioFormat class declaration. But it is important and my buffer wasn't playable until I set it manually, and after the class instance declaration:
let bufferFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: hrtfSampleRate, channels: 2, interleaved: false)!
let frameCapacity = UInt32(audioFile.length)
guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: bufferFormat, frameCapacity: frameCapacity) else {
fatalError("Could not create output buffer.")
}
outputBuffer.frameLength = frameCapacity // Important!
This took a long time to find out, hopefully this will help someone else in the future.

AVAudioRecorder generates strange Wav file(wrong header)

How can I get the only the PCM data from AVAudioRecorder file?
these are the settings I use to record the file:
let settings : [String : Any] = [
AVFormatIDKey: Int(kAudioFormatLinearPCM),
AVSampleRateKey: Int(stethoscopeSampleRateDefault),
AVNumberOfChannelsKey: 1,
AVEncoderAudioQualityKey: AVAudioQuality.medium.rawValue,
]
the outcome of this is strange wav file with strange header.
How can I extract only the PCM data out of it?
The actual sound data in a wav file is in the "data" subchunk of that file - this format description might help you visualize the structure you'll have to navigate. But maybe what's tripping you up is that Apple includes an extra subchunk called "fllr" which precedes the sound data, so you have to seek past that too. Fortunately every subchunk is given an id and size, so finding the data subchunk is still relatively straightforward.
Open the file using FileHandle
Seek to byte 12, which gets you past the header and puts you at the beginning of the first subchunk (should be fmt).
Read 4 bytes and convert to a string, then read 4 more bytes and convert to an integer. The string is the subchunk name, and the integer is the size of that subchunk. If the string is not "data" then seek forward "size" number of bytes and repeat step 3.
Read the rest of the file - this is your PCM data.
With Jamie's guidance I managed to solve this. Here is my code:
func extractSubchunks(data:Data) -> RiffFile?{
var data = data
var chunks = [SubChunk]()
let position = data.subdata(in: 8..<12)
let filelength = Int(data.subdata(in: 4..<8).uint32)
let wave = String(bytes: position, encoding: .utf8) ?? "NoName"
guard wave == "WAVE" else {
print("File is \(wave) not WAVE")
return nil
}
data.removeSubrange(0..<12)
print("Found chunks")
while data.count != 0{
let position = data.subdata(in: 0..<4)
let length = Int(data.subdata(in: 4..<8).uint32)
guard let current = String(bytes: position, encoding: .utf8) else{
return nil
}
data.removeSubrange(0..<8)
let chunkData = data.subdata(in: 0..<length)
data.removeSubrange(0..<length)
let subchunk = SubChunk(name: current, size: length, data: chunkData)
chunks.append(subchunk)
print(subchunk.debugDescription)
}
let riff = RiffFile(size: filelength, subChunks: chunks)
return riff
}
Here's the definition for RiffFile and SubChunk structs:
struct RiffFile {
var size : Int
var subChunks : [SubChunk]
}
struct SubChunk {
var debugDescription: String {
return "name : \(name) size : \(size) dataAssignedsize : \(data.count)"
}
var name : String
var size : Int
var data : Data
}

How to record audio in wav format in Swift?

I have searched everywhere for this and i couldn't find proper way of doing it. I have succeeded in recording in .wav format, but the problem is, when i try reading raw data from recorded .wav file, some chunks are in wrong place/aren't there at all.
My code for recording audio:
func startRecording(){
let audioSession = AVAudioSession.sharedInstance()
try! audioSession.setCategory(AVAudioSessionCategoryPlayAndRecord)
try! audioSession.setActive(true)
audioSession.requestRecordPermission({(allowed: Bool) -> Void in print("Accepted")} )
let settings: [String : AnyObject] = [
AVFormatIDKey:Int(kAudioFormatLinearPCM),
AVSampleRateKey:44100.0,
AVNumberOfChannelsKey:1,
AVLinearPCMBitDepthKey:8,
AVLinearPCMIsFloatKey:false,
AVLinearPCMIsBigEndianKey:false,
AVEncoderAudioQualityKey:AVAudioQuality.Max.rawValue
]
let date = NSDate()
let df = NSDateFormatter()
df.dateFormat = "yyyy-MM-dd-HH:mm:ss"
let dfString = df.stringFromDate(date)
let fullPath = documentsPath.stringByAppendingString("/\(dfString).wav")
recorder = try! AVAudioRecorder(URL: NSURL(string: fullPath)!, settings: settings)
recorder.delegate = self
recorder.prepareToRecord()
recorder.record()
}
When i print out data of recorder audio file, i get weird number where 'd' 'a' 't' 'a' should be written, following by zeros. And then, in middle of of data, it appears.
No 64617461 ('d' 'a' 't' 'a') chunk - it should be in place of 464c4c52
64617461 ('d' 'a' 't' 'a') at random spot after a lot of zeros
Is there better way of recording wav file? I am not sure why is this happening, so any help would be appreciated. Should i maybe record in other format then convert it to raw?
Thanks and sorry for so many images.
I think only the fmt chunk is guaranteed to come first. It looks like it's fine to have other chunks before the data chunk, so just skip over non-data chunks.
From http://soundfile.sapp.org/doc/WaveFormat/
A RIFF file starts out with a file header followed by a sequence of data chunks.
You need to update your parser :)

xCode recording in AAC format not working

Here is my code for recording audio in my iOS8 Swift app:
var fileName = "/SFRecording-" + String(recordingSequence) + ".caf"
var str = storageLocation + fileName
var url = NSURL.fileURLWithPath(str as String)
audioSession.setCategory(AVAudioSessionCategoryRecord, error: nil)
audioSession.setActive(true, error: nil)
var recordSettings = [
AVFormatIDKey:kAudioFormatAppleIMA4,
AVSampleRateKey:44100.0,
AVNumberOfChannelsKey:2,
AVEncoderBitRateKey:12800,
AVLinearPCMBitDepthKey:16,
AVEncoderAudioQualityKey:AVAudioQuality.Max.rawValue
]
var error: NSError?
realRecorder = AVAudioRecorder(URL:url, settings: recordSettings as [NSObject : AnyObject], error: &error)
It works fine but the resultant CAF file is useless on windows systems. I wanted to record in a more familiar format like MP3 but turns out you cannot in iOS due to licensing issues.
Now I want to record in AAC format for which I have switched the file extension from .CAF to .AAC in above code and also switched value in AVFormatIDKey:kAudioFormatAppleIMA4 to kAudioFormatMPEG4AAC but those settings fail to record anything. Am I suppose to change some other setting too to make AAC recording work?
Remember my objective is to record in a format which is readily playable on mac/win/browser
I figured it out myself. I had to remove below lines for .AAC format to work:
AVSampleRateKey:44100.0,
AVNumberOfChannelsKey:2,
AVEncoderBitRateKey:12800,
AVLinearPCMBitDepthKey:16,

Resources