First audio CMSampleBuffer lost when reading mp4 file using AVAssetReader - ios

I'm using AVAssetWriter to write audio CMSampleBuffer to an mp4 file, but when I later read that file using AVAssetReader, it seems to be missing the initial chunk of data.
Here's the debug description of the first CMSampleBuffer passed to writer input append method (notice the priming duration attachement of 1024/44_100):
CMSampleBuffer 0x102ea5b60 retainCount: 7 allocator: 0x1c061f840
invalid = NO
dataReady = YES
makeDataReadyCallback = 0x0
makeDataReadyRefcon = 0x0
buffer-level attachments:
TrimDurationAtStart = {
epoch = 0;
flags = 1;
timescale = 44100;
value = 1024;
}
formatDescription = <CMAudioFormatDescription 0x281fd9720 [0x1c061f840]> {
mediaType:'soun'
mediaSubType:'aac '
mediaSpecific: {
ASBD: {
mSampleRate: 44100.000000
mFormatID: 'aac '
mFormatFlags: 0x2
mBytesPerPacket: 0
mFramesPerPacket: 1024
mBytesPerFrame: 0
mChannelsPerFrame: 2
mBitsPerChannel: 0 }
cookie: {<CFData 0x2805f50a0 [0x1c061f840]>{length = 39, capacity = 39, bytes = 0x03808080220000000480808014401400 ... 1210068080800102}}
ACL: {(null)}
FormatList Array: {
Index: 0
ChannelLayoutTag: 0x650002
ASBD: {
mSampleRate: 44100.000000
mFormatID: 'aac '
mFormatFlags: 0x0
mBytesPerPacket: 0
mFramesPerPacket: 1024
mBytesPerFrame: 0
mChannelsPerFrame: 2
mBitsPerChannel: 0 }}
}
extensions: {(null)}
}
sbufToTrackReadiness = 0x0
numSamples = 1
outputPTS = {6683542167/44100 = 151554.244, rounded}(based on cachedOutputPresentationTimeStamp)
sampleTimingArray[1] = {
{PTS = {6683541143/44100 = 151554.221, rounded}, DTS = {6683541143/44100 = 151554.221, rounded}, duration = {1024/44100 = 0.023}},
}
sampleSizeArray[1] = {
sampleSize = 163,
}
dataBuffer = 0x281cc7a80
Here's the debug description of the second CMSampleBuffer (notice the priming duration attachement of 1088/44_100, which combined with the previous trim duration yields the standard value of 2112):
CMSampleBuffer 0x102e584f0 retainCount: 7 allocator: 0x1c061f840
invalid = NO
dataReady = YES
makeDataReadyCallback = 0x0
makeDataReadyRefcon = 0x0
buffer-level attachments:
TrimDurationAtStart = {
epoch = 0;
flags = 1;
timescale = 44100;
value = 1088;
}
formatDescription = <CMAudioFormatDescription 0x281fd9720 [0x1c061f840]> {
mediaType:'soun'
mediaSubType:'aac '
mediaSpecific: {
ASBD: {
mSampleRate: 44100.000000
mFormatID: 'aac '
mFormatFlags: 0x2
mBytesPerPacket: 0
mFramesPerPacket: 1024
mBytesPerFrame: 0
mChannelsPerFrame: 2
mBitsPerChannel: 0 }
cookie: {<CFData 0x2805f50a0 [0x1c061f840]>{length = 39, capacity = 39, bytes = 0x03808080220000000480808014401400 ... 1210068080800102}}
ACL: {(null)}
FormatList Array: {
Index: 0
ChannelLayoutTag: 0x650002
ASBD: {
mSampleRate: 44100.000000
mFormatID: 'aac '
mFormatFlags: 0x0
mBytesPerPacket: 0
mFramesPerPacket: 1024
mBytesPerFrame: 0
mChannelsPerFrame: 2
mBitsPerChannel: 0 }}
}
extensions: {(null)}
}
sbufToTrackReadiness = 0x0
numSamples = 1
outputPTS = {6683543255/44100 = 151554.269, rounded}(based on cachedOutputPresentationTimeStamp)
sampleTimingArray[1] = {
{PTS = {6683542167/44100 = 151554.244, rounded}, DTS = {6683542167/44100 = 151554.244, rounded}, duration = {1024/44100 = 0.023}},
}
sampleSizeArray[1] = {
sampleSize = 179,
}
dataBuffer = 0x281cc4750
Now, when I read the audio track using AVAssetReader, the first CMSampleBuffer I get is:
CMSampleBuffer 0x102ed7b20 retainCount: 7 allocator: 0x1c061f840
invalid = NO
dataReady = YES
makeDataReadyCallback = 0x0
makeDataReadyRefcon = 0x0
buffer-level attachments:
EmptyMedia(P) = true
formatDescription = (null)
sbufToTrackReadiness = 0x0
numSamples = 0
outputPTS = {0/1 = 0.000}(based on outputPresentationTimeStamp)
sampleTimingArray[1] = {
{PTS = {0/1 = 0.000}, DTS = {INVALID}, duration = {0/1 = 0.000}},
}
dataBuffer = 0x0
and the next one is contains priming info of 1088/44_100:
CMSampleBuffer 0x10318bc00 retainCount: 7 allocator: 0x1c061f840
invalid = NO
dataReady = YES
makeDataReadyCallback = 0x0
makeDataReadyRefcon = 0x0
buffer-level attachments:
FillDiscontinuitiesWithSilence(P) = true
GradualDecoderRefresh(P) = 1
TrimDurationAtStart(P) = {
epoch = 0;
flags = 1;
timescale = 44100;
value = 1088;
}
IsGradualDecoderRefreshAuthoritative(P) = false
formatDescription = <CMAudioFormatDescription 0x281fdcaa0 [0x1c061f840]> {
mediaType:'soun'
mediaSubType:'aac '
mediaSpecific: {
ASBD: {
mSampleRate: 44100.000000
mFormatID: 'aac '
mFormatFlags: 0x0
mBytesPerPacket: 0
mFramesPerPacket: 1024
mBytesPerFrame: 0
mChannelsPerFrame: 2
mBitsPerChannel: 0 }
cookie: {<CFData 0x2805f3800 [0x1c061f840]>{length = 39, capacity = 39, bytes = 0x03808080220000000480808014401400 ... 1210068080800102}}
ACL: {Stereo (L R)}
FormatList Array: {
Index: 0
ChannelLayoutTag: 0x650002
ASBD: {
mSampleRate: 44100.000000
mFormatID: 'aac '
mFormatFlags: 0x0
mBytesPerPacket: 0
mFramesPerPacket: 1024
mBytesPerFrame: 0
mChannelsPerFrame: 2
mBitsPerChannel: 0 }}
}
extensions: {{
VerbatimISOSampleEntry = {length = 87, bytes = 0x00000057 6d703461 00000000 00000001 ... 12100680 80800102 };
}}
}
sbufToTrackReadiness = 0x0
numSamples = 43
outputPTS = {83/600 = 0.138}(based on outputPresentationTimeStamp)
sampleTimingArray[1] = {
{PTS = {1024/44100 = 0.023}, DTS = {1024/44100 = 0.023}, duration = {1024/44100 = 0.023}},
}
sampleSizeArray[43] = {
sampleSize = 179,
sampleSize = 173,
sampleSize = 178,
sampleSize = 172,
sampleSize = 172,
sampleSize = 159,
sampleSize = 180,
sampleSize = 200,
sampleSize = 187,
sampleSize = 189,
sampleSize = 206,
sampleSize = 192,
sampleSize = 195,
sampleSize = 186,
sampleSize = 183,
sampleSize = 189,
sampleSize = 211,
sampleSize = 198,
sampleSize = 204,
sampleSize = 211,
sampleSize = 204,
sampleSize = 202,
sampleSize = 218,
sampleSize = 210,
sampleSize = 206,
sampleSize = 207,
sampleSize = 221,
sampleSize = 219,
sampleSize = 236,
sampleSize = 219,
sampleSize = 227,
sampleSize = 225,
sampleSize = 225,
sampleSize = 229,
sampleSize = 225,
sampleSize = 236,
sampleSize = 233,
sampleSize = 231,
sampleSize = 249,
sampleSize = 234,
sampleSize = 250,
sampleSize = 249,
sampleSize = 259,
}
dataBuffer = 0x281cde370
The input append method keeps returning true which in principle means that all sample buffers got appended, but the reader for some reason skips the first chunk of data. Is there anything I'm doing wrong here?
I'm using the following code to read the file:
let asset = AVAsset(url: fileURL)
guard let assetReader = try? AVAssetReader(asset: asset) else {
return
}
asset.loadValuesAsynchronously(forKeys: ["tracks"]) { in
guard let audioTrack = asset.tracks(withMediaType: .audio).first else { return }
let audioOutput = AVAssetReaderTrackOutput(track: audioTrack, outputSettings: nil)
assetReader.startReading()
while assetReader.status == .reading {
if let sampleBuffer = audioOutput.copyNextSampleBuffer() {
// do something
}
}
}

First some pedantry: you haven't lost your first sample buffer, but rather the first packet within your first sample buffer.
The behaviour of AVAssetReader with nil outputSettings when reading AAC packet data has changed on iOS 13 and macOS 10.15 (Catalina).
Previously you would get the first AAC packet, that packet's presentation timestamp (zero) and a trim attachment instructing you to discard the usual first 2112 frames of decoded audio.
Now [iOS 13, macOS 10.15] AVAssetReader seems to discard the first packet, leaving you the second packet, whose presentation timestamp is 1024, and you need only discard 2112 - 1024 = 1088 of the decoded frames.
Something that might not be immediately obvious in the above situations is that AVAssetReader is talking about TWO timelines, not one. The packet timestamps are referred to one, the untrimmed timeline, and the trim instruction implies the existence of another: the untrimmed timeline.
The transformation from untrimmed to trimmed timestamps is very simple, it's usually trimmed = untrimmed - 2112.
So is the new behaviour a bug? The fact that if you decode to LPCM and correctly follow the trim instructions, then you should still get the same audio, leads me believe the change was intentional (NB: I haven't yet personally confirmed the LPCM samples are the same).
However, the documentation says:
A value of nil for outputSettings configures the output to vend samples in their original format as stored by the specified track.
I don't think you can both discard packets [even the first one, which is basically a constant] and claim to be vending samples in their "original format", so from this point of view I think the change has a bug-like quality.
I also think it's an unfortunate change as I used to consider nil outputSettings AVAssetReader to be a sort of "raw" mode, but now it assumes your only use case is decoding to LPCM.
There's only one thing that could downgrade "unfortunate" to "serious bug", and that's if this new "let's pretend the first AAC packet doesn't exist" approach extends to files created with AVAssetWriter because that would break interoperability with non-AVAssetReader code, where the number of frames to trim has congealed to a constant 2112 frames. I also haven't personally confirmed this. Do you have a file created with the above sample buffers that you can share?
p.s. I don't think your input sample buffers are relevant here, I think you'd lose the first packet reading from any AAC file. However your input sample buffers seem slightly unusual in that they have hosttime [capture session?] style timestamps, yet are AAC, and only have one packet per sample buffer, which isn't very many and seems like a lot of overhead for 23ms of audio. Are you creating them them yourself in an AVCaptureSession -> AVAudioConverter chain?

Related

How can I convert my UInt8 array to Data? (Swift)

I am trying to communicate with a Bluetooth laser tag gun that takes data in 20 byte chunks, which are broken down into 16, 8 or 4-bit words. To do this, I made a UInt8 array and changed the values in there. The problem happens when I try to send the UInt8 array.
var bytes = [UInt8](repeating: 0, count: 20)
bytes[0] = commandID
if commandID == 240 {
commandID = 0
}
commandID += commandIDIncrement
print(commandID)
bytes[2] = 128
bytes[4] = UInt8(gunIDSlider.value)
print("Response: \(laserTagGun.writeValue(bytes, for: gunCControl, type: CBCharacteristicWriteType.withResponse))")
commandID is just a UInt8. This gives me the error, Cannot convert value of type '[UInt8]' to expected argument type 'Data', which I tried to solve by doing this:
var bytes = [UInt8](repeating: 0, count: 20)
bytes[0] = commandID
if commandID == 240 {
commandID = 0
}
commandID += commandIDIncrement
print(commandID)
bytes[2] = 128
bytes[4] = UInt8(gunIDSlider.value)
print("bytes: \(bytes)")
assert(bytes.count * MemoryLayout<UInt8>.stride >= MemoryLayout<Data>.size)
let data1 = UnsafeRawPointer(bytes).assumingMemoryBound(to: Data.self).pointee
print("data1: \(data1)")
print("Response: \(laserTagGun.writeValue(data1, for: gunCControl, type: CBCharacteristicWriteType.withResponse))")
To this, data1 just prints 0 bytes and I can see that laserTagGun.writeValue isn't actually doing anything by reading data from the other characteristics. How can I convert my UInt8 array to Data in swift? Also please let me know if there is a better way to handle 20 bytes of data than a UInt8 array. Thank you for your help!
It looks like you're really trying to avoid a copy of the bytes, if not, then just init a new Data with your bytes array:
let data2 = Data(bytes)
print("data2: \(data2)")
If you really want to avoid the copy, what about something like this?
let data1 = Data(bytesNoCopy: UnsafeMutableRawPointer(mutating: bytes), count: bytes.count, deallocator: .none)
print("data1: \(data1)")

AVAssetWriter startSessionAtSourceTime not accepting CMTIme value

My app is designed to record video & analyze the frames generated under iOS 11.4, using Xcode 10.0 as IDE. Succeeded in recording video using AVCaptureMovieFileOutput, but need to analyze frames so transitioned to AVAssetWriter and modeled code after RosyWriter [ https://github.com/WildDylan/appleSample/tree/master/RosyWriter ]. Code is written in ObjC.
I am stuck with problem inside captureOutput: didOutputSampleBuffer: fromConnection: delegate. After capturing first frame, the AVAssetWriter is configured along with its inputs (video and audio),using settings extracted from first frame. Once user selects record, the captured sampleBuffer is analyzed and written. I tried to use AVAssetWriter startSessionAtSourceTime: but there is clearly something wrong with the way CMSampleBufferGetPresentationTimeStamp is returning CMTime from the sample buffer. The sampleBuufer log seems to show CMTime with valid values.
If I implement:
CMTime sampleTime = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
[self->assetWriter startSessionAtSourceTime: sampleTime]
the error generated is '*** -[AVAssetWriter startSessionAtSourceTime:] invalid parameter not satisfying: CMTIME_IS_NUMERIC(startTime)' .
If I use [self->assetWriter startSessionAtSourceTime:kCMTimeZero] the error "warning: could not execute support code to read Objective-C class data in the process. This may reduce the quality of type information available." is generated.
When I log sampleTime I read - value=0, timescale=0, epoch=0 & flags=0. I also log the sampleBuffer and show it below, followed by the relevant code:
SampleBuffer Content =
2018-10-17 12:07:04.540816+0300 MyApp[10664:2111852] -[CameraCaptureManager captureOutput:didOutputSampleBuffer:fromConnection:] : sampleBuffer - CMSampleBuffer 0x100e388c0 retainCount: 1 allocator: 0x1c03a95e0
invalid = NO
dataReady = YES
makeDataReadyCallback = 0x0
makeDataReadyRefcon = 0x0
buffer-level attachments:
Orientation(P) = 1
{Exif} (P) = <CFBasicHash 0x28161ce80 [0x1c03a95e0]>{type = mutable dict, count = 24,
entries => .....A LOT OF CAMERA DATA HERE.....
}
DPIWidth (P) = 72
{TIFF} (P) = <CFBasicHash 0x28161c540 [0x1c03a95e0]>{type = mutable dict, count = 7,
entries => .....MORE CAMERA DATA HERE.....
}
DPIHeight (P) = 72
{MakerApple}(P) = {
1 = 3;
10 = 0;
14 = 0;
3 = {
epoch = 0;
flags = 1;
timescale = 1000000000;
value = 390750488472916;
};
4 = 0;
5 = 221;
6 = 211;
7 = 1;
8 = (
"-0.04894018",
"-0.6889497",
"-0.7034443"
);
9 = 0;
}
formatDescription = <CMVideoFormatDescription 0x280ddc780 [0x1c03a95e0]> {
mediaType:'vide'
mediaSubType:'BGRA'
mediaSpecific: {
codecType: 'BGRA' dimensions: 720 x 1280
}
extensions: {<CFBasicHash 0x28161f880 [0x1c03a95e0]>{type = immutable dict, count = 5,
entries =>
0 : <CFString 0x1c0917068 [0x1c03a95e0]>{contents = "CVImageBufferYCbCrMatrix"} = <CFString 0x1c09170a8 [0x1c03a95e0]>{contents = "ITU_R_601_4"}
1 : <CFString 0x1c09171c8 [0x1c03a95e0]>{contents = "CVImageBufferTransferFunction"} = <CFString 0x1c0917088 [0x1c03a95e0]>{contents = "ITU_R_709_2"}
2 : <CFString 0x1c093f348 [0x1c03a95e0]>{contents = "CVBytesPerRow"} = <CFNumber 0x81092876519e5903 [0x1c03a95e0]>{value = +2880, type = kCFNumberSInt32Type}
3 : <CFString 0x1c093f3c8 [0x1c03a95e0]>{contents = "Version"} = <CFNumber 0x81092876519eed23 [0x1c03a95e0]>{value = +2, type = kCFNumberSInt32Type}
5 : <CFString 0x1c0917148 [0x1c03a95e0]>{contents = "CVImageBufferColorPrimaries"} = <CFString 0x1c0917088 [0x1c03a95e0]>{contents = "ITU_R_709_2"}
}
}
}
sbufToTrackReadiness = 0x0
numSamples = 1
sampleTimingArray[1] = {
{PTS = {390750488483992/1000000000 = 390750.488}, DTS = {INVALID}, duration = {INVALID}},
}
imageBuffer = 0x2832ad2c0
====================================================
//AVCaptureVideoDataOutput Delegates
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
{
if (connection == videoConnection)
{
if (self.outputVideoFormatDescription == NULL )
{
self.outputVideoFormatDescription = CMSampleBufferGetFormatDescription(sampleBuffer);
[self setupVideoRecorder];
}
else if (self.status==RecorderRecording)
{
NSLog(#"%s : self.outputVideoFormatDescription - %#",__FUNCTION__,self.outputVideoFormatDescription);
[self.cmDelegate manager:self capturedFrameBuffer:sampleBuffer];
NSLog(#"%s : sampleBuffer - %#",__FUNCTION__,sampleBuffer);
dispatch_async(vidWriteQueue, ^
{
if (!self->wroteFirstFrame)
{
CMTime sampleTime = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
NSLog(#"%s : sampleTime value - %lld, timescale - %i, epoch - %lli, flags - %u",__FUNCTION__,sampleTime.value, sampleTime.timescale, sampleTime.epoch, sampleTime.flags);
[self->assetWriter startSessionAtSourceTime:sampleTime];
self->wroteFirstFrame = YES;
}
if (self->videoAWInput.readyForMoreMediaData)
//else if (self->videoAWInput.readyForMoreMediaData)
{
BOOL appendSuccess = [self->videoAWInput appendSampleBuffer:sampleBuffer];
NSLog(#"%s : appendSuccess - %i",__FUNCTION__,appendSuccess);
if (!appendSuccess) NSLog(#"%s : failed to append video buffer - %##",__FUNCTION__,self->assetWriter.error.localizedDescription);
}
});
}
else if (connection == audioConnection)
{
}
}
}
My bad... my problem was that I was spawning off the frame capture using a thread that was already declared in AVCaptureDataOutput setSampleBufferDelegate:queue: . Was recursively putting a process on a thread within that same thread. Posting answer in case another idiot, like me, makes the same stupid mistake...

Can't append CMSampleBuffer to AVAssertWriterInput (error -12780)

I'm manually decoding h264 RTSP stream using ffmpeg and trying to save the uncompressed frames using AVAssertWriter and AVAssertWriterInput.
I'm getting the following error when calling AVAssetWriterInput appendBuffer -
Error Domain=AVFoundationErrorDomain Code=-11800 "The operation could not be completed" UserInfo={NSUnderlyingError=0x170059530 {Error Domain=NSOSStatusErrorDomain Code=-12780 "(null)"}, NSLocalizedFailureReason=An unknown error occurred (-12780), NSLocalizedDescription=The operation could not be completed}
The CMSampleBuffer contains BGRA frames and looks like this -
CMSampleBuffer 0x159d12900 retainCount: 1 allocator: 0x1b3aa3bb8
invalid = NO
dataReady = YES
makeDataReadyCallback = 0x0
makeDataReadyRefcon = 0x0
formatDescription = <CMVideoFormatDescription 0x17405bd50 [0x1b3aa3bb8]> {
mediaType:'vide'
mediaSubType:'BGRA'
mediaSpecific: {
codecType: 'BGRA'
dimensions: 720 x 1280
}
extensions: {<CFBasicHash 0x1742652c0 [0x1b3aa3bb8]>{type = immutable dict, count = 4,
entries =>
0 : <CFString 0x1addb17c8 [0x1b3aa3bb8]>{contents = "CVImageBufferYCbCrMatrix"} = <CFString 0x1addb1808 [0x1b3aa3bb8]>{contents = "ITU_R_601_4"}
1 : <CFString 0x1addb1928 [0x1b3aa3bb8]>{contents = "CVImageBufferTransferFunction"} = <CFString 0x1addb17e8 [0x1b3aa3bb8]>{contents = "ITU_R_709_2"}
2 : <CFString 0x1adde3800 [0x1b3aa3bb8]>{contents = "CVBytesPerRow"} = <CFNumber 0xb00000000000b402 [0x1b3aa3bb8]>{value = +2880, type = kCFNumberSInt32Type}
3 : <CFString 0x1adde3880 [0x1b3aa3bb8]>{contents = "Version"} = <CFNumber 0xb000000000000022 [0x1b3aa3bb8]>{value = +2, type = kCFNumberSInt32Type}
}
}
}
sbufToTrackReadiness = 0x0
numSamples = 1
sampleTimingArray[1] = {
{PTS = {3000/90000 = 0.033}, DTS = {INVALID}, duration = {INVALID}},
}
imageBuffer = 0x17413ebe0
I've looked on the following question and answers as well but it doesn't seem to explain the issue I'm having (the format I used is a supported pixel format):
Why won't AVFoundation accept my planar pixel buffers on an iOS device?
Any help will be grateful!
FYI - When I save BGRA CMSampleBuffers I get from the iPhone camera it just works, if needed I can paste an example CMSampleBuffer as well.
I'll answer myself as I've found the issue -
The CMSampleBuffer wasn't IOSurface backed. I've used CVPixelBufferCreateWithBytes which created a CVPixelBuffer without IOSurface backing, as soon as I used CVPixelBufferCreate and passed the kCVPixelBufferIOSurfacePropertiesKey key it worked.
https://developer.apple.com/library/content/qa/qa1781/_index.html has all the information about creating IOSurface-backed CVPixelBuffers.

Swift 3 AVAudioEngine set microphone input format

I want to process the bytes read from the microphone using Swift 3 on my iOS. I currently use AVAudioEngine.
print(inputNode.inputFormat(forBus: bus).settings)
print(inputNode.inputFormat(forBus: bus).formatDescription)
This gives me the following output:
["AVNumberOfChannelsKey": 1, "AVLinearPCMBitDepthKey": 32, "AVSampleRateKey": 16000, "AVLinearPCMIsNonInterleaved": 1, "AVLinearPCMIsBigEndianKey": 0, "AVFormatIDKey": 1819304813, "AVLinearPCMIsFloatKey": 1]
<CMAudioFormatDescription 0x14d5bbb0 [0x3a5fb7d8]> {
mediaType:'soun'
mediaSubType:'lpcm'
mediaSpecific: {
ASBD: {
mSampleRate: 16000.000000
mFormatID: 'lpcm'
mFormatFlags: 0x29
mBytesPerPacket: 4
mFramesPerPacket: 1
mBytesPerFrame: 4
mChannelsPerFrame: 1
mBitsPerChannel: 32 }
cookie: {(null)}
ACL: {(null)}
FormatList Array: {(null)}
}
extensions: {(null)}
}
The problem is that the server I want to send the data to does not expect 32 bit floats but 16 bit unsigned ints. I think I have to change the mFormatFlags. Does anybody know how I can do this and what value would be the right one?
The resulting byte stream should be equivalent to the one I get on android using
AudioRecord recorder = new AudioRecord(MediaRecorder.AudioSource.MIC, SAMPLES_PER_SECOND,
AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_16BIT,
recordSegmentSizeBytes);
I tried this:
let cfmt = AVAudioCommonFormat.pcmFormatInt16
inputNode.inputFormat(forBus: bus) = AVAudioFormat(commonFormat: cfmt, sampleRate: 16000.0, channels: 1, interleaved: false)
but got this error
Cannot assign to value: function call returns immutable value
Any ideas?
Oh my god, I think I got it. I was too blind to see that you can specify the format of the installTap callback. This seems to work
let audioEngine = AVAudioEngine()
func startRecording() {
let inputNode = audioEngine.inputNode!
let bus = 0
let format = AVAudioFormat(commonFormat: AVAudioCommonFormat.pcmFormatInt16, sampleRate: 16000.0, channels: 1, interleaved: false)
inputNode.installTap(onBus: bus, bufferSize: 2048, format: format) { // inputNode.inputFormat(forBus: bus)
(buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
let values = UnsafeBufferPointer(start: buffer.int16ChannelData![0], count: Int(buffer.frameLength))
let arr = Array(values)
print(arr)
}
audioEngine.prepare()
do {
try audioEngine.start()
} catch {
print("Error info: \(error)")
}
}

VTDecompressionSessionDecodeFrame ERROR -12916 (kVTFormatDescriptionChangeNotSupportedErr)

I got a problem I can't get my head around.
First I create a VTCompressionSessionCreate (h264) then in my compression callback when I start feeding images I get a CMSampleBufferRef sampleBuffer as expected.
Just for debugging the code stream I then create a VTDecompressionSessionCreate and feed the 'sampleBuffer' containing the H264 stream to a VTDecompressionSessionDecodeFrame and I would expect a CVImageBufferRef imageBuffer in my decompression callback.
Now to the problem:
If I create VTDecompressionSessionCreate using the 'sampleBuffer' from the compression callback like this:
CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer);
Everything works as expected and I get CVImageBufferRef's in my decompression callback.
However my intention is to send the data over a network so I need to get my format discription from the in stream SPS and PPS information.
So then I must 'fake' getting the SPS and PPS by first extracting them and then using them like this:
CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer);
size_t spsSize, ppsSize;
size_t parmCount;
const uint8_t* sps, *pps;
CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 0, &sps, &spsSize, &parmCount, NULL );
CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 1, &pps, &ppsSize, &parmCount, NULL );
const uint8_t* const parameterSetPointers[2] = {sps, pps};
const size_t parameterSetSizes[2] = {spsSize, ppsSize};
CMFormatDescriptionRef format2;
status = CMVideoFormatDescriptionCreateFromH264ParameterSets(kCFAllocatorDefault, 2, parameterSetPointers, parameterSetSizes, 4, &format2);
I would expect format and format2 to contain the same information but:
format = <CMVideoFormatDescription 0x17004fd50 [0x19483ac80]> {
mediaType:'vide'
mediaSubType:'avc1'
mediaSpecific: {
codecType: 'avc1' dimensions: 1280 x 720
}
extensions: {<CFBasicHash 0x170270cc0 [0x19483ac80]>{type = immutable dict, count = 2,
entries =>
0 : <CFString 0x194935fa0 [0x19483ac80]>{contents = "SampleDescriptionExtensionAtoms"} = <CFBasicHash 0x170270c40 [0x19483ac80]>{type = immutable dict, count = 1,
entries =>
2 : <CFString 0x194939fa0 [0x19483ac80]>{contents = "avcC"} = <CFData 0x1700c9920 [0x19483ac80]>{length = 35, capacity = 35, bytes = 0x0164001fffe100106764001fac56c050 ... 28ee3cb0fdf8f800}
}
2 : <CFString 0x194936000 [0x19483ac80]>{contents = "FormatName"} = <CFString 0x17003a160 [0x19483ac80]>{contents = "H.264"}
}
}
}
format2:
format2 = <CMVideoFormatDescription 0x174051c70 [0x19483ac80]> {
mediaType:'vide'
mediaSubType:'avc1'
mediaSpecific: {
codecType: 'avc1' dimensions: 1280 x 720
}
extensions: {<CFBasicHash 0x17426f9c0 [0x19483ac80]>{type = immutable dict, count = 5,
entries =>
0 : <CFString 0x19499a608 [0x19483ac80]>{contents = "CVImageBufferChromaLocationBottomField"} = <CFString 0x19499a648 [0x19483ac80]>{contents = "Center"}
1 : <CFString 0x19499a328 [0x19483ac80]>{contents = "CVFieldCount"} = <CFNumber 0xb000000000000012 [0x19483ac80]>{value = +1, type = kCFNumberSInt32Type}
3 : <CFString 0x194935fa0 [0x19483ac80]>{contents = "SampleDescriptionExtensionAtoms"} = <CFBasicHash 0x17426b100 [0x19483ac80]>{type = immutable dict, count = 1,
entries =>
2 : <CFString 0x174031560 [0x19483ac80]>{contents = "avcC"} = <CFData 0x1740c4910 [0x19483ac80]>{length = 35, capacity = 35, bytes = 0x0164001fffe100106764001fac56c050 ... 28ee3cb0fdf8f800}
}
5 : <CFString 0x19499a5e8 [0x19483ac80]>{contents = "CVImageBufferChromaLocationTopField"} = <CFString 0x19499a648 [0x19483ac80]>{contents = "Center"}
6 : <CFString 0x1949360e0 [0x19483ac80]>{contents = "FullRangeVideo"} = <CFBoolean 0x19483b030 [0x19483ac80]>{value = false}
}
}
}
format works
forma2 don't and VTDecompressionSessionDecodeFrame throws error -12916.
Thank you for helping.
.
Solved it. It was the way I created CMFormatDescriptionRef containing the code stream that was causing the error.
The SPS and PPS was taken from a CFSampleBuffer. Then I create a CMVideoFormatDescriptionCreateFromH264ParameterSets so far so good. But in the same application I turned around the stream and decoded the picture using the same CFSampleBuffer. That's not working and was causing the error. I had to convert the payload to NSData first then create a new CFSampleBuffer from the NSData. Then it works

Resources