I am trying to write a Video decoder using the Hardware supported Video Toolkit Decoder. But if I try to initialize the decoding session like in the example posted below, I get the error -8971 while calling VTDecompressionSessionCreate. Can anyone tell me what I am doing wrong here?
Thank you and best regards,
Oliver
OSStatus status;
int tmpWidth = sps.EncodedWidth();
int tmpHeight = sps.EncodedHeight();
NSLog(#"Got new Width and Height from SPS - %dx%d", tmpWidth, tmpHeight);
const VTDecompressionOutputCallbackRecord callback = { ReceivedDecompressedFrame, self };
status = CMVideoFormatDescriptionCreate(NULL,
kCMVideoCodecType_H264,
tmpWidth,
tmpHeight,
NULL,
&decoderFormatDescription);
if (status == noErr)
{
// Set the pixel attributes for the destination buffer
CFMutableDictionaryRef destinationPixelBufferAttributes = CFDictionaryCreateMutable(
NULL, // CFAllocatorRef allocator
0, // CFIndex capacity
&kCFTypeDictionaryKeyCallBacks,
&kCFTypeDictionaryValueCallBacks);
SInt32 destinationPixelType = kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange;
CFDictionarySetValue(destinationPixelBufferAttributes,kCVPixelBufferPixelFormatTypeKey, CFNumberCreate(NULL, kCFNumberSInt32Type, &destinationPixelType));
CFDictionarySetValue(destinationPixelBufferAttributes,kCVPixelBufferWidthKey, CFNumberCreate(NULL, kCFNumberSInt32Type, &tmpWidth));
CFDictionarySetValue(destinationPixelBufferAttributes, kCVPixelBufferHeightKey, CFNumberCreate(NULL, kCFNumberSInt32Type, &tmpHeight));
CFDictionarySetValue(destinationPixelBufferAttributes, kCVPixelBufferOpenGLCompatibilityKey, kCFBooleanTrue);
// Set the Decoder Parameters
CFMutableDictionaryRef decoderParameters = CFDictionaryCreateMutable(
NULL, // CFAllocatorRef allocator
0, // CFIndex capacity
&kCFTypeDictionaryKeyCallBacks,
&kCFTypeDictionaryValueCallBacks);
CFDictionarySetValue(decoderParameters,kVTDecompressionPropertyKey_RealTime, kCFBooleanTrue);
// Create the decompression session
// Throws Error -8971 (codecExtensionNotFoundErr)
status = VTDecompressionSessionCreate(NULL, decoderFormatDescription, decoderParameters, destinationPixelBufferAttributes, &callback, &decoderDecompressionSession);
// release the dictionaries
CFRelease(destinationPixelBufferAttributes);
CFRelease(decoderParameters);
// Check the Status
if(status != noErr)
{
NSLog(#"Error %d while creating Video Decompression Session.", (int)status);
continue;
}
}
else
{
NSLog(#"Error %d while creating Video Format Descripttion.", (int)status);
continue;
}
I also stumbled with kVTVideoDecoderBadDataErr. In my case I was changing the header 0x00000001 with the size of the NAL package which included the 4 bytes of this header, that was the reason. I changed the size to not include these 4 bytes (frame_size = sizeof(NAL) - 4). This size should be encoded in big-endian.
You need to create the CMFormatDescriptionRef from your SPS and PPS like
CMFormatDescriptionRef decoderFormatDescription;
const uint8_t* const parameterSetPointers[2] = { (const uint8_t*)[currentSps bytes], (const uint8_t*)[currentPps bytes] };
const size_t parameterSetSizes[2] = { [currentSps length], [currentPps length] };
status = CMVideoFormatDescriptionCreateFromH264ParameterSets(NULL,
2,
parameterSetPointers,
parameterSetSizes,
4,
&decoderFormatDescription);
Also if you are getting your Video Data in Annex-B format you need to remove the start code and replace it with the 4-Byte size information for the decoder to recognize it as avcc formated (Thats what the 5th parameter to CMVideoFormatDescriptionCreateFromH264ParameterSets is for).
#Joride
refer to http://www.szatmary.org/blog/25
It explains that the header (first) byte of each buffer within a NALU describes the buffer's type. You need to mask off these bits and compare them to the table provided. Note the comment about the bit fields. You need to mask the byte with 0x1f to the type value.
Related
I want to send ARFrame pixel buffer data over network, i have listed my setup below. With this setup if i try to send the frames app crashes in gstreamers C code after few frames, but if i send camera's AVCaptureVideoDataOutput pixel buffer instead, the stream works fine. I have set AVCaptureSession's pixel format type to
kCVPixelFormatType_420YpCbCr8BiPlanarFullRange
so it replicates the same type ARFrame gives. Please help i am unable to find any solution. I am sorry if my english is bad or i have missed something out, do ask me for it.
My Setup
Get pixelBuffer of ARFrame from didUpdateFrame delegate of ARKit
Encode to h264 using VTCompressionSession
- (void)SendFrames:(CVPixelBufferRef)pixelBuffer :(NSTimeInterval)timeStamp
{
size_t width = CVPixelBufferGetWidth(pixelBuffer);
size_t height = CVPixelBufferGetHeight(pixelBuffer);
if(session == NULL)
{
[self initEncoder:width height:height];
}
CMTime presentationTimeStamp = CMTimeMake(0, 1);
OSStatus statusCode = VTCompressionSessionEncodeFrame(session, pixelBuffer, presentationTimeStamp, kCMTimeInvalid, NULL, NULL, NULL);
if (statusCode != noErr) {
// End the session
VTCompressionSessionInvalidate(session);
CFRelease(session);
session = NULL;
return;
}
VTCompressionSessionEndPass(session, NULL, NULL);
}
- (void) initEncoder:(size_t)width height:(size_t)height
{
OSStatus status = VTCompressionSessionCreate(NULL, (int)width, (int)height, kCMVideoCodecType_H264, NULL, NULL, NULL, OutputCallback, NULL, &session);
NSLog(#":VTCompressionSessionCreate %d", (int)status);
if (status != noErr)
{
NSLog(#"Unable to create a H264 session");
return ;
}
VTSessionSetProperty(session, kVTCompressionPropertyKey_RealTime, kCFBooleanTrue);
VTSessionSetProperty(session, kVTCompressionPropertyKey_ProfileLevel, kVTProfileLevel_H264_Baseline_AutoLevel);
VTCompressionSessionPrepareToEncodeFrames(session);
}
Get sampleBuffer from callback, convert it to elementary stream
void OutputCallback(void *outputCallbackRefCon, void *sourceFrameRefCon, OSStatus status, VTEncodeInfoFlags infoFlags,CMSampleBufferRef sampleBuffer)
{
if (status != noErr) {
NSLog(#"Error encoding video, err=%lld", (int64_t)status);
return;
}
if (!CMSampleBufferDataIsReady(sampleBuffer))
{
NSLog(#"didCompressH264 data is not ready ");
return;
}
// In this example we will use a NSMutableData object to store the
// elementary stream.
NSMutableData *elementaryStream = [NSMutableData data];
// This is the start code that we will write to
// the elementary stream before every NAL unit
static const size_t startCodeLength = 4;
static const uint8_t startCode[] = {0x00, 0x00, 0x00, 0x01};
// Write the SPS and PPS NAL units to the elementary stream
CMFormatDescriptionRef description = CMSampleBufferGetFormatDescription(sampleBuffer);
// Find out how many parameter sets there are
size_t numberOfParameterSets;
int AVCCHeaderLength;
CMVideoFormatDescriptionGetH264ParameterSetAtIndex(description,
0, NULL, NULL,
&numberOfParameterSets,
&AVCCHeaderLength);
// Write each parameter set to the elementary stream
for (int i = 0; i < numberOfParameterSets; i++) {
const uint8_t *parameterSetPointer;
int NALUnitHeaderLengthOut = 0;
size_t parameterSetLength;
CMVideoFormatDescriptionGetH264ParameterSetAtIndex(description,
i,
¶meterSetPointer,
¶meterSetLength,
NULL, &NALUnitHeaderLengthOut);
// Write the parameter set to the elementary stream
[elementaryStream appendBytes:startCode length:startCodeLength];
[elementaryStream appendBytes:parameterSetPointer length:parameterSetLength];
}
// Get a pointer to the raw AVCC NAL unit data in the sample buffer
size_t blockBufferLength;
uint8_t *bufferDataPointer = NULL;
size_t lengthAtOffset = 0;
size_t bufferOffset = 0;
CMBlockBufferGetDataPointer(CMSampleBufferGetDataBuffer(sampleBuffer),
bufferOffset,
&lengthAtOffset,
&blockBufferLength,
(char **)&bufferDataPointer);
// Loop through all the NAL units in the block buffer
// and write them to the elementary stream with
// start codes instead of AVCC length headers
while (bufferOffset < blockBufferLength - AVCCHeaderLength) {
// Read the NAL unit length
uint32_t NALUnitLength = 0;
memcpy(&NALUnitLength, bufferDataPointer + bufferOffset, AVCCHeaderLength);
// Convert the length value from Big-endian to Little-endian
NALUnitLength = CFSwapInt32BigToHost(NALUnitLength);
// Write start code to the elementary stream
[elementaryStream appendBytes:startCode length:startCodeLength];
// Write the NAL unit without the AVCC length header to the elementary stream
[elementaryStream appendBytes:bufferDataPointer + bufferOffset + AVCCHeaderLength
length:NALUnitLength];
// Move to the next NAL unit in the block buffer
bufferOffset += AVCCHeaderLength + NALUnitLength;
}
char *bytePtr = (char *)[elementaryStream mutableBytes];
long maxSize = (long)elementaryStream.length;
CMTime presentationtime = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
vidplayer_stream(bytePtr, maxSize, (long)presentationtime.value);
}
For OSX and IOS, I have streams of real time encoded video (h.264) and audio (AAC) data coming in, and I want to be able to mux these together into an mp4.
I'm using an AVAssetWriterto perform the muxing.
I have video working, but my audio still sounds like jumbled static. Here's what I'm trying right now (skipping some of the error checks here for brevity):
I initialize the writer:
NSURL *url = [NSURL fileURLWithPath:mContext->filename];
NSError* err = nil;
mContext->writer = [AVAssetWriter assetWriterWithURL:url fileType:AVFileTypeMPEG4 error:&err];
I initialize the audio input:
NSDictionary* settings;
AudioChannelLayout acl;
bzero(&acl, sizeof(acl));
acl.mChannelLayoutTag = kAudioChannelLayoutTag_Stereo;
settings = nil; // set output to nil so it becomes a pass-through
CMAudioFormatDescriptionRef audioFormatDesc = nil;
{
AudioStreamBasicDescription absd = {0};
absd.mSampleRate = mParameters.audioSampleRate; //known sample rate
absd.mFormatID = kAudioFormatMPEG4AAC;
absd.mFormatFlags = kMPEG4Object_AAC_Main;
CMAudioFormatDescriptionCreate(NULL, &absd, 0, NULL, 0, NULL, NULL, &audioFormatDesc);
}
mContext->aacWriterInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeAudio outputSettings:settings sourceFormatHint:audioFormatDesc];
mContext->aacWriterInput.expectsMediaDataInRealTime = YES;
[mContext->writer addInput:mContext->aacWriterInput];
And start the writer:
[mContext->writer startWriting];
[mContext->writer startSessionAtSourceTime:kCMTimeZero];
Then, I have a callback where I receive a packet with a timestamp (milliseconds), and a std::vector<uint8_t> with the data containing 1024 compressed samples. I make sure isReadyForMoreMediaData is true. Then, if this is our first time receiving the callback, I set up the CMAudioFormatDescription:
OSStatus error = 0;
AudioStreamBasicDescription streamDesc = {0};
streamDesc.mSampleRate = mParameters.audioSampleRate;
streamDesc.mFormatID = kAudioFormatMPEG4AAC;
streamDesc.mFormatFlags = kMPEG4Object_AAC_Main;
streamDesc.mChannelsPerFrame = 2; // always stereo for us
streamDesc.mBitsPerChannel = 0;
streamDesc.mBytesPerFrame = 0;
streamDesc.mFramesPerPacket = 1024; // Our AAC packets contain 1024 samples per frame
streamDesc.mBytesPerPacket = 0;
streamDesc.mReserved = 0;
AudioChannelLayout acl;
bzero(&acl, sizeof(acl));
acl.mChannelLayoutTag = kAudioChannelLayoutTag_Stereo;
error = CMAudioFormatDescriptionCreate(kCFAllocatorDefault, &streamDesc, sizeof(acl), &acl, 0, NULL, NULL, &mContext->audioFormat);
And finally, I create a CMSampleBufferRef and send it along:
CMSampleBufferRef buffer = NULL;
CMBlockBufferRef blockBuffer;
CMBlockBufferCreateWithMemoryBlock(kCFAllocatorDefault, NULL, packet.data.size(), kCFAllocatorDefault, NULL, 0, packet.data.size(), kCMBlockBufferAssureMemoryNowFlag, &blockBuffer);
CMBlockBufferReplaceDataBytes((void*)packet.data.data(), blockBuffer, 0, packet.data.size());
CMTime duration = CMTimeMake(1024, mParameters.audioSampleRate);
CMTime pts = CMTimeMake(packet.timestamp, 1000);
CMSampleTimingInfo timing = {duration , pts, kCMTimeInvalid };
size_t sampleSizeArray[1] = {packet.data.size()};
error = CMSampleBufferCreate(kCFAllocatorDefault, blockBuffer, true, NULL, nullptr, mContext->audioFormat, 1, 1, &timing, 1, sampleSizeArray, &buffer);
// First input buffer must have an appropriate kCMSampleBufferAttachmentKey_TrimDurationAtStart since the codec has encoder delay'
if (mContext->firstAudioFrame)
{
CFDictionaryRef dict = NULL;
dict = CMTimeCopyAsDictionary(CMTimeMake(1024, 44100), kCFAllocatorDefault);
CMSetAttachment(buffer, kCMSampleBufferAttachmentKey_TrimDurationAtStart, dict, kCMAttachmentMode_ShouldNotPropagate);
// we must trim the start time on first audio frame...
mContext->firstAudioFrame = false;
}
CMSampleBufferMakeDataReady(buffer);
BOOL ret = [mContext->aacWriterInput appendSampleBuffer:buffer];
I guess the part I'm most suspicious of is my call to CMSampleBufferCreate. It seems I have to pass in a sample sizes array, otherwise I get this error message immediately when checking my writer's status:
Error Domain=AVFoundationErrorDomain Code=-11800 "The operation could not be completed" UserInfo={NSLocalizedFailureReason=An unknown error occurred (-12735), NSLocalizedDescription=The operation could not be completed, NSUnderlyingError=0x604001e50770 {Error Domain=NSOSStatusErrorDomain Code=-12735 "(null)"}}
Where the underlying error appears to be kCMSampleBufferError_BufferHasNoSampleSizes.
I did notice an example in Apple's documentation for creating the buffer with AAC data:
https://developer.apple.com/documentation/coremedia/1489723-cmsamplebuffercreate?language=objc
In their example, they specify a long sampleSizeArray with an entry for every single sample. Is that necessary? I don't have that information with this callback. And in our Windows implementation we didn't need that data. So I tried sending in packet.data.size() as the sample size but that doesn't seem right and it certainly doesn't produce pleasant audio.
Any ideas? Either tweaks to my calls here or different APIs I should be using to mux together streams of encoded data.
Thanks!
If you don't want to transcode, do not pass the outputSetting dictionary. You should pass nil there:
mContext->aacWriterInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeAudio outputSettings:nil sourceFormatHint:audioFormatDesc];
It is explained somewhere in this article:
https://developer.apple.com/library/archive/documentation/AudioVideo/Conceptual/AVFoundationPG/Articles/05_Export.html
I am trying to take a video file read it in using AVAssetReader and pass the audio off to CoreAudio for processing (adding effects and stuff) before saving it back out to disk using AVAssetWriter. I would like to point out that if i set the componentSubType on AudioComponentDescription of my output node as RemoteIO, things play correctly though the speakers. This makes me confident that my AUGraph is properly setup as I can hear things working. I am setting the subType to GenericOutput though so I can do the rendering myself and get back the adjusted audio.
I am reading in the audio and i pass the CMSampleBufferRef off to copyBuffer. This puts the audio into a circular buffer that will be read in later.
- (void)copyBuffer:(CMSampleBufferRef)buf {
if (_readyForMoreBytes == NO)
{
return;
}
AudioBufferList abl;
CMBlockBufferRef blockBuffer;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(buf, NULL, &abl, sizeof(abl), NULL, NULL, kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment, &blockBuffer);
UInt32 size = (unsigned int)CMSampleBufferGetTotalSampleSize(buf);
BOOL bytesCopied = TPCircularBufferProduceBytes(&circularBuffer, abl.mBuffers[0].mData, size);
if (!bytesCopied){
/
_readyForMoreBytes = NO;
if (size > kRescueBufferSize){
NSLog(#"Unable to allocate enought space for rescue buffer, dropping audio frame");
} else {
if (rescueBuffer == nil) {
rescueBuffer = malloc(kRescueBufferSize);
}
rescueBufferSize = size;
memcpy(rescueBuffer, abl.mBuffers[0].mData, size);
}
}
CFRelease(blockBuffer);
if (!self.hasBuffer && bytesCopied > 0)
{
self.hasBuffer = YES;
}
}
Next I call processOutput. This will do a manual reder on the outputUnit. When AudioUnitRender is called it invokes the playbackCallback below, which is what is hooked up as input callback on my first node. playbackCallback pulls the data off the circular buffer and feeds it into the audioBufferList passed in. Like I said before if the output is set as RemoteIO this will cause the audio to correctly be played on the speakers. When AudioUnitRender finishes, it returns noErr and the bufferList object contains valid data. When I call CMSampleBufferSetDataBufferFromAudioBufferList though I get kCMSampleBufferError_RequiredParameterMissing (-12731).
-(CMSampleBufferRef)processOutput
{
if(self.offline == NO)
{
return NULL;
}
AudioUnitRenderActionFlags flags = 0;
AudioTimeStamp inTimeStamp;
memset(&inTimeStamp, 0, sizeof(AudioTimeStamp));
inTimeStamp.mFlags = kAudioTimeStampSampleTimeValid;
UInt32 busNumber = 0;
UInt32 numberFrames = 512;
inTimeStamp.mSampleTime = 0;
UInt32 channelCount = 2;
AudioBufferList *bufferList = (AudioBufferList*)malloc(sizeof(AudioBufferList)+sizeof(AudioBuffer)*(channelCount-1));
bufferList->mNumberBuffers = channelCount;
for (int j=0; j<channelCount; j++)
{
AudioBuffer buffer = {0};
buffer.mNumberChannels = 1;
buffer.mDataByteSize = numberFrames*sizeof(SInt32);
buffer.mData = calloc(numberFrames,sizeof(SInt32));
bufferList->mBuffers[j] = buffer;
}
CheckError(AudioUnitRender(outputUnit, &flags, &inTimeStamp, busNumber, numberFrames, bufferList), #"AudioUnitRender outputUnit");
CMSampleBufferRef sampleBufferRef = NULL;
CMFormatDescriptionRef format = NULL;
CMSampleTimingInfo timing = { CMTimeMake(1, 44100), kCMTimeZero, kCMTimeInvalid };
AudioStreamBasicDescription audioFormat = self.audioFormat;
CheckError(CMAudioFormatDescriptionCreate(kCFAllocatorDefault, &audioFormat, 0, NULL, 0, NULL, NULL, &format), #"CMAudioFormatDescriptionCreate");
CheckError(CMSampleBufferCreate(kCFAllocatorDefault, NULL, false, NULL, NULL, format, numberFrames, 1, &timing, 0, NULL, &sampleBufferRef), #"CMSampleBufferCreate");
CheckError(CMSampleBufferSetDataBufferFromAudioBufferList(sampleBufferRef, kCFAllocatorDefault, kCFAllocatorDefault, 0, bufferList), #"CMSampleBufferSetDataBufferFromAudioBufferList");
return sampleBufferRef;
}
static OSStatus playbackCallback(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData)
{
int numberOfChannels = ioData->mBuffers[0].mNumberChannels;
SInt16 *outSample = (SInt16 *)ioData->mBuffers[0].mData;
/
memset(outSample, 0, ioData->mBuffers[0].mDataByteSize);
MyAudioPlayer *p = (__bridge MyAudioPlayer *)inRefCon;
if (p.hasBuffer){
int32_t availableBytes;
SInt16 *bufferTail = TPCircularBufferTail([p getBuffer], &availableBytes);
int32_t requestedBytesSize = inNumberFrames * kUnitSize * numberOfChannels;
int bytesToRead = MIN(availableBytes, requestedBytesSize);
memcpy(outSample, bufferTail, bytesToRead);
TPCircularBufferConsume([p getBuffer], bytesToRead);
if (availableBytes <= requestedBytesSize*2){
[p setReadyForMoreBytes];
}
if (availableBytes <= requestedBytesSize) {
p.hasBuffer = NO;
}
}
return noErr;
}
The CMSampleBufferRef I pass in looks valid (below is a dump of the object from the debugger)
CMSampleBuffer 0x7f87d2a03120 retainCount: 1 allocator: 0x103333180
invalid = NO
dataReady = NO
makeDataReadyCallback = 0x0
makeDataReadyRefcon = 0x0
formatDescription = <CMAudioFormatDescription 0x7f87d2a02b20 [0x103333180]> {
mediaType:'soun'
mediaSubType:'lpcm'
mediaSpecific: {
ASBD: {
mSampleRate: 44100.000000
mFormatID: 'lpcm'
mFormatFlags: 0xc2c
mBytesPerPacket: 2
mFramesPerPacket: 1
mBytesPerFrame: 2
mChannelsPerFrame: 1
mBitsPerChannel: 16 }
cookie: {(null)}
ACL: {(null)}
}
extensions: {(null)}
}
sbufToTrackReadiness = 0x0
numSamples = 512
sampleTimingArray[1] = {
{PTS = {0/1 = 0.000}, DTS = {INVALID}, duration = {1/44100 = 0.000}},
}
dataBuffer = 0x0
The buffer list looks like this
Printing description of bufferList:
(AudioBufferList *) bufferList = 0x00007f87d280b0a0
Printing description of bufferList->mNumberBuffers:
(UInt32) mNumberBuffers = 2
Printing description of bufferList->mBuffers:
(AudioBuffer [1]) mBuffers = {
[0] = (mNumberChannels = 1, mDataByteSize = 2048, mData = 0x00007f87d3008c00)
}
Really at a loss here, hoping someone can help. Thanks,
In case it matters i am debugging this in ios 8.3 simulator and the audio is coming from a mp4 that i shot on my iphone 6 then saved to my laptop.
I have read the following issues, however still to no avail, things are not working.
How to convert AudioBufferList to CMSampleBuffer?
Converting an AudioBufferList to a CMSampleBuffer Produces Unexpected Results
CMSampleBufferSetDataBufferFromAudioBufferList returning error 12731
core audio offline rendering GenericOutput
UPDATE
I poked around some more and notice that when my AudioBufferList right before AudioUnitRender runs looks like this:
bufferList->mNumberBuffers = 2,
bufferList->mBuffers[0].mNumberChannels = 1,
bufferList->mBuffers[0].mDataByteSize = 2048
mDataByteSize is numberFrames*sizeof(SInt32), which is 512 * 4. When I look at the AudioBufferList passed in playbackCallback, the list looks like this:
bufferList->mNumberBuffers = 1,
bufferList->mBuffers[0].mNumberChannels = 1,
bufferList->mBuffers[0].mDataByteSize = 1024
not really sure where that other buffer is going, or the other 1024 byte size...
if when i get finished calling Redner if I do something like this
AudioBufferList newbuff;
newbuff.mNumberBuffers = 1;
newbuff.mBuffers[0] = bufferList->mBuffers[0];
newbuff.mBuffers[0].mDataByteSize = 1024;
and pass newbuff off to CMSampleBufferSetDataBufferFromAudioBufferList the error goes away.
If I try setting the size of BufferList to have 1 mNumberBuffers or its mDataByteSize to be numberFrames*sizeof(SInt16) I get a -50 when calling AudioUnitRender
UPDATE 2
I hooked up a render callback so I can inspect the output when I play the sound over the speakers. I noticed that the output that goes to the speakers also has a AudioBufferList with 2 buffers, and the mDataByteSize during the input callback is 1024 and in the render callback its 2048, which is the same as I have been seeing when manually calling AudioUnitRender. When I inspect the data in the rendered AudioBufferList I notice that the bytes in the 2 buffers are the same, which means I can just ignore the second buffer. But I am not sure how to handle the fact that the data is 2048 in size after being rendered instead of 1024 as it's being taken in. Any ideas on why that could be happening? Is it in more of a raw form after going through the audio graph and that is why the size is doubling?
Sounds like the issue you're dealing with is because of a discrepancy in the number of channels. The reason you're seeing data in blocks of 2048 instead of 1024 is because it is feeding you back two channels (stereo). Check to make sure all of your audio units are properly configured to use mono throughout the entire audio graph, including the Pitch Unit and any audio format descriptions.
One thing to especially beware of is that calls to AudioUnitSetProperty can fail - so be sure to wrap those in CheckError() as well.
I am working on H263 decompression with VideoToolBox.but when decoding 4CIF video stream, the output pixel data are all 0 value, and there is no error info.
I don't know why this happened, as video stream with CIF resolution is decompressed correctly.
Is any one has the same problem?
this is a piece of my code:
CMFormatDescriptionRef newFmtDesc = nil;
OSStatus status = CMVideoFormatDescriptionCreate(kCFAllocatorDefault,
kCMVideoCodecType_H263,
width,
height,
NULL,
&_videoFormatDescription);
if (status)
{
return -1;
}
CFMutableDictionaryRef dpba = CFDictionaryCreateMutable(kCFAllocatorDefault,
2,
&kCFTypeDictionaryKeyCallBacks,
&kCFTypeDictionaryValueCallBacks);
CFDictionarySetValue(dpba,
kCVPixelBufferOpenGLCompatibilityKey,
kCFBooleanFalse);
VTDictionarySetInt32(dpba,
kCVPixelBufferPixelFormatTypeKey,
kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange); // use NV12
VTDictionarySetInt32(dpba,
kCVPixelBufferWidthKey,
dimension.width);
VTDictionarySetInt32(dpba,
kCVPixelBufferHeightKey,
dimension.height);
VTDictionarySetInt32(dpba,
kCVPixelBufferBytesPerRowAlignmentKey,
dimension.width);// setup decoder callback record
VTDecompressionOutputCallbackRecord decoderCallbackRecord;
decoderCallbackRecord.decompressionOutputCallback = onDecodeCallback;
decoderCallbackRecord.decompressionOutputRefCon = this;// create decompression session
status = VTDecompressionSessionCreate(kCFAllocatorDefault,
_videoFormatDescription,
nil,
dpba,
&decoderCallbackRecord,
&_session);
// Do Decode
CMSampleBufferRef sampleBuffer;
sampleBuffer = VTSampleBufferCreate(_videoFormatDescription, (void*)data_start, data_len, ts);
VTDecodeFrameFlags flags = 0;
VTDecodeInfoFlags flagOut = 0;
OSStatus decodeStatus = VTDecompressionSessionDecodeFrame(_session,
sampleBuffer,
flags,
nil,
&flagOut);
I tried compressing H263 with VideoToolBox, I int the session with resolution of 4CIF, and push 4CIF NV12 image to the compression session, but the output of H263 stream is in CIF resolution!
Is VideoToolBox can't support 4CIF H263 Video on both compression and decompression?
I'm trying to decode a raw stream of .H264 video data but I can't find a way to create a proper
- (void)decodeFrameWithNSData:(NSData*)data presentationTime:
(CMTime)presentationTime
{
#autoreleasepool {
CMSampleBufferRef sampleBuffer = NULL;
CMBlockBufferRef blockBuffer = NULL;
VTDecodeInfoFlags infoFlags;
int sourceFrame;
if( dSessionRef == NULL )
[self createDecompressionSession];
CMSampleTimingInfo timingInfo ;
timingInfo.presentationTimeStamp = presentationTime;
timingInfo.duration = CMTimeMake(1,100000000);
timingInfo.decodeTimeStamp = kCMTimeInvalid;
//Creates block buffer from NSData
OSStatus status = CMBlockBufferCreateWithMemoryBlock(CFAllocatorGetDefault(), (void*)data.bytes,data.length*sizeof(char), CFAllocatorGetDefault(), NULL, 0, data.length*sizeof(char), 0, &blockBuffer);
//Creates CMSampleBuffer to feed decompression session
status = CMSampleBufferCreateReady(CFAllocatorGetDefault(), blockBuffer,self.encoderVideoFormat,1,1,&timingInfo, 0, 0, &sampleBuffer);
status = VTDecompressionSessionDecodeFrame(dSessionRef,sampleBuffer, kVTDecodeFrame_1xRealTimePlayback, &sourceFrame,&infoFlags);
if(status != noErr) {
NSLog(#"Decode with data error %d",status);
}
}
}
At the end of the call I'm getting -12911 error in VTDecompressionSessionDecodeFrame that translates to kVTVideoDecoderMalfunctionErr which after reading this [post] pointed me that I should make a VideoFormatDescriptor using CMVideoFormatDescriptionCreateFromH264ParameterSets. But how can I create a new VideoFormatDescription if I don't have information of the currentSps or currentPps? How can I get that information from my raw .H264 streaming?
CMFormatDescriptionRef decoderFormatDescription;
const uint8_t* const parameterSetPointers[2] =
{ (const uint8_t*)[currentSps bytes], (const uint8_t*)[currentPps bytes] };
const size_t parameterSetSizes[2] =
{ [currentSps length], [currentPps length] };
status = CMVideoFormatDescriptionCreateFromH264ParameterSets(NULL,
2,
parameterSetPointers,
parameterSetSizes,
4,
&decoderFormatDescription);
Thanks in advance,
Marcos
[post] : Decoding H264 VideoToolkit API fails with Error -8971 in VTDecompressionSessionCreate
You you MUST call CMVideoFormatDescriptionCreateFromH264ParameterSets first. The SPS/PPS may be stored/transmitted separately from the video stream. Or may come inline.
Note that for VTDecompressionSessionDecodeFrame your NALUs must be preceded with a size, and not a start code.
You can read more here:
Possible Locations for Sequence/Picture Parameter Set(s) for H.264 Stream