I get a libyuv crash recently.
I try a lot, but no use.
Please help or try to give some ideas how to achieve this. Thanks!
I have a iOS project(Objective C). One of the functions is encode the video stream.
My idea is
Step 1: Start a timer(20 FPS)
Step 2: Copy and get the bitmap data
Step 3: Transfer the bitmap data to YUV I420 (libyuv)
Step 4: Encode to the h264 format (Openh264)
Step 5: Send the h264 data with RTSP
All of function run on the foreground.
It works well for 3~4hr.
BUT it always will be crashed after 4hr+.
Check the CPU(39%), Memory(140MB), it is stable(No memory leak, CPU busy, etc.).
I try a lot, but no use ( Include add try-catch in my project, detect the data size before run in this line )
I figure out it will run more if decrease the FPS time(20FPS -> 15FPS)
Does it need to add something after encode each frame?
Could someone help me or give some idea for this? Thanks!
// This function runs in a GCD timer
- (void)processSDLFrame:(NSData *)_frameData {
if (mH264EncoderPtr == NULL) {
[self initEncoder];
return;
}
int argbSize = mMapWidth * mMapHeight * 4;
NSData *frameData = [[NSData alloc] initWithData:_frameData];
if ([frameData length] == 0 || [frameData length] != argbSize) {
NSLog(#"Incorrect frame with size : %ld\n", [frameData length]);
return;
}
SFrameBSInfo info;
memset(&info, 0, sizeof (SFrameBSInfo));
SSourcePicture pic;
memset(&pic, 0, sizeof (SSourcePicture));
pic.iPicWidth = mMapWidth;
pic.iPicHeight = mMapHeight;
pic.uiTimeStamp = [[NSDate date] timeIntervalSince1970];
#try {
libyuv::ConvertToI420(
static_cast<const uint8 *>([frameData bytes]), // sample
argbSize, // sample_size
mDstY, // dst_y
mStrideY, // dst_stride_y
mDstU, // dst_u
mStrideU, // dst_stride_u
mDstV, // dst_v
mStrideV, // dst_stride_v
0, // crop_x
0, // crop_y
mMapWidth, // src_width
mMapHeight, // src_height
mMapWidth, // crop_width
mMapHeight, // crop_height
libyuv::kRotateNone, // rotation
libyuv::FOURCC_ARGB); // fourcc
} #catch (NSException *exception) {
NSLog(#"libyuv::ConvertToI420 - exception:%#", exception.reason);
return;
}
pic.iColorFormat = videoFormatI420;
pic.iStride[0] = mStrideY;
pic.iStride[1] = mStrideU;
pic.iStride[2] = mStrideV;
pic.pData[0] = mDstY;
pic.pData[1] = mDstU;
pic.pData[2] = mDstV;
if (mH264EncoderPtr == NULL) {
NSLog(#"OpenH264Manager - encoder not initialized");
return;
}
int rv = -1;
#try {
rv = mH264EncoderPtr->EncodeFrame(&pic, &info);
} #catch (NSException *exception) {
NSLog( #"NSException caught - mH264EncoderPtr->EncodeFrame" );
NSLog( #"Name: %#", exception.name);
NSLog( #"Reason: %#", exception.reason );
[self deinitEncoder];
return;
}
if (rv != cmResultSuccess) {
NSLog(#"OpenH264Manager - encode failed : %d", rv);
[self deinitEncoder];
return;
}
if (info.eFrameType == videoFrameTypeSkip) {
NSLog(#"OpenH264Manager - drop skipped frame");
return;
}
// handle buffer data
int size = 0;
int layerSize[MAX_LAYER_NUM_OF_FRAME] = { 0 };
for (int layer = 0; layer < info.iLayerNum; layer++) {
for (int i = 0; i < info.sLayerInfo[layer].iNalCount; i++) {
layerSize[layer] += info.sLayerInfo[layer].pNalLengthInByte[i];
}
size += layerSize[layer];
}
uint8 *output = (uint8 *)malloc(size);
size = 0;
for (int layer = 0; layer < info.iLayerNum; layer++) {
memcpy(output + size, info.sLayerInfo[layer].pBsBuf, layerSize[layer]);
size += layerSize[layer];
}
// alloc new buffer for streaming
NSData *newData = [NSData dataWithBytes:output length:size];
// Send the data with RTSP
sendData( newData );
// free output buffer data
free(output);
}
[Jan/08/2020 Update]
I report this ticket on the Google Issue Report
https://bugs.chromium.org/p/libyuv/issues/detail?id=853
The Googler give me a feedback.
ARGBToI420 does no allocations. Its similar to a memcpy with a source and destination and number of pixels to convert.
The most common issues with it are
1. the destination buffer has been deallocated. Try adding validation that the YUV buffer is valid. Write to the first and last byte of each layer.
This often occurs on shutdown and threads dont shut down in the order you were hoping. A mutex to guard the memory could help.
2. the destination is an odd size and the allocator did not allocate enough memory. When alllocating the UV plane, use (width + 1) / 2 for width/stride and (height + 1) / 2 for height of UV. Allocate stride * height bytes. You could also use an allocator that verifies there are no overreads or overwrites, or a sanitizer like asan / msan.
When screen casting, usually windows are a multiple of 2 pixels on Windows and Linux, but I have seen MacOS use odd pixel count.
As a test you could wrap the function with temporary buffers. Copy the ARGB to a temporary ARGB buffer.
Call ARGBToI420 to a temporary I420 buffer.
Copy the I420 result to the final I420 buffer.
That should give you a clue which buffer/function is failing.
I will try them.
Related
I am working on a project in which I have to store the datas of an ADC Stream on a µSD card. However even if I use a 16 bits buffer, I lose data from the ADC stream. My ADC is used with DMA and I use FATFS (WITHOUT DMA) and the SDMMC1 peripheral to fill a .bin file with the datas.
Do you have an idea to avoid this loss ?
Here is my project : https://github.com/mathieuchene/STM32H743ZI
I use a nucleo-h743zi2 Board, CubeIDE, and CubeMx in their last version.
EDIT 1
I tried to implement Colin's solution, it's better but I have a strange things in the middle of my acquisition. However when I increase the maximal count value or try to debug, the HardFault_Handler appears. I modified main.c file by creating 2 blocks (uint16_t blockX[BUFFERLENGTH/2]) and 2 flags for when adcBuffer is half filled or completely filled.
I also changed the while(1) part in main function like this
if (flagHlfCplt){
//flagCplt=0;
res = f_write(&SDFile, block1, strlen((char*)block1), (void *)&byteswritten);
memcpy(block2, adcBuffer, BUFFERLENGTH/2);
flagHlfCplt = 0;
count++;
}
if (flagCplt){
//flagHlfCplt=0;
res = f_write(&SDFile, block2, strlen((char*)block2), (void *)&byteswritten);
memcpy(block1, adcBuffer[(BUFFERLENGTH/2)-1], BUFFERLENGTH/2);
flagCplt = 0;
count++;
}
if (count == 10){
f_close(&SDFile);
HAL_ADC_Stop_DMA(&hadc1);
while(1){
HAL_GPIO_TogglePin(LD1_GPIO_Port, LD1_Pin);
HAL_Delay(1000);
}
}
}
EDIT 2
I modified my program. I set block 1 and block 2 with the length of BUFFERLENGTH and I added a pointer (*idx) to change the buffer which is filled. I don't have HardFault_Handler anymore but I still loose some datas from my adc's stream.
Here are the modification I made:
// my pointer and buffers
uint16_t block1[BUFFERLENGTH], block2[BUFFERLENGTH], *idx;
// init of pointer and adc start
idx=block1;
HAL_ADC_Start_DMA(&hadc1, (uint32_t*)idx, BUFFERLENGTH);
// while(1) part
while (1)
{
if (flagCplt){
if (flagToChangeBuffer) {
idx=block1;
res = f_write(&SDFile, block2, strlen((char*)block2), (void *)&byteswritten);
flagCplt = 0;
flagToChangeBuffer=0;
count++;
}
else {
idx=block2;
res = f_write(&SDFile, block1, strlen((char*)block1), (void *)&byteswritten);
flagCplt = 0;
flagToChangeBuffer=1;
count++;
}
}
if (count == 150){
f_close(&SDFile);
HAL_ADC_Stop_DMA(&hadc1);
while(1){
HAL_GPIO_TogglePin(LD1_GPIO_Port, LD1_Pin);
HAL_Delay(1000);
}
}
}
Does someone know how to solve my matter with these loss?
Best Regards
Mathieu
Right now I'm investigating possibility to implement video streaming through MultipeerConnectivity framework. For that purpose I'm using NSInputStream and NSOutputStream.
The problem is: I can't receive any picture so far. Right now I'm trying to pass simple picture and show it on the receiver. Here's a little snippet of my code:
Sending picture via NSOutputStream:
- (void)sendMessageToStream
{
NSData *imgData = UIImagePNGRepresentation(_testImage);
int img_length = (int)[imgData length];
NSMutableData *msgData = [[NSMutableData alloc] initWithBytes:&img_length length:sizeof(img_length)];
[msgData appendData:imgData];
int msg_length = (int)[msgData length];
uint8_t *readBytes = (uint8_t *)[msgData bytes];
uint8_t buf[msg_length];
(void)memcpy(buf, readBytes, msg_length);
int stream_len = [_stream writeData:(uint8_t*)buf maxLength:msg_length];
//int stream_len = [_stream writeData:(uint8_t *)buf maxLength:data_length];
//NSLog(#"stream_len = %d", stream_len);
_tmpCounter++;
dispatch_async(dispatch_get_main_queue(), ^{
_lblOperationsCounter.text = [NSString stringWithFormat:#"Sent: %ld", (long)_tmpCounter];
});
}
The code above works totally fine. stream_len parameter after writing equals to 29627 bytes which is expected value, because image's size is around 25-26 kb.
Receiving picture via NSinputStream:
- (void)readDataFromStream
{
UInt32 length;
if (_currentFrameSize == 0) {
uint8_t frameSize[4];
length = [_stream readData:frameSize maxLength:sizeof(int)];
unsigned int b = frameSize[3];
b <<= 8;
b |= frameSize[2];
b <<= 8;
b |= frameSize[1];
b <<= 8;
b |= frameSize[0];
_currentFrameSize = b;
}
uint8_t bytes[1024];
length = [_stream readData:bytes maxLength:1024];
[_frameData appendBytes:bytes length:length];
if ([_frameData length] >= _currentFrameSize) {
UIImage *img = [UIImage imageWithData:_frameData];
NSLog(#"SETUP IMAGE!");
_imgView.image = img;
_currentFrameSize = 0;
[_frameData setLength:0];
}
_tmpCounter++;
dispatch_async(dispatch_get_main_queue(), ^{
_lblOperationsCounter.text = [NSString stringWithFormat:#"Received: %ld", (long)_tmpCounter];
});
}
As you can see I'm trying to receive picture in several steps, and here's why. When I'm trying to read data from stream, it's always reading maximum 1095 bytes no matter what number I put in maxLength: parameter. But when I send the picture in the first snippet of code, it's sending absolutely ok (29627 bytes . Btw, image's size is around 29 kb.
That's the place where my question come up - why is that? Why is sending 29 kb via NSOutputStream works totally fine when receiving is causing problems? And is there a solid way to make video streaming work through NSInputStream and NSOutputStream? I just didn't find much information about this technology, all I found were some simple things which I knew already.
Here's an app I wrote that shows you how:
https://app.box.com/s/94dcm9qjk8giuar08305qspdbe0pc784
Build the project with Xcode 9 and run the app on two iOS 11 devices.
To stream live video, touch the Camera icon on one of two devices.
If you don't have two devices, you can run one app in the Simulator; however, you can only use the camera on the real device (the Simulator will display the video broadcasted).
Just so you know: this is not the ideal way to stream real-time video between devices (it should probably be your last choice). Data packets (versus streaming) are way more efficient and faster.
Regardless, I'm really confused by your NSInputStream-related code. Here's something that makes a little more sense, I think:
case NSStreamEventHasBytesAvailable: {
// len is a global variable set to a non-zero value;
// mdata is a NSMutableData object that is reset when a new input
// stream is created.
// displayImage is a block that accepts the image data and a reference
// to the layer on which the image will be rendered
uint8_t * buf[len];
len = [aStream read:(uint8_t *)buf maxLength:len];
if (len > 0) {
[mdata appendBytes:(const void *)buf length:len];
} else {
displayImage(mdata, wLayer);
}
break;
}
The output stream code should look something like this:
// data is an NSData object that contains the image data from the video
// camera;
// len is a global variable set to a non-zero value
// byteIndex is a global variable set to zero each time a new output
// stream is created
if (data.length > 0 && len >= 0 && (byteIndex <= data.length)) {
len = (data.length - byteIndex) < DATA_LENGTH ? (data.length - byteIndex) : DATA_LENGTH;
uint8_t * bytes[len];
[data getBytes:&bytes range:NSMakeRange(byteIndex, len)];
byteIndex += [oStream write:(const uint8_t *)bytes maxLength:len];
}
There's a lot more to streaming video than setting up the NSStream classes correctly—a lot more. You'll notice in my app, I created a cache for the input and output streams. This solved a myriad of issues that you would likely encounter if you don't do the same.
I have never seen anyone successfully use NSStreams for video streaming...ever. It's highly complex, for one reason.
There are many different (and better) ways to stream video; I wouldn't go this route. I just took it on because no one else has been able to do it successfully.
I think that the problem is in your assumption that all data will be available in NSInputStream all the time while you are reading it. NSInputStream made from NSURL object has an asynchronous nature and it should be accessed accordingly using NSStreamDelegate. You can look at example in the README of POSInputStreamLibrary.
I am trying to read an audio file (that is not supported by iOS) with ffmpeg and then play it using AVAudioPlayer. It took me a while to get ffmpeg built inside an iOS project, but I finally did using kewlbear/FFmpeg-iOS-build-script.
This is the snippet I have right now, after a lot of searching on the web, including stackoverflow. One of the best examples I found was here.
I believe this is all the relevant code. I added comments to let you know what I'm doing and where I need something clever to happen.
#import "FFmpegWrapper.h"
#import <AVFoundation/AVFoundation.h>
AVFormatContext *formatContext = NULL;
AVStream *audioStream = NULL;
av_register_all();
avformat_network_init();
avcodec_register_all();
// this is a file locacted on my NAS
int opened = avformat_open_input(&formatContext, #"http://192.168.1.70:50002/m/NDLNA/43729.flac", NULL, NULL);
// can't open file
if(opened == 1) {
avformat_close_input(&formatContext);
}
int streamInfoValue = avformat_find_stream_info(formatContext, NULL);
// can't open stream
if (streamInfoValue < 0)
{
avformat_close_input(&formatContext);
}
// number of streams available
int inputStreamCount = formatContext->nb_streams;
for(unsigned int i = 0; i<inputStreamCount; i++)
{
// I'm only interested in the audio stream
if(formatContext->streams[i]->codec->codec_type == AVMEDIA_TYPE_AUDIO)
{
// found audio stream
audioStream = formatContext->streams[i];
}
}
if(audioStream == NULL) {
// no audio stream
}
AVFrame* frame = av_frame_alloc();
AVCodecContext* codecContext = audioStream->codec;
codecContext->codec = avcodec_find_decoder(codecContext->codec_id);
if (codecContext->codec == NULL)
{
av_free(frame);
avformat_close_input(&formatContext);
// no proper codec found
}
else if (avcodec_open2(codecContext, codecContext->codec, NULL) != 0)
{
av_free(frame);
avformat_close_input(&formatContext);
// could not open the context with the decoder
}
// this is displaying: This stream has 2 channels and a sample rate of 44100Hz
// which makes sense
NSLog(#"This stream has %d channels and a sample rate of %dHz", codecContext->channels, codecContext->sample_rate);
AVPacket packet;
av_init_packet(&packet);
// this is where I try to store in the sound data
NSMutableData *soundData = [[NSMutableData alloc] init];
while (av_read_frame(formatContext, &packet) == 0)
{
if (packet.stream_index == audioStream->index)
{
// Try to decode the packet into a frame
int frameFinished = 0;
avcodec_decode_audio4(codecContext, frame, &frameFinished, &packet);
// Some frames rely on multiple packets, so we have to make sure the frame is finished before
// we can use it
if (frameFinished)
{
// this is where I think something clever needs to be done
// I need to store some bytes, but I can't figure out what exactly and what length?
// should the length be multiplied by the of the number of channels?
NSData *frameData = [[NSData alloc] initWithBytes:packet.buf->data length:packet.buf->size];
[soundData appendData: frameData];
}
}
// You *must* call av_free_packet() after each call to av_read_frame() or else you'll leak memory
av_free_packet(&packet);
}
// first try to write it to a file, see if that works
// this is indeed writing bytes, but it is unplayable
[soundData writeToFile:#"output.wav" atomically:YES];
NSError *error;
// this is my final goal, playing it with the AVAudioPlayer, but this is giving unclear errors
AVAudioPlayer *player = [[AVAudioPlayer alloc] initWithData:soundData error:&error];
if(player == nil) {
NSLog(error.description); // Domain=NSOSStatusErrorDomain Code=1954115647 "(null)"
} else {
[player prepareToPlay];
[player play];
}
// Some codecs will cause frames to be buffered up in the decoding process. If the CODEC_CAP_DELAY flag
// is set, there can be buffered up frames that need to be flushed, so we'll do that
if (codecContext->codec->capabilities & CODEC_CAP_DELAY)
{
av_init_packet(&packet);
// Decode all the remaining frames in the buffer, until the end is reached
int frameFinished = 0;
while (avcodec_decode_audio4(codecContext, frame, &frameFinished, &packet) >= 0 && frameFinished)
{
}
}
av_free(frame);
avcodec_close(codecContext);
avformat_close_input(&formatContext);
Not really found a solution to this specific problem, but ended up using ap4y/OrigamiEngine instead.
My main reason I wanted to use FFmpeg is to play unsupported audio files (FLAC/OGG) on iOS and tvOS and OrigamiEngine does the job just fine.
I am working on an iOS app to display a h264 video stream with aac audio.
The stream I have is a custom stream that does not use HLS or rtsp/rtmp, so I have my own code to handle the receiving of data.
The data I receive is in two parts: header data and frame data (for both audio and video). I would like to support iOS6+, but will adept if necessary.
My initial idea was converting my frame data from a byte array to an UIImage and than continuously update a UIImageView with new frames. The problem with this is that the frames still need to be decoded first.
I looked at ffmpeg, but all the examples I have seen need either an URL or a local file which don’t work for me. And I read that there might be some licensing problems when using ffmpeg.
I also looked at openh264. I think that might be an option, but since I am developing for iOS, I will still run into those licensing issues.
Edit:
I managed to get this implemented on iOS 8+ using videoToolbox and the provided sample.
My problem with that was I was receiving more data from my stream, than in the example.
I am still looking for a way to do this on iOS 6 and 7.
So my question is how should I handle the decoding and displaying of my frames?
I eventually got this working with FFmpeg and without using the GPL license.
This is how i set it up:
I downloaded the FFmpeg iOS libraries from source forge. (You can also build it from scratch by downloading the build script from: https://github.com/kewlbear/FFmpeg-iOS-build-script )
In code I added a check to see which OS version I am on:
uint8_t *data = (unsigned char*)buf;
float version = [[[UIDevice currentDevice] systemVersion] floatValue];
if (version >= 8.0)
{
[self receivedRawVideoFrame:data withSize:ret ];
}
else if (version >= 6.0 && version < 8.0)
{
[self altDecodeFrame:data withSize:ret isConfigured:configured];
}
You can see find the implementation for the VideoToolbox part here.
- (void)altDecodeFrame:(uint8_t *)frame_bytes withSize:(int) frameSize isConfigured:(Boolean) configured
{
if (!configured) {
uint8_t *header = NULL;
// I know what my H.264 data source's NALUs look like so I know start code index is always 0.
// if you don't know where it starts, you can use a for loop similar to how i find the 2nd and 3rd start codes
int startCodeIndex = 0;
int secondStartCodeIndex = 0;
int thirdStartCodeIndex = 0;
int fourthStartCodeIndex = 0;
int nalu_type = (frame_bytes[startCodeIndex + 4] & 0x1F);
// NALU type 7 is the SPS parameter NALU
if (nalu_type == 7)
{
// find where the second PPS start code begins, (the 0x00 00 00 01 code)
// from which we also get the length of the first SPS code
for (int i = startCodeIndex + 4; i < startCodeIndex + 40; i++)
{
if (frame_bytes[i] == 0x00 && frame_bytes[i+1] == 0x00 && frame_bytes[i+2] == 0x00 && frame_bytes[i+3] == 0x01)
{
secondStartCodeIndex = i;
_spsSize = secondStartCodeIndex; // includes the header in the size
break;
}
}
// find what the second NALU type is
nalu_type = (frame_bytes[secondStartCodeIndex + 4] & 0x1F);
}
// type 8 is the PPS parameter NALU
if(nalu_type == 8)
{
// find where the NALU after this one starts so we know how long the PPS parameter is
for (int i = _spsSize + 4; i < _spsSize + 30; i++)
{
if (frame_bytes[i] == 0x00 && frame_bytes[i+1] == 0x00 && frame_bytes[i+2] == 0x00 && frame_bytes[i+3] == 0x01)
{
thirdStartCodeIndex = i;
_ppsSize = thirdStartCodeIndex - _spsSize;
break;
}
}
// allocate enough data to fit the SPS and PPS parameters into our data object.
header = malloc(_ppsSize + _spsSize);
// copy in the actual sps and pps values, again ignoring the 4 byte header
memcpy (header, &frame_bytes[0], _ppsSize + _spsSize);
NSLog(#"refresh codec context");
avcodec_close(instance.codec_context);
int result;
// I know I have an H264 stream, so that is the codex I look for
AVCodec *codec = avcodec_find_decoder(AV_CODEC_ID_H264);
self.codec_context = avcodec_alloc_context3(codec);
//open codec
result = avcodec_open2(self.codec_context, codec,NULL);
if (result < 0) {
NSLog(#"avcodec_open2 returned %i", result);
}
if (header != NULL) {
//set the extra data for decoding
self.codec_context->extradata = header;
self.codec_context->extradata_size = _spsSize+_ppsSize;
self.codec_context->flags |= CODEC_FLAG_GLOBAL_HEADER;
free(header);
}
// allocate the picture data.
// My frame data is in PIX_FMT_YUV420P format, but I will be converting that later on.
avpicture_alloc(&_pictureData, PIX_FMT_RGB24, 1280, 720);
// After my SPS and PPS data I receive a SEI NALU
nalu_type = (frame_bytes[thirdStartCodeIndex + 4] & 0x1F);
}
if(nalu_type == 6)
{
for (int i = _spsSize +_ppsSize + 4; i < _spsSize +_ppsSize + 30; i++)
{
if (frame_bytes[i] == 0x00 && frame_bytes[i+1] == 0x00 && frame_bytes[i+2] == 0x00 && frame_bytes[i+3] == 0x01)
{
fourthStartCodeIndex = i;
_seiSize = fourthStartCodeIndex - (_spsSize + _ppsSize);
break;
}
}
// do stuff here
// [...]
nalu_type = (frame_bytes[fourthStartCodeIndex + 4] & 0x1F);
}
}
//I had some issues with a large build up of memory, so I created an autoreleasepool
#autoreleasepool {
_frm = av_frame_alloc();
int result;
//fill the packet with the frame data
av_init_packet(&_pkt);
_pkt.data = frame_bytes;
_pkt.size = frameSize;
_pkt.flags = AV_PKT_FLAG_KEY;
int got_packet;
//Decode the frames
result = avcodec_decode_video2(self.codec_context, _frm, &got_packet, &_pkt);
if (result < 0) {
NSLog(#"avcodec_decode_video2 returned %i", result);
}
if (_frm == NULL) {
return;
}
else
{
//Here we will convert from YUV420P to RGB24
static int sws_flags = SWS_FAST_BILINEAR;
struct SwsContext *img_convert_ctx = sws_getContext(self.codec_context->width, self.codec_context->height, self.codec_context->pix_fmt, 1280, 720, PIX_FMT_RGB24, sws_flags, NULL, NULL, NULL);
sws_scale(img_convert_ctx, (const uint8_t* const*)_frm->data, _frm->linesize, 0, _frm->height, _pictureData.data, _pictureData.linesize);
sws_freeContext(img_convert_ctx);
self.lastImage = [self imageFromAVPicture:_pictureData width:_frm->width height:_frm->height];
av_frame_unref(_frm);
}
if (!self.lastImage) {
return;
}
//Normally we render on the AVSampleBufferDisplayLayer, so hide that.
//Add a UIImageView and display the image there.
dispatch_sync(dispatch_get_main_queue(), ^{
if (![[[self viewController] avSbdLayer] isHidden]) {
[[[self viewController] avSbdLayer] setHidden:true];
self.imageView = [[UIImageView alloc] initWithFrame:[[[self viewController] view] bounds]] ;
[[[self viewController] view] addSubview: self.imageView];
}
[[self imageView] setImage: self.lastImage];
});
// Free the allocated data
av_free_packet(&_pkt);
av_frame_free(&_frm);
av_free(_frm);
// free(bckgrnd);
}
}
And this is how I made a UIImage from an AVPicture
-(UIImage *)imageFromAVPicture:(AVPicture)pict width:(int)width height:(int)height {
CGBitmapInfo bitmapInfo = kCGBitmapByteOrderDefault;
CFDataRef data = CFDataCreateWithBytesNoCopy(kCFAllocatorDefault, pict.data[0], pict.linesize[0]*height,kCFAllocatorNull);
CGDataProviderRef provider = CGDataProviderCreateWithCFData(data);
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
CGImageRef cgImage = CGImageCreate(width,
height,
8,
24,
pict.linesize[0],
colorSpace,
bitmapInfo,
provider,
NULL,
NO,
kCGRenderingIntentDefault);
CGColorSpaceRelease(colorSpace);
UIImage *image = [UIImage imageWithCGImage:cgImage];
CGImageRelease(cgImage);
CGDataProviderRelease(provider);
CFRelease(data);
return image;
}
if someone has another (or better) solution please let me know.
I have a project where I need to decode h264 video from a live network stream and eventually end up with a texture I can display in another framework (Unity3D) on iOS devices. I can successfully decode the video using VTDecompressionSession and then grab the texture with CVMetalTextureCacheCreateTextureFromImage (or the OpenGL variant). It works great when I use a low-latency encoder and the image buffers come out in display order, however, when I use the regular encoder the image buffers do not come out in display order and reordering the image buffers is apparently far more difficult that I expected.
The first attempt was to set the VTDecodeFrameFlags with kVTDecodeFrame_EnableAsynchronousDecompression and kVTDecodeFrame_EnableTemporalProcessing... However, it turns out that VTDecompressionSession can choose to ignore the flag and do whatever it wants... and in my case, it chooses to ignore the flag and still outputs the buffer in encoder order (not display order). Essentially useless.
The next attempt was to associate the image buffers with the presentation time stamp and then throw them into a vector which would allow me to grab the image buffer I needed when I create the texture. The problem seems to be that the image buffer that goes into the VTDecompressionSession, which is associated with a time stamp, is no longer the same buffer that comes out, essentially making the time stamp useless.
For example, going into the decoder...
VTDecodeFrameFlags flags = kVTDecodeFrame_EnableAsynchronousDecompression;
VTDecodeInfoFlags flagOut;
// Presentation time stamp to be passed with the buffer
NSNumber *nsPts = [NSNumber numberWithDouble:pts];
VTDecompressionSessionDecodeFrame(_decompressionSession, sampleBuffer, flags,
(void*)CFBridgingRetain(nsPts), &flagOut);
On the callback side...
void decompressionSessionDecodeFrameCallback(void *decompressionOutputRefCon, void *sourceFrameRefCon, OSStatus status, VTDecodeInfoFlags infoFlags, CVImageBufferRef imageBuffer, CMTime presentationTimeStamp, CMTime presentationDuration)
{
// The presentation time stamp...
// No longer seems to be associated with the buffer that it went in with!
NSNumber* pts = CFBridgingRelease(sourceFrameRefCon);
}
When ordered, the time stamps on the callback side increase monotonically at the expected rate, but the buffers are not in the right order. Does anyone see where I am making an error here? Or know how to determine the order of the buffers on the callback side? At this point I have tried just about everything I can think of.
In my case, the problem wasn't with VTDecompressionSession, it was a problem with the demuxer getting the wrong PTS. While I couldn't get VTDecompressionSession to put out the frames in temporal (display) order with the kVTDecodeFrame_EnableAsynchronousDecompression and kVTDecodeFrame_EnableTemporalProcessing flags, I could sort the frames myself based on PTS with a small vector.
First, make sure you associate all of your timing information with your CMSampleBuffer along with the block buffer so you receive it in the VTDecompressionSession callback.
// Wrap our CMBlockBuffer in a CMSampleBuffer...
CMSampleBufferRef sampleBuffer;
CMTime duration = ...;
CMTime presentationTimeStamp = ...;
CMTime decompressTimeStamp = ...;
CMSampleTimingInfo timingInfo{duration, presentationTimeStamp, decompressTimeStamp};
_sampleTimingArray[0] = timingInfo;
_sampleSizeArray[0] = nalLength;
// Wrap the CMBlockBuffer...
status = CMSampleBufferCreate(kCFAllocatorDefault, blockBuffer, true, NULL, NULL, _formatDescription, 1, 1, _sampleTimingArray, 1, _sampleSizeArray, &sampleBuffer);
Then, decode the frame... It is worth trying to get the frames out in display order with the flags.
VTDecodeFrameFlags flags = kVTDecodeFrame_EnableAsynchronousDecompression | kVTDecodeFrame_EnableTemporalProcessing;
VTDecodeInfoFlags flagOut;
VTDecompressionSessionDecodeFrame(_decompressionSession, sampleBuffer, flags,
(void*)CFBridgingRetain(NULL), &flagOut);
On the callback side of things, we need a way of sorting the CVImageBufferRefs we receive. I use a struct that contains the CVImageBufferRef and the PTS. Then a vector with a size of two that will do the actual sorting.
struct Buffer
{
CVImageBufferRef imageBuffer = NULL;
double pts = 0;
};
std::vector <Buffer> _buffer;
We also need a way to sort the Buffers. Always writing to and reading from the index with the lowest PTS works well.
-(int) getMinIndex
{
if(_buffer[0].pts > _buffer[1].pts)
{
return 1;
}
return 0;
}
In the callback, we need to fill the vector with Buffers...
void decompressionSessionDecodeFrameCallback(void *decompressionOutputRefCon, void *sourceFrameRefCon, OSStatus status, VTDecodeInfoFlags infoFlags, CVImageBufferRef imageBuffer, CMTime presentationTimeStamp, CMTime presentationDuration)
{
StreamManager *streamManager = (__bridge StreamManager *)decompressionOutputRefCon;
#synchronized(streamManager)
{
if (status != noErr)
{
NSError *error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
NSLog(#"Decompressed error: %#", error);
}
else
{
// Get the PTS
double pts = CMTimeGetSeconds(presentationTimeStamp);
// Fill our buffer initially
if(!streamManager->_bufferReady)
{
Buffer buffer;
buffer.pts = pts;
buffer.imageBuffer = imageBuffer;
CVBufferRetain(buffer.imageBuffer);
streamManager->_buffer[streamManager->_bufferIndex++] = buffer;
}
else
{
// Push new buffers to the index with the lowest PTS
int index = [streamManager getMinIndex];
// Release the old CVImageBufferRef
CVBufferRelease(streamManager->_buffer[index].imageBuffer);
Buffer buffer;
buffer.pts = pts;
buffer.imageBuffer = imageBuffer;
// Retain the new CVImageBufferRef
CVBufferRetain(buffer.imageBuffer);
streamManager->_buffer[index] = buffer;
}
// Wrap around the buffer when initialized
// _bufferWindow = 2
if(streamManager->_bufferIndex == streamManager->_bufferWindow)
{
streamManager->_bufferReady = YES;
streamManager->_bufferIndex = 0;
}
}
}
}
Finally we need to drain the Buffers in temporal (display) order...
- (void)drainBuffer
{
#synchronized(self)
{
if(_bufferReady)
{
// Drain buffers from the index with the lowest PTS
int index = [self getMinIndex];
Buffer buffer = _buffer[index];
// Do something useful with the buffer now in display order
}
}
}
I would like to improve upon that answer a bit. While the outlined solution works, it requires knowledge of the number of frames needed to produce an output frame. The example uses a buffer size of 2, but in my case I needed a buffer size of 3.
To avoid having to specify this in advance one can make use of the fact, that frames (in display order) align exactly in terms of pts/duration. I.e. the end of one frame is exactly the beginning of the next. Thus one can simply accumulate frames until there is no "gap" at the beginning, then pop the first frame, and so on. Also one can take the pts of the first frame (which is always an I-frame) as the initial "head" (as it does not have to be zero...).
Here is some code that does this:
#include <CoreVideo/CVImageBuffer.h>
#include <boost/container/flat_set.hpp>
inline bool operator<(const CMTime& left, const CMTime& right)
{
return CMTimeCompare(left, right) < 0;
}
inline bool operator==(const CMTime& left, const CMTime& right)
{
return CMTimeCompare(left, right) == 0;
}
inline CMTime operator+(const CMTime& left, const CMTime& right)
{
return CMTimeAdd(left, right);
}
class reorder_buffer_t
{
public:
struct entry_t
{
CFGuard<CVImageBufferRef> image;
CMTime pts;
CMTime duration;
bool operator<(const entry_t& other) const
{
return pts < other.pts;
}
};
private:
typedef boost::container::flat_set<entry_t> buffer_t;
public:
reorder_buffer_t()
{
}
void push(entry_t entry)
{
if (!_head)
_head = entry.pts;
_buffer.insert(std::move(entry));
}
bool empty() const
{
return _buffer.empty();
}
bool ready() const
{
return !empty() && _buffer.begin()->pts == _head;
}
entry_t pop()
{
assert(ready());
auto entry = *_buffer.begin();
_buffer.erase(_buffer.begin());
_head = entry.pts + entry.duration;
return entry;
}
void clear()
{
_buffer.clear();
_head = boost::none;
}
private:
boost::optional<CMTime> _head;
buffer_t _buffer;
};
Here's a solution that works with any required buffer size, and also does not need any 3rd party libraries. My C++ code might not be the best, but it works.
We create a Buffer struct to identify the buffers by pts:
struct Buffer
{
CVImageBufferRef imageBuffer = NULL;
uint64_t pts = 0;
};
In our decoder, we need to keep track of the buffers, and what pts we want to release next:
#property (nonatomic) std::vector <Buffer> buffers;
#property (nonatomic, assign) uint64_t nextExpectedPts;
Now we are ready to handle the buffers coming in. In my case the buffers were provided asynchronously. Make sure you provide the correct duration and presentation timestamp values to the decompressionsession to be able to sort them properly:
-(void)handleImageBuffer:(CVImageBufferRef)imageBuffer pts:(CMTime)presentationTimeStamp duration:(uint64_t)duration {
//Situation 1, we can directly pass over this buffer
if (self.nextExpectedPts == presentationTimeStamp.value || duration == 0) {
[self sendImageBuffer:imageBuffer duration:duration];
return;
}
//Situation 2, we got this buffer too fast. We will store it, but first we check if we have already stored the expected buffer
Buffer futureBuffer = [self bufferWithImageBuffer:imageBuffer pts:presentationTimeStamp.value];
int smallestPtsInBufferIndex = [self getSmallestPtsBufferIndex];
if (smallestPtsInBufferIndex >= 0 && self.nextExpectedPts == self.buffers[smallestPtsInBufferIndex].pts) {
//We found the next buffer, lets store the current buffer and return this one
Buffer bufferWithSmallestPts = self.buffers[smallestPtsInBufferIndex];
[self sendImageBuffer:bufferWithSmallestPts.imageBuffer duration:duration];
CVBufferRelease(bufferWithSmallestPts.imageBuffer);
[self setBuffer:futureBuffer atIndex:smallestPtsInBufferIndex];
} else {
//We dont have the next buffer yet, lets store this one to a new slot
[self setBuffer:futureBuffer atIndex:self.buffers.size()];
}
}
-(Buffer)bufferWithImageBuffer:(CVImageBufferRef)imageBuffer pts:(uint64_t)pts {
Buffer futureBuffer = Buffer();
futureBuffer.pts = pts;
futureBuffer.imageBuffer = imageBuffer;
CVBufferRetain(futureBuffer.imageBuffer);
return futureBuffer;
}
- (void)sendImageBuffer:(CVImageBufferRef)imageBuffer duration:(uint64_t)duration {
//Send your buffer to wherever you need it here
self.nextExpectedPts += duration;
}
-(int) getSmallestPtsBufferIndex
{
int minIndex = -1;
uint64_t minPts = 0;
for(int i=0;i<_buffers.size();i++) {
if (_buffers[i].pts < minPts || minPts == 0) {
minPts = _buffers[i].pts;
minIndex = i;
}
}
return minIndex;
}
- (void)setBuffer:(Buffer)buffer atIndex:(int)index {
if (_buffers.size() <= index) {
_buffers.push_back(buffer);
} else {
_buffers[index] = buffer;
}
}
Do not forget to release all the buffers in the vector when deallocating your decoder, and if you're working with a looping file for example, keep track of when the file has fully looped to reset the nextExpectedPts and such.