iOS 11 Objective-C - Processing Image Buffers From ReplayKit Using AVAssetWriterInputPixelBufferAdaptor - ios

I'm trying to record my app's screen using ReplayKit, cropping out some parts of it while recording the video. Not quite going well.
ReplayKit will capture the entire screen, so I decided to receive each frame from ReplayKit (as CMSampleBuffer via startCaptureWithHandler), crop it there and feed it to a video writer via AVAssetWriterInputPixelBufferAdaptor. But I am having a trouble in hard-copying the image buffer before cropping it.
This is my working code that records the entire screen:
// Starts recording with a completion/error handler
-(void)startRecordingWithHandler: (RPHandler)handler
{
// Sets up AVAssetWriter that will generate a video file from the recording.
self.writer = [AVAssetWriter assetWriterWithURL:self.outputFileURL
fileType:AVFileTypeQuickTimeMovie
error:nil];
NSDictionary* outputSettings =
#{
AVVideoWidthKey : #(screen.size.width), // The whole width of the entire screen.
AVVideoHeightKey : #(screen.size.height), // The whole height of the entire screen.
AVVideoCodecKey : AVVideoCodecTypeH264,
};
// Sets up AVAssetWriterInput that will feed ReplayKit's frame buffers to the writer.
self.videoInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeVideo
outputSettings:outputSettings];
// Lets it know that the input will be realtime using ReplayKit.
[self.videoInput setExpectsMediaDataInRealTime:YES];
NSDictionary* sourcePixelBufferAttributes =
#{
(NSString*) kCVPixelBufferPixelFormatTypeKey: #(kCVPixelFormatType_32BGRA),
(NSString*) kCVPixelBufferWidthKey : #(screen.size.width),
(NSString*) kCVPixelBufferHeightKey : #(screen.size.height),
};
// Adds the video input to the writer.
[self.writer addInput:self.videoInput];
// Sets up ReplayKit itself.
self.recorder = [RPScreenRecorder sharedRecorder];
// Arranges the pipleline from ReplayKit to the input.
RPBufferHandler bufferHandler = ^(CMSampleBufferRef sampleBuffer, RPSampleBufferType bufferType, NSError* error) {
[self captureSampleBuffer:sampleBuffer withBufferType:bufferType];
};
RPHandler errorHandler = ^(NSError* error) {
if (error) handler(error);
};
// Starts ReplayKit's recording session.
// Sample buffers will be sent to `captureSampleBuffer` method.
[self.recorder startCaptureWithHandler:bufferHandler completionHandler:errorHandler];
}
// Receives a sample buffer from ReplayKit every frame.
-(void)captureSampleBuffer:(CMSampleBufferRef)sampleBuffer withBufferType:(RPSampleBufferType)bufferType
{
// Uses a queue in sync so that the writer-starting logic won't be invoked twice.
dispatch_sync(dispatch_get_main_queue(), ^{
// Starts the writer if not started yet. We do this here in order to get the proper source time later.
if (self.writer.status == AVAssetWriterStatusUnknown) {
[self.writer startWriting];
return;
}
// Receives a sample buffer from ReplayKit.
switch (bufferType) {
case RPSampleBufferTypeVideo:{
// Initializes the source time when a video frame buffer is received the first time.
// This prevents the output video from starting with blank frames.
if (!self.startedWriting) {
NSLog(#"self.writer startSessionAtSourceTime");
[self.writer startSessionAtSourceTime:CMSampleBufferGetPresentationTimeStamp(sampleBuffer)];
self.startedWriting = YES;
}
// Appends a received video frame buffer to the writer.
[self.input append:sampleBuffer];
break;
}
}
});
}
// Stops the current recording session, and saves the output file to the user photo album.
-(void)stopRecordingWithHandler:(RPHandler)handler
{
// Closes the input.
[self.videoInput markAsFinished];
// Finishes up the writer.
[self.writer finishWritingWithCompletionHandler:^{
handler(self.writer.error);
// Saves the output video to the user photo album.
[[PHPhotoLibrary sharedPhotoLibrary] performChanges: ^{ [PHAssetChangeRequest creationRequestForAssetFromVideoAtFileURL: self.outputFileURL]; }
completionHandler: ^(BOOL s, NSError* e) { }];
}];
// Stops ReplayKit's recording.
[self.recorder stopCaptureWithHandler:nil];
}
where each sample buffer from ReplayKit will be directly fed to the writer (in captureSampleBuffer method), hence records the entire screen.
Then, I replaced the part with an identical logic using AVAssetWriterInputPixelBufferAdaptor, which works just fine:
...
case RPSampleBufferTypeVideo:{
... // Initializes source time.
// Gets the timestamp of the sample buffer.
CMTime time = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
// Extracts the pixel image buffer from the sample buffer.
CVPixelBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
// Appends a received sample buffer as an image buffer to the writer via the adaptor.
[self.videoAdaptor appendPixelBuffer:imageBuffer withPresentationTime:time];
break;
}
...
where the adaptor is set up as:
NSDictionary* sourcePixelBufferAttributes =
#{
(NSString*) kCVPixelBufferPixelFormatTypeKey: #(kCVPixelFormatType_32BGRA),
(NSString*) kCVPixelBufferWidthKey : #(screen.size.width),
(NSString*) kCVPixelBufferHeightKey : #(screen.size.height),
};
self.videoAdaptor = [AVAssetWriterInputPixelBufferAdaptor assetWriterInputPixelBufferAdaptorWithAssetWriterInput:self.videoInput
sourcePixelBufferAttributes:sourcePixelBufferAttributes];
So the pipeline is working.
Then, I created a hard copy of the image buffer in the main memory and feed it to the adaptor:
...
case RPSampleBufferTypeVideo:{
... // Initializes source time.
// Gets the timestamp of the sample buffer.
CMTime time = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
// Extracts the pixel image buffer from the sample buffer.
CVPixelBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
// Hard-copies the image buffer.
CVPixelBufferRef copiedImageBuffer = [self copy:imageBuffer];
// Appends a received video frame buffer to the writer via the adaptor.
[self.adaptor appendPixelBuffer:copiedImageBuffer withPresentationTime:time];
break;
}
...
// Hard-copies the pixel buffer.
-(CVPixelBufferRef)copy:(CVPixelBufferRef)inputBuffer
{
// Locks the base address of the buffer
// so that GPU won't change the data until unlocked later.
CVPixelBufferLockBaseAddress(inputBuffer, 0); //-------------------------------
char* baseAddress = (char*)CVPixelBufferGetBaseAddress(inputBuffer);
size_t bytesPerRow = CVPixelBufferGetBytesPerRow(inputBuffer);
size_t width = CVPixelBufferGetWidth(inputBuffer);
size_t height = CVPixelBufferGetHeight(inputBuffer);
size_t length = bytesPerRow * height;
// Mallocs the same length as the input buffer for copying.
char* outputAddress = (char*)malloc(length);
// Copies the input buffer's data to the malloced space.
for (int i = 0; i < length; i++) {
outputAddress[i] = baseAddress[i];
}
// Create a new image buffer using the copied data.
CVPixelBufferRef outputBuffer;
CVPixelBufferCreateWithBytes(kCFAllocatorDefault,
width,
height,
kCVPixelFormatType_32BGRA,
outputAddress,
bytesPerRow,
&releaseCallback, // Releases the malloced space.
NULL,
NULL,
&outputBuffer);
// Unlocks the base address of the input buffer
// So that GPU can restart using the data.
CVPixelBufferUnlockBaseAddress(inputBuffer, 0); //-------------------------------
return outputBuffer;
}
// Releases the malloced space.
void releaseCallback(void *releaseRefCon, const void *baseAddress)
{
free((void *)baseAddress);
}
This doesn't work -- the saved video will look like the screenshot on the right hand:
Seems like bytes per row and the color format are wrong. I have researched and experimented with the followings, but not avail:
Hard-coding 4 * width for bytes per row -> "bad access".
Using int and double instead of char -> some weird debugger-terminating exceptions.
Using other image formats -> either "not supported" or access errors.
Additionally, the releaseCallback is never called -- the ram will run out in 10 seconds of recording.
What are potential causes from the look of this output?

You could first save the video as it is.
Then by using AVMutableComposition class, you can crop the video by adding instructions and layer instructions to it.

On my case, Replaykit calls sampleBuffer with 420YpCbCr8BiPlanarFullRange format.
Not RBGA format. You need to work 2 plane. Your screenshot indicates Y plane on top UV plane on Bottom. UV plane is half size of Y plane.
For getting base address of 2 planes.
CVPixelBufferGetBaseAddressOfPlane(imageBuffer, 0)
CVPixelBufferGetBaseAddressOfPlane(imageBuffer, 1)
You also need to get width, height, byte per row, for each plane by these API.
CVPixelBufferGetWithOfPlane(imageBuffer, 0 or 1)
CVPixelBufferGetHeightOfPlane(imageBuffer, 0 or 1)
CVPixelBufferGetBytesPerRowOfPlane(imageBuffer, 0 or 1)

Related

How do I draw onto a CVPixelBufferRef that is planar/ycbcr/420f/yuv/NV12/not rgb?

I have received a CMSampleBufferRef from a system API that contains CVPixelBufferRefs that are not RGBA (linear pixels). The buffer contains planar pixels (such as 420f aka kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange aka yCbCr aka YUV).
I would like to modify do some manipulation of this video data before sending it off to VideoToolkit to be encoded to h264 (drawing some text, overlaying a logo, rotating the image, etc), but I'd like for it to be efficient and real-time. Buuuut planar image data looks suuuper messy to work with -- there's the chroma plane and the luma plane and they're different sizes and... Working with this on a byte level seems like a lot of work.
I could probably use a CGContextRef and just paint right on top of the pixels, but from what I can gather it only supports RGBA pixels. Any advice on how I can do this with as little data copying as possible, yet as few lines of code as possible?
CGBitmapContextRef can only paint into something like 32ARGB, correct. This means that you will want to create ARGB (or RGBA) buffers, and then find a way to very quickly transfer YUV pixels onto this ARGB surface. This recipe includes using CoreImage, a home-made CVPixelBufferRef through a pool, a CGBitmapContextRef referencing your home made pixel buffer, and then recreating a CMSampleBufferRef resembling your input buffer, but referencing your output pixels. In other words,
Fetch the incoming pixels into a CIImage.
Create a CVPixelBufferPool with the pixel format and output dimensions you are creating. You don't want to create CVPixelBuffers without a pool in real time: you will run out of memory if your producer is too fast; you'll fragment your RAM as you won't be reusing buffers; and it's a waste of cycles.
Create a CIContext with the default constructor that you'll share between buffers. It contains no external state, but documentation says that recreating it on every frame is very expensive.
On incoming frame, create a new pixel buffer. Make sure to use an allocation threshold so you don't get runaway RAM usage.
Lock the pixel buffer
Create a bitmap context referencing the bytes in the pixel buffer
Use CIContext to render the planar image data into the linear buffer
Perform your app-specific drawing in the CGContext!
Unlock the pixel buffer
Fetch the timing info of the original sample buffer
Create a CMVideoFormatDescriptionRef by asking the pixel buffer for its exact format
Create a sample buffer for the pixel buffer. Done!
Here's a sample implementation, where I have chosen 32ARGB as the image format to work with, as that's something that both CGBitmapContext and CoreVideo enjoys working with on iOS:
{
CGPixelBufferPoolRef *_pool;
CGSize _poolBufferDimensions;
}
- (void)_processSampleBuffer:(CMSampleBufferRef)inputBuffer
{
// 1. Input data
CVPixelBufferRef inputPixels = CMSampleBufferGetImageBuffer(inputBuffer);
CIImage *inputImage = [CIImage imageWithCVPixelBuffer:inputPixels];
// 2. Create a new pool if the old pool doesn't have the right format.
CGSize bufferDimensions = {CVPixelBufferGetWidth(inputPixels), CVPixelBufferGetHeight(inputPixels)};
if(!_pool || !CGSizeEqualToSize(bufferDimensions, _poolBufferDimensions)) {
if(_pool) {
CFRelease(_pool);
}
OSStatus ok0 = CVPixelBufferPoolCreate(NULL,
NULL, // pool attrs
(__bridge CFDictionaryRef)(#{
(id)kCVPixelBufferPixelFormatTypeKey: #(kCVPixelFormatType_32ARGB),
(id)kCVPixelBufferWidthKey: #(bufferDimensions.width),
(id)kCVPixelBufferHeightKey: #(bufferDimensions.height),
}), // buffer attrs
&_pool
);
_poolBufferDimensions = bufferDimensions;
assert(ok0 == noErr);
}
// 4. Create pixel buffer
CVPixelBufferRef outputPixels;
OSStatus ok1 = CVPixelBufferPoolCreatePixelBufferWithAuxAttributes(NULL,
_pool,
(__bridge CFDictionaryRef)#{
// Opt to fail buffer creation in case of slow buffer consumption
// rather than to exhaust all memory.
(__bridge id)kCVPixelBufferPoolAllocationThresholdKey: #20
}, // aux attributes
&outputPixels
);
if(ok1 == kCVReturnWouldExceedAllocationThreshold) {
// Dropping frame because consumer is too slow
return;
}
assert(ok1 == noErr);
// 5, 6. Graphics context to draw in
CGColorSpaceRef deviceColors = CGColorSpaceCreateDeviceRGB();
OSStatus ok2 = CVPixelBufferLockBaseAddress(outputPixels, 0);
assert(ok2 == noErr);
CGContextRef cg = CGBitmapContextCreate(
CVPixelBufferGetBaseAddress(outputPixels), // bytes
CVPixelBufferGetWidth(inputPixels), CVPixelBufferGetHeight(inputPixels), // dimensions
8, // bits per component
CVPixelBufferGetBytesPerRow(outputPixels), // bytes per row
deviceColors, // color space
kCGImageAlphaPremultipliedFirst // bitmap info
);
CFRelease(deviceColors);
assert(cg != NULL);
// 7
[_imageContext render:inputImage toCVPixelBuffer:outputPixels];
// 8. DRAW
CGContextSetRGBFillColor(cg, 0.5, 0, 0, 1);
CGContextSetTextDrawingMode(cg, kCGTextFill);
NSAttributedString *text = [[NSAttributedString alloc] initWithString:#"Hello world" attributes:NULL];
CTLineRef line = CTLineCreateWithAttributedString((__bridge CFAttributedStringRef)text);
CTLineDraw(line, cg);
CFRelease(line);
// 9. Unlock and stop drawing
CFRelease(cg);
CVPixelBufferUnlockBaseAddress(outputPixels, 0);
// 10. Timings
CMSampleTimingInfo timingInfo;
OSStatus ok4 = CMSampleBufferGetSampleTimingInfo(inputBuffer, 0, &timingInfo);
assert(ok4 == noErr);
// 11. VIdeo format
CMVideoFormatDescriptionRef videoFormat;
OSStatus ok5 = CMVideoFormatDescriptionCreateForImageBuffer(NULL, outputPixels, &videoFormat);
assert(ok5 == noErr);
// 12. Output sample buffer
CMSampleBufferRef outputBuffer;
OSStatus ok3 = CMSampleBufferCreateForImageBuffer(NULL, // allocator
outputPixels, // image buffer
YES, // data ready
NULL, // make ready callback
NULL, // make ready refcon
videoFormat,
&timingInfo, // timing info
&outputBuffer // out
);
assert(ok3 == noErr);
[_consumer consumeSampleBuffer:outputBuffer];
CFRelease(outputPixels);
CFRelease(videoFormat);
CFRelease(outputBuffer);
}

How do I control AVAssetWriter to write at the correct FPS

Let me see if I understood it correctly.
At the present most advanced hardware, iOS allows me to record at the following fps: 30, 60, 120 and 240.
But these fps behave differently. If I shoot at 30 or 60 fps, I expect the videos files created from shooting at these fps to play at 30 and 60 fps respectively.
But if I shoot at 120 or 240 fps, I expect the video files creating from shooting at these fps to play at 30 fps, or I will not see the slow motion.
A few questions:
am I right?
is there a way to shoot at 120 or 240 fps and play at 120 and 240 fps respectively? I mean play at the fps the videos were shoot without slo-mo?
How do I control that framerate when I write the file?
I am creating the AVAssetWriter input like this...
NSDictionary *videoCompressionSettings = #{AVVideoCodecKey : AVVideoCodecH264,
AVVideoWidthKey : #(videoWidth),
AVVideoHeightKey : #(videoHeight),
AVVideoCompressionPropertiesKey : #{ AVVideoAverageBitRateKey : #(bitsPerSecond),
AVVideoMaxKeyFrameIntervalKey : #(1)}
};
_assetWriterVideoInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeVideo outputSettings:videoCompressionSettings];
and there is no apparent way to control that.
NOTE: I have tried different numbers where that 1 is. I have tried 1.0/fps, I have tried fps and I have removed the key. No difference.
This is how I setup `AVAssetWriter:
AVAssetWriter *newAssetWriter = [[AVAssetWriter alloc] initWithURL:_movieURL fileType:AVFileTypeQuickTimeMovie
error:&error];
_assetWriter = newAssetWriter;
_assetWriter.shouldOptimizeForNetworkUse = NO;
CGFloat videoWidth = size.width;
CGFloat videoHeight = size.height;
NSUInteger numPixels = videoWidth * videoHeight;
NSUInteger bitsPerSecond;
// Assume that lower-than-SD resolutions are intended for streaming, and use a lower bitrate
// if ( numPixels < (640 * 480) )
// bitsPerPixel = 4.05; // This bitrate matches the quality produced by AVCaptureSessionPresetMedium or Low.
// else
NSUInteger bitsPerPixel = 11.4; // This bitrate matches the quality produced by AVCaptureSessionPresetHigh.
bitsPerSecond = numPixels * bitsPerPixel;
NSDictionary *videoCompressionSettings = #{AVVideoCodecKey : AVVideoCodecH264,
AVVideoWidthKey : #(videoWidth),
AVVideoHeightKey : #(videoHeight),
AVVideoCompressionPropertiesKey : #{ AVVideoAverageBitRateKey : #(bitsPerSecond)}
};
if (![_assetWriter canApplyOutputSettings:videoCompressionSettings forMediaType:AVMediaTypeVideo]) {
NSLog(#"Couldn't add asset writer video input.");
return;
}
_assetWriterVideoInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeVideo
outputSettings:videoCompressionSettings
sourceFormatHint:formatDescription];
_assetWriterVideoInput.expectsMediaDataInRealTime = YES;
NSDictionary *adaptorDict = #{
(id)kCVPixelBufferPixelFormatTypeKey : #(kCVPixelFormatType_32BGRA),
(id)kCVPixelBufferWidthKey : #(videoWidth),
(id)kCVPixelBufferHeightKey : #(videoHeight)
};
_pixelBufferAdaptor = [[AVAssetWriterInputPixelBufferAdaptor alloc]
initWithAssetWriterInput:_assetWriterVideoInput
sourcePixelBufferAttributes:adaptorDict];
// Add asset writer input to asset writer
if (![_assetWriter canAddInput:_assetWriterVideoInput]) {
return;
}
[_assetWriter addInput:_assetWriterVideoInput];
captureOutput method is very simple. I get the image from the filter and write it to file using:
if (videoJustStartWriting)
[_assetWriter startSessionAtSourceTime:presentationTime];
CVPixelBufferRef renderedOutputPixelBuffer = NULL;
OSStatus err = CVPixelBufferPoolCreatePixelBuffer(nil,
_pixelBufferAdaptor.pixelBufferPool,
&renderedOutputPixelBuffer);
if (err) return; // NSLog(#"Cannot obtain a pixel buffer from the buffer pool");
//_ciContext is a metal context
[_ciContext render:finalImage
toCVPixelBuffer:renderedOutputPixelBuffer
bounds:[finalImage extent]
colorSpace:_sDeviceRgbColorSpace];
[self writeVideoPixelBuffer:renderedOutputPixelBuffer
withInitialTime:presentationTime];
- (void)writeVideoPixelBuffer:(CVPixelBufferRef)pixelBuffer withInitialTime:(CMTime)presentationTime
{
if ( _assetWriter.status == AVAssetWriterStatusUnknown ) {
// If the asset writer status is unknown, implies writing hasn't started yet, hence start writing with start time as the buffer's presentation timestamp
if ([_assetWriter startWriting]) {
[_assetWriter startSessionAtSourceTime:presentationTime];
}
}
if ( _assetWriter.status == AVAssetWriterStatusWriting ) {
// If the asset writer status is writing, append sample buffer to its corresponding asset writer input
if (_assetWriterVideoInput.readyForMoreMediaData) {
if (![_pixelBufferAdaptor appendPixelBuffer:pixelBuffer withPresentationTime:presentationTime]) {
NSLog(#"error", [_assetWriter.error localizedFailureReason]);
}
}
}
if ( _assetWriter.status == AVAssetWriterStatusFailed ) {
NSLog(#"failed");
}
}
I put the whole thing to shoot at 240 fps. These are presentation times of frames being appended.
time ======= 113594.311510508
time ======= 113594.324011508
time ======= 113594.328178716
time ======= 113594.340679424
time ======= 113594.344846383
if you do some calculation between them you will see that the framerate is about 240 fps. So the frames are being stored with the correct time.
But when I watch the video the movement is not in slow motion and quick time says the video is 30 fps.
Note: this app grabs frames from the camera, the frames goes into CIFilters and the result of those filters is converted back to a sample buffer that is stored to file and displayed on the screen.
I'm reaching here, but I think this is where you're going wrong. Think of your video capture as a pipeline.
(1) Capture buffer -> (2) Do Something With buffer -> (3) Write buffer as frames in video.
Sounds like you've successfully completed (1) and (2), you're getting the buffer fast enough and you're processing them so you can vend them as frames.
The problem is almost certainly in (3) writing the video frames.
https://developer.apple.com/reference/avfoundation/avmutablevideocomposition
Check out the frameDuration setting in your AVMutableComposition, you'll need something like CMTime(1, 60) //60FPS or CMTime(1, 240) // 240FPS to get what you're after (telling the video to WRITE this many frames and encode at this rate).
Using AVAssetWriter, it's exactly the same principle but you set the frame rate as a property in the AVAssetWriterInput outputSettings adding in the AVVideoExpectedSourceFrameRateKey.
NSDictionary *videoCompressionSettings = #{AVVideoCodecKey : AVVideoCodecH264,
AVVideoWidthKey : #(videoWidth),
AVVideoHeightKey : #(videoHeight),
AVVideoExpectedSourceFrameRateKey : #(60),
AVVideoCompressionPropertiesKey : #{ AVVideoAverageBitRateKey : #(bitsPerSecond),
AVVideoMaxKeyFrameIntervalKey : #(1)}
};
To expand a little more - you can't strictly control or sync your camera capture exactly to the output / playback rate, the timing just doesn't work that way and isn't that exact, and of course the processing pipeline adds overhead. When you capture frames they are time stamped, which you've seen, but in the writing / compression phase, it's using only the frames it needs to produce the output specified for the composition.
It goes both ways, you could capture only 30 FPS and write out at 240 FPS, the video would display fine, you'd just have a lot of frames "missing" and being filled in by the algorithm. You can even vend only 1 frame per second and play back at 30FPS, the two are separate from each other (how fast I capture Vs how many frames and what I present per second)
As to how to play it back at different speed, you just need to tweak the playback speed - slow it down as needed.
If you've correctly set the time base (frameDuration), it will always play back "normal" - you're telling it "play back is X Frames Per Second", of course, your eye may notice a difference (almost certainly between low FPS and high FPS), and the screen may not refresh that high (above 60FPS), but regardless the video will be at a "normal" 1X speed for it's timebase. By slowing the video, if my timebase is 120, and I slow it to .5x I know effectively see 60FPS and one second of playback takes two seconds.
You control the playback speed by setting the rate property on AVPlayer https://developer.apple.com/reference/avfoundation/avplayer
The iOS screen refresh is locked at 60fps, so the only way to "see" the extra frames is, as you say, to slow down the playback rate, a.k.a slow motion.
So
yes, you are right
the screen refresh rate (and perhaps limitations of the human visual system, assuming you're human?) means that you cannot perceive 120 & 240fps frame rates. You can play them at normal speed by downsampling to the screen refresh rate. Surely this is what AVPlayer already does, although I'm not sure if that's the answer you're looking for.
you control the framerate of the file when you write it with the CMSampleBuffer presentation timestamps. If your frames are coming from the camera, you're probably passing the timestamps straight through, in which case check that you really are getting the framerate you asked for (a log statement in your capture callback should be enough to verify this). If you're procedurally creating frames, then you choose the presentation timestamps so that they're spaced 1.0/desiredFrameRate seconds apart!
Is 3. not working for you?
p.s. you can discard & ignore AVVideoMaxKeyFrameIntervalKey - it's a quality setting and has nothing to do with playback framerate.

Why won't AVFoundation accept my planar pixel buffers on an iOS device?

I've been struggling to figure out what the problem is with my code. I'm creating a planar CVPixelBufferRef to write to an AVAssetWriter. This pixel buffer is created manually through some other process (i.e., I'm not getting these samples from the camera or anything like that). On the iOS Simulator, it has no problem appending the samples and creating a valid output movie.
But on the device, it immediately fails at the first sample and provides less than useless error information:
AVAssetWriterError: Error Domain=AVFoundationErrorDomain Code=-11800 "The operation could not be completed" UserInfo={NSUnderlyingError=0x12fd2c670 {Error Domain=NSOSStatusErrorDomain Code=-12780 "(null)"}, NSLocalizedFailureReason=An unknown error occurred (-12780), NSLocalizedDescription=The operation could not be completed}
I'm very new to pixel formats, and I wouldn't be surprised if I've somehow created invalid pixel buffers, but the fact that it works just fine on the Simulator (i.e., OS X) leaves me confused.
Here's my code:
const int pixelBufferWidth = img->get_width();
const int pixelBufferHeight = img->get_height();
size_t planeWidths[3];
size_t planeHeights[3];
size_t planeBytesPerRow[3];
void* planeBaseAddresses[3];
for (int c=0;c<3;c++) {
int stride;
const uint8_t* p = de265_get_image_plane(img, c, &stride);
int width = de265_get_image_width(img,c);
int height = de265_get_image_height(img, c);
planeWidths[c] = width;
planeHeights[c] = height;
planeBytesPerRow[c] = stride;
planeBaseAddresses[c] = const_cast<uint8_t*>(p);
}
void* descriptor = calloc(1, sizeof(CVPlanarPixelBufferInfo_YCbCrPlanar));
CVPixelBufferRef pixelBufferRef;
CVReturn result = CVPixelBufferCreateWithPlanarBytes(NULL,
pixelBufferWidth,
pixelBufferHeight,
kCVPixelFormatType_420YpCbCr8Planar,
NULL,
0,
3,
planeBaseAddresses,
planeWidths,
planeHeights,
planeBytesPerRow,
&pixelBufferReleaseCallback,
NULL,
NULL,
&pixelBufferRef);
CMFormatDescriptionRef formatDescription = NULL;
CMVideoFormatDescriptionCreateForImageBuffer(NULL, pixelBufferRef, &formatDescription);
if (assetWriter == nil) {
// ... create output file path in Caches directory
assetWriter = [AVAssetWriter assetWriterWithURL:fileOutputURL fileType:AVFileTypeMPEG4 error:nil];
NSDictionary *videoSettings = #{AVVideoCodecKey : AVVideoCodecH264,
AVVideoWidthKey : #(pixelBufferWidth),
AVVideoHeightKey : #(pixelBufferHeight),
AVVideoCompressionPropertiesKey : #{AVVideoMaxKeyFrameIntervalKey : #1}};
assetWriterInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeVideo outputSettings:videoSettings sourceFormatHint:formatDescription];
[assetWriter addInput:assetWriterInput];
NSDictionary *pixelBufferAttributes = #{(id)kCVPixelBufferPixelFormatTypeKey : #(kCVPixelFormatType_420YpCbCr8Planar),
(id)kCVPixelBufferWidthKey : #(pixelBufferWidth),
(id)kCVPixelBufferHeightKey : #(pixelBufferHeight)};
pixelBufferAdaptor = [AVAssetWriterInputPixelBufferAdaptor assetWriterInputPixelBufferAdaptorWithAssetWriterInput:assetWriterInput sourcePixelBufferAttributes:pixelBufferAttributes];
[assetWriter startWriting];
[assetWriter startSessionAtSourceTime:kCMTimeZero];
}
samplePresentationTime = CMTimeMake(frameIndex++, framesPerSecond);
BOOL success = [pixelBufferAdaptor appendPixelBuffer:pixelBufferRef withPresentationTime:samplePresentationTime];
success is always NO, and the error from the asset writer is what I pasted above.
I also tried creating the sample buffers manually instead of using AVAssetWriterInputPixelBufferAdaptor just to eliminate that as a possible problem, but the results are the same.
Again, this does work on the Simulator, so I know my pixel buffers do contain the right data.
Also, I verified that I can write to the file destination. I tried creating a dummy file at that location, and it succeeded.
I would like to avoid converting my buffer to RGB since I shouldn't have to. I have Y'CbCr buffers to begin with, and I want to just encode them into an H.264 video, which supports Y'CbCr.
The source that is creating these buffers states the following:
The image is currently always 3-channel YCbCr, with 4:2:0 chroma.
I confirmed that it always enters its loop logic that deals with 8-bit YUV channels.
What am I doing wrong?
So, I can't confirm this officially, but it appears that AVAssetWriter doesn't like 3-plane pixel formats (i.e., kCVPixelFormatType_420YpCbCr8Planar) on iOS. On OS X, it appears to work with pretty much anything. When I converted my 3-plane buffers to a bi-planar pixel buffer format, this worked on iOS. This is unsurprising since the camera natively captures in kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange pixel format, so AV Foundation would likely also work with that format.
Still, it'd be nice if I didn't have to do this explicit conversion step myself, though vImageConvert_PlanarToChunky8 helps to interleave the Cb and Cr planes into a single plane.

Rotating video without rotating AVCaptureConnection and in the middle of AVAssetWriter session

I'm using PBJVision to implement tap-to-record video functionality. The library doesn't support orientation yet so I'm in the process of trying to engineer it in. From what I see, there are three ways to rotate the video - I need help on deciding the best way forward and how to implement it. Note that rotation can happen between tap-to-record segments. So in a recording session, the orientation is locked to what it was when the user tapped the button. The next time the user taps the button to record, it should re-set the orientation to whatever the device's orientation is (so the resulting video shows right-side-up).
The approaches are outlined in the issue page on GitHub as well
Method 1
Rotate the AVCaptureConnection using setVideoOrientation: - this causes the video preview to flicker every time it's switched, since this switches the actual hardware it seems. Not cool, not acceptable.
Method 2
Set the transform property on the AVAssetWriterInput object used to write the video. The problem is, once the asset writer starts writing, the transform property can't be changed, so this only works for the first segment of the video.
Method 3
Rotate the image buffer being appended using something like this: How to directly rotate CVImageBuffer image in IOS 4 without converting to UIImage? but it keeps crashing and I'm not even sure if I'm barking up the right tree. There's an exception that is thrown and I can't really trace it back to much more than the fact that I'm using the vImageRotate90_ARGB8888 function incorrectly.
The explanation is a bit more detailed on the GitHub issue page I linked to above. Any suggestions would be welcome - to be honest, I'm not hugely experienced at AVFoundation and so I'm hoping that there's some miraculous way to do this that I don't even know about!
Method 1 isn't the preferred method according to Apple's documentation ("Physically rotating buffers does come with a performance cost, so only request rotation if it's necessary"). Method 2 worked for me but if I played my video on an app that doesn't support the transformation "metadata", the video isn't rotated properly. Method 3 is what I did.
I think it's crashing for you before you're trying to pass the image data directly from vImageRotate... to the AVAssetWriterInputPixelBufferAdaptor. You have to create a CVPixelBufferRef first. Here's my code:
Inside of captureOutput:didOutputSampleBuffer:fromConnection: I rotate the frame before writing it into the adaptor:
if ([self.videoWriterInput isReadyForMoreMediaData])
{
// Rotate buffer first and then write to adaptor
CMTime sampleTime = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
CVPixelBufferRef rotatedBuffer = [self correctBufferOrientation:sampleBuffer];
[self.videoWriterInputAdaptor appendPixelBuffer:rotatedBuffer withPresentationTime:sampleTime];
CVBufferRelease(rotatedBuffer);
}
The referenced function that performs the vImage rotation is:
/* rotationConstant:
* 0 -- rotate 0 degrees (simply copy the data from src to dest)
* 1 -- rotate 90 degrees counterclockwise
* 2 -- rotate 180 degress
* 3 -- rotate 270 degrees counterclockwise
*/
- (CVPixelBufferRef)rotateBuffer:(CMSampleBufferRef)sampleBuffer withConstant:(uint8_t)rotationConstant
{
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
CVPixelBufferLockBaseAddress(imageBuffer, 0);
OSType pixelFormatType = CVPixelBufferGetPixelFormatType(imageBuffer);
NSAssert(pixelFormatType == kCVPixelFormatType_32ARGB, #"Code works only with 32ARGB format. Test/adapt for other formats!");
const size_t kAlignment_32ARGB = 32;
const size_t kBytesPerPixel_32ARGB = 4;
size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
size_t width = CVPixelBufferGetWidth(imageBuffer);
size_t height = CVPixelBufferGetHeight(imageBuffer);
BOOL rotatePerpendicular = (rotateDirection == 1) || (rotateDirection == 3); // Use enumeration values here
const size_t outWidth = rotatePerpendicular ? height : width;
const size_t outHeight = rotatePerpendicular ? width : height;
size_t bytesPerRowOut = kBytesPerPixel_32ARGB * ceil(outWidth * 1.0 / kAlignment_32ARGB) * kAlignment_32ARGB;
const size_t dstSize = bytesPerRowOut * outHeight * sizeof(unsigned char);
void *srcBuff = CVPixelBufferGetBaseAddress(imageBuffer);
unsigned char *dstBuff = (unsigned char *)malloc(dstSize);
vImage_Buffer inbuff = {srcBuff, height, width, bytesPerRow};
vImage_Buffer outbuff = {dstBuff, outHeight, outWidth, bytesPerRowOut};
uint8_t bgColor[4] = {0, 0, 0, 0};
vImage_Error err = vImageRotate90_ARGB8888(&inbuff, &outbuff, rotationConstant, bgColor, 0);
if (err != kvImageNoError)
{
NSLog(#"%ld", err);
}
CVPixelBufferUnlockBaseAddress(imageBuffer, 0);
CVPixelBufferRef rotatedBuffer = NULL;
CVPixelBufferCreateWithBytes(NULL,
outWidth,
outHeight,
pixelFormatType,
outbuff.data,
bytesPerRowOut,
freePixelBufferDataAfterRelease,
NULL,
NULL,
&rotatedBuffer);
return rotatedBuffer;
}
void freePixelBufferDataAfterRelease(void *releaseRefCon, const void *baseAddress)
{
// Free the memory we malloced for the vImage rotation
free((void *)baseAddress);
}
Note: You may like to use enumeration for rotationConstant. Something like that (don't call this function with MOVRotateDirectionUnknown):
typedef NS_ENUM(uint8_t, MOVRotateDirection)
{
MOVRotateDirectionNone = 0,
MOVRotateDirectionCounterclockwise90,
MOVRotateDirectionCounterclockwise180,
MOVRotateDirectionCounterclockwise270,
MOVRotateDirectionUnknown
};
Note: If you need IOSurface support, you should use CVPixelBufferCreate instead of CVPixelBufferCreateWithBytes and pass bytes data into it directly:
NSDictionary *pixelBufferAttributes = #{ (NSString *)kCVPixelBufferIOSurfacePropertiesKey : #{} };
CVPixelBufferCreate(kCFAllocatorDefault,
outWidth,
outHeight,
pixelFormatType,
(__bridge CFDictionaryRef)(pixelBufferAttributes),
&rotatedBuffer);
CVPixelBufferLockBaseAddress(rotatedBuffer, 0);
uint8_t *dest = CVPixelBufferGetBaseAddress(rotatedBuffer);
memcpy(dest, outbuff.data, bytesPerRowOut * outHeight);
CVPixelBufferUnlockBaseAddress(rotatedBuffer, 0);
There is an easy and safe way.
#define degreeToRadian(x) (Double.pi * x / 180.0)
self.assetWriterInputVideo.transform =
CGAffineTransformMakeRotation(CGFloat(degreeToRadian(-90))) ;
method 3 does work to rotate the frame of video。
But I found out it can cause the MM leak. in order to it, I try to move the funcation in the same thread as the merging the frame of video.
it does work.
When you meet the issue, Please take care.

How to set timestamp of CMSampleBuffer for AVWriter writing

I'm working with AVFoundation for capturing and recording audio. There are some issues I don't quite understand.
Basically I want to capture audio from AVCaptureSession and write it using AVWriter, however I need some shifting in the timestamp of the CMSampleBuffer I get from AVCaptureSession. I read documentation of CMSampleBuffer I see two different term of timestamp: 'presentation timestamp' and 'output presentation timestamp'. What the different of the two ?
Let say I get a CMSampleBuffer (for audio) instance from AVCaptureSession, and I want to write it to a file using AVWriter, what function should I use to 'inject' a CMTime to the buffer in order to set the presentation timestamp of it in the resulting file ?
Thanks.
Use the CMSampleBufferGetPresentationTimeStamp, that is the time when the buffer is captured and should be "presented" at when played back to be in sync. To quote session 520 at WWDC 2012: "Presentation time is the time at which the first sample in the buffer was picked up by the microphone".
If you start the AVWriter with
[videoWriter startWriting];
[videoWriter startSessionAtSourceTime:CMSampleBufferGetPresentationTimeStamp(sampleBuffer)];
and then append samples with
if(videoWriterInput.readyForMoreMediaData) [videoWriterInput appendSampleBuffer:sampleBuffer];
the frames in the finished video will be consistent with CMSampleBufferGetPresentationTimeStamp (I have checked). If you want to modify the time when adding samples you have to use AVAssetWriterInputPixelBufferAdaptor
Chunk of sample code from here: http://www.gdcl.co.uk/2013/02/20/iPhone-Pause.html
CMSampleBufferRef sample - is your sampleBuffer, CMSampleBufferRef sout your output. NewTimeStamp is your time stamp.
CMItemCount count;
CMTime newTimeStamp = CMTimeMake(YOURTIME_GOES_HERE);
CMSampleBufferGetSampleTimingInfoArray(sample, 0, nil, &count);
CMSampleTimingInfo* pInfo = malloc(sizeof(CMSampleTimingInfo) * count);
CMSampleBufferGetSampleTimingInfoArray(sample, count, pInfo, &count);
for (CMItemCount i = 0; i < count; i++)
{
pInfo[i].decodeTimeStamp = newTimeStamp; // kCMTimeInvalid if in sequence
pInfo[i].presentationTimeStamp = newTimeStamp;
}
CMSampleBufferRef sout;
CMSampleBufferCreateCopyWithNewTiming(kCFAllocatorDefault, sample, count, pInfo, &sout);
free(pInfo);

Resources