I want to capture video at 60fps using minimal resources, because I need to use at most 16-20ms per frame and most of the time will be occupied by heavy computation on the frame.
I am currently using the preset AVCaptureSessionPreset1280x720, but I want to do the computations at 640x480, otherwise the device will not keep up. Here starts the problem: I cannot directly capture at 640x480#60fps according to what Apple lets developers do and my current resizing is very slow.
I am using an image resizing with Metal kernel shaders, but 98% of the time (seen in Instruments) is spent in these two lines:
[_inputTexture replaceRegion:region mipmapLevel:0 withBytes:inputImage.data bytesPerRow:inputImage.channels() * inputImage.cols];
...
[_outputTexture getBytes:outputImage.data bytesPerRow:inputImage.channels() * outputImage.cols fromRegion:outputRegion mipmapLevel:0];
basically in memory load/store instructions. This part puzzles me as in theory the memory is shared between CPU and GPU on iPhone.
Do you think Metal code should be faster? I also have video presentation with Metal and it doesn't break a sweat (~1ms on GPU, while resizing takes up to ~20ms).
Is there no faster way of resizing an image? How is your experience with Image I/O?
UPDATE:
When I increased the size of my work-groups to be 22x22x1, the performance of image resizing improved from 20ms of GPU to 8ms. Still not quite what I want, but better.
UPDATE 2:
I switched to CoreGraphics and it goes fast enough. See this post.
I think you have created UIImage from Capture session, and than you have to convert it to the Metal texture (MTLTexture). But you shouldn't do it. Here is an example how to get MTLTexture from CaptureOutput delegate:
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection {
CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
CVPixelBufferLockBaseAddress(pixelBuffer, 0);
id<MTLTexture> inTexture = nil;
size_t width = CVPixelBufferGetWidth(pixelBuffer);
size_t height = CVPixelBufferGetHeight(pixelBuffer);
MTLPixelFormat pixelFormat = MTLPixelFormatBGRA8Unorm;
CVMetalTextureRef metalTextureRef = NULL;
CVReturn status = CVMetalTextureCacheCreateTextureFromImage(NULL, _textureCache, pixelBuffer, NULL, pixelFormat, width, height, 0, &metalTextureRef);
if(status == kCVReturnSuccess) {
inTexture = CVMetalTextureGetTexture(metalTextureRef);
CFRelease(metalTextureRef);
}
CVPixelBufferUnlockBaseAddress(pixelBuffer,0);
}
Related
I have created a ReplayKit Broadcast Extension, so the maximum amount of memory I can use is 50 MB.
I am taking samples of the broadcasted stream to send those images with a CFMessagePortSendRequest call. As that function accepts only CFData type, I need to convert my multi-plane image to Data.
NSKeyedArchiver.archivedObject() seems to exceed this 50 MB. Breaking on the line before the call I can see a memory consumption of ~6 MB. Then, executing the archivedObject call, my extension crashes cause it exceeds the memory limit.
Is there a less memory-eating way to convert the CIImage of a CVPixelBuffer to Data? And then back, of course.
I was able convert CMSampleBufferRef to NSData in following way. This method is using like 1-5~MB ram. I hope this will solve your problem.
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBufferType);
UInt8* bap0 = CVPixelBufferGetBaseAddressOfPlane(imageBuffer, 0);
CVPixelBufferLockBaseAddress(imageBuffer,0);
int byteperrow = CVPixelBufferGetBytesPerRowOfPlane(imageBuffer, 0);
int height = CVPixelBufferGetHeight(imageBuffer);
NSData *data = [NSData dataWithBytes:bap0 length:byteperrow * height];
I have received a CMSampleBufferRef from a system API that contains CVPixelBufferRefs that are not RGBA (linear pixels). The buffer contains planar pixels (such as 420f aka kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange aka yCbCr aka YUV).
I would like to modify do some manipulation of this video data before sending it off to VideoToolkit to be encoded to h264 (drawing some text, overlaying a logo, rotating the image, etc), but I'd like for it to be efficient and real-time. Buuuut planar image data looks suuuper messy to work with -- there's the chroma plane and the luma plane and they're different sizes and... Working with this on a byte level seems like a lot of work.
I could probably use a CGContextRef and just paint right on top of the pixels, but from what I can gather it only supports RGBA pixels. Any advice on how I can do this with as little data copying as possible, yet as few lines of code as possible?
CGBitmapContextRef can only paint into something like 32ARGB, correct. This means that you will want to create ARGB (or RGBA) buffers, and then find a way to very quickly transfer YUV pixels onto this ARGB surface. This recipe includes using CoreImage, a home-made CVPixelBufferRef through a pool, a CGBitmapContextRef referencing your home made pixel buffer, and then recreating a CMSampleBufferRef resembling your input buffer, but referencing your output pixels. In other words,
Fetch the incoming pixels into a CIImage.
Create a CVPixelBufferPool with the pixel format and output dimensions you are creating. You don't want to create CVPixelBuffers without a pool in real time: you will run out of memory if your producer is too fast; you'll fragment your RAM as you won't be reusing buffers; and it's a waste of cycles.
Create a CIContext with the default constructor that you'll share between buffers. It contains no external state, but documentation says that recreating it on every frame is very expensive.
On incoming frame, create a new pixel buffer. Make sure to use an allocation threshold so you don't get runaway RAM usage.
Lock the pixel buffer
Create a bitmap context referencing the bytes in the pixel buffer
Use CIContext to render the planar image data into the linear buffer
Perform your app-specific drawing in the CGContext!
Unlock the pixel buffer
Fetch the timing info of the original sample buffer
Create a CMVideoFormatDescriptionRef by asking the pixel buffer for its exact format
Create a sample buffer for the pixel buffer. Done!
Here's a sample implementation, where I have chosen 32ARGB as the image format to work with, as that's something that both CGBitmapContext and CoreVideo enjoys working with on iOS:
{
CGPixelBufferPoolRef *_pool;
CGSize _poolBufferDimensions;
}
- (void)_processSampleBuffer:(CMSampleBufferRef)inputBuffer
{
// 1. Input data
CVPixelBufferRef inputPixels = CMSampleBufferGetImageBuffer(inputBuffer);
CIImage *inputImage = [CIImage imageWithCVPixelBuffer:inputPixels];
// 2. Create a new pool if the old pool doesn't have the right format.
CGSize bufferDimensions = {CVPixelBufferGetWidth(inputPixels), CVPixelBufferGetHeight(inputPixels)};
if(!_pool || !CGSizeEqualToSize(bufferDimensions, _poolBufferDimensions)) {
if(_pool) {
CFRelease(_pool);
}
OSStatus ok0 = CVPixelBufferPoolCreate(NULL,
NULL, // pool attrs
(__bridge CFDictionaryRef)(#{
(id)kCVPixelBufferPixelFormatTypeKey: #(kCVPixelFormatType_32ARGB),
(id)kCVPixelBufferWidthKey: #(bufferDimensions.width),
(id)kCVPixelBufferHeightKey: #(bufferDimensions.height),
}), // buffer attrs
&_pool
);
_poolBufferDimensions = bufferDimensions;
assert(ok0 == noErr);
}
// 4. Create pixel buffer
CVPixelBufferRef outputPixels;
OSStatus ok1 = CVPixelBufferPoolCreatePixelBufferWithAuxAttributes(NULL,
_pool,
(__bridge CFDictionaryRef)#{
// Opt to fail buffer creation in case of slow buffer consumption
// rather than to exhaust all memory.
(__bridge id)kCVPixelBufferPoolAllocationThresholdKey: #20
}, // aux attributes
&outputPixels
);
if(ok1 == kCVReturnWouldExceedAllocationThreshold) {
// Dropping frame because consumer is too slow
return;
}
assert(ok1 == noErr);
// 5, 6. Graphics context to draw in
CGColorSpaceRef deviceColors = CGColorSpaceCreateDeviceRGB();
OSStatus ok2 = CVPixelBufferLockBaseAddress(outputPixels, 0);
assert(ok2 == noErr);
CGContextRef cg = CGBitmapContextCreate(
CVPixelBufferGetBaseAddress(outputPixels), // bytes
CVPixelBufferGetWidth(inputPixels), CVPixelBufferGetHeight(inputPixels), // dimensions
8, // bits per component
CVPixelBufferGetBytesPerRow(outputPixels), // bytes per row
deviceColors, // color space
kCGImageAlphaPremultipliedFirst // bitmap info
);
CFRelease(deviceColors);
assert(cg != NULL);
// 7
[_imageContext render:inputImage toCVPixelBuffer:outputPixels];
// 8. DRAW
CGContextSetRGBFillColor(cg, 0.5, 0, 0, 1);
CGContextSetTextDrawingMode(cg, kCGTextFill);
NSAttributedString *text = [[NSAttributedString alloc] initWithString:#"Hello world" attributes:NULL];
CTLineRef line = CTLineCreateWithAttributedString((__bridge CFAttributedStringRef)text);
CTLineDraw(line, cg);
CFRelease(line);
// 9. Unlock and stop drawing
CFRelease(cg);
CVPixelBufferUnlockBaseAddress(outputPixels, 0);
// 10. Timings
CMSampleTimingInfo timingInfo;
OSStatus ok4 = CMSampleBufferGetSampleTimingInfo(inputBuffer, 0, &timingInfo);
assert(ok4 == noErr);
// 11. VIdeo format
CMVideoFormatDescriptionRef videoFormat;
OSStatus ok5 = CMVideoFormatDescriptionCreateForImageBuffer(NULL, outputPixels, &videoFormat);
assert(ok5 == noErr);
// 12. Output sample buffer
CMSampleBufferRef outputBuffer;
OSStatus ok3 = CMSampleBufferCreateForImageBuffer(NULL, // allocator
outputPixels, // image buffer
YES, // data ready
NULL, // make ready callback
NULL, // make ready refcon
videoFormat,
&timingInfo, // timing info
&outputBuffer // out
);
assert(ok3 == noErr);
[_consumer consumeSampleBuffer:outputBuffer];
CFRelease(outputPixels);
CFRelease(videoFormat);
CFRelease(outputBuffer);
}
Have you met the problem when for the same video copyPixelBufferForItemTime is incorrect on iOS?
I have AVPlayerItemVideoOutput, linked to appropriate AVPlayerItem.
I call copyPixelBufferForItemTime, receive CVPixelBufferRef and then retrieve OpenGL texture from it.
CVPixelBufferRef pb = [_playerVideoOutput copyPixelBufferForItemTime:currentTime itemTimeForDisplay:nil];
For this sample video there's bug with CVPixelBufferRef:
int bpr = (int)CVPixelBufferGetBytesPerRow(pb);
int width_real = (int)CVPixelBufferGetWidth(pb);
int width_working = (int)CVPixelBufferGetBytesPerRow(pb)/4;
Mac output:
bpr = 2400
width_real = 596
width_working = 600
iOS output:
bpr = 2432
width_real = 596
width_working = 608
How it's rendered on iOS:
How it's rendered on Mac:
CVPixelBufferGetPixelFormatType returns BGRA on both platforms.
Edit
When creating texture on iOS, I read data from pixel buffer via CVPixelBufferGetBaseAddress and use provided size CVPixelBufferGetWidth/CVPixelBufferGetHeight:
- (GLuint)createTextureFromMovieFrame:(CVPixelBufferRef)movieFrame
{
int bufferWidth = (int) CVPixelBufferGetWidth(movieFrame);
int bufferHeight = (int) CVPixelBufferGetHeight(movieFrame);
// Upload to texture
CVPixelBufferLockBaseAddress(movieFrame, 0);
CVOpenGLTextureRef texture=0;
GLuint tex = 0;
#if TARGET_OS_IOS==1
void * data = CVPixelBufferGetBaseAddress(movieFrame);
CVReturn err = 0;
tex = algotest::MyGL::createRGBATexture(bufferWidth, bufferHeight, data, algotest::MyGL::KLinear);
#else
CVReturn err = CVOpenGLTextureCacheCreateTextureFromImage(kCFAllocatorDefault,
getGlobalTextureCache(), movieFrame, 0, &texture);
#endif
CVPixelBufferUnlockBaseAddress(movieFrame, 0);
return tex;
}
So width_working is just for debug. As it mismatch width_real, and passing neither width_working not width_real doesn't work, I suppose that it's a bug with pixel buffer.
The pixel buffers have per-line padding pixels on both iOS and mac, presumably for alignment reasons. The difference is that the mac CVOpenGLTextureCacheCreateTextureFromImage function understands this, while the iOS createRGBATexture function can not, not without a bytes-per-row argument.
You could either include the padding pixels in the width, and crop them out later:
tex = algotest::MyGL::createRGBATexture(CVPixelBufferGetBytesPerRow(movieFrame)/4, bufferHeight, data, algotest::MyGL::KLinear);
Or you could use CVOpenGLESTextureCache, the iOS equivalent of CVOpenGLTextureCache and replace createRGBATexture() with CVOpenGLESTextureCacheCreateTextureFromImage(). Then your mac & iOS code would be similar & the iOS code might even run faster as texture caches on iOS can avoid redundant copying of texture data.
I'm using AVFoundation to get camera stream.
I'm using this code to get MTLTextures from:
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
{
CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
id<MTLTexture> texture = nil;
{
size_t width = CVPixelBufferGetWidth(pixelBuffer);
size_t height = CVPixelBufferGetHeight(pixelBuffer);
MTLPixelFormat pixelFormat = MTLPixelFormatBGRA8Unorm;
CVMetalTextureRef metalTextureRef = NULL;
CVReturn status = CVMetalTextureCacheCreateTextureFromImage(NULL, _textureCache, pixelBuffer, NULL, pixelFormat, width, height, 0, &metalTextureRef);
if(status == kCVReturnSuccess)
{
texture = CVMetalTextureGetTexture(metalTextureRef);
if (self.delegate){
[self.delegate textureUpdated:texture];
}
CFRelease(metalTextureRef);
}
}
}
It works fine, except for that generated MTLTexture object is not mipmaped (has only one mip level).
In this call:
CVMetalTextureCacheCreateTextureFromImage(NULL, _textureCache, pixelBuffer, NULL, pixelFormat, width, height, 0, &metalTextureRef);
There is a third parameter called "textureAtributes", I think it's possible to specify that I want mipmaped texture, but I haven't found any word in documentation what exactly goes there. Neither had I find a source code in which something else is passed instead of NULL.
In OpenGLES for iOS there is similar method, with same parameter, and also no words in documentation .
Just received an answer from Metal engineer here. Here's a quote:
No, it is not possible to generate a mipmapped texture from a
CVPixelBuffer directly.
CVPixelBuffer images typically have a linear/stride layout, as non-GPU hardware blocks might be interacting with those, and most GPU
hardware only supports mipmapping from tiled textures. You'll need to
issue a blit to copy from the linear MTLTexture to a private
MTLTexture of your own creation, then generate mipmaps.
As for textureAttributes, there is only one key supported: kCVMetalTextureCacheMaximumTextureAgeKey
There isn't a method to get a mipmapped texture directly, but you can generate one yourself easily enough.
First use your Metal device to create an empty Metal texture that is the same size and format as your existing texture, but has a full mipmap chain. See newTexture documentation
Use your MTLCommandBuffer object to create a blitEncoder object. See blitCommandEncoder documentation
Use the blitEncoder to copy from your camera texture to your empty texture. destinationLevel should be zero as you are only copying the top level mipmap. See copyFromTexture documentation
Finally use the blitEncoder to generate all the mip levels by calling generateMipmapsForTexture. See generateMipMapsForTexture documentation
At the end of this you have a metal texture from the camera with a full mip chain.
I'm using PBJVision to implement tap-to-record video functionality. The library doesn't support orientation yet so I'm in the process of trying to engineer it in. From what I see, there are three ways to rotate the video - I need help on deciding the best way forward and how to implement it. Note that rotation can happen between tap-to-record segments. So in a recording session, the orientation is locked to what it was when the user tapped the button. The next time the user taps the button to record, it should re-set the orientation to whatever the device's orientation is (so the resulting video shows right-side-up).
The approaches are outlined in the issue page on GitHub as well
Method 1
Rotate the AVCaptureConnection using setVideoOrientation: - this causes the video preview to flicker every time it's switched, since this switches the actual hardware it seems. Not cool, not acceptable.
Method 2
Set the transform property on the AVAssetWriterInput object used to write the video. The problem is, once the asset writer starts writing, the transform property can't be changed, so this only works for the first segment of the video.
Method 3
Rotate the image buffer being appended using something like this: How to directly rotate CVImageBuffer image in IOS 4 without converting to UIImage? but it keeps crashing and I'm not even sure if I'm barking up the right tree. There's an exception that is thrown and I can't really trace it back to much more than the fact that I'm using the vImageRotate90_ARGB8888 function incorrectly.
The explanation is a bit more detailed on the GitHub issue page I linked to above. Any suggestions would be welcome - to be honest, I'm not hugely experienced at AVFoundation and so I'm hoping that there's some miraculous way to do this that I don't even know about!
Method 1 isn't the preferred method according to Apple's documentation ("Physically rotating buffers does come with a performance cost, so only request rotation if it's necessary"). Method 2 worked for me but if I played my video on an app that doesn't support the transformation "metadata", the video isn't rotated properly. Method 3 is what I did.
I think it's crashing for you before you're trying to pass the image data directly from vImageRotate... to the AVAssetWriterInputPixelBufferAdaptor. You have to create a CVPixelBufferRef first. Here's my code:
Inside of captureOutput:didOutputSampleBuffer:fromConnection: I rotate the frame before writing it into the adaptor:
if ([self.videoWriterInput isReadyForMoreMediaData])
{
// Rotate buffer first and then write to adaptor
CMTime sampleTime = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
CVPixelBufferRef rotatedBuffer = [self correctBufferOrientation:sampleBuffer];
[self.videoWriterInputAdaptor appendPixelBuffer:rotatedBuffer withPresentationTime:sampleTime];
CVBufferRelease(rotatedBuffer);
}
The referenced function that performs the vImage rotation is:
/* rotationConstant:
* 0 -- rotate 0 degrees (simply copy the data from src to dest)
* 1 -- rotate 90 degrees counterclockwise
* 2 -- rotate 180 degress
* 3 -- rotate 270 degrees counterclockwise
*/
- (CVPixelBufferRef)rotateBuffer:(CMSampleBufferRef)sampleBuffer withConstant:(uint8_t)rotationConstant
{
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
CVPixelBufferLockBaseAddress(imageBuffer, 0);
OSType pixelFormatType = CVPixelBufferGetPixelFormatType(imageBuffer);
NSAssert(pixelFormatType == kCVPixelFormatType_32ARGB, #"Code works only with 32ARGB format. Test/adapt for other formats!");
const size_t kAlignment_32ARGB = 32;
const size_t kBytesPerPixel_32ARGB = 4;
size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
size_t width = CVPixelBufferGetWidth(imageBuffer);
size_t height = CVPixelBufferGetHeight(imageBuffer);
BOOL rotatePerpendicular = (rotateDirection == 1) || (rotateDirection == 3); // Use enumeration values here
const size_t outWidth = rotatePerpendicular ? height : width;
const size_t outHeight = rotatePerpendicular ? width : height;
size_t bytesPerRowOut = kBytesPerPixel_32ARGB * ceil(outWidth * 1.0 / kAlignment_32ARGB) * kAlignment_32ARGB;
const size_t dstSize = bytesPerRowOut * outHeight * sizeof(unsigned char);
void *srcBuff = CVPixelBufferGetBaseAddress(imageBuffer);
unsigned char *dstBuff = (unsigned char *)malloc(dstSize);
vImage_Buffer inbuff = {srcBuff, height, width, bytesPerRow};
vImage_Buffer outbuff = {dstBuff, outHeight, outWidth, bytesPerRowOut};
uint8_t bgColor[4] = {0, 0, 0, 0};
vImage_Error err = vImageRotate90_ARGB8888(&inbuff, &outbuff, rotationConstant, bgColor, 0);
if (err != kvImageNoError)
{
NSLog(#"%ld", err);
}
CVPixelBufferUnlockBaseAddress(imageBuffer, 0);
CVPixelBufferRef rotatedBuffer = NULL;
CVPixelBufferCreateWithBytes(NULL,
outWidth,
outHeight,
pixelFormatType,
outbuff.data,
bytesPerRowOut,
freePixelBufferDataAfterRelease,
NULL,
NULL,
&rotatedBuffer);
return rotatedBuffer;
}
void freePixelBufferDataAfterRelease(void *releaseRefCon, const void *baseAddress)
{
// Free the memory we malloced for the vImage rotation
free((void *)baseAddress);
}
Note: You may like to use enumeration for rotationConstant. Something like that (don't call this function with MOVRotateDirectionUnknown):
typedef NS_ENUM(uint8_t, MOVRotateDirection)
{
MOVRotateDirectionNone = 0,
MOVRotateDirectionCounterclockwise90,
MOVRotateDirectionCounterclockwise180,
MOVRotateDirectionCounterclockwise270,
MOVRotateDirectionUnknown
};
Note: If you need IOSurface support, you should use CVPixelBufferCreate instead of CVPixelBufferCreateWithBytes and pass bytes data into it directly:
NSDictionary *pixelBufferAttributes = #{ (NSString *)kCVPixelBufferIOSurfacePropertiesKey : #{} };
CVPixelBufferCreate(kCFAllocatorDefault,
outWidth,
outHeight,
pixelFormatType,
(__bridge CFDictionaryRef)(pixelBufferAttributes),
&rotatedBuffer);
CVPixelBufferLockBaseAddress(rotatedBuffer, 0);
uint8_t *dest = CVPixelBufferGetBaseAddress(rotatedBuffer);
memcpy(dest, outbuff.data, bytesPerRowOut * outHeight);
CVPixelBufferUnlockBaseAddress(rotatedBuffer, 0);
There is an easy and safe way.
#define degreeToRadian(x) (Double.pi * x / 180.0)
self.assetWriterInputVideo.transform =
CGAffineTransformMakeRotation(CGFloat(degreeToRadian(-90))) ;
method 3 does work to rotate the frame of video。
But I found out it can cause the MM leak. in order to it, I try to move the funcation in the same thread as the merging the frame of video.
it does work.
When you meet the issue, Please take care.