Efficient use of Core Image with AV Foundation - ios

I'm writing an iOS app that applies filters to existing video files and outputs the results to new ones. Initially, I tried using Brad Larson's nice framework, GPUImage. Although I was able to output filtered video files without much effort, the output wasn't perfect: the videos were the proper length, but some frames were missing, and others were duplicated (see Issue 1501 for more info). I plan to learn more about OpenGL ES so that I can better investigate the dropped/skipped frames issue. However, in the meantime, I'm exploring other options for rendering my video files.
I'm already familiar with Core Image, so I decided to leverage it in an alternative video-filtering solution. Within a block passed to AVAssetWriterInput requestMediaDataWhenReadyOnQueue:usingBlock:, I filter and output each frame of the input video file like so:
CMSampleBufferRef sampleBuffer = [self.assetReaderVideoOutput copyNextSampleBuffer];
if (sampleBuffer != NULL)
{
CMTime presentationTimeStamp = CMSampleBufferGetOutputPresentationTimeStamp(sampleBuffer);
CVPixelBufferRef inputPixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
CIImage* frame = [CIImage imageWithCVPixelBuffer:inputPixelBuffer];
// a CIFilter created outside the "isReadyForMoreMediaData" loop
[screenBlend setValue:frame forKey:kCIInputImageKey];
CVPixelBufferRef outputPixelBuffer;
CVReturn result = CVPixelBufferPoolCreatePixelBuffer(NULL, assetWriterInputPixelBufferAdaptor.pixelBufferPool, &outputPixelBuffer);
// verify that everything's gonna be ok
NSAssert(result == kCVReturnSuccess, #"CVPixelBufferPoolCreatePixelBuffer failed with error code");
NSAssert(CVPixelBufferGetPixelFormatType(outputPixelBuffer) == kCVPixelFormatType_32BGRA, #"Wrong pixel format");
[self.coreImageContext render:screenBlend.outputImage toCVPixelBuffer:outputPixelBuffer];
BOOL success = [assetWriterInputPixelBufferAdaptor appendPixelBuffer:outputPixelBuffer withPresentationTime:presentationTimeStamp];
CVPixelBufferRelease(outputPixelBuffer);
CFRelease(sampleBuffer);
sampleBuffer = NULL;
completedOrFailed = !success;
}
This works well: the rendering seems reasonably fast, and the resulting video file doesn't have any missing or duplicated frames. However, I'm not confident that my code is as efficient as it could be. Specifically, my questions are
Does this approach allow the device to keep all frame data on the GPU, or are there any methods (e.g. imageWithCVPixelBuffer: or render:toCVPixelBuffer:) that prematurely copy pixels to the CPU?
Would it be more efficient to use CIContext's drawImage:inRect:fromRect: to draw to an OpenGLES context?
If the answer to #2 is yes, what's the proper way to pipe the results of drawImage:inRect:fromRect: into a CVPixelBufferRef so that it can be appended to the output video file?
I've searched for an example of how to use CIContext drawImage:inRect:fromRect: to render filtered video frames, but haven't found any. Notably, the source for GPUImageMovieWriter does something similar, but since a) I don't really understand it yet, and b) it's not working quite right for this use case, I'm wary of copying its solution.

Related

WebRTC iOS: Filtering camera stream from RTCCameraVideoCapturer. Conversion from RTCFrame to CVPixelBuffer

I found the git below is simple and efficient by using func capturer(_ capturer: RTCVideoCapturer, didCapture frame: RTCVideoFrame) of RTCVideoCapturerDelegate. You get RTCVideoFrame and then convert to CVPixelBuffer to modify.
https://gist.github.com/lyokato/d041f16b94c84753b5e877211874c6fc
However, I found Chronium says nativeHandle to get PixelBuffer is no more available(link below). I tried frame.buffer.pixelbuffer..., but, looking at framework > Headers > RTCVideoFrameBuffer.h, I found CVPixelBuffer is also gone from here!
https://codereview.webrtc.org/2990253002
Is there any good way to convert RTCVideoFrame to CVPixelBuffer?
Or do we have better way to modify captured video from RTCCameraVideoCapturer?
Below link suggests modifying sdk directly but hopefully we can achieve this on Xcode.
How to modify (add filters to) the camera stream that WebRTC is sending to other peers/server
can you specify what is your expectation? because you can get pixel buffer from RTCVideoframe easily but I feel there can be a better solution if you want to filter video buffer than sent to Webrtc, you should work with RTCVideoSource.
you can get buffer with
as seen
RTCCVPixelBuffer *buffer = (RTCCVPixelBuffer *)frame.buffer;
CVPixelBufferRef imageBuffer = buffer.pixelBuffer;
(with latest SDK and with local video camera buffer only)
but in the sample i can see that filter will not work for remote.
i have attached the screenshot this is how you can check the preview as well.

Importance of AVAssetWriterInputPixelBufferAdaptor in AVAssetWriter

I'm trying to output video captured from the camera using AVAssetWriter.
I'm following some examples that don't use AVAssetWriterInputPixelBufferAdaptor (Record video with AVAssetWriter), and some that do (AVCaptureSession only got video buffer).
Based on the Apple references, I've interpreted the purpose of AVAssetWriterInputPixelBufferAdaptor (or CVPixelBuffer, CVPixelBufferPool) in general to be an efficient way to buffer incoming pixels in memory. In practice, how important is it to use this when writing video output using AVAssetWriter? I seem to be able to get a basic version working without using the adaptor just fine, but I wanted to understand a bit more the benefit/intent of using AVAssetWriterInputPixelBufferAdaptor in general.
I have been using video recording without the PixelBufferAdaptor for several years without any problems. I essentially use this code:
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection{
if (videoWriterInput.readyForMoreMediaData) {
[videoWriterInput appendSampleBuffer:sampleBuffer];
}
}
My take is that since the CMSampleBufferRef contains timing information it can be written directly. Whereas if you have a CVPixelBuffer you must add the timing information through the adaptor. So if you are doing some image processing before writing you will end up with a CVPixelBuffer and have to use the adaptor. The adaptor might also add some buffering capabilities for the CVPixelBuffer if your processing takes time.

React native: Real time camera data without image save and preview

I started working on my first non-demo react-native app. I hope it will be a iOS/Android app, but actually I'm focused on iOS only.
I have a one problem actually. How can I get a data (base64, array of pixels, ...) in real-time from the camera without saving to the camera roll.
There is this module: https://github.com/lwansbrough/react-native-camera but base64 is deprecated and is useless for me, because I want a render processed image to user (change picture colors eg.), not the real picture from camera, as it does react-native-camera module.
(I know how to communicate with SWIFT code, but I don't know what the options are in native code, I come here from WebDev)
Thanks a lot.
This may not be optimal but is what I have been using. If anyone can give a better solution, I would appreciate your help, too!
My basic idea is simply to loop (but not simple for-loop, see below) taking still pictures in yuv/rgb format at max resolution, which is reasonably fast (~x0ms with normal exposure duration) and process them. Basically you will setup AVCaptureStillImageOutput that links to you camera (following tutorials everywhere) then set the format to kCVPixelFormatType_420YpCbCr8BiPlanarFullRange (if you want YUV) or kCVPixelFormatType_32BGRA(if you prefer rgba) like
bool usingYUVFormat = true;
NSDictionary *outputFormat = [NSDictionary dictionaryWithObject:
[NSNumber numberWithInt:usingYUVFormat?kCVPixelFormatType_420YpCbCr8BiPlanarFullRange:kCVPixelFormatType_32BGRA]
forKey:(id)kCVPixelBufferPixelFormatTypeKey];
[yourAVCaptureStillImageOutput setOutputSettings:outputFormat];
When you are ready, you can start calling
AVCaptureConnection *captureConnection=[yourAVCaptureStillImageOutput connectionWithMediaType:AVMediaTypeVideo];
[yourAVCaptureStillImageOutput captureStillImageAsynchronouslyFromConnection:captureConnection completionHandler:^(CMSampleBufferRef imageDataSampleBuffer, NSError *error) {
if(imageDataSampleBuffer){
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(imageDataSampleBuffer);
CVPixelBufferLockBaseAddress(imageBuffer, 0);
// do your magic with the data buffer imageBuffer
// use CVPixelBufferGetBaseAddressOfPlane(imageBuffer,0/1/2); to get each plane
// use CVPixelBufferGetWidth/CVPixelBufferGetHeight to get dimensions
// if you want more, please google
}
}];
Additionally, use NSNotificationCenter to register your photo-taking action and post a notification after you have processed each frame (with some delay perhaps, to cap your through-put and reduce power consumption) so the loop will keep going.
A quick precaution: the Android counterpart is much worse a headache. Few hardware manufacturers implement api for max-resolution uncompressed photos but only 1080p for preview/video, as I have raised in my question. I am still looking for solutions but gave up most hope. JPEG images are just toooo slow.

How to decode a live555 rtsp stream (h.264) MediaSink data using iOS8's VideoToolbox?

Ok, I know that this question is almost the same as get-rtsp-stream-from-live555-and-decode-with-avfoundation, but now VideoToolbox for iOS8 became public for use and although I know that it can be done using this framework, I have no idea of how to do this.
My goals are:
Connect with a WiFiCamera using rtsp protocol and receive stream data (Done with live555)
Decode the data and convert to UIImages to display on the screen (motionJPEG like)
And save the streamed data on a .mov file
I reached all this goals using ffmpeg, but unfortunately I can't use it due to my company's policy.
I know that I can display on the screen using openGL too, but this time I have to convert to UIImages. I also tried to use the libraries below:
ffmpeg: can't use this time due to company's policy. (don't ask me why)
libVLC: display lags about 2secs and I don't have access to stream data to save into a .mov file...
gstreamer: same as above
I believe that live555 + VideoToolbox will do the job, just can't figure out how to do this happen ...
I did it. VideoToolbox is still poor documented and we have no much information about video programming (without using ffmpeg) so it cost me more time than I really expected.
For stream using live555, I got the SPS and PPS info to create the CMVideoFormatDescription like this:
const uint8_t *props[] = {[spsData bytes], [ppsData bytes]};
size_t sizes[] = {[spsData length], [ppsData length]};
OSStatus result = CMVideoFormatDescriptionCreateFromH264ParameterSets(NULL, 2, props, sizes, 4, &videoFormat);
Now, the difficult part (because I'm noob on video programming): Replace the NALunit header with a 4 byte length code as described here
int headerEnd = 23; //where the real data starts
uint32_t hSize = (uint32_t)([rawData length] - headerEnd - 4);
uint32_t bigEndianSize = CFSwapInt32HostToBig(hSize);
NSMutableData *videoData = [NSMutableData dataWithBytes:&bigEndianSize length:sizeof(bigEndianSize)];
[videoData appendData:[rawData subdataWithRange:NSMakeRange(headerEnd + 4, [rawData length] - headerEnd - 4)]];
Now I was able to create a CMBlockBuffer successfully using this raw data and pass the buffer to VTDecompressionSessionDecodeFrame. From here is easy to convert the response CVImageBufferRef to UIImage... I used this stack overflow thread as reference.
And finally, save the stream data converted on UIImage following the explanation described on How do I export UIImage array as a movie?
I just posted a little bit of my code because I believe this is the important part, or in other words, it is where I was having problems.

How to use AVAssetWriter instead of AVAssetExportSession to re-encode existing video

I'm trying to re-encode videos on an iPad which were recorded on that device but with the "wrong" orientation. This is because when the file is converted to an MP4 file and uploaded to a web server for use with the "video" HTML5 tag, only Safari seems to render the video with the correct orientation.
Basically, I've managed to implement what I wanted by using a AVMutableVideoCompositionLayerInstruction, and then using AVAssetExportSession to create the resultant video with audio. However, the problem is that the file sizes jump up considerably after doing this, eg correcting an original file of 4.1MB results in a final file size of 18.5MB! All I've done is rotate the video through 180 degrees!! Incidentally, the video instance that I'm trying to process was originally created by the UIImagePicker during "compression" using videoQuality = UIImagePickerControllerQualityType640x480, which actually results in videos of 568 x 320 on an iPad mini.
I experimented with the various presetName settings on AVAssetExportSession but I couldn't get the desired result. The closest I got filesize-wise was 4.1MB (ie exactly the same as source!) by using AVAssetExportPresetMediumQuality, BUT this also reduced the dimensions of the resultant video to 480 x 272 instead of the 568 x 320 that I had explicitly set.
So, this led me to look into other options, and hence using AVAssetWriter instead. The problem is, I can't get any code that I have found to work! I tried the code found on this SO post (Video Encoding using AVAssetWriter - CRASHES), but can't get it to work. For a start, I get a compilation error for this line:
NSDictionary *videoOptions = [NSDictionary dictionaryWithObject:[NSNumber numberWithInt:kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange] forKey:(id)kCVPixelBufferPixelFormatTypeKey];
The resultant compilation error being:
Undefined symbols for architecture armv7: "_kCVPixelBufferPixelFormatTypeKey"
This aside, I tried passsing in nil for the AVAssetReaderTrackOutput's outputSettings, which should be OK according to header info:
A value of nil for outputSettings configures the output to vend samples in their original format as stored by the specified track.
However, I then get a crash happening at this line:
BOOL result = [videoWriterInput appendSampleBuffer:sampleBuffer];
In short, I've not been able to get any code to work with AVAssetWriter, so I REALLY need some help here. Are there any other ways to achieve my desired results? Incidentally, I'm using Xcode 4.6 and I'm targeting everything from iOS5 upwards, using ARC.
I have solved similar problems related to your questions. This might help someone who has similar problems:
Assuming writerInput is your object instance of AVAssetWriterInput and assetTrack is the instance of your AVAssetTrack, then your transform problem is solved by simply:
writerInput.transform = assetTrack.preferredTransform;
You have to release sampleBuffer after appending your sample buffer, so you would have something like:
if (sampleBuffer = [asset_reader_output copyNextSampleBuffer]) {
BOOL result = [writerInput appendSampleBuffer:sampleBuffer];
CFRelease(sampleBuffer); // Release sampleBuffer!
}
The compilation error was caused by me not including the CoreVideo.framework. As soon as I had included that and imported it, I could get the code to compile. Also, the code would work and generate a resultant video, but I uncovered 2 new problems:
I can't get the transform to work using the transform property on AVAssetWriterInput. This means that I'm stuck with using a AVMutableVideoCompositionInstruction and AVAssetExportSession for the transformation.
If I use AVAssetWriter to just handle compression (seeing as I don't have many options with AVAssetExportSession), I still have a bad memory leak. I've tried everything I can think of, starting with the solution in this link ( Help Fix Memory Leak release ) and also with #autorelease blocks at key points. But it seems that the following line will cause a leak, no matter what I try:
CMSampleBufferRef sampleBuffer = [asset_reader_output copyNextSampleBuffer];
I could really do with some help.

Resources