How to use AVAssetWriter instead of AVAssetExportSession to re-encode existing video - ios

I'm trying to re-encode videos on an iPad which were recorded on that device but with the "wrong" orientation. This is because when the file is converted to an MP4 file and uploaded to a web server for use with the "video" HTML5 tag, only Safari seems to render the video with the correct orientation.
Basically, I've managed to implement what I wanted by using a AVMutableVideoCompositionLayerInstruction, and then using AVAssetExportSession to create the resultant video with audio. However, the problem is that the file sizes jump up considerably after doing this, eg correcting an original file of 4.1MB results in a final file size of 18.5MB! All I've done is rotate the video through 180 degrees!! Incidentally, the video instance that I'm trying to process was originally created by the UIImagePicker during "compression" using videoQuality = UIImagePickerControllerQualityType640x480, which actually results in videos of 568 x 320 on an iPad mini.
I experimented with the various presetName settings on AVAssetExportSession but I couldn't get the desired result. The closest I got filesize-wise was 4.1MB (ie exactly the same as source!) by using AVAssetExportPresetMediumQuality, BUT this also reduced the dimensions of the resultant video to 480 x 272 instead of the 568 x 320 that I had explicitly set.
So, this led me to look into other options, and hence using AVAssetWriter instead. The problem is, I can't get any code that I have found to work! I tried the code found on this SO post (Video Encoding using AVAssetWriter - CRASHES), but can't get it to work. For a start, I get a compilation error for this line:
NSDictionary *videoOptions = [NSDictionary dictionaryWithObject:[NSNumber numberWithInt:kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange] forKey:(id)kCVPixelBufferPixelFormatTypeKey];
The resultant compilation error being:
Undefined symbols for architecture armv7: "_kCVPixelBufferPixelFormatTypeKey"
This aside, I tried passsing in nil for the AVAssetReaderTrackOutput's outputSettings, which should be OK according to header info:
A value of nil for outputSettings configures the output to vend samples in their original format as stored by the specified track.
However, I then get a crash happening at this line:
BOOL result = [videoWriterInput appendSampleBuffer:sampleBuffer];
In short, I've not been able to get any code to work with AVAssetWriter, so I REALLY need some help here. Are there any other ways to achieve my desired results? Incidentally, I'm using Xcode 4.6 and I'm targeting everything from iOS5 upwards, using ARC.

I have solved similar problems related to your questions. This might help someone who has similar problems:
Assuming writerInput is your object instance of AVAssetWriterInput and assetTrack is the instance of your AVAssetTrack, then your transform problem is solved by simply:
writerInput.transform = assetTrack.preferredTransform;
You have to release sampleBuffer after appending your sample buffer, so you would have something like:
if (sampleBuffer = [asset_reader_output copyNextSampleBuffer]) {
BOOL result = [writerInput appendSampleBuffer:sampleBuffer];
CFRelease(sampleBuffer); // Release sampleBuffer!
}

The compilation error was caused by me not including the CoreVideo.framework. As soon as I had included that and imported it, I could get the code to compile. Also, the code would work and generate a resultant video, but I uncovered 2 new problems:
I can't get the transform to work using the transform property on AVAssetWriterInput. This means that I'm stuck with using a AVMutableVideoCompositionInstruction and AVAssetExportSession for the transformation.
If I use AVAssetWriter to just handle compression (seeing as I don't have many options with AVAssetExportSession), I still have a bad memory leak. I've tried everything I can think of, starting with the solution in this link ( Help Fix Memory Leak release ) and also with #autorelease blocks at key points. But it seems that the following line will cause a leak, no matter what I try:
CMSampleBufferRef sampleBuffer = [asset_reader_output copyNextSampleBuffer];
I could really do with some help.

Related

UIVideoEditorController videoQuality setting not working

I'm currently trying to use UIVideoEditorViewController to trim a video file previously selected from the Photos app by using UIImagePickerController. I verified that after choosing the file and before creating UIVideoEditorViewController the file has the original resolution. However, after trimming the video, the output is always in 360p resolution despite setting the videoQuality property to high. All the applicable settings seem to be ignored for this property and the video ends up being compressed unnecessarily.
I have found multiple people reporting similar issues but yet have to find a workaround for this.
let editor = UIVideoEditorController()
editor.delegate = self
editor.videoMaximumDuration = 0 // No limit for now
// TODO: This should really work, however while testing it seems like the imported videos are always scaled down
// to 320p which equivalents the default value of .low.
editor.videoQuality = .typeHigh
editor.videoPath = internalURL.path
Any help would be greatly appreciated.

WebRTC iOS: Filtering camera stream from RTCCameraVideoCapturer. Conversion from RTCFrame to CVPixelBuffer

I found the git below is simple and efficient by using func capturer(_ capturer: RTCVideoCapturer, didCapture frame: RTCVideoFrame) of RTCVideoCapturerDelegate. You get RTCVideoFrame and then convert to CVPixelBuffer to modify.
https://gist.github.com/lyokato/d041f16b94c84753b5e877211874c6fc
However, I found Chronium says nativeHandle to get PixelBuffer is no more available(link below). I tried frame.buffer.pixelbuffer..., but, looking at framework > Headers > RTCVideoFrameBuffer.h, I found CVPixelBuffer is also gone from here!
https://codereview.webrtc.org/2990253002
Is there any good way to convert RTCVideoFrame to CVPixelBuffer?
Or do we have better way to modify captured video from RTCCameraVideoCapturer?
Below link suggests modifying sdk directly but hopefully we can achieve this on Xcode.
How to modify (add filters to) the camera stream that WebRTC is sending to other peers/server
can you specify what is your expectation? because you can get pixel buffer from RTCVideoframe easily but I feel there can be a better solution if you want to filter video buffer than sent to Webrtc, you should work with RTCVideoSource.
you can get buffer with
as seen
RTCCVPixelBuffer *buffer = (RTCCVPixelBuffer *)frame.buffer;
CVPixelBufferRef imageBuffer = buffer.pixelBuffer;
(with latest SDK and with local video camera buffer only)
but in the sample i can see that filter will not work for remote.
i have attached the screenshot this is how you can check the preview as well.

React native: Real time camera data without image save and preview

I started working on my first non-demo react-native app. I hope it will be a iOS/Android app, but actually I'm focused on iOS only.
I have a one problem actually. How can I get a data (base64, array of pixels, ...) in real-time from the camera without saving to the camera roll.
There is this module: https://github.com/lwansbrough/react-native-camera but base64 is deprecated and is useless for me, because I want a render processed image to user (change picture colors eg.), not the real picture from camera, as it does react-native-camera module.
(I know how to communicate with SWIFT code, but I don't know what the options are in native code, I come here from WebDev)
Thanks a lot.
This may not be optimal but is what I have been using. If anyone can give a better solution, I would appreciate your help, too!
My basic idea is simply to loop (but not simple for-loop, see below) taking still pictures in yuv/rgb format at max resolution, which is reasonably fast (~x0ms with normal exposure duration) and process them. Basically you will setup AVCaptureStillImageOutput that links to you camera (following tutorials everywhere) then set the format to kCVPixelFormatType_420YpCbCr8BiPlanarFullRange (if you want YUV) or kCVPixelFormatType_32BGRA(if you prefer rgba) like
bool usingYUVFormat = true;
NSDictionary *outputFormat = [NSDictionary dictionaryWithObject:
[NSNumber numberWithInt:usingYUVFormat?kCVPixelFormatType_420YpCbCr8BiPlanarFullRange:kCVPixelFormatType_32BGRA]
forKey:(id)kCVPixelBufferPixelFormatTypeKey];
[yourAVCaptureStillImageOutput setOutputSettings:outputFormat];
When you are ready, you can start calling
AVCaptureConnection *captureConnection=[yourAVCaptureStillImageOutput connectionWithMediaType:AVMediaTypeVideo];
[yourAVCaptureStillImageOutput captureStillImageAsynchronouslyFromConnection:captureConnection completionHandler:^(CMSampleBufferRef imageDataSampleBuffer, NSError *error) {
if(imageDataSampleBuffer){
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(imageDataSampleBuffer);
CVPixelBufferLockBaseAddress(imageBuffer, 0);
// do your magic with the data buffer imageBuffer
// use CVPixelBufferGetBaseAddressOfPlane(imageBuffer,0/1/2); to get each plane
// use CVPixelBufferGetWidth/CVPixelBufferGetHeight to get dimensions
// if you want more, please google
}
}];
Additionally, use NSNotificationCenter to register your photo-taking action and post a notification after you have processed each frame (with some delay perhaps, to cap your through-put and reduce power consumption) so the loop will keep going.
A quick precaution: the Android counterpart is much worse a headache. Few hardware manufacturers implement api for max-resolution uncompressed photos but only 1080p for preview/video, as I have raised in my question. I am still looking for solutions but gave up most hope. JPEG images are just toooo slow.

Efficient use of Core Image with AV Foundation

I'm writing an iOS app that applies filters to existing video files and outputs the results to new ones. Initially, I tried using Brad Larson's nice framework, GPUImage. Although I was able to output filtered video files without much effort, the output wasn't perfect: the videos were the proper length, but some frames were missing, and others were duplicated (see Issue 1501 for more info). I plan to learn more about OpenGL ES so that I can better investigate the dropped/skipped frames issue. However, in the meantime, I'm exploring other options for rendering my video files.
I'm already familiar with Core Image, so I decided to leverage it in an alternative video-filtering solution. Within a block passed to AVAssetWriterInput requestMediaDataWhenReadyOnQueue:usingBlock:, I filter and output each frame of the input video file like so:
CMSampleBufferRef sampleBuffer = [self.assetReaderVideoOutput copyNextSampleBuffer];
if (sampleBuffer != NULL)
{
CMTime presentationTimeStamp = CMSampleBufferGetOutputPresentationTimeStamp(sampleBuffer);
CVPixelBufferRef inputPixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
CIImage* frame = [CIImage imageWithCVPixelBuffer:inputPixelBuffer];
// a CIFilter created outside the "isReadyForMoreMediaData" loop
[screenBlend setValue:frame forKey:kCIInputImageKey];
CVPixelBufferRef outputPixelBuffer;
CVReturn result = CVPixelBufferPoolCreatePixelBuffer(NULL, assetWriterInputPixelBufferAdaptor.pixelBufferPool, &outputPixelBuffer);
// verify that everything's gonna be ok
NSAssert(result == kCVReturnSuccess, #"CVPixelBufferPoolCreatePixelBuffer failed with error code");
NSAssert(CVPixelBufferGetPixelFormatType(outputPixelBuffer) == kCVPixelFormatType_32BGRA, #"Wrong pixel format");
[self.coreImageContext render:screenBlend.outputImage toCVPixelBuffer:outputPixelBuffer];
BOOL success = [assetWriterInputPixelBufferAdaptor appendPixelBuffer:outputPixelBuffer withPresentationTime:presentationTimeStamp];
CVPixelBufferRelease(outputPixelBuffer);
CFRelease(sampleBuffer);
sampleBuffer = NULL;
completedOrFailed = !success;
}
This works well: the rendering seems reasonably fast, and the resulting video file doesn't have any missing or duplicated frames. However, I'm not confident that my code is as efficient as it could be. Specifically, my questions are
Does this approach allow the device to keep all frame data on the GPU, or are there any methods (e.g. imageWithCVPixelBuffer: or render:toCVPixelBuffer:) that prematurely copy pixels to the CPU?
Would it be more efficient to use CIContext's drawImage:inRect:fromRect: to draw to an OpenGLES context?
If the answer to #2 is yes, what's the proper way to pipe the results of drawImage:inRect:fromRect: into a CVPixelBufferRef so that it can be appended to the output video file?
I've searched for an example of how to use CIContext drawImage:inRect:fromRect: to render filtered video frames, but haven't found any. Notably, the source for GPUImageMovieWriter does something similar, but since a) I don't really understand it yet, and b) it's not working quite right for this use case, I'm wary of copying its solution.

AVFoundation: Video to OpenGL texture working - How to play and sync audio?

I've managed to load a video-track of a movie frame by frame into an OpenGL texture with AVFoundation. I followed the steps described in the answer here: iOS4: how do I use video file as an OpenGL texture?
and took some code from the GLVideoFrame sample from WWDC2010 which can be downloaded here.
How do I play the audio-track of the movie synchronously to the video? I think it would not be a good idea to play it in a separate player, but to use the audio-track of the same AVAsset.
AVAssetTrack* audioTrack = [[asset tracksWithMediaType:AVMediaTypeAudio] objectAtIndex:0];
I retrieve a videoframe and it's timestamp in the CADisplayLink-callback via
CMSampleBufferRef sampleBuffer = [self.readerOutput copyNextSampleBuffer];
CMTime timestamp = CMSampleBufferGetPresentationTimeStamp( sampleBuffer );
where readerOutput is of type AVAssetReaderTrackOutput*
How to get the corresponding audio-samples?
And how to play them?
Edit:
I've looked around a bit and I think, best would be to use AudioQueue from the AudioToolbox.framework using the approach described here: AVAssetReader and Audio Queue streaming problem
There is also an audio-player in the AVFoundation: AVAudioPlayer. But I don't know exactly how I should pass data to its initWithData-initializer which expects NSData. Furthermore, I don't think it's the best choice for my case because a new AVAudioPlayer-instance would have to be created for every new chunk of audio samples, as I understand it.
Any other suggestions?
What's the best way to play the raw audio samples which I get from the AVAssetReaderTrackOutput?
You want do do an AV composition. You can merge multiple media sources, synchronized temporally, into one output.
http://developer.apple.com/library/ios/#DOCUMENTATION/AVFoundation/Reference/AVComposition_Class/Reference/Reference.html

Resources