Scale and crop CMSampleBufferRef - ios

I am using AvFoundation & AVCaptureVideoDataOutputSampleBufferDelegate to record a video.
I need to implement Zoom functionality in the video being recorded. I am using the following delegate method.
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
I am using this for getting video frames because i need to add text and images later on it before the appending it to the AVAssetWriterInput, using
[assetWriterVideoIn appendSampleBuffer:sampleBuffer]
The only way i can think to perform zoom is to scale and crop the "(CMSampleBufferRef)sampleBuffer" that i get from the delegate method.
Please help me out on this. I need to know the possible ways to scale and crop "CMSampleBufferRef".

One solution is to convert the CMSampleBuffer ref to a CIImage, then scale that and write it back to CVPixelBufferRef and append that.
You can see how to do that here which contains the code structure.
Adding filters to video with AVFoundation (OSX) - how do I write the resulting image back to AVWriter?
Another alternative is to scale the video using Layer Instructions like:
AVMutableVideoCompositionLayerInstruction *layerInstruction =
[AVMutableVideoCompositionLayerInstruction videoCompositionLayerInstruction];
layerInstruction.trackID = mutableCompositionTrack.trackID;
[layerInstruction setTransform:CGAffineTransformMakeScale(2.0f,2.0f) atTime:kCMTimeZero];
This tells the composition to scale the mutableCompositionTrack (or whatever variable name you use for the track) by a factor of 2.0 starting at the beginning of video.
Now when you composite the video, add the array of layer intructions and you'll get your scaling without needing to worry about manipulating CMSampleBuffer (it will also be a lot faster).
AVMutableVideoComposition *videoComposition = [AVMutableVideoComposition videoComposition];
videoComposition.renderSize = CGSizeMake(1280, 720);
videoComposition.frameDuration = CMTimeMake(1, 30);
videoComposition.instructions = #[_instructions];

Related

AVPlayer plays video composition result incorrectly

I need a simple thing: play a video while rotating and applying CIFilter on it.
First, I create the player item:
AVPlayerItem *item = [AVPlayerItem playerItemWithURL:videoURL];
// DEBUG LOGGING
AVAssetTrack *track = [[item.asset tracksWithMediaType:AVMediaTypeVideo] objectAtIndex:0];
NSLog(#"Natural size is: %#", NSStringFromCGSize(track.naturalSize));
NSLog(#"Preffered track transform is: %#", NSStringFromCGAffineTransform(track.preferredTransform));
NSLog(#"Preffered asset transform is: %#", NSStringFromCGAffineTransform(item.asset.preferredTransform));
Then I need to apply the video composition. Originally, I was thinking to create an AVVideoComposition with 2 instructions - one will be the AVVideoCompositionLayerInstruction for rotation and the other one will be CIFilter application. However, I got an exception thrown saying "Expecting video composition to contain only AVCoreImageFilterVideoCompositionInstruction" which means Apple doesn't allow to combine those 2 instructions. As a result, I combined both under the filtering, here is the code:
AVAsset *asset = playerItem.asset;
CGAffineTransform rotation = [self transformForItem:playerItem];
AVVideoComposition *composition = [AVVideoComposition videoCompositionWithAsset:asset applyingCIFiltersWithHandler:^(AVAsynchronousCIImageFilteringRequest * _Nonnull request) {
// Step 1: get the input frame image (screenshot 1)
CIImage *sourceImage = request.sourceImage;
// Step 2: rotate the frame
CIFilter *transformFilter = [CIFilter filterWithName:#"CIAffineTransform"];
[transformFilter setValue:sourceImage forKey: kCIInputImageKey];
[transformFilter setValue: [NSValue valueWithCGAffineTransform: rotation] forKey: kCIInputTransformKey];
sourceImage = transformFilter.outputImage;
CGRect extent = sourceImage.extent;
CGAffineTransform translation = CGAffineTransformMakeTranslation(-extent.origin.x, -extent.origin.y);
[transformFilter setValue:sourceImage forKey: kCIInputImageKey];
[transformFilter setValue: [NSValue valueWithCGAffineTransform: translation] forKey: kCIInputTransformKey];
sourceImage = transformFilter.outputImage;
// Step 3: apply the custom filter chosen by the user
extent = sourceImage.extent;
sourceImage = [sourceImage imageByClampingToExtent];
[filter setValue:sourceImage forKey:kCIInputImageKey];
sourceImage = filter.outputImage;
sourceImage = [sourceImage imageByCroppingToRect:extent];
// Step 4: finish processing the frame (screenshot 2)
[request finishWithImage:sourceImage context:nil];
}];
playerItem.videoComposition = composition;
The screenshots I made during debugging show that the image is successfully rotated and the filter is applied (in this example it was an identity filter which doesn't change the image). Here are the screenshot 1 and screenshot 2 which were taken at the points marked in the comments above:
As you can see, the rotation is successful, the extent of the resulting frame was also correct.
The problem starts when I try to play this video in a player. Here is what I get:
So seems like all the frames are scaled and shifted down. The green area is the empty frame info, when I clamp to extent to make frame infinite size it shows border pixels instead of green. I have a feeling that the player still takes some old size info before rotation from the AVPlayerItem, that's why in the first code snippet above I was logging the sizes and transforms, there are the logs:
Natural size is: {1920, 1080}
Preffered track transform is: [0, 1, -1, 0, 1080, 0]
Preffered asset transform is: [1, 0, 0, 1, 0, 0]
The player is set up like this:
layer.videoGravity = AVLayerVideoGravityResizeAspectFill;
layer.needsDisplayOnBoundsChange = YES;
PLEASE NOTE the most important thing: this only happens to videos which were recorded by the app itself using camera in landscape iPhone[6s] orientation and saved on the device storage previously. The videos that the app records in portrait mode are totally fine (by the way, the portrait videos got exactly the same size and transform log like landscape videos! strange...maybe iphone puts the rotation info in the video and fixes it). So zooming and shifting the video seems like a combination of "aspect fill" and old resolution info before rotation. By the way, the portrait video frames are shown partially because of scaling to fill the player area which has a different aspect ratio, but this is expected behavior.
Let me know your thoughts on this and, if you know a better way how to accomplish what I need, then it would be great to know.
UPDATE: There comes out to be an easier way to "change" the AVPlayerItem video dimensions during playback - set the renderSize property of video composition (can be done using AVMutableVideoComposition class).
MY OLD ANSWER BELOW:
After a lot of debugging I understood the problem and found a solution. My initial guess that AVPlayer still considers the video being of the original size was correct. In the image below it is explained what was happening:
As for the solution, I couldn't find a way to change the video size inside AVAsset or AVPlayerItem. So I just manipulated the video to fit the size and scale that AVPlayer was expecting, and then when playing in a player with correct aspect ratio and flag to scale and fill the player area - everything looks good. Here is the graphical explanation:
And here goes the additional code that needs to be inserted in the applyingCIFiltersWithHandler block mentioned in the question:
... after Step 3 in the question codes above
// make the frame the same aspect ratio as the original input frame
// by adding empty spaces at the top and the bottom of the extent rectangle
CGFloat newHeight = originalExtent.size.height * originalExtent.size.height / extent.size.height;
CGFloat inset = (extent.size.height - newHeight) / 2;
extent = CGRectInset(extent, 0, inset);
sourceImage = [sourceImage imageByCroppingToRect:extent];
// scale down to the original frame size
CGFloat scale = originalExtent.size.height / newHeight;
CGAffineTransform scaleTransform = CGAffineTransformMakeScale(scale, scale);
[transformFilter setValue:sourceImage forKey: kCIInputImageKey];
[transformFilter setValue: [NSValue valueWithCGAffineTransform: scaleTransform] forKey: kCIInputTransformKey];
sourceImage = transformFilter.outputImage;
// translate the frame to make it's origin start at (0, 0)
CGAffineTransform translation = CGAffineTransformMakeTranslation(0, -inset * scale);
[transformFilter setValue:sourceImage forKey: kCIInputImageKey];
[transformFilter setValue: [NSValue valueWithCGAffineTransform: translation] forKey: kCIInputTransformKey];
sourceImage = transformFilter.outputImage;

Change camera resolution if I have width and height in iOS

I need to change live video resolution with the width and height inputted by user. Sorry for my question but I have never done it before.
Please help.
You can change video resolution by using AVMutableVideoComposition and AVAssetExportSession.
First create object of AVMutableVideoComposition shown below.
AVMutableVideoComposition* videoComposition = [AVMutableVideoComposition videoComposition];
videoComposition.frameDuration = CMTimeMake(1, 30);
videoComposition.renderSize = CGSizeMake(YOUR_WIDTH, YOUR_HEIGHT);
Then, create object of AVAssetExportSession,
exporter = [[AVAssetExportSession alloc] initWithAsset:asset presetName:AVAssetExportPresetHighestQuality];
exporter.videoComposition = videoComposition;
And write completionBlock for exporter.
Hope this helps.
If you are using OpenTok, you can use a custom video capturer that is mostly identical to the one found in this sample. The only difference is that you would need to additionally write code to scale the image from the CVPixelBuffer (called imageBuffer) to the size which your user is setting.
One way technique to scale the image would be to use the CoreImage APIs as shown here: https://stackoverflow.com/a/8494304/305340

insertEmptyTimeRange to AVMutableCompositionTrack not working

I'm stitching videos together in an AVMutableCompositionTrack, using this:
AVMutableVideoCompositionLayerInstruction *passThroughLayer = [AVMutableVideoCompositionLayerInstruction videoCompositionLayerInstructionWithAssetTrack:videoTrack];
I'm also adding a CALayer with text and images to the composition, using an animationLayer.
At the beginning, I add 5 seconds of nothing to insert a title using insertEmptyTimeRange.
Up to here, everything's working fine.
Now I want to add some «nothing» to the end of the video, using insertEmptyTimeRange again - but that fails miserably.
CMTime creditsDuration = CMTimeMakeWithSeconds(5, 600);
CMTimeRange creditsRange = CMTimeRangeMake([[compositionVideoTrack asset] duration], creditsDuration);
[compositionVideoTrack insertEmptyTimeRange:creditsRange];
[compositionAudioTrack insertEmptyTimeRange:creditsRange];
NSLog(#"credit-range %f from %f", CMTimeGetSeconds(creditsRange.duration), CMTimeGetSeconds(creditsRange.start));
NSLog(#"Total duration %f", CMTimeGetSeconds([[compositionVideoTrack asset] duration]));
The insert-points are correct (first NSLog), but the total duration won't get extended...
Any ideas what I could be doing wrong?
Turns out, it seems to be impossible to add an empty timerange to the end of an AVMutableComposition.
This answer saved my life: AVMutableComposition of a Solid Color with No AVAsset

How does one apply different frame rates to videoDataOutput and session's previewLayer in AVFoundation?

I am developing an augmented reality application with AVFoundation. Basically, I need to start up the camera, provide a instant preview, and get image samples every 1 second. Currently I am using AVCaptureVideoPreviewLayer for camera preview, and AVCaptureVideoDataOutput to get sample frames.
But the problem is, the reasonable frame rate for AVCaptureVideoPreviewLayer is way too high for AVCaptureVideoDataOutput. How can I apply different frame rates to them?
Thanks.
There is no answer yet, so I put my temporary solution here:
Firstly, add a property:
#property (assign, nonatomic) NSTimeInterval lastFrameTimestamp;
And the delegate method:
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
{
CMTime timestamp = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
NSTimeInterval currentFrameTimestamp = (double)timestamp.value / timestamp.timescale;
if (currentFrameTimestamp - self.lastFrameTimestamp > secondsBetweenSampling) {
self.lastFrameTimestamp = currentFrameTimestamp;
// deal with the sampleBuffer here
}
}
The idea is pretty simple. As we can see, I just keep the frame rate high so the preview layer is satisfied. But upon receiving an output, I check the timestamp every time to decide whether to deal with that frame or just ignore it.
Still looking for better ideas ;)

AV Foundation: AVCaptureVideoPreviewLayer and frame duration

I am using the AV Foundation to process frames from the video camera (iPhone 4s, iOS 6.1.2). I am setting up AVCaptureSession, AVCaptureDeviceInput, AVCaptureVideoDataOutput per the AV Foundation programming guide. Everything works as expected and I am able to recieve frames in the captureOutput:didOutputSampleBuffer:fromConnection: delegate.
I also have a preview layer set like this:
AVCaptureVideoPreviewLayer *videoPreviewLayer = [[AVCaptureVideoPreviewLayer alloc] initWithSession:_captureSession];
[videoPreviewLayer setFrame:self.view.bounds];
videoPreviewLayer.videoGravity = AVLayerVideoGravityResizeAspectFill;
[self.view.layer insertSublayer:videoPreviewLayer atIndex:0];
Thing is, I don't need 30 frames per second in my frame handling and I am not able to process them so fast anyway. So I am using this code to limit the frame duration:
// videoOutput is AVCaptureVideoDataOutput set earlier
AVCaptureConnection *conn = [videoOutput connectionWithMediaType:AVMediaTypeVideo];
[conn setVideoMinFrameDuration:CMTimeMake(1, 10)];
[conn setVideoMaxFrameDuration:CMTimeMake(1, 2)];
This works fine and limits the frames recieved by the captureOutput delegate.
However, this also limits the frames per second on the preview layer and preview video becomes very unresponsive.
I understand from the documentation that the frame duration is set independently on the connection and the preview layer has indeed a different AVCaptureConnection. Checking the mix/max frame durations on [videoPreviewLayer connection] shows that it's indeed set to the defaults (1/30 and 1/24) and different than the durations set on the connection of the AVCaptureVideoDataOutput.
So, is it possible to limit the frame duration only on the frame capturing output and still see a 1/24-1/30 frame duration on the preview video? How?
Thanks.
While you're correct that there are two AVCaptureConnections, that doesn't mean they can have independently set the minimum and maximum frame durations. This is because they are sharing the same physical hardware.
If connection #1 is activating the rolling shutter at a rate of (say) five frames/sec with a frame duration of 1/5 sec, there is no way that connection #2 can simultaneously activate the shutter 30 times/sec with a frame duration of 1/30 sec.
To get the effect you want would require two cameras!
The only way to get close to what you want is to follow an approach along the lines of that outlined by Kaelin Colclasure in the answer of 22 March.
You do have options of being a little more sophisticated within that approach, however. For example, you can use a counter to decide which frames to drop, rather than making the thread sleep. You can make that counter respond to the actual frame-rate that's coming through (which you can get from the metadata that comes in to the captureOutput:didOutputSampleBuffer:fromConnection: delegate along with the image data, or which you can calculate yourself by manually timing the frames). You can even do a very reasonable imitation of a longer exposure by compositing frames rather than dropping them—just as a number of "slow shutter" apps in the App Store do (leaving aside details—such as differing rolling shutter artefacts—there's not really that much difference between one frame scanned at 1/5 sec and five frames each scanned at 1/25 sec and then glued together).
Yes, it's a bit of work, but you are trying to make one video camera behave like two, in real time—and that's never going to be easy.
Think of it this way:
You ask the capture device to limit frame duration, so you get better exposure.
Fine.
You want to preview at higher frame rate.
If you were to preview at higher rate, then the capture device (the camera) would NOT have enough time to expose the frame so you get better exposure at the captured frames.
It is like asking to see different frames in preview than the ones captured.
I think that, if it was possible, it would also be a negative user experience.
I had the same issue for my Cocoa (Mac OS X) application. Here's how I solved it:
First, make sure to process the captured frames on a separate dispatch queue. Also make sure any frames you're not ready to process are discarded; this is the default, but I set the flag below anyway just to document that I'm depending on it.
videoQueue = dispatch_queue_create("com.ohmware.LabCam.videoQueue", DISPATCH_QUEUE_SERIAL);
videoOutput = [[AVCaptureVideoDataOutput alloc] init];
[videoOutput setAlwaysDiscardsLateVideoFrames:YES];
[videoOutput setSampleBufferDelegate:self
queue:videoQueue];
[session addOutput:videoOutput];
Then when processing the frames in the delegate, you can simply have the thread sleep for the desired time interval. Frames that the delegate is not awake to handle are quietly discarded. I implement the optional method for counting dropped frames below just as a sanity check; my application never logs dropping any frames using this technique.
- (void)captureOutput:(AVCaptureOutput *)captureOutput
didDropSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection;
{
OSAtomicAdd64(1, &videoSampleBufferDropCount);
}
- (void)captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection;
{
int64_t savedSampleBufferDropCount = videoSampleBufferDropCount;
if (savedSampleBufferDropCount && OSAtomicCompareAndSwap64(savedSampleBufferDropCount, 0, &videoSampleBufferDropCount)) {
NSLog(#"Dropped %lld video sample buffers!!!", savedSampleBufferDropCount);
}
// NSLog(#"%s", __func__);
#autoreleasepool {
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
CIImage * cameraImage = [CIImage imageWithCVImageBuffer:imageBuffer];
CIImage * faceImage = [self faceImage:cameraImage];
dispatch_sync(dispatch_get_main_queue(), ^ {
[_imageView setCIImage:faceImage];
});
}
[NSThread sleepForTimeInterval:0.5]; // Only want ~2 frames/sec.
}
Hope this helps.

Resources