WebRTC iOS: Filtering camera stream from RTCCameraVideoCapturer. Conversion from RTCFrame to CVPixelBuffer

WebRTC iOS: Filtering camera stream from RTCCameraVideoCapturer. Conversion from RTCFrame to CVPixelBuffer - ios

I found the git below is simple and efficient by using func capturer(_ capturer: RTCVideoCapturer, didCapture frame: RTCVideoFrame) of RTCVideoCapturerDelegate. You get RTCVideoFrame and then convert to CVPixelBuffer to modify.
https://gist.github.com/lyokato/d041f16b94c84753b5e877211874c6fc
However, I found Chronium says nativeHandle to get PixelBuffer is no more available(link below). I tried frame.buffer.pixelbuffer..., but, looking at framework > Headers > RTCVideoFrameBuffer.h, I found CVPixelBuffer is also gone from here!
https://codereview.webrtc.org/2990253002
Is there any good way to convert RTCVideoFrame to CVPixelBuffer?
Or do we have better way to modify captured video from RTCCameraVideoCapturer?
Below link suggests modifying sdk directly but hopefully we can achieve this on Xcode.
How to modify (add filters to) the camera stream that WebRTC is sending to other peers/server

can you specify what is your expectation? because you can get pixel buffer from RTCVideoframe easily but I feel there can be a better solution if you want to filter video buffer than sent to Webrtc, you should work with RTCVideoSource.
you can get buffer with
as seen
RTCCVPixelBuffer *buffer = (RTCCVPixelBuffer *)frame.buffer;
CVPixelBufferRef imageBuffer = buffer.pixelBuffer;
(with latest SDK and with local video camera buffer only)
but in the sample i can see that filter will not work for remote.
i have attached the screenshot this is how you can check the preview as well.

Related

React native: Real time camera data without image save and preview

I started working on my first non-demo react-native app. I hope it will be a iOS/Android app, but actually I'm focused on iOS only.
I have a one problem actually. How can I get a data (base64, array of pixels, ...) in real-time from the camera without saving to the camera roll.
There is this module: https://github.com/lwansbrough/react-native-camera but base64 is deprecated and is useless for me, because I want a render processed image to user (change picture colors eg.), not the real picture from camera, as it does react-native-camera module.
(I know how to communicate with SWIFT code, but I don't know what the options are in native code, I come here from WebDev)
Thanks a lot.

This may not be optimal but is what I have been using. If anyone can give a better solution, I would appreciate your help, too!
My basic idea is simply to loop (but not simple for-loop, see below) taking still pictures in yuv/rgb format at max resolution, which is reasonably fast (~x0ms with normal exposure duration) and process them. Basically you will setup AVCaptureStillImageOutput that links to you camera (following tutorials everywhere) then set the format to kCVPixelFormatType_420YpCbCr8BiPlanarFullRange (if you want YUV) or kCVPixelFormatType_32BGRA(if you prefer rgba) like
bool usingYUVFormat = true;
NSDictionary *outputFormat = [NSDictionary dictionaryWithObject:
[NSNumber numberWithInt:usingYUVFormat?kCVPixelFormatType_420YpCbCr8BiPlanarFullRange:kCVPixelFormatType_32BGRA]
forKey:(id)kCVPixelBufferPixelFormatTypeKey];
[yourAVCaptureStillImageOutput setOutputSettings:outputFormat];
When you are ready, you can start calling
AVCaptureConnection *captureConnection=[yourAVCaptureStillImageOutput connectionWithMediaType:AVMediaTypeVideo];
[yourAVCaptureStillImageOutput captureStillImageAsynchronouslyFromConnection:captureConnection completionHandler:^(CMSampleBufferRef imageDataSampleBuffer, NSError *error) {
if(imageDataSampleBuffer){
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(imageDataSampleBuffer);
CVPixelBufferLockBaseAddress(imageBuffer, 0);
// do your magic with the data buffer imageBuffer
// use CVPixelBufferGetBaseAddressOfPlane(imageBuffer,0/1/2); to get each plane
// use CVPixelBufferGetWidth/CVPixelBufferGetHeight to get dimensions
// if you want more, please google
}
}];
Additionally, use NSNotificationCenter to register your photo-taking action and post a notification after you have processed each frame (with some delay perhaps, to cap your through-put and reduce power consumption) so the loop will keep going.
A quick precaution: the Android counterpart is much worse a headache. Few hardware manufacturers implement api for max-resolution uncompressed photos but only 1080p for preview/video, as I have raised in my question. I am still looking for solutions but gave up most hope. JPEG images are just toooo slow.

How to decode a H.264 frame on iOS by hardware decoding?

I have been used ffmpeg to decode every single frame that I received from my ip cam. The brief code looks like this:
-(void) decodeFrame:(unsigned char *)frameData frameSize:(int)frameSize{
AVFrame frame;
AVPicture picture;
AVPacket pkt;
AVCodecContext *context;
pkt.data = frameData;
pat.size = frameSize;
avcodec_get_frame_defaults(&frame);
avpicture_alloc(&picture, PIX_FMT_RGB24, targetWidth, targetHeight);
avcodec_decode_video2(&context, &frame, &got_picture, &pkt);
}
The code woks fine, but it's software decoding. I want to enhance the decoding performance by hardware decoding. After lots of research, I know it may be achieved by AVFoundation framework.
The AVAssetReader class may help, but I can't figure out what's the next.Could anyone points out the following steps for me? Any help would be appreciated.

iOS does not provide any public access directly to the hardware decode engine, because hardware is always used to decode H.264 video on iOS.
Therefore, session 513 gives you all the information you need to allow frame-by-frame decoding on iOS. In short, per that session:
Generate individual network abstraction layer units (NALUs) from your H.264 elementary stream. There is much information on how this is done online. VCL NALUs (IDR and non-IDR) contain your video data and are to be fed into the decoder.
Re-package those NALUs according to the "AVCC" format, removing NALU start codes and replacing them with a 4-byte NALU length header.
Create a CMVideoFormatDescriptionRef from your SPS and PPS NALUs via CMVideoFormatDescriptionCreateFromH264ParameterSets()
Package NALU frames as CMSampleBuffers per session 513.
Create a VTDecompressionSessionRef, and feed VTDecompressionSessionDecodeFrame() with the sample buffers
Alternatively, use AVSampleBufferDisplayLayer, whose -enqueueSampleBuffer: method obviates the need to create your own decoder.

Edit:
This link provide more detail explanation on how to decode h.264 step by step: stackoverflow.com/a/29525001/3156169
Original answer:
I watched the session 513 "Direct Access to Video Encoding and Decoding" in WWDC 2014 yesterday, and got the answer of my own question.
The speaker says:
We have Video Toolbox(in iOS 8). Video Toolbox has been there on
OS X for a while, but now it's finally populated with headers on
iOS.This provides direct access to encoders and decoders.
So, there is no way to do hardware decoding frame by frame in iOS 7, but it can be done in iOS 8.
Is there anyone figure out how to directly access to video encoding and decoding frame by frame in iOS 8?

Efficient use of Core Image with AV Foundation

I'm writing an iOS app that applies filters to existing video files and outputs the results to new ones. Initially, I tried using Brad Larson's nice framework, GPUImage. Although I was able to output filtered video files without much effort, the output wasn't perfect: the videos were the proper length, but some frames were missing, and others were duplicated (see Issue 1501 for more info). I plan to learn more about OpenGL ES so that I can better investigate the dropped/skipped frames issue. However, in the meantime, I'm exploring other options for rendering my video files.
I'm already familiar with Core Image, so I decided to leverage it in an alternative video-filtering solution. Within a block passed to AVAssetWriterInput requestMediaDataWhenReadyOnQueue:usingBlock:, I filter and output each frame of the input video file like so:
CMSampleBufferRef sampleBuffer = [self.assetReaderVideoOutput copyNextSampleBuffer];
if (sampleBuffer != NULL)
{
CMTime presentationTimeStamp = CMSampleBufferGetOutputPresentationTimeStamp(sampleBuffer);
CVPixelBufferRef inputPixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
CIImage* frame = [CIImage imageWithCVPixelBuffer:inputPixelBuffer];
// a CIFilter created outside the "isReadyForMoreMediaData" loop
[screenBlend setValue:frame forKey:kCIInputImageKey];
CVPixelBufferRef outputPixelBuffer;
CVReturn result = CVPixelBufferPoolCreatePixelBuffer(NULL, assetWriterInputPixelBufferAdaptor.pixelBufferPool, &outputPixelBuffer);
// verify that everything's gonna be ok
NSAssert(result == kCVReturnSuccess, #"CVPixelBufferPoolCreatePixelBuffer failed with error code");
NSAssert(CVPixelBufferGetPixelFormatType(outputPixelBuffer) == kCVPixelFormatType_32BGRA, #"Wrong pixel format");
[self.coreImageContext render:screenBlend.outputImage toCVPixelBuffer:outputPixelBuffer];
BOOL success = [assetWriterInputPixelBufferAdaptor appendPixelBuffer:outputPixelBuffer withPresentationTime:presentationTimeStamp];
CVPixelBufferRelease(outputPixelBuffer);
CFRelease(sampleBuffer);
sampleBuffer = NULL;
completedOrFailed = !success;
}
This works well: the rendering seems reasonably fast, and the resulting video file doesn't have any missing or duplicated frames. However, I'm not confident that my code is as efficient as it could be. Specifically, my questions are
Does this approach allow the device to keep all frame data on the GPU, or are there any methods (e.g. imageWithCVPixelBuffer: or render:toCVPixelBuffer:) that prematurely copy pixels to the CPU?
Would it be more efficient to use CIContext's drawImage:inRect:fromRect: to draw to an OpenGLES context?
If the answer to #2 is yes, what's the proper way to pipe the results of drawImage:inRect:fromRect: into a CVPixelBufferRef so that it can be appended to the output video file?
I've searched for an example of how to use CIContext drawImage:inRect:fromRect: to render filtered video frames, but haven't found any. Notably, the source for GPUImageMovieWriter does something similar, but since a) I don't really understand it yet, and b) it's not working quite right for this use case, I'm wary of copying its solution.

How to use AVAssetWriter instead of AVAssetExportSession to re-encode existing video

I'm trying to re-encode videos on an iPad which were recorded on that device but with the "wrong" orientation. This is because when the file is converted to an MP4 file and uploaded to a web server for use with the "video" HTML5 tag, only Safari seems to render the video with the correct orientation.
Basically, I've managed to implement what I wanted by using a AVMutableVideoCompositionLayerInstruction, and then using AVAssetExportSession to create the resultant video with audio. However, the problem is that the file sizes jump up considerably after doing this, eg correcting an original file of 4.1MB results in a final file size of 18.5MB! All I've done is rotate the video through 180 degrees!! Incidentally, the video instance that I'm trying to process was originally created by the UIImagePicker during "compression" using videoQuality = UIImagePickerControllerQualityType640x480, which actually results in videos of 568 x 320 on an iPad mini.
I experimented with the various presetName settings on AVAssetExportSession but I couldn't get the desired result. The closest I got filesize-wise was 4.1MB (ie exactly the same as source!) by using AVAssetExportPresetMediumQuality, BUT this also reduced the dimensions of the resultant video to 480 x 272 instead of the 568 x 320 that I had explicitly set.
So, this led me to look into other options, and hence using AVAssetWriter instead. The problem is, I can't get any code that I have found to work! I tried the code found on this SO post (Video Encoding using AVAssetWriter - CRASHES), but can't get it to work. For a start, I get a compilation error for this line:
NSDictionary *videoOptions = [NSDictionary dictionaryWithObject:[NSNumber numberWithInt:kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange] forKey:(id)kCVPixelBufferPixelFormatTypeKey];
The resultant compilation error being:
Undefined symbols for architecture armv7: "_kCVPixelBufferPixelFormatTypeKey"
This aside, I tried passsing in nil for the AVAssetReaderTrackOutput's outputSettings, which should be OK according to header info:
A value of nil for outputSettings configures the output to vend samples in their original format as stored by the specified track.
However, I then get a crash happening at this line:
BOOL result = [videoWriterInput appendSampleBuffer:sampleBuffer];
In short, I've not been able to get any code to work with AVAssetWriter, so I REALLY need some help here. Are there any other ways to achieve my desired results? Incidentally, I'm using Xcode 4.6 and I'm targeting everything from iOS5 upwards, using ARC.

I have solved similar problems related to your questions. This might help someone who has similar problems:
Assuming writerInput is your object instance of AVAssetWriterInput and assetTrack is the instance of your AVAssetTrack, then your transform problem is solved by simply:
writerInput.transform = assetTrack.preferredTransform;
You have to release sampleBuffer after appending your sample buffer, so you would have something like:
if (sampleBuffer = [asset_reader_output copyNextSampleBuffer]) {
BOOL result = [writerInput appendSampleBuffer:sampleBuffer];
CFRelease(sampleBuffer); // Release sampleBuffer!
}

The compilation error was caused by me not including the CoreVideo.framework. As soon as I had included that and imported it, I could get the code to compile. Also, the code would work and generate a resultant video, but I uncovered 2 new problems:
I can't get the transform to work using the transform property on AVAssetWriterInput. This means that I'm stuck with using a AVMutableVideoCompositionInstruction and AVAssetExportSession for the transformation.
If I use AVAssetWriter to just handle compression (seeing as I don't have many options with AVAssetExportSession), I still have a bad memory leak. I've tried everything I can think of, starting with the solution in this link ( Help Fix Memory Leak release ) and also with #autorelease blocks at key points. But it seems that the following line will cause a leak, no matter what I try:
CMSampleBufferRef sampleBuffer = [asset_reader_output copyNextSampleBuffer];
I could really do with some help.

get yuv planar format image from camera - iOS

I am using AVFoundation to capture still images from camera (capturing still images and not video frame) using captureStillImageAsynchronouslyFromConnection. This gives to me a buffer of type CMSSampleBuffer, which I am calling imageDataSampleBuffer.
As far as I have understood, this buffer can contain any type of data related to media, and the type of data is determined when I am configuring the output settings.
for output settings, I make a dictionary with value: AVVideoCodecJPEG for key: AVVideoCOdecKey.
There is no other codec option. But when I read the AVFoundation Programming Guide>Media Capture, I can see that 420f, 420v, BGRA, jpeg are the available encoded formats supported for iPhone 3gs (which i am using)
I want to get the yuv420 (i.e. 420v) formatted image data into the imageSampleBuffer. Is that possible?
if I print the availableImageDataCodecTypes, I get only JPEG
if I print availableImageDataCVPixelFormatTypes, I get three numbers 875704422, 875704438, 1111970369.
Is it possible that these three numbers map to 420f, 420v, BGRA?
If yes, which key should I modify in my output settings?
I tried putting the value: [NSNumber numberWithInt:875704438] for key: (id)kCVPixelBufferPixelFormatTypeKey.
Would it work?
If yes, how do I extract this data from the imageSampleBuffer?
Also, In which format is UIImage stored? Can it be any format? Is it just NSData with some extra info which makes it interpreted as an image?

I have been trying to use this method :
Raw image data from camera like "645 PRO"
I am saving the data using writeToFile and I have been trying to open it using irfan view.
But I am unable to verify whether or not the saved file is in yuv format ot not because irfan view gives error that it is unable to read the headers.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart