Android MediaCodec: how to request a key frame when encoding - encode

In Android4.1, a key frame is often requested in a real-time encoding application. But how to do it using MediaCodec object? The current Android4.2 SDK seems not support it.

You can produce random keyframe by specifying MediaCodec.BUFFER_FLAG_SYNC_FRAME when queuing input buffers:
MediaCodec codec = MediaCodec.createDecoderByType(type);
codec.configure(format, ...);
codec.start();
ByteBuffer[] inputBuffers = codec.getInputBuffers();
for (;;) {
int inputBufferIndex = codec.dequeueInputBuffer(timeoutUs);
if (inputBufferIndex >= 0) {
// fill inputBuffers[inputBufferIndex] with valid data
...
codec.queueInputBuffer(inputBufferIndex, 0, inputBuffers[inputBufferIndex].limit(), presentationTime,
isKeyFrame ? MediaCodec.BUFFER_FLAG_SYNC_FRAME : 0);
}
}
Stumbled upon the need to insert random keyframe when encoding video on Galaxy Nexus.
On it, MediaCodec didn't automatically produce keyframe at the start of the video.

MediaCodec has a method called setParameters which comes to the rescue.
In Kotlin you can do it like:
fun yieldKeyFrame(): Boolean {
val param = Bundle()
param.putInt(MediaCodec.PARAMETER_KEY_REQUEST_SYNC_FRAME, 0)
try {
videoEncoder.setParameters(param)
return true
} catch (e: IllegalStateException) {
return false
}
}
in above snippet, the videoEncoder is an instance of MediaCodec configured to encode.

You can request a periodic key frame by setting the KEY_I_FRAME_INTERVAL key when configuring the encoder. In the example below I am requesting one every two seconds. I've omitted the other keys like frame rate or color format for the sake of clarity, but you will still want to include them.
encoder = MediaCodec.createByCodecName(codecInfo.getName());
MediaFormat inputFormat = MediaFormat.createVideoFormat(mimeType, width, height);
/* ..... set various format options here ..... */
inputFormat.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, 2);
encoder.configure(inputFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
encoder.start();
I suspect, however, that what you are really asking is how to request a random key frame while encoding, like at the start of a cut scene. Unfortunately I haven't seen an interface for that. It is possible that stopping and restarting the encoder would have the effect of creating a new key frame at the restart. When I have the opportunity to try that, I'll post the result here.
I hope this was helpful.
Thad Phetteplace - GLACI, Inc.

Related

Getting video Rotation metadata value using JavaCV

It seems there are lots of related topics out there, but I was not able to find the answer I'm searching for.
I'm using JavaCV and FFmpegFrameGrabber to get an image from the middle of the video. If an mp4 file has a "Rotation" metadata field (like 90 or 270), I'm getting an image that is positioned not correctly. I wanted to get Orientation from FFmpegFrameGrabber, but could not find a way to do so.
Is there a way to tell FFmpegFrameGrabber to respect orientation, or is there a way to get this value somehow using JavaCV?
Just in case the code that I have so far
FFmpegFrameGrabber g = new FFmpegFrameGrabber(input);
g.start();
g.getVideoMetadata(); // <-- this thing is empty
try {
g.setFrameNumber(g.getLengthInFrames() / 2);
Java2DFrameConverter converter = new Java2DFrameConverter();
Frame frame = g.grabImage();
BufferedImage bufferedImage = converter.convert(frame);
ImageIO.write(bufferedImage, "jpeg", output);
} finally {
g.stop();
}

A method to resize frame after decoding on client side

I've just joined a project to build a realtime video streaming application using ffmpeg/opencv/c++ via udp socket. On server side, they want to transmit a video size (640x480) to client, in order to reduce data transmission through network I resize the video to (320x240) and send frame. On client side (client), after receiving frame, we will upscale the frame back to (640x480). Using H265 for encode/decoding.
As I am just a beginner with video encoding, I would like to understand how to down-sampling & up-sampling the frame at server & client side in which we can incorporate with the video encoder/decoder.
A simple idea came into my mind that after decoding avframe -> Mat frame, I will upsampling this frame then display it.
I am not sure my idea is right or wrong. I would like to seek advice from any people who had experience in this area. Thank you very much!
static void updateFrameCallback(AVFrame *avframe, void* userdata) {
VideoStreamUDPClient* streamer = static_cast<VideoStreamUDPClient*> (userdata);
TinyClient* client = static_cast<TinyClient*> (streamer->userdata);
//Update Frame
pthread_mutex_lock(&client->mtx_updateFrame);
if (streamer->irect.width == client->frameSize.width
&& streamer->irect.height == client->frameSize.height) {
cvtAVFrameYUV4202Frame(&avframe, client->frame);
printf("TinyClient: Received Full Frame\n");
} else {
Mat block;
cvtAVFrameYUV4202Frame(&avframe, block);
block.copyTo(client->frame(streamer->irect));
}
//How to resize frame before display it!!!
imshow("Frame", client->frame);
waitKey(1);
pthread_mutex_unlock(&client->mtx_updateFrame);
}
From what I understand you want just to resize frame after decoding. In opencv you can do it like this
if(...)
{
...
}
else
{
Mat block;
cvtAVFrameYUV4202Frame(&avframe, block);
Mat temp;
resize(block, temp, 640,480)
temp.copyTo(client->frame(streamer->irect));
}
//client->frame will always have 640x480
imshow("Frame" client->frame);
I don't have much experience in video decoding but from what I know you cannot resize video( or single frame) without decoding it first.

Receieved msg_Image getting distored while displaying in openCv

I have published an image from one node and then i want to subscribe that image in my second node. But after subscribing it in the second node, when i try to store it in cv::Mat image then, it get distorted.
The patchImage in the following code is distored. there are some horizontal lines and four images of the same image merged.
An overview of my code is following.
first_node_publisher
{
im.header.stamp = time;
im.width = width;
im.height = height;
im.step = 3*width;
im.encoding = "rgb8";
image_pub.publish(im);
}
second_node_imageCallBack(const sensor_msgs::ImageConstPtr& msg)
{
cv::Mat patchImage;
cv_bridge::CvImagePtr cv_ptr;
try
{
cv_ptr = cv_bridge::toCvCopy(msg, sensor_msgs::image_encodings::RGB8); //
}
catch (cv_bridge::Exception& e)
{
ROS_ERROR("cv_bridge exception: %s", e.what());
}
patchImage=cv_ptr->image;
imshow("Received Image", patchImage); //This patchImage is distored
}
I believe the problem is with your encoding setting, are you sure the encoding is actually rgb8? That is unlikely because OpenCV stores images by default in the BGR format (such as CV_8UC3). It is also possible that your images are actually not even stored as unsigned characters, but shorts, floats, doubles, etc.
I always include assert(image.type==CV_8UC3) in my publishers to make sure the encoding is correct

How to know the size of one AVPacket in ffmpeg?

I am using ffmpeg library. I want to know how much memory one packet can take.
I debug to check the members in on AVPacket, and none of them seem reasonable, such as AVPacket.size, ec.
If you provide your own data buffer, it needs to have a size of mininum FF_MIN_BUFFER_SIZE. You would then set the AVPacket.size to the allocated size, and AVPacket.data to the memory you've allocated.
Note that all FFmpeg decoding routine will simply fail if you provide your own buffer and it's too small.
The other possibility, is let FFmpeg calculates the optimal size for you.
Then do something like:
AVPacket pkt;
pkt.size = 0;
pkt.data = NULL; // <-- the critical part is there
int got_output = 0;
ret = avcodec_encode_audio2(ctx, &pkt, NULL, &got_output);
and provide this AVPacket to the encoding codec. Memory will be allocated automatically.
You will have to call av_free_packet upon return from the encoder and if got_output is set to 1.
FFmpeg will automatically free the AVPacket content in case of error.
AVPacket::size holds the size of the referenced data. Because it is a generic container for data, there can be no definite answer to the question
how much memory one packet can take
It can actually take from zero to a lot. Everything depends on data type, codec and other related parameters.
From FFmpeg examples:
static void audio_encode_example(const char *filename)
{
// ...
AVPacket pkt;
// ...
ret = avcodec_encode_audio2(c, &pkt, NULL, &got_output);
// ...
if (got_output) {
fwrite(pkt.data, 1, pkt.size, f); // <<--- AVPacket.size
av_free_packet(&pkt);
}

opencv cvGrabFrame frame rate on iOS?

I'm using openCV to split a video into frames. For that I need the fps and duration. Both of these value return 1 when asking them via cvGetCaptureProperty.
I've made a hack where I use AVURLAsset to get the fps and duration, but when I combine that with openCV I get only a partial video. It seems like it's missing frames.
This is my code right now:
while (cvGrabFrame(capture)) {
frameCounter++;
if (frameCounter % (int)(videoFPS / MyDesiredFramesPerSecond) == 0) {
IplImage *frame = cvCloneImage(cvRetrieveFrame(capture));
// Do Stuff
}
if (frameCounter > duration*fps)
break; // this is here because the loop never stops on its own
}
How can I get all the frames of a video using openCV on iOS? (opencv 2.3.2)
According to the documentation you should check the value returned by cvRetrieveFrame(), if a null pointer is returned you're at the end of the video sequence. Then you break the loop when that happens, instead of relying on the accuracy of FPS*frame_number.

Resources