I am having trouble getting a UIImage out of the frames I am reading into my iOS FFmpeg project. I need to be able to read a frame in, and then convert this to a UIImage in order to display the frame in a UIImageView. My code appears to be reading in the frames, but I am lost as to how to convert them as there is little documentation on how to do this. Can anyone help?
while (!finished) {
if (av_read_frame(_formatContext, &packet) >= 0) {
if (packet.stream_index == _videoStream) {
int ret = avcodec_send_packet(_codecContext, &packet);
if (ret < 0 || ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) {
printf("av_codec_send_packet error ");
}
while (ret >= 0) {
ret = avcodec_receive_frame(_codecContext, _frame);
if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) {
printf("avcodec_receive_frame error ");
}
finished = true;
}
}
av_packet_unref(&packet);
}
}
You should know about pixel formats like rgb and yuv. Videos almost always uses yuv formats like yuv420p. Then study AVFrame structure, here some info:
AVFormat.format : Current frame's pixel format i.e. AV_PIX_FMT_YUV420P
AVFormat.width : Horizontal length of current frame (hence width) unit: pixels
AVFormat.height : Vertical length of current frame (hence height) unit: pixels
Now where is the actual frame buffer you might ask, it is in AVFormat.data[n]
n can be 0-3. Depending on the format, just first one may contain whole frame or all 4 of them. I.e. yuv420p uses 0, 1, and 2. Their linesizes (aka strides) can be obtained reading corresponding AVFormat.linesize[n] value.
As for yuv420p:
data[0] is Y plane
data[1] is U plane
data[2] is V plane
If you multiply linesize[0] with AVFrame.height, you'll get size of that plane (Y) as number of bytes.
I don't know about UIImage structure (or whatever it is), if it requeris a specific format like RGB, you need to convert your AVFrame to that format using swscale.
Here some examples: https://github.com/FFmpeg/FFmpeg/blob/master/doc/examples/scaling_video.c
In libav (ffmpeg) scaling (resizing) and pixel format conversion are done via same function.
Hope these helps.
Related
I'm trying to average every 30 frames of a video to create a blurred timelapse. I got the video reading and video writing working, but something is wrong, because I'm only seeing the blue channel! (or one channel that is being written to blue).
Any ideas? Or better ways to do this? I'm new to OpenCV. The code is in Kotlin, but I think it should be the same issue if this was Java or python or whatever.
val videoCapture = VideoCapture(parsedArgs.inputFile)
val frameSize = Size(
videoCapture.get(Videoio.CV_CAP_PROP_FRAME_WIDTH),
videoCapture.get(Videoio.CV_CAP_PROP_FRAME_HEIGHT))
val fps = videoCapture.get(Videoio.CAP_PROP_FPS)
val videoWriter = VideoWriter( parsedArgs.outputFile, VideoWriter.fourcc('M', 'J', 'P', 'G'), fps, frameSize)
val image = Mat(frameSize,CV_8UC3)
val blended = Mat(frameSize,CV_64FC3)
println("Size: $frameSize fps:$fps over $frameCount frames")
try {
while (videoCapture.read(image)) {
val frameNumber = videoCapture.get(Videoio.CAP_PROP_POS_FRAMES).toInt()
Core.flip(image, image, -1) // I shot the video upside down
Imgproc.accumulate(image,blended)
if(frameNumber>0 && frameNumber%parsedArgs.windowSize==0) {
Core.multiply(blended, Scalar(1.0/parsedArgs.windowSize), blended)
blended.convertTo(image, CV_8UC3);
videoWriter.write(image)
blended.setTo(Scalar(0.0,0.0,0.0))
println(frameNumber.toDouble()/frameCount)
}
}
} finally {
videoCapture.release()
videoWriter.release()
}
Martin Beckett led me to the right answer (thank you!). I was multiplying by a Scalar(double), which should have been my hint because I wasn't multiplying by plain-double.
It expected a Scalar, with a value for each channel so it was happily multiplying my first channel by double, and the rest by 0.
Imgproc.accumulate(image, blended64)
if (frameNumber > 0 && frameNumber % parsedArgs.windowSize == 0) {
val blendDivisor = 1.0 / parsedArgs.windowSize
Core.multiply(blended64, Scalar(blendDivisor, blendDivisor, blendDivisor), blended64)
My guess would be using different types in Imgproc.accumulate(image,blended) try converting image to match blended before combining them.
If it was writing the entire 8bit*3 pixel data into one float the first field in an openCV image is blue (it uses BGR order)
Hi I'm trying to write some camera calibration code and I'm having a hard time using MatVectors in JavaCV that should be the equivalents of std::vec in C++.
This is how i generate my image and object points:
Mat objectPoints = new Mat(allImagePoints.rows(),1,opencv_core.CV_32FC3);
float x = 0;
float y = 0;
for (int h=0;h<patternHeight;h++) {
y = h*rectangleSize;
for (int w=0;w<patternWidth;w++) {
x = w*rectangleSize;
objectPoints.getFloatBuffer().put(3*(patternWidth*h+w), x);
objectPoints.getFloatBuffer().put(3*(patternWidth*h+w)+1, y);
objectPoints.getFloatBuffer().put(3*(patternWidth*h+w)+2, 0);
}
}
MatVector allObjectPointsVec = new MatVector(allImagePoints.cols());
MatVector allImagePointsVec = new MatVector(allImagePoints.cols());
for (int i=0;i<allImagePoints.cols();i++) {
allObjectPointsVec.put(i,objectPoints);
allImagePointsVec.put(i,allImagePoints.col(i));
}
My image points are given in the Mat allImagePoints and as you can see I create corresponding vectors allObjectPointsVec and allImagePointsVec accordingly. When i try to do a camera calibration with these points i get the following error:
OpenCV Error: Assertion failed (ni > 0 && ni == ni1) in cv::collectCalibrationData, file ..\..\..\..\opencv\modules\calib3d\src\calibration.cpp, line 3193
java.lang.reflect.InvocationTargetException
...
which seems like the lengths of the image and object points don't coincide but i'm pretty sure that i got this right. Printing the MatVector objects gives
org.bytedeco.javacpp.opencv_core$MatVector[address=0x2237b8a0,position=0,limit=1,capacity=1,deallocator=org.bytedeco.javacpp.Pointer$NativeDeallocator#4d353a7a]
org.bytedeco.javacpp.opencv_core$MatVector[address=0x2237acd0,position=0,limit=1,capacity=1,deallocator=org.bytedeco.javacpp.Pointer$NativeDeallocator#772f4d0]
which also confuses me as I would have expected that the capacity should correspond to the length (number of matrices in the vector). If I print the size field I get the expected value. If i access a random element in the vector (e.g. allObjectPointsVec.get(i)) and print it to a string, I reveive the following:
AbstractArray[width=1,height=77,depth=32,channels=3] (for object points)
AbstractArray[width=1,height=77,depth=32,channels=2] (for image points)
which is what I would expect... Any ideas? To me this seems sort of a bug, also because I don't understand what the capacity represents if not the vector length...
I have an FFMPEG AVFrame in YUVJ420P and I want to convert it to a CVPixelBufferRef with CVPixelBufferCreateWithBytes. The reason I want to do this is to use AVFoundation to show/encode the frames.
I selected kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange and tried converting it since the AVFrame has the data in three planes
Y480 Cb240 Cr240. And according to what I've researched this matches the selected kCVPixelFormatType. By being biplanar I need to convert it into a buffer that contains Y480 and CbCr480 Interleaved.
I tried to create a buffer with 2 planes:
frame->data[0] on the first plane,
frame->data[1] and frame->data[2] interleaved on the second plane.
However, I'm getting return error -6661 (invalid a) from CVPixelBufferCreateWithBytes:
"Invalid function parameter. For example, out of range or the wrong type."
I don't have expertise on image processing at all, so any pointers to documentation that can get me started in the right approach to this problem are appreciated. My C skills aren't top of the line either so maybe I'm making a basic mistake here.
uint8_t **buffer = malloc(2*sizeof(int *));
buffer[0] = frame->data[0];
buffer[1] = malloc(frame->linesize[0]*sizeof(int));
for(int i = 0; i<frame->linesize[0]; i++){
if(i%2){
buffer[1][i]=frame->data[1][i/2];
}else{
buffer[1][i]=frame->data[2][i/2];
}
}
int ret = CVPixelBufferCreateWithBytes(NULL, frame->width, frame->height, kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange, buffer, frame->linesize[0], NULL, 0, NULL, cvPixelBufferSample)
The frame is the AVFrame with the rawData from FFMPEG Decoding.
My C skills aren't top of the line either so maybe im making a basic mistake here.
You're making several:
You should be using CVPixelBufferCreateWithPlanarBytes(). I do not know if CVPixelBufferCreateWithBytes() can be used to create a planar video frame; if so, it will require a pointer to a "plane descriptor block" (I can't seem to find the struct in the docs).
frame->linesize[0] is the bytes per row, not the size of the whole image. The docs are unclear, but the usage is fairly unambiguous.
frame->linesize[0] refers to the Y plane; you care about the UV planes.
Where is sizeof(int) from?
You're passing in cvPixelBufferSample; you might mean &cvPixelBufferSample.
You're not passing in a release callback. The documentation does not say that you can pass NULL.
Try something like this:
size_t srcPlaneSize = frame->linesize[1]*frame->height;
size_t dstPlaneSize = srcPlaneSize *2;
uint8_t *dstPlane = malloc(dstPlaneSize);
void *planeBaseAddress[2] = { frame->data[0], dstPlane };
// This loop is very naive and assumes that the line sizes are the same.
// It also copies padding bytes.
assert(frame->linesize[1] == frame->linesize[2]);
for(size_t i = 0; i<srcPlaneSize; i++){
// These might be the wrong way round.
dstPlane[2*i ]=frame->data[2][i];
dstPlane[2*i+1]=frame->data[1][i];
}
// This assumes the width and height are even (it's 420 after all).
assert(!frame->width%2 && !frame->height%2);
size_t planeWidth[2] = {frame->width, frame->width/2};
size_t planeHeight[2] = {frame->height, frame->height/2};
// I'm not sure where you'd get this.
size_t planeBytesPerRow[2] = {frame->linesize[0], frame->linesize[1]*2};
int ret = CVPixelBufferCreateWithPlanarBytes(
NULL,
frame->width,
frame->height,
kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange,
NULL,
0,
2,
planeBaseAddress,
planeWidth,
planeHeight,
planeBytesPerRow,
YOUR_RELEASE_CALLBACK,
YOUR_RELEASE_CALLBACK_CONTEXT,
NULL,
&cvPixelBufferSample);
Memory management is left as an exercise to the reader, but for test code you might get away with passing in NULL instead of a release callback.
I'm coding an opencv 2.1 program with visual c++ 2008 express. I want to get each pixel color data of each pixel and modify them by pixel.
I understand that the code "frmSource.channels();" returns the color channels of the mat frmSource, but it always returns 1 even if it is absolutely color video image, not 3 or 4.
Am I wrong?
If I'm wrong, please guide me how to get the each color component data of each pixel.
Also, the total frame count by "get(CV_CAP_PROP_FRAME_COUNT)" is much larger than the frame count I expected, so I divide the "get(CV_CAP_PROP_FRAME_COUNT) by get(CV_CAP_PROP_FPS Frame rate.") and I can get the result as I expected.
I understand that the frame is like a cut of a movie, and 30 frames per sec. Is that right?
My coding is as follows:
void fEditMain()
{
VideoCapture vdoCap("C:/Users/Public/Videos/Sample Videos/WildlifeTest.wmv");
// this video file is provided in window7
if( !vdoCap.isOpened() )
{
printf("failed to open!\n");
return;
}
Mat frmSource;
vdoCap >> frmSource;
if(! frmSource.data) return;
VideoWriter vdoRec(vRecFIleName, CV_FOURCC('W','M','V','1'), 30, frmSource.size(), true);
namedWindow("video",1);
// record video
int vFrmCntNo=1;
for(;;)
{
int vDepth = frmSource.depth();
vChannel = frmSource.channels();
// here! vChannel is always 1, i expect 3 or 4 because it is color image
imshow("video", frmSource);// frmSource Show
vdoRec << frmSource;
vdoCap >> frmSource;
if(! frmSource.data)
return;
}
return;
}
I am not sure if this will answer your question but if you use IplImage it will be very easy to get the correct number of channels as well as manipulate the image. Try using:
IplImage *frm = cvQueryFrame(cap);
int numOfChannels = channelfrm->nChannels;
A video is composed of frames and you can know how many frames pass in a second by using get(CV_CAP_PROP_FPS). If you divide the frame count by the FPS you'll get the number of seconds for the clip.
I'm trying to get information from an image using the function cvGet2D in OpenCV.
I created an array of 10 IplImage pointers:
IplImage *imageArray[10];
and I'm saving 10 images from my webcam:
imageArray[numPicture] = cvQueryFrame(capture);
when I call the function:
info = cvGet2D(imageArray[0], 250, 100);
where info:
CvScalar info;
I got the error:
OpenCV Error: Bad argument (unrecognized or unsupported array type) in cvPtr2D, file /build/buildd/opencv-2.1.0/src/cxcore/cxarray.cpp, line 1824
terminate called after throwing an instance of 'cv::Exception'
what(): /build/buildd/opencv-2.1.0/src/cxcore/cxarray.cpp:1824: error: (-5) unrecognized or unsupported array type in function cvPtr2D
If I use the function cvLoadImage to initialize an IplImage pointer and then I pass it to the cvGet2D function, the code works properly:
IplImage* imagen = cvLoadImage("test0.jpg");
info = cvGet2D(imagen, 250, 100);
however, I want to use the information already stored in my array.
Do you know how can I solve it?
Even though its a very late response, but I guess someone might be still searching for the solution with CvGet2D. Here it is.
For CvGet2D, we need to pass the arguments in the order of Y first and then X.
Example:
CvScalar s = cvGet2D(img, Y, X);
Its not mentioned anywhere in the documentation, but you find it only inside core.h/ core_c.h. Try to go to the declaration of CvGet2D(), and above the function prototypes, there are few comments that explain this.
Yeah the message is correct.
If you want to store a pixel value you need to do something like this.
int value = 0;
value = ((uchar *)(img->imageData + i*img->widthStep))[j*img->nChannels +0];
cout << "pixel value for Blue Channel and (i,j) coordinates: " << value << endl;
Summarizing, to plot or store data you must create an integer value (pixel value varies between 0 and 255). But if you only want to test pixel value (like in an if closure or something similar) you can access directly to pixel value without using an integer value.
I think thats a little bit weird when you start but when you work with it 2 o 3 times you will work without difficulties.
Sorry, cvGet2D is not the best way to obtain pixel value. I know its the shortest and clear way because you in only one line of code and knowing coordinates obtain the pixel value.
I suggest you this option. When you see this code you you wiil think that is so complicated but is more effecient.
int main()
{
// Acquire the image (I'm reading it from a file);
IplImage* img = cvLoadImage("image.bmp",1);
int i,j,k;
// Variables to store image properties
int height,width,step,channels;
uchar *data;
// Variables to store the number of white pixels and a flag
int WhiteCount,bWhite;
// Acquire image unfo
height = img->height;
width = img->width;
step = img->widthStep;
channels = img->nChannels;
data = (uchar *)img->imageData;
// Begin
WhiteCount = 0;
for(i=0;i<height;i++)
{
for(j=0;j<width;j++)
{ // Go through each channel of the image (R,G, and B) to see if it's equal to 255
bWhite = 0;
for(k=0;k<channels;k++)
{ // This checks if the pixel's kth channel is 255 - it can be faster.
if (data[i*step+j*channels+k]==255) bWhite = 1;
else
{
bWhite = 0;
break;
}
}
if(bWhite == 1) WhiteCount++;
}
}
printf("Percentage: %f%%",100.0*WhiteCount/(height*width));
return 0;
This code count white pixels and gives you a percetage of white pixels in the image.