I'm trying to average every 30 frames of a video to create a blurred timelapse. I got the video reading and video writing working, but something is wrong, because I'm only seeing the blue channel! (or one channel that is being written to blue).
Any ideas? Or better ways to do this? I'm new to OpenCV. The code is in Kotlin, but I think it should be the same issue if this was Java or python or whatever.
val videoCapture = VideoCapture(parsedArgs.inputFile)
val frameSize = Size(
videoCapture.get(Videoio.CV_CAP_PROP_FRAME_WIDTH),
videoCapture.get(Videoio.CV_CAP_PROP_FRAME_HEIGHT))
val fps = videoCapture.get(Videoio.CAP_PROP_FPS)
val videoWriter = VideoWriter( parsedArgs.outputFile, VideoWriter.fourcc('M', 'J', 'P', 'G'), fps, frameSize)
val image = Mat(frameSize,CV_8UC3)
val blended = Mat(frameSize,CV_64FC3)
println("Size: $frameSize fps:$fps over $frameCount frames")
try {
while (videoCapture.read(image)) {
val frameNumber = videoCapture.get(Videoio.CAP_PROP_POS_FRAMES).toInt()
Core.flip(image, image, -1) // I shot the video upside down
Imgproc.accumulate(image,blended)
if(frameNumber>0 && frameNumber%parsedArgs.windowSize==0) {
Core.multiply(blended, Scalar(1.0/parsedArgs.windowSize), blended)
blended.convertTo(image, CV_8UC3);
videoWriter.write(image)
blended.setTo(Scalar(0.0,0.0,0.0))
println(frameNumber.toDouble()/frameCount)
}
}
} finally {
videoCapture.release()
videoWriter.release()
}
Martin Beckett led me to the right answer (thank you!). I was multiplying by a Scalar(double), which should have been my hint because I wasn't multiplying by plain-double.
It expected a Scalar, with a value for each channel so it was happily multiplying my first channel by double, and the rest by 0.
Imgproc.accumulate(image, blended64)
if (frameNumber > 0 && frameNumber % parsedArgs.windowSize == 0) {
val blendDivisor = 1.0 / parsedArgs.windowSize
Core.multiply(blended64, Scalar(blendDivisor, blendDivisor, blendDivisor), blended64)
My guess would be using different types in Imgproc.accumulate(image,blended) try converting image to match blended before combining them.
If it was writing the entire 8bit*3 pixel data into one float the first field in an openCV image is blue (it uses BGR order)
Related
I am having trouble getting a UIImage out of the frames I am reading into my iOS FFmpeg project. I need to be able to read a frame in, and then convert this to a UIImage in order to display the frame in a UIImageView. My code appears to be reading in the frames, but I am lost as to how to convert them as there is little documentation on how to do this. Can anyone help?
while (!finished) {
if (av_read_frame(_formatContext, &packet) >= 0) {
if (packet.stream_index == _videoStream) {
int ret = avcodec_send_packet(_codecContext, &packet);
if (ret < 0 || ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) {
printf("av_codec_send_packet error ");
}
while (ret >= 0) {
ret = avcodec_receive_frame(_codecContext, _frame);
if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) {
printf("avcodec_receive_frame error ");
}
finished = true;
}
}
av_packet_unref(&packet);
}
}
You should know about pixel formats like rgb and yuv. Videos almost always uses yuv formats like yuv420p. Then study AVFrame structure, here some info:
AVFormat.format : Current frame's pixel format i.e. AV_PIX_FMT_YUV420P
AVFormat.width : Horizontal length of current frame (hence width) unit: pixels
AVFormat.height : Vertical length of current frame (hence height) unit: pixels
Now where is the actual frame buffer you might ask, it is in AVFormat.data[n]
n can be 0-3. Depending on the format, just first one may contain whole frame or all 4 of them. I.e. yuv420p uses 0, 1, and 2. Their linesizes (aka strides) can be obtained reading corresponding AVFormat.linesize[n] value.
As for yuv420p:
data[0] is Y plane
data[1] is U plane
data[2] is V plane
If you multiply linesize[0] with AVFrame.height, you'll get size of that plane (Y) as number of bytes.
I don't know about UIImage structure (or whatever it is), if it requeris a specific format like RGB, you need to convert your AVFrame to that format using swscale.
Here some examples: https://github.com/FFmpeg/FFmpeg/blob/master/doc/examples/scaling_video.c
In libav (ffmpeg) scaling (resizing) and pixel format conversion are done via same function.
Hope these helps.
i am stuck on this problem for like 20h.
The quality is not every good because on 1080p video, the minimap is less than 300px / 300px
I want to detect the 10 heros circles on this images:
Like this:
For background removal, i can use this:
The heroes portrait circle radius are between 8 to 12 because a hero portrait is like 21x21px.
With this code
Mat minimapMat = mgcodecs.imread("minimap.png");
Mat minimapCleanMat = Imgcodecs.imread("minimapClean.png");
Mat minimapDiffMat = new Mat();
Core.subtract(minimapMat, minimapCleanMat, minimapDiffMat);
I obtain this:
Now i apply circles detection on it:
findCircles(minimapDiffMat);
public static void findCircles(Mat imgSrc) {
Mat img = imgSrc.clone();
Mat gray = new Mat();
Imgproc.cvtColor(img, gray, Imgproc.COLOR_BGR2GRAY);
Imgproc.blur(gray, gray, new Size(3, 3));
Mat edges = new Mat();
int lowThreshold = 40;
int ratio = 3;
Imgproc.Canny(gray, edges, lowThreshold, lowThreshold * ratio);
Mat circles = new Mat();
Vector<Mat> circlesList = new Vector<Mat>();
Imgproc.HoughCircles(edges, circles, Imgproc.CV_HOUGH_GRADIENT, 1, 10, 5, 20, 7, 15);
double x = 0.0;
double y = 0.0;
int r = 0;
for (int i = 0; i < circles.rows(); i++) {
for (int k = 0; k < circles.cols(); k++) {
double[] data = circles.get(i, k);
for (int j = 0; j < data.length; j++) {
x = data[0];
y = data[1];
r = (int) data[2];
}
Point center = new Point(x, y);
// circle center
Imgproc.circle(img, center, 3, new Scalar(0, 255, 0), -1);
// circle outline
Imgproc.circle(img, center, r, new Scalar(0, 255, 0), 1);
}
}
HighGui.imshow("cirleIn", img);
}
Results is not ok, detecting only 2 on 10:
I have tried with knn background too:
With less success.
Any tips ? Thanks a lot in advance.
The problem is that your minimap contains highlighted parts (possibly around active players) rendering your background removal inoperable. Why not threshold the highlighted color out from the image? From what I see there are just few of them. I do not use OpenCV so I gave it a shot in C++ here is the result:
int x,y;
color c0,c1,c;
picture pic0,pic1,pic2;
// pic0 - source background
// pic1 - source map
// pic2 - output
// ensure all images are the same size
pic1.resize(pic0.xs,pic0.ys);
pic2.resize(pic0.xs,pic0.ys);
// process all pixels
for (y=0;y<pic2.ys;y++)
for (x=0;x<pic2.xs;x++)
{
// get both colors without alpha
c0.dd=pic0.p[y][x].dd&0x00FFFFFF;
c1.dd=pic1.p[y][x].dd&0x00FFFFFF; c=c1;
// threshold 0xAARRGGBB distance^2
if (distance2(c1,color(0x00EEEEEE))<2000) c.dd=0; // white-ish rectangle
if (distance2(c1,color(0x00889971))<2000) c.dd=0; // gray-ish path
if (distance2(c1,color(0x005A6443))<2000) c.dd=0; // gray-ish path
if (distance2(c1,color(0x0021A2C2))<2000) c.dd=0; // aqua water
if (distance2(c1,color(0x002A6D70))<2000) c.dd=0; // aqua water
if (distance2(c1,color(0x00439D96))<2000) c.dd=0; // aqua water
if (distance2(c1,c0 )<2500) c.dd=0; // close to background
pic2.p[y][x]=c;
}
pic2.save("out0.png");
pic2.pixel_format(_pf_u); // convert to gray scale
pic2.smooth(); // blur a little
pic2.save("out1.png");
pic2.threshold(0,80,765,0x00000000); // set dark pixels (<80) to black (0) and rest to white (3*255)
pic2.pixel_format(_pf_rgba);// convert back to RGB
pic2.save("out2.png");
So you need to find OpenCV counter parts to this. The thresholds are color distance^2 (so I do not need sqrt) and looks like 50^2 is ideal for <0,255> per channel RGB vector.
I use my own picture class for images so some members are:
xs,ys is size of image in pixels
p[y][x].dd is pixel at (x,y) position as 32 bit integer type
clear(color) clears entire image with color
resize(xs,ys) resizes image to new resolution
bmp is VCL encapsulated GDI Bitmap with Canvas access
pf holds actual pixel format of the image:
enum _pixel_format_enum
{
_pf_none=0, // undefined
_pf_rgba, // 32 bit RGBA
_pf_s, // 32 bit signed int
_pf_u, // 32 bit unsigned int
_pf_ss, // 2x16 bit signed int
_pf_uu, // 2x16 bit unsigned int
_pixel_format_enum_end
};
color and pixels are encoded like this:
union color
{
DWORD dd; WORD dw[2]; byte db[4];
int i; short int ii[2];
color(){}; color(color& a){ *this=a; }; ~color(){}; color* operator = (const color *a) { dd=a->dd; return this; }; /*color* operator = (const color &a) { ...copy... return this; };*/
};
The bands are:
enum{
_x=0, // dw
_y=1,
_b=0, // db
_g=1,
_r=2,
_a=3,
_v=0, // db
_s=1,
_h=2,
};
Here also the distance^2 between colors I used for thresholding:
DWORD distance2(color &a,color &b)
{
DWORD d,dd;
d=DWORD(a.db[0])-DWORD(b.db[0]); dd =d*d;
d=DWORD(a.db[1])-DWORD(b.db[1]); dd+=d*d;
d=DWORD(a.db[2])-DWORD(b.db[2]); dd+=d*d;
d=DWORD(a.db[3])-DWORD(b.db[3]); dd+=d*d;
return dd;
}
As input I used your images:
pic0:
pic1:
And here the (sub) results:
out0.png:
out1.png:
out2.png:
Now just remove noise (by blurring or by erosion) a bit and apply your circle fitting or hough transform...
[Edit1] circle detector
I gave it a bit of taught and implemented simple detector. I just check circumference points around any pixel position with constant radius (player circle) and if number of set point is above threshold I found potential circle. It is better than use whole disc area as some of the players contain holes and there are more pixels to test also ... Then I average close circles together and render the output ... Here updated code:
int i,j,x,y,xx,yy,x0,y0,r=10,d;
List<int> cxy; // circle circumferece points
List<int> plr; // player { x,y } list
color c0,c1,c;
picture pic0,pic1,pic2;
// pic0 - source background
// pic1 - source map
// pic2 - output
// ensure all images are the same size
pic1.resize(pic0.xs,pic0.ys);
pic2.resize(pic0.xs,pic0.ys);
// process all pixels
for (y=0;y<pic2.ys;y++)
for (x=0;x<pic2.xs;x++)
{
// get both colors without alpha
c0.dd=pic0.p[y][x].dd&0x00FFFFFF;
c1.dd=pic1.p[y][x].dd&0x00FFFFFF; c=c1;
// threshold 0xAARRGGBB distance^2
if (distance2(c1,color(0x00EEEEEE))<2000) c.dd=0; // white-ish rectangle
if (distance2(c1,color(0x00889971))<2000) c.dd=0; // gray-ish path
if (distance2(c1,color(0x005A6443))<2000) c.dd=0; // gray-ish path
if (distance2(c1,color(0x0021A2C2))<2000) c.dd=0; // aqua water
if (distance2(c1,color(0x002A6D70))<2000) c.dd=0; // aqua water
if (distance2(c1,color(0x00439D96))<2000) c.dd=0; // aqua water
if (distance2(c1,c0 )<2500) c.dd=0; // close to background
pic2.p[y][x]=c;
}
// pic2.save("out0.png");
pic2.pixel_format(_pf_u); // convert to gray scale
pic2.smooth(); // blur a little
// pic2.save("out1.png");
pic2.threshold(0,80,765,0x00000000); // set dark pixels (<80) to black (0) and rest to white (3*255)
// compute player circle circumference points mask
x0=r-1; y0=r; x0*=x0; y0*=y0;
for (x=-r,xx=x*x;x<=r;x++,xx=x*x)
for (y=-r,yy=y*y;y<=r;y++,yy=y*y)
{
d=xx+yy;
if ((d>=x0)&&(d<=y0))
{
cxy.add(x);
cxy.add(y);
}
}
// get all potential player circles
x0=(5*cxy.num)/20;
for (y=r;y<pic2.ys-r;y+=2) // no need to step by single pixel ...
for (x=r;x<pic2.xs-r;x+=2)
{
for (d=0,i=0;i<cxy.num;)
{
xx=x+cxy.dat[i]; i++;
yy=y+cxy.dat[i]; i++;
if (pic2.p[yy][xx].dd>100) d++;
}
if (d>=x0) { plr.add(x); plr.add(y); }
}
// pic2.pixel_format(_pf_rgba);// convert back to RGB
// pic2.save("out2.png");
// average all circles too close together
pic2=pic1; // use original image again
pic2.bmp->Canvas->Pen->Color=TColor(0x0000FF00);
pic2.bmp->Canvas->Pen->Width=3;
pic2.bmp->Canvas->Brush->Style=bsClear;
for (i=0;i<plr.num;i+=2) if (plr.dat[i]>=0)
{
x0=plr.dat[i+0]; x=x0;
y0=plr.dat[i+1]; y=y0; d=1;
for (j=i+2;j<plr.num;j+=2) if (plr.dat[j]>=0)
{
xx=plr.dat[j+0];
yy=plr.dat[j+1];
if (((x0-xx)*(x0-xx))+((y0-yy)*(y0-yy))*10<=20*r*r) // if close
{
x+=xx; y+=yy; d++; // add to average
plr.dat[j+0]=-1; // mark as deleted
plr.dat[j+1]=-1;
}
}
x/=d; y/=d;
plr.dat[i+0]=x;
plr.dat[i+1]=y;
pic2.bmp->Canvas->Ellipse(x-r,y-r,x+r,y+r);
}
pic2.bmp->Canvas->Pen->Width=1;
pic2.bmp->Canvas->Brush->Style=bsSolid;
// pic2.save("out3.png");
As you can see the core of code is the same I just added the detector in the end.
I also use mine dynamic list template so:
List<double> xxx; is the same as double xxx[];
xxx.add(5); adds 5 to end of the list
xxx[7] access array element (safe)
xxx.dat[7] access array element (unsafe but fast direct access)
xxx.num is the actual used size of the array
xxx.reset() clears the array and set xxx.num=0
xxx.allocate(100) preallocate space for 100 items
And here the final result out3.png:
As you can see it is a bit messed up when the players are very near (due to circle averaging) with some tweaking you might get better results. But on second taught it might be due to that small red circle nearby ...
I used VCL/GDI for the circles render so just ignore/port the pic2.bmp->Canvas-> stuff to what ever you use.
As the populated image is lighter in the blue areas around the heroes, your background subtraction is of virtually no use.
I tried to improve by applying a gain of 3 to the clean image before subtraction and here is the result.
The background has disappeared, but the outlines of the heroes are severely damaged.
I looked at your case with other approaches and I consider that it is a very difficult one.
What I do when I want to do image processing is first open the image in a paint editor (I use Gimp). Then I manipulate the image the until I end up with something that defines the parts I want to detect.
Generally, RGB is bad for a lot of computer vision tasks, and making it gray scale solves only a part of the problem.
A good start is trying to decompose the image to HSL instead.
Doing so on the first image, and only looking at the Hue channel gives me this:
Several of the blobs are quite well defined.
Playing a bit with the contrast and brightness of the Hue and Luminance layers and multiplying them gives me this:
It enhances the ring around the markers, which might be useful.
These methods all have corresponding functionality in OpenCV.
It's a tricky task and you will most likely require several different filters and techniques to succeed. Hope this helps a bit. Good luck.
Right now I am working on an OCR algorithm with Template Matching, using the opencv library. I am comparing pixel by pixel, and till now I have obtained good results. The problem comes when the area I want to match is of different size.
Ex: Template size = 70x100 while ROI = 140x200.
Is there any function that I can use in order adapt the required size and end up with the same amount of rows and columns?
Thanks
Robert Grech
Usually one makes an image scale pyramid and then only scans with the 70x100 windows across all scales i.e. as in opencv HOGDescriptor:
double scale = 1.;
double scale0 = 1.05;
int maxLevels = 64;
int nLevels;
Size templateSize(70,100);
cv::Mat testImage = cv::imread("test1.jpg");
vector<double> levelScale;
for( nLevels = 0; nLevels < maxLevels; nLevels++ )
{
levelScale.push_back(scale);
if( cvRound(testImage.cols/scale) < templateSize.width ||
cvRound(testImage.rows/scale) < templateSize.height ||
scale0 <= 1 )
break;
scale *= scale0;
}
nLevels = std::max(nLevels, 1);
levelScale.resize(nLevels);
int level;
for(level =0; level<nLevels; level++)
{
cv::Mat testAtScale;
Size sz(cvRound(testImage.cols/levelScale[level]),
cvRound(testImage.rows/levelScale[level]));
resize(testImage,testAtScale,sz);
//result = match(template,testAtScale);
//cv::imshow("sclale",testAtScale);
//cv::waitKey();
}
you would then need to post-process your results back to the original scale, this is simple with a box, but if you have a heat map / response map / probability map, then re-sizing it back up maybe somewhat hacky.
I'm coding an opencv 2.1 program with visual c++ 2008 express. I want to get each pixel color data of each pixel and modify them by pixel.
I understand that the code "frmSource.channels();" returns the color channels of the mat frmSource, but it always returns 1 even if it is absolutely color video image, not 3 or 4.
Am I wrong?
If I'm wrong, please guide me how to get the each color component data of each pixel.
Also, the total frame count by "get(CV_CAP_PROP_FRAME_COUNT)" is much larger than the frame count I expected, so I divide the "get(CV_CAP_PROP_FRAME_COUNT) by get(CV_CAP_PROP_FPS Frame rate.") and I can get the result as I expected.
I understand that the frame is like a cut of a movie, and 30 frames per sec. Is that right?
My coding is as follows:
void fEditMain()
{
VideoCapture vdoCap("C:/Users/Public/Videos/Sample Videos/WildlifeTest.wmv");
// this video file is provided in window7
if( !vdoCap.isOpened() )
{
printf("failed to open!\n");
return;
}
Mat frmSource;
vdoCap >> frmSource;
if(! frmSource.data) return;
VideoWriter vdoRec(vRecFIleName, CV_FOURCC('W','M','V','1'), 30, frmSource.size(), true);
namedWindow("video",1);
// record video
int vFrmCntNo=1;
for(;;)
{
int vDepth = frmSource.depth();
vChannel = frmSource.channels();
// here! vChannel is always 1, i expect 3 or 4 because it is color image
imshow("video", frmSource);// frmSource Show
vdoRec << frmSource;
vdoCap >> frmSource;
if(! frmSource.data)
return;
}
return;
}
I am not sure if this will answer your question but if you use IplImage it will be very easy to get the correct number of channels as well as manipulate the image. Try using:
IplImage *frm = cvQueryFrame(cap);
int numOfChannels = channelfrm->nChannels;
A video is composed of frames and you can know how many frames pass in a second by using get(CV_CAP_PROP_FPS). If you divide the frame count by the FPS you'll get the number of seconds for the clip.
I'm looking for a way to automatically remove (=make transparent) a "green screen" portrait background from a lot of pictures.
My own attempts this far have been... ehum... less successful.
I'm looking around for any hints or solutions or papers on the subject. Commercial solutions are just fine, too.
And before you comment and say that it is impossible to do this automatically: no it isn't. There actually exists a company which offers exactly this service, and if I fail to come up with a different solution we're going to use them. The problem is that they guard their algorithm with their lives, and therefore won't sell/license their software. Instead we have to FTP all pictures to them where the processing is done and then we FTP the result back home. (And no, they don't have an underpaid staff hidden away in the Philippines which handles this manually, since we're talking several thousand pictures a day...) However, this approach limits its usefulness for several reasons. So I'd really like a solution where this could be done instantly while being offline from the internet.
EDIT: My "portraits" depictures persons, which do have hair - which is a really tricky part since the green background will bleed into hair. Another tricky part is if it is possible to distingush between the green in the background and the same green in peoples clothes. The company I'm talking about above claims that they can do it by figuring out if the green area are in focus (being sharp vs blurred).
Since you didn't provide any image, I selected one from the web having a chroma key with different shades of green and a significant amount of noise due to JPEG compression.
There is no technology specification so I used Java and Marvin Framework.
input image:
The step 1 simply converts green pixels to transparency. Basically it uses a filtering rule in the HSV color space.
As you mentioned, the hair and some boundary pixels presents colors mixed with green. To reduce this problem, in the step 2, these pixels are filtered and balanced to reduce its green proportion.
before:
after:
Finally, in the step 3, a gradient transparency is applied to all boundary pixels. The result will be even better with high quality images.
final output:
Source code:
import static marvin.MarvinPluginCollection.*;
public class ChromaToTransparency {
public ChromaToTransparency(){
MarvinImage image = MarvinImageIO.loadImage("./res/person_chroma.jpg");
MarvinImage imageOut = new MarvinImage(image.getWidth(), image.getHeight());
// 1. Convert green to transparency
greenToTransparency(image, imageOut);
MarvinImageIO.saveImage(imageOut, "./res/person_chroma_out1.png");
// 2. Reduce remaining green pixels
reduceGreen(imageOut);
MarvinImageIO.saveImage(imageOut, "./res/person_chroma_out2.png");
// 3. Apply alpha to the boundary
alphaBoundary(imageOut, 6);
MarvinImageIO.saveImage(imageOut, "./res/person_chroma_out3.png");
}
private void greenToTransparency(MarvinImage imageIn, MarvinImage imageOut){
for(int y=0; y<imageIn.getHeight(); y++){
for(int x=0; x<imageIn.getWidth(); x++){
int color = imageIn.getIntColor(x, y);
int r = imageIn.getIntComponent0(x, y);
int g = imageIn.getIntComponent1(x, y);
int b = imageIn.getIntComponent2(x, y);
double[] hsv = MarvinColorModelConverter.rgbToHsv(new int[]{color});
if(hsv[0] >= 60 && hsv[0] <= 130 && hsv[1] >= 0.4 && hsv[2] >= 0.3){
imageOut.setIntColor(x, y, 0, 127, 127, 127);
}
else{
imageOut.setIntColor(x, y, color);
}
}
}
}
private void reduceGreen(MarvinImage image){
for(int y=0; y<image.getHeight(); y++){
for(int x=0; x<image.getWidth(); x++){
int r = image.getIntComponent0(x, y);
int g = image.getIntComponent1(x, y);
int b = image.getIntComponent2(x, y);
int color = image.getIntColor(x, y);
double[] hsv = MarvinColorModelConverter.rgbToHsv(new int[]{color});
if(hsv[0] >= 60 && hsv[0] <= 130 && hsv[1] >= 0.15 && hsv[2] > 0.15){
if((r*b) !=0 && (g*g) / (r*b) >= 1.5){
image.setIntColor(x, y, 255, (int)(r*1.4), (int)g, (int)(b*1.4));
} else{
image.setIntColor(x, y, 255, (int)(r*1.2), g, (int)(b*1.2));
}
}
}
}
}
public static void main(String[] args) {
new ChromaToTransparency();
}
}
Take a look at this thread:
http://www.wizards-toolkit.org/discourse-server/viewtopic.php?f=2&t=14394&start=0
and the link within it to the tutorial at:
http://tech.natemurray.com/2007/12/convert-white-to-transparent.html
Then it's just a matter of writing some scripts to look through the directory full of images. Pretty simple.
If you know the "green color" you may write a small program in opencv C/C++/Python to do extract that color and replace with transparent pixels.
123 Video Magic Green Screen Background Software and there are a few more just made to remove green screen background hope this helps
PaintShop Pro allows you to remove backgrounds based on picking a color. They also have a Remove Background wand that will remove whatever you touch (converting those pixels to transparent). You can tweak the "tolerance" for the wand, such that it takes out pixels that are similar to the ones you are touching. This has worked pretty well for me in the past.
To automate it, you'd program a script in PSP that does what you want and then call it from your program. This might be a kludgy way to to do automatic replacement, but it would be the cheapest, fastest solution without having to write a bunch of C#/C++ imaging code or pay a commercial agency.
They being said, you pay for what you get.