character image thinning - image-processing

im doing an ocr application . im confusing that how to do skew an image like this :
Second ,i have a character image with many font size. the problem is : how to thin them to the same size like this

For your first point: find the angle by which the text is rotated, and rotate your image by that angle. In your sample you can do this by finding the angles of the lines between the large black patches on the edges and the white areas. Look into edge detection and hough transform to help you find the lines, and then help you find their angle. OpenCV has a good implementation of both algorithms.
For your second point: that is the morphological operation binary skeleton in action.

you can use the following code for detecting and correcting skew but i need your help if you get any thinning algorithms...asume the input image is on the picture box....
try
{
//Check if there exists an image on the picture box
if (pictureBox1.Image == null)
{
MessageBox.Show("Please load an image first.", "Error", MessageBoxButtons.OK, MessageBoxIcon.Error);
uploadImageToolStripMenuItem.PerformClick();
return;
}
Bitmap image = new Bitmap(pictureBox1.Image);
BitmapData imageData = image.LockBits(new Rectangle(0, 0, image.Width, image.Height),
ImageLockMode.ReadWrite, PixelFormat.Format8bppIndexed);
//document image skew detection starts here
DocumentSkewChecker skewChecker = new DocumentSkewChecker();
// get documents skew angle
double angle = skewChecker.GetSkewAngle(imageData);
// create rotation filter and rotate image applying the filter
RotateBilinear rotationFilter = new RotateBilinear(-angle);
rotationFilter.FillColor = Color.White;
image.UnlockBits(imageData);
//if the angle is more 90 or 180, consider it as a normal image or if it is not, perform a skew correction
if (-angle == 90 || -angle == 180)
{
pictureBox1.Image = image;
pictureBox1.SizeMode = PictureBoxSizeMode.Zoom;
return;
}
//Bitmap rotatedImage = rotationFilter.Apply();
//draw a bitmap based on the skew angle...
Bitmap returnBitmap = new Bitmap(image.Width, image.Height);
Graphics g = Graphics.FromImage(returnBitmap);
g.TranslateTransform((float)image.Width / 2, (float)image.Height / 2);
g.RotateTransform(((float)angle));
g.TranslateTransform(-(float)image.Width / 2, -(float)image.Height / 2);
g.DrawImage(image, new Point(0, 0));
pictureBox1.Image = returnBitmap;
pictureBox1.SizeMode = PictureBoxSizeMode.Zoom;
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}

Related

Detectings small circles on game minimap

i am stuck on this problem for like 20h.
The quality is not every good because on 1080p video, the minimap is less than 300px / 300px
I want to detect the 10 heros circles on this images:
Like this:
For background removal, i can use this:
The heroes portrait circle radius are between 8 to 12 because a hero portrait is like 21x21px.
With this code
Mat minimapMat = mgcodecs.imread("minimap.png");
Mat minimapCleanMat = Imgcodecs.imread("minimapClean.png");
Mat minimapDiffMat = new Mat();
Core.subtract(minimapMat, minimapCleanMat, minimapDiffMat);
I obtain this:
Now i apply circles detection on it:
findCircles(minimapDiffMat);
public static void findCircles(Mat imgSrc) {
Mat img = imgSrc.clone();
Mat gray = new Mat();
Imgproc.cvtColor(img, gray, Imgproc.COLOR_BGR2GRAY);
Imgproc.blur(gray, gray, new Size(3, 3));
Mat edges = new Mat();
int lowThreshold = 40;
int ratio = 3;
Imgproc.Canny(gray, edges, lowThreshold, lowThreshold * ratio);
Mat circles = new Mat();
Vector<Mat> circlesList = new Vector<Mat>();
Imgproc.HoughCircles(edges, circles, Imgproc.CV_HOUGH_GRADIENT, 1, 10, 5, 20, 7, 15);
double x = 0.0;
double y = 0.0;
int r = 0;
for (int i = 0; i < circles.rows(); i++) {
for (int k = 0; k < circles.cols(); k++) {
double[] data = circles.get(i, k);
for (int j = 0; j < data.length; j++) {
x = data[0];
y = data[1];
r = (int) data[2];
}
Point center = new Point(x, y);
// circle center
Imgproc.circle(img, center, 3, new Scalar(0, 255, 0), -1);
// circle outline
Imgproc.circle(img, center, r, new Scalar(0, 255, 0), 1);
}
}
HighGui.imshow("cirleIn", img);
}
Results is not ok, detecting only 2 on 10:
I have tried with knn background too:
With less success.
Any tips ? Thanks a lot in advance.
The problem is that your minimap contains highlighted parts (possibly around active players) rendering your background removal inoperable. Why not threshold the highlighted color out from the image? From what I see there are just few of them. I do not use OpenCV so I gave it a shot in C++ here is the result:
int x,y;
color c0,c1,c;
picture pic0,pic1,pic2;
// pic0 - source background
// pic1 - source map
// pic2 - output
// ensure all images are the same size
pic1.resize(pic0.xs,pic0.ys);
pic2.resize(pic0.xs,pic0.ys);
// process all pixels
for (y=0;y<pic2.ys;y++)
for (x=0;x<pic2.xs;x++)
{
// get both colors without alpha
c0.dd=pic0.p[y][x].dd&0x00FFFFFF;
c1.dd=pic1.p[y][x].dd&0x00FFFFFF; c=c1;
// threshold 0xAARRGGBB distance^2
if (distance2(c1,color(0x00EEEEEE))<2000) c.dd=0; // white-ish rectangle
if (distance2(c1,color(0x00889971))<2000) c.dd=0; // gray-ish path
if (distance2(c1,color(0x005A6443))<2000) c.dd=0; // gray-ish path
if (distance2(c1,color(0x0021A2C2))<2000) c.dd=0; // aqua water
if (distance2(c1,color(0x002A6D70))<2000) c.dd=0; // aqua water
if (distance2(c1,color(0x00439D96))<2000) c.dd=0; // aqua water
if (distance2(c1,c0 )<2500) c.dd=0; // close to background
pic2.p[y][x]=c;
}
pic2.save("out0.png");
pic2.pixel_format(_pf_u); // convert to gray scale
pic2.smooth(); // blur a little
pic2.save("out1.png");
pic2.threshold(0,80,765,0x00000000); // set dark pixels (<80) to black (0) and rest to white (3*255)
pic2.pixel_format(_pf_rgba);// convert back to RGB
pic2.save("out2.png");
So you need to find OpenCV counter parts to this. The thresholds are color distance^2 (so I do not need sqrt) and looks like 50^2 is ideal for <0,255> per channel RGB vector.
I use my own picture class for images so some members are:
xs,ys is size of image in pixels
p[y][x].dd is pixel at (x,y) position as 32 bit integer type
clear(color) clears entire image with color
resize(xs,ys) resizes image to new resolution
bmp is VCL encapsulated GDI Bitmap with Canvas access
pf holds actual pixel format of the image:
enum _pixel_format_enum
{
_pf_none=0, // undefined
_pf_rgba, // 32 bit RGBA
_pf_s, // 32 bit signed int
_pf_u, // 32 bit unsigned int
_pf_ss, // 2x16 bit signed int
_pf_uu, // 2x16 bit unsigned int
_pixel_format_enum_end
};
color and pixels are encoded like this:
union color
{
DWORD dd; WORD dw[2]; byte db[4];
int i; short int ii[2];
color(){}; color(color& a){ *this=a; }; ~color(){}; color* operator = (const color *a) { dd=a->dd; return this; }; /*color* operator = (const color &a) { ...copy... return this; };*/
};
The bands are:
enum{
_x=0, // dw
_y=1,
_b=0, // db
_g=1,
_r=2,
_a=3,
_v=0, // db
_s=1,
_h=2,
};
Here also the distance^2 between colors I used for thresholding:
DWORD distance2(color &a,color &b)
{
DWORD d,dd;
d=DWORD(a.db[0])-DWORD(b.db[0]); dd =d*d;
d=DWORD(a.db[1])-DWORD(b.db[1]); dd+=d*d;
d=DWORD(a.db[2])-DWORD(b.db[2]); dd+=d*d;
d=DWORD(a.db[3])-DWORD(b.db[3]); dd+=d*d;
return dd;
}
As input I used your images:
pic0:
pic1:
And here the (sub) results:
out0.png:
out1.png:
out2.png:
Now just remove noise (by blurring or by erosion) a bit and apply your circle fitting or hough transform...
[Edit1] circle detector
I gave it a bit of taught and implemented simple detector. I just check circumference points around any pixel position with constant radius (player circle) and if number of set point is above threshold I found potential circle. It is better than use whole disc area as some of the players contain holes and there are more pixels to test also ... Then I average close circles together and render the output ... Here updated code:
int i,j,x,y,xx,yy,x0,y0,r=10,d;
List<int> cxy; // circle circumferece points
List<int> plr; // player { x,y } list
color c0,c1,c;
picture pic0,pic1,pic2;
// pic0 - source background
// pic1 - source map
// pic2 - output
// ensure all images are the same size
pic1.resize(pic0.xs,pic0.ys);
pic2.resize(pic0.xs,pic0.ys);
// process all pixels
for (y=0;y<pic2.ys;y++)
for (x=0;x<pic2.xs;x++)
{
// get both colors without alpha
c0.dd=pic0.p[y][x].dd&0x00FFFFFF;
c1.dd=pic1.p[y][x].dd&0x00FFFFFF; c=c1;
// threshold 0xAARRGGBB distance^2
if (distance2(c1,color(0x00EEEEEE))<2000) c.dd=0; // white-ish rectangle
if (distance2(c1,color(0x00889971))<2000) c.dd=0; // gray-ish path
if (distance2(c1,color(0x005A6443))<2000) c.dd=0; // gray-ish path
if (distance2(c1,color(0x0021A2C2))<2000) c.dd=0; // aqua water
if (distance2(c1,color(0x002A6D70))<2000) c.dd=0; // aqua water
if (distance2(c1,color(0x00439D96))<2000) c.dd=0; // aqua water
if (distance2(c1,c0 )<2500) c.dd=0; // close to background
pic2.p[y][x]=c;
}
// pic2.save("out0.png");
pic2.pixel_format(_pf_u); // convert to gray scale
pic2.smooth(); // blur a little
// pic2.save("out1.png");
pic2.threshold(0,80,765,0x00000000); // set dark pixels (<80) to black (0) and rest to white (3*255)
// compute player circle circumference points mask
x0=r-1; y0=r; x0*=x0; y0*=y0;
for (x=-r,xx=x*x;x<=r;x++,xx=x*x)
for (y=-r,yy=y*y;y<=r;y++,yy=y*y)
{
d=xx+yy;
if ((d>=x0)&&(d<=y0))
{
cxy.add(x);
cxy.add(y);
}
}
// get all potential player circles
x0=(5*cxy.num)/20;
for (y=r;y<pic2.ys-r;y+=2) // no need to step by single pixel ...
for (x=r;x<pic2.xs-r;x+=2)
{
for (d=0,i=0;i<cxy.num;)
{
xx=x+cxy.dat[i]; i++;
yy=y+cxy.dat[i]; i++;
if (pic2.p[yy][xx].dd>100) d++;
}
if (d>=x0) { plr.add(x); plr.add(y); }
}
// pic2.pixel_format(_pf_rgba);// convert back to RGB
// pic2.save("out2.png");
// average all circles too close together
pic2=pic1; // use original image again
pic2.bmp->Canvas->Pen->Color=TColor(0x0000FF00);
pic2.bmp->Canvas->Pen->Width=3;
pic2.bmp->Canvas->Brush->Style=bsClear;
for (i=0;i<plr.num;i+=2) if (plr.dat[i]>=0)
{
x0=plr.dat[i+0]; x=x0;
y0=plr.dat[i+1]; y=y0; d=1;
for (j=i+2;j<plr.num;j+=2) if (plr.dat[j]>=0)
{
xx=plr.dat[j+0];
yy=plr.dat[j+1];
if (((x0-xx)*(x0-xx))+((y0-yy)*(y0-yy))*10<=20*r*r) // if close
{
x+=xx; y+=yy; d++; // add to average
plr.dat[j+0]=-1; // mark as deleted
plr.dat[j+1]=-1;
}
}
x/=d; y/=d;
plr.dat[i+0]=x;
plr.dat[i+1]=y;
pic2.bmp->Canvas->Ellipse(x-r,y-r,x+r,y+r);
}
pic2.bmp->Canvas->Pen->Width=1;
pic2.bmp->Canvas->Brush->Style=bsSolid;
// pic2.save("out3.png");
As you can see the core of code is the same I just added the detector in the end.
I also use mine dynamic list template so:
List<double> xxx; is the same as double xxx[];
xxx.add(5); adds 5 to end of the list
xxx[7] access array element (safe)
xxx.dat[7] access array element (unsafe but fast direct access)
xxx.num is the actual used size of the array
xxx.reset() clears the array and set xxx.num=0
xxx.allocate(100) preallocate space for 100 items
And here the final result out3.png:
As you can see it is a bit messed up when the players are very near (due to circle averaging) with some tweaking you might get better results. But on second taught it might be due to that small red circle nearby ...
I used VCL/GDI for the circles render so just ignore/port the pic2.bmp->Canvas-> stuff to what ever you use.
As the populated image is lighter in the blue areas around the heroes, your background subtraction is of virtually no use.
I tried to improve by applying a gain of 3 to the clean image before subtraction and here is the result.
The background has disappeared, but the outlines of the heroes are severely damaged.
I looked at your case with other approaches and I consider that it is a very difficult one.
What I do when I want to do image processing is first open the image in a paint editor (I use Gimp). Then I manipulate the image the until I end up with something that defines the parts I want to detect.
Generally, RGB is bad for a lot of computer vision tasks, and making it gray scale solves only a part of the problem.
A good start is trying to decompose the image to HSL instead.
Doing so on the first image, and only looking at the Hue channel gives me this:
Several of the blobs are quite well defined.
Playing a bit with the contrast and brightness of the Hue and Luminance layers and multiplying them gives me this:
It enhances the ring around the markers, which might be useful.
These methods all have corresponding functionality in OpenCV.
It's a tricky task and you will most likely require several different filters and techniques to succeed. Hope this helps a bit. Good luck.

Canvas zoomIn/zoomOut: how to avoid loss of image quality?

I have this code that I used for scaling images. To zoomIn and zoomOut is use the code scalePicture(1.10, drawingContext); and scalePicture(0.90, drawingContext);. I perform that operations on a off screen canvas and then copy the image back to the original screen.
I make use of the offscreen processing since the browser optimizes the image operations by using double buffering. I am still having the issue that when I zoomIn by around 400% and then zoomOut back to the original size, there is a significant loss of image quality.
I am not depending on the original image because the user can perform many operations such as clip, crop, rotate, annotate and I need to stack all the operations on the original image.
Can anyone throw some advice/suggestions around any means to preserve the quality of the image while not sacrificing the performance and quality.
scalePicture : function(scalePercent, operatingCanvasContext) {
var w = operatingCanvasContext.canvas.width,
h = operatingCanvasContext.canvas.height,
sw = w * scalePercent,
sh = h * scalePercent,
operatingCanvas = operatingCanvasContext.canvas;
var canvasPic = new Image();
operatingCanvasContext.save();
canvasPic.src = operatingCanvas.toDataURL();
operatingCanvasContext.clearRect (0,0, operatingCanvas.width, operatingCanvas.height);
operatingCanvasContext.translate(operatingCanvas.width/2, operatingCanvas.height/2);
canvasPic.onload = function () {
operatingCanvasContext.drawImage(canvasPic, -sw/2 , -sh/2 , sw, sh);
operatingCanvasContext.translate(-operatingCanvas.width/2, -operatingCanvas.height/2);
operatingCanvasContext.restore();
};
}
Canvas is draw and forget. There is no way to preserve original quality without referencing the original source.
I would suggest to reconstruct the recorded stack but using a transformation matrix for the changes in scale, rotation etc. Then apply the accumulated matrix on the original image. This will preserve the optimal quality as well as provide some gain in performance (as you only draw the last and current state).
Similar for clipping, calculate and merge the clipping regions using the same matrix and apply clip before drawing in the original image in the final step. And similar with text etc.
It's a bit too broad to show an example that does all these steps, but here is an example showing how to use accumulated matrix transforms on the original image preserving optimal quality. You can see that you can zoom in and out, rotate and the image will in each instance render at optimal quality.
Example of Concept
var ctx = c.getContext("2d"), img = new Image; // these lines just for demo init.
img.onload = demo;
ctx.fillText("Loading image...", 20, 20);
ctx.globalCompositeOperation = "copy";
img.src = "http://i.imgur.com/sPrSId0.jpg";
function demo() {
render();
zin.onclick = zoomIn; // accumulates transform, but render
zout.onclick = zoomOut; // based on original image using.
zrot.onclick = rotate; // current transformation matrix
}
function render() {ctx.drawImage(img, 0, 0)} // render original image
function zoomIn() {
ctx.translate(c.width * 0.5, c.height * 0.5); // pivot = center
ctx.scale(1.05, 1.05);
ctx.translate(-c.width * 0.5, -c.height * 0.5);
render();
}
function zoomOut() {
ctx.translate(c.width * 0.5, c.height * 0.5);
ctx.scale(1/1.05, 1/1.05);
ctx.translate(-c.width * 0.5, -c.height * 0.5);
render();
}
function rotate() {
ctx.translate(c.width * 0.5, c.height * 0.5);
ctx.rotate(0.3);
ctx.translate(-c.width * 0.5, -c.height * 0.5);
render();
}
<button id=zin>Zoom in</button>
<button id=zout>Zoom out</button>
<button id=zrot>Rotate</button><br>
<canvas id=c width=640 height=378></canvas>

How to determine the distance between upper lip and lower lip by using webcam in Processing?

Where should I start? I can see plenty of face recognition and analysis using Python, Java script but how about Processing ?
I want to determine the distance by using 2 points between upper and lower lip at their highest and lowest point via webcam to use it in further project.
any help would be appreciated
If you want to do it in Processing alone you can use Greg Borenstein's OpenCV for Processing library:
You can start with the Face Detection example
Once you detect a face, you can detect a mouth within the face rectangle using OpenCV.CASCADE_MOUTH.
Once you have mouth detected maybe you can get away with using the mouth bounding box height. For more detail you use OpenCV to threshold that rectangle. Hopefully the open mouth will segment nicely from the rest of the skin. Finding contours should give you lists of points you can work with.
For something a lot more exact, you can use Jason Saragih's CLM FaceTracker, which is available as an OpenFrameworks addon. OpenFrameworks has similarities to Processing. If you do need this sort of accuracy in Processing you can run FaceOSC in the background and read the mouth coordinates in Processing using oscP5
Update
For the first option, using HAAR cascade classifiers, turns out there are a couple of issues:
The OpenCV Processing library can load one cascade and a second instance will override the first.
The OpenCV.CASCADE_MOUTH seems to work better for closed mouths, but not very well with open mouths
To get past the 1st issue, you can use the OpenCV Java API directly, bypassing OpenCV Processing for multiple cascade detection.
There are couple of parameters that can help the detection, such as having idea of the bounding box of the mouth before hand to pass as a hint to the classifier.
I've done a basic test using a webcam on my laptop and measure the bounding box for face and mouth at various distances. Here's an example:
import gab.opencv.*;
import org.opencv.core.*;
import org.opencv.objdetect.*;
import processing.video.*;
Capture video;
OpenCV opencv;
CascadeClassifier faceDetector,mouthDetector;
MatOfRect faceDetections,mouthDetections;
//cascade detections parameters - explanations from Mastering OpenCV with Practical Computer Vision Projects
int flags = Objdetect.CASCADE_FIND_BIGGEST_OBJECT;
// Smallest object size.
Size minFeatureSizeFace = new Size(50,60);
Size maxFeatureSizeFace = new Size(125,150);
Size minFeatureSizeMouth = new Size(30,10);
Size maxFeatureSizeMouth = new Size(120,60);
// How detailed should the search be. Must be larger than 1.0.
float searchScaleFactor = 1.1f;
// How much the detections should be filtered out. This should depend on how bad false detections are to your system.
// minNeighbors=2 means lots of good+bad detections, and minNeighbors=6 means only good detections are given but some are missed.
int minNeighbors = 4;
//laptop webcam face rectangle
//far, small scale, ~50,60px
//typing distance, ~83,91px
//really close, ~125,150
//laptop webcam mouth rectangle
//far, small scale, ~30,10
//typing distance, ~50,25px
//really close, ~120,60
int mouthHeightHistory = 30;
int[] mouthHeights = new int[mouthHeightHistory];
void setup() {
opencv = new OpenCV(this,320,240);
size(opencv.width, opencv.height);
noFill();
frameRate(30);
video = new Capture(this,width,height);
video.start();
faceDetector = new CascadeClassifier(dataPath("haarcascade_frontalface_alt2.xml"));
mouthDetector = new CascadeClassifier(dataPath("haarcascade_mcs_mouth.xml"));
}
void draw() {
//feed cam image to OpenCV, it turns it to grayscale
opencv.loadImage(video);
opencv.equalizeHistogram();
image(opencv.getOutput(), 0, 0 );
//detect face using raw Java OpenCV API
Mat equalizedImg = opencv.getGray();
faceDetections = new MatOfRect();
faceDetector.detectMultiScale(equalizedImg, faceDetections, searchScaleFactor, minNeighbors, flags, minFeatureSizeFace, maxFeatureSizeFace);
Rect[] faceDetectionResults = faceDetections.toArray();
int faces = faceDetectionResults.length;
text("detected faces: "+faces,5,15);
if(faces >= 1){
Rect face = faceDetectionResults[0];
stroke(0,192,0);
rect(face.x,face.y,face.width,face.height);
//detect mouth - only within face rectangle, not the whole frame
Rect faceLower = face.clone();
faceLower.height = (int) (face.height * 0.65);
faceLower.y = face.y + faceLower.height;
Mat faceROI = equalizedImg.submat(faceLower);
//debug view of ROI
PImage faceImg = createImage(faceLower.width,faceLower.height,RGB);
opencv.toPImage(faceROI,faceImg);
image(faceImg,width-faceImg.width,0);
mouthDetections = new MatOfRect();
mouthDetector.detectMultiScale(faceROI, mouthDetections, searchScaleFactor, minNeighbors, flags, minFeatureSizeMouth, maxFeatureSizeMouth);
Rect[] mouthDetectionResults = mouthDetections.toArray();
int mouths = mouthDetectionResults.length;
text("detected mouths: "+mouths,5,25);
if(mouths >= 1){
Rect mouth = mouthDetectionResults[0];
stroke(192,0,0);
rect(faceLower.x + mouth.x,faceLower.y + mouth.y,mouth.width,mouth.height);
text("mouth height:"+mouth.height+"~px",5,35);
updateAndPlotMouthHistory(mouth.height);
}
}
}
void updateAndPlotMouthHistory(int newHeight){
//shift older values by 1
for(int i = mouthHeightHistory-1; i > 0; i--){
mouthHeights[i] = mouthHeights[i-1];
}
//add new value at the front
mouthHeights[0] = newHeight;
//plot
float graphWidth = 100.0;
float elementWidth = graphWidth / mouthHeightHistory;
for(int i = 0; i < mouthHeightHistory; i++){
rect(elementWidth * i,45,elementWidth,mouthHeights[i]);
}
}
void captureEvent(Capture c) {
c.read();
}
One very imortant note to make: I've copied cascade xml files from the OpenCV Processing library folder (~/Documents/Processing/libraries/opencv_processing/library/cascade-files) to the sketch's data folder. My sketch is OpenCVMouthOpen, so the folder structure looks like this:
OpenCVMouthOpen
├── OpenCVMouthOpen.pde
└── data
├── haarcascade_frontalface_alt.xml
├── haarcascade_frontalface_alt2.xml
├── haarcascade_frontalface_alt_tree.xml
├── haarcascade_frontalface_default.xml
├── haarcascade_mcs_mouth.xml
└── lbpcascade_frontalface.xml
If you don't copy the cascades files and use the code as it is you won't get any errors, but the detection simply won't work. If you want to check, you can do
println(faceDetector.empty())
at the end of the setup() function and if you get false, the cascade has been loaded and if you get true, the cascade hasn't been loaded.
You may need to play with the minFeatureSize and maxFeatureSize values for face and mouth for your setup. The second issue, cascade not detecting wide open mouth very well is tricky. There might be an already trained cascade for open mouths, but you'd need to find it. Otherwise, with this method you may need to train one yourself and that can be a bit tedious.
Nevertheless, notice that there is an upside down plot drawn on the left when a mouth is detected. In my tests I noticed that the height isn't super accurate, but there are noticeable changes in the graph. You may not be able to get a steady mouth height, but by comparing current to averaged previous height values you should see some peaks (values going from positive to negative or vice-versa) which give you an idea of a mouth open/close change.
Although searching through the whole image for a mouth as opposed to a face only can be a bit slower and less accurate, it's a simpler setup. It you can get away with less accuracy and more false positives on your project this could be simpler:
import gab.opencv.*;
import java.awt.Rectangle;
import org.opencv.objdetect.Objdetect;
import processing.video.*;
Capture video;
OpenCV opencv;
Rectangle[] faces,mouths;
//cascade detections parameters - explanations from Mastering OpenCV with Practical Computer Vision Projects
int flags = Objdetect.CASCADE_FIND_BIGGEST_OBJECT;
// Smallest object size.
int minFeatureSize = 20;
int maxFeatureSize = 150;
// How detailed should the search be. Must be larger than 1.0.
float searchScaleFactor = 1.1f;
// How much the detections should be filtered out. This should depend on how bad false detections are to your system.
// minNeighbors=2 means lots of good+bad detections, and minNeighbors=6 means only good detections are given but some are missed.
int minNeighbors = 6;
void setup() {
size(320, 240);
noFill();
stroke(0, 192, 0);
strokeWeight(3);
video = new Capture(this,width,height);
video.start();
opencv = new OpenCV(this,320,240);
opencv.loadCascade(OpenCV.CASCADE_MOUTH);
}
void draw() {
//feed cam image to OpenCV, it turns it to grayscale
opencv.loadImage(video);
opencv.equalizeHistogram();
image(opencv.getOutput(), 0, 0 );
Rectangle[] mouths = opencv.detect(searchScaleFactor,minNeighbors,flags,minFeatureSize, maxFeatureSize);
for (int i = 0; i < mouths.length; i++) {
text(mouths[i].x + "," + mouths[i].y + "," + mouths[i].width + "," + mouths[i].height,mouths[i].x, mouths[i].y);
rect(mouths[i].x, mouths[i].y, mouths[i].width, mouths[i].height);
}
}
void captureEvent(Capture c) {
c.read();
}
I was mentioning segmenting/thresholding as well. Here's a rough example using the lower part of a detected face just a basic threshold, then some basic morphological filters (erode/dilate) to cleanup the thresholded image a bit:
import gab.opencv.*;
import org.opencv.core.*;
import org.opencv.objdetect.*;
import org.opencv.imgproc.Imgproc;
import java.awt.Rectangle;
import java.util.*;
import processing.video.*;
Capture video;
OpenCV opencv;
CascadeClassifier faceDetector,mouthDetector;
MatOfRect faceDetections,mouthDetections;
//cascade detections parameters - explanations from Mastering OpenCV with Practical Computer Vision Projects
int flags = Objdetect.CASCADE_FIND_BIGGEST_OBJECT;
// Smallest object size.
Size minFeatureSizeFace = new Size(50,60);
Size maxFeatureSizeFace = new Size(125,150);
// How detailed should the search be. Must be larger than 1.0.
float searchScaleFactor = 1.1f;
// How much the detections should be filtered out. This should depend on how bad false detections are to your system.
// minNeighbors=2 means lots of good+bad detections, and minNeighbors=6 means only good detections are given but some are missed.
int minNeighbors = 4;
//laptop webcam face rectangle
//far, small scale, ~50,60px
//typing distance, ~83,91px
//really close, ~125,150
float threshold = 160;
int erodeAmt = 1;
int dilateAmt = 5;
void setup() {
opencv = new OpenCV(this,320,240);
size(opencv.width, opencv.height);
noFill();
video = new Capture(this,width,height);
video.start();
faceDetector = new CascadeClassifier(dataPath("haarcascade_frontalface_alt2.xml"));
mouthDetector = new CascadeClassifier(dataPath("haarcascade_mcs_mouth.xml"));
}
void draw() {
//feed cam image to OpenCV, it turns it to grayscale
opencv.loadImage(video);
opencv.equalizeHistogram();
image(opencv.getOutput(), 0, 0 );
//detect face using raw Java OpenCV API
Mat equalizedImg = opencv.getGray();
faceDetections = new MatOfRect();
faceDetector.detectMultiScale(equalizedImg, faceDetections, searchScaleFactor, minNeighbors, flags, minFeatureSizeFace, maxFeatureSizeFace);
Rect[] faceDetectionResults = faceDetections.toArray();
int faces = faceDetectionResults.length;
text("detected faces: "+faces,5,15);
if(faces > 0){
Rect face = faceDetectionResults[0];
stroke(0,192,0);
rect(face.x,face.y,face.width,face.height);
//detect mouth - only within face rectangle, not the whole frame
Rect faceLower = face.clone();
faceLower.height = (int) (face.height * 0.55);
faceLower.y = face.y + faceLower.height;
//submat grabs a portion of the image (submatrix) = our region of interest (ROI)
Mat faceROI = equalizedImg.submat(faceLower);
Mat faceROIThresh = faceROI.clone();
//threshold
Imgproc.threshold(faceROI, faceROIThresh, threshold, width, Imgproc.THRESH_BINARY_INV);
Imgproc.erode(faceROIThresh, faceROIThresh, new Mat(), new Point(-1,-1), erodeAmt);
Imgproc.dilate(faceROIThresh, faceROIThresh, new Mat(), new Point(-1,-1), dilateAmt);
//find contours
Mat faceContours = faceROIThresh.clone();
List<MatOfPoint> contours = new ArrayList<MatOfPoint>();
Imgproc.findContours(faceContours, contours, new Mat(), Imgproc.RETR_EXTERNAL , Imgproc.CHAIN_APPROX_SIMPLE);
//draw contours
for(int i = 0 ; i < contours.size(); i++){
MatOfPoint contour = contours.get(i);
Point[] points = contour.toArray();
stroke(map(i,0,contours.size()-1,32,255),0,0);
beginShape();
for(Point p : points){
vertex((float)p.x,(float)p.y);
}
endShape();
}
//debug view of ROI
PImage faceImg = createImage(faceLower.width,faceLower.height,RGB);
opencv.toPImage(faceROIThresh,faceImg);
image(faceImg,width-faceImg.width,0);
}
text("Drag mouseX to control threshold: " + threshold+
"\nHold 'e' and drag mouseX to control erodeAmt: " + erodeAmt+
"\nHold 'd' and drag mouseX to control dilateAmt: " + dilateAmt,5,210);
}
void mouseDragged(){
if(keyPressed){
if(key == 'e') erodeAmt = (int)map(mouseX,0,width,1,6);
if(key == 'd') dilateAmt = (int)map(mouseX,0,width,1,10);
}else{
threshold = mouseX;
}
}
void captureEvent(Capture c) {
c.read();
}
This could be improved a bit by using YCrCb colour space to segment skin better, but overall you notice that there are quite a few variables to get right which doesn't make this a very flexible setup.
You will be much better results using FaceOSC and reading the values you need in Processing via oscP5. Here is a slightly simplified version of the FaceOSCReceiver Processing example focusing mainly on mouth:
import oscP5.*;
OscP5 oscP5;
// num faces found
int found;
// pose
float poseScale;
PVector posePosition = new PVector();
// gesture
float mouthHeight;
float mouthWidth;
void setup() {
size(640, 480);
frameRate(30);
oscP5 = new OscP5(this, 8338);
oscP5.plug(this, "found", "/found");
oscP5.plug(this, "poseScale", "/pose/scale");
oscP5.plug(this, "posePosition", "/pose/position");
oscP5.plug(this, "mouthWidthReceived", "/gesture/mouth/width");
oscP5.plug(this, "mouthHeightReceived", "/gesture/mouth/height");
}
void draw() {
background(255);
stroke(0);
if(found > 0) {
translate(posePosition.x, posePosition.y);
scale(poseScale);
noFill();
ellipse(0, 20, mouthWidth* 3, mouthHeight * 3);
}
}
// OSC CALLBACK FUNCTIONS
public void found(int i) {
println("found: " + i);
found = i;
}
public void poseScale(float s) {
println("scale: " + s);
poseScale = s;
}
public void posePosition(float x, float y) {
println("pose position\tX: " + x + " Y: " + y );
posePosition.set(x, y, 0);
}
public void mouthWidthReceived(float w) {
println("mouth Width: " + w);
mouthWidth = w;
}
public void mouthHeightReceived(float h) {
println("mouth height: " + h);
mouthHeight = h;
}
// all other OSC messages end up here
void oscEvent(OscMessage m) {
if(m.isPlugged() == false) {
println("UNPLUGGED: " + m);
}
}
On OSX you can simply download the compiled FaceOSC app.
On other operating systems you may need to setup OpenFrameworks, download ofxFaceTracker and compile FaceOSC yourself.
It's really hard to answer general "how do I do this" type questions. Stack Overflow is designed for specific "I tried X, expected Y, but got Z instead" type questions. But I'll try to answer in a general sense:
You need to break your problem down into smaller pieces.
Step 1: Can you get a webcam feed showing in your sketch? Don't worry about the computer vision stuff for a second. Just get the camera connected. Do some research and try something out.
Step 2: Can you detect facial features in that video? You might try doing it yourself, or you might use one of the many libraries listed in the Videos and Vision section of the Processing libraries page.
Step 3: Read the documentation on those libraries. Try them out. You might have to make a bunch of little example sketches using each library until you find one you like. We can't do this for you, as which one is right for you depends on you. If you're confused about something specific we can try to help you, but we can't really help you with picking out a library.
Step 4: Once you've done a bunch of example programs and picked out a library, start working towards your goal. Can you detect facial features using the library? Get just that part working. Once you have that working, can you detect changes like opening or closing a mouth?
Work on one small step at a time. If you get stuck, post an MCVE along with a specific technical question, and we'll go from there. Good luck.

Converting contours found using EMGU.CV

I am new to EMGU.CV and I am struggling a bit. Let me start by giving some background of the project, i am trying to track a users fingers, i.e. calculate the users finger tips, but i am struggling a bit. I have created a set of code which filters the depth information to only a certain range and I generate a Bitmap image, tempBitmap, i then convert this image to a greyscale image using EMGU.CV which can be used by cvCanny. Once this is done i apply dilate filter to the canny image to thicken up the outline of the hand to better improve the chance of generating a successful contour, I then try to get the contours of the hand. Now what i have managed to do is to draw a box around the hand, but i am struggling to find a way to convert the contours generated by FindContours to a set of Points i can use to draw the contour with. the variable depthImage2 is a Bitmap image variable i use to draw on before assinging it to the picturebox variable on my C# form based application. any information or guidance you can provide me with will be greatly appreciated, also if my code isnt correct maybe guiding me in a direction where i can calculate the finger tips. I think i am almost there i am just missing something, so any help of any kind will be appreciated.
Image<Bgr, Byte> currentFrame = new Image<Bgr, Byte>(tempBitmap);
Image<Gray, Byte> grayImage = currentFrame.Convert<Gray, Byte>().PyrDown().PyrUp();
Image<Gray, Byte> cannyImage = new Image<Gray, Byte>(grayImage.Size);
CvInvoke.cvCanny(grayImage, cannyImage, 10, 60, 3);
StructuringElementEx kernel = new StructuringElementEx(
3, 3, 1, 1, Emgu.CV.CvEnum.CV_ELEMENT_SHAPE.CV_SHAPE_ELLIPSE);
CvInvoke.cvDilate(cannyImage, cannyImage, kernel, 1);
IntPtr cont = IntPtr.Zero;
Graphics graphicsBitmap = Graphics.FromImage(depthImage2);
using (MemStorage storage = new MemStorage()) //allocate storage for contour approximation
for (Contour<Point> contours =
cannyImage.FindContours(Emgu.CV.CvEnum.CHAIN_APPROX_METHOD.CV_CHAIN_APPROX_SIMPLE,
Emgu.CV.CvEnum.RETR_TYPE.CV_RETR_EXTERNAL);
contours != null; contours = contours.HNext)
{
IntPtr seq = CvInvoke.cvConvexHull2(contours, storage.Ptr, Emgu.CV.CvEnum.ORIENTATION.CV_CLOCKWISE, 0);
IntPtr defects = CvInvoke.cvConvexityDefects(contours, seq, storage);
Seq<Point> tr = contours.GetConvexHull(Emgu.CV.CvEnum.ORIENTATION.CV_CLOCKWISE);
Seq<Emgu.CV.Structure.MCvConvexityDefect> te = contours.GetConvexityDefacts(
storage, Emgu.CV.CvEnum.ORIENTATION.CV_CLOCKWISE);
graphicsBitmap.DrawRectangle(
new Pen(new SolidBrush(Color.Red)), tr.BoundingRectangle);
}
Contour contours = cannyImage.FindContours(Emgu.CV.CvEnum.CHAIN_APPROX_METHOD.CV_CHAIN_APPROX_NONE) //to return all points
then:
List<Point[]> convertedContours = new List<Point[]>();
while(cotours!=null)
{
var contourPoints = contours.ToArray(); //put Seq<Point> to Point[], ToList() is also available ?
convertedContours.Add(contourpoints);
contours = contours.HNext;
}
you can draw contour by image Draw functon overload.
just find signature that contains parameter Seq<>
....

mouse handler in opencv for large images, wrong x,y coordinates?

i am using images that are 2048 x 500 and when I use cvShowImage, I only see half the image. This is not a big deal because the interesting part is on the top half of the image. Now, when I use the mouseHandler to get the x,y coordinates of my clicks, I noticed that the coordinate for y (the dimension that doesnt fit in the screen) is wrong.
It seems OpenCV think this is the whole image and recalibrates the coordinate system although we are only effectively showing half the image.
I would need to know how to do 2 things:
- display a resized image that would fit in the screen
get the proper coordinate.
Did anybody encounter similar problems?
Thanks!
Update: it seems the y coordinate is divided by 2 of what it is supposed to be
code:
EXPORT void click_rect(uchar * the_img, int size_x, int size_y, int * points)
{
CvSize size;
size.height = size_y ;
size.width = size_x;
IplImage * img;
img = cvCreateImageHeader(size, IPL_DEPTH_8U, 1);
img->imageData = (char *)the_img;
img->imageDataOrigin = img->imageData;
img1 = cvCreateImage(cvSize((int)((size.width)) , (int)((size.height)) ),IPL_DEPTH_8U, 1);
cvNamedWindow("mainWin",CV_WINDOW_AUTOSIZE);
cvMoveWindow("mainWin", 100, 100);
cvSetMouseCallback( "mainWin", mouseHandler_rect, NULL );
cvShowImage("mainWin", img1 );
//// wait for a key
cvWaitKey(0);
points[0] = x_1;
points[1] = x_2;
points[2] = y_1;
points[3] = y_2;
//// release the image
cvDestroyWindow("mainWin");
cvReleaseImage(&img1 );
cvReleaseImage(&img);
}
You should create a window with the CV_WINDOW_KEEPRATIO flag instead of the CV_WINDOW_AUTOSIZE flag. This temporarily fixes the problem with your y values being wrong.
I use OpenCV2.1 and visual studio C++ compiler. I fix this problem with another flag CV_WINDOW_NORMAL and work properly and returns correct coordinates, this flag enables you to resize the image window.
cvNamedWindow("Box Example", CV_WINDOW_NORMAL);
I am having the same problem with OpenCV 2.1 using it with Windows and mingw compiler. It took me forever to find out what was wrong. As you describe it, cvSetMouseCallback gets too large y coordinates. This is apparently due to the image and the cvNamedWindow it is shown in being bigger than my screen resolution; thus I cannot see the bottom of the image.
As a solution I resize the images to a fixed size, such that they fit on the screen (in this case with resolution 800x600, which can be any other values:
// g_input_image, g_output_image and g_resized_image are global IplImage* pointers.
int img_w = cvGetSize(g_input_image).width;
int img_h = cvGetSize(g_input_image).height;
// If the height/width ratio is greater than 6/8 resize height to 600.
if (img_h > (img_w*6)/8) {
g_resized_image = cvCreateImage(cvSize((img_w*600)/img_h, 600), 8, 3);
}
// else adjust width to 800.
else {
g_resized_image = cvCreateImage(cvSize(800, (img_h*800)/img_w), 8, 3);
}
cvResize(g_output_image, g_resized_image);
Not a perfect solution, but works for me...
Cheers,
Linus
How are you building the window? You are not passing CV_WINDOW_AUTOSIZE to cvNamedWindow(), are you?
Share some source, #Denis.

Resources