Related
I have a problem with opencv, I must detect and tracking grapes with a camera using the program: processing, how do it do? Can I have an exemple? thankyou
This code is an exemple code that detect the face:
import gab.opencv.*;
import processing.video.*;
import java.awt.*;
Capture video;
OpenCV opencv;
void setup() {
size(640, 480);
video = new Capture(this, 640/2, 480/2);
opencv = new OpenCV(this, 640/2, 480/2);
opencv.loadCascade(OpenCV.CASCADE_FRONTALFACE);
video.start();
}
void draw() {
scale(2);
opencv.loadImage(video);
image(video, 0, 0 );
noFill();
stroke(0, 255, 0);
strokeWeight(3);
Rectangle[] faces = opencv.detect();
println(faces.length);
for (int i = 0; i < faces.length; i++) {
println(faces[i].x + "," + faces[i].y);
rect(faces[i].x, faces[i].y, faces[i].width, faces[i].height);
}
}
void captureEvent(Capture c) {
c.read();
}
The code you're using trying to detect faces.
As a basic breakdown you will need to segment the object you're trying to detect (grapes in this case) from the background. I recommend starting simple:
try simply using threshold() and see if the highlights of each grape can be isolated. Hopefully they'll be the brightest spot in the image (if the camera isn't looking directly at a light source)
if method 1 isn't effective, try using colour detection: if you what kind of grapes you want to detect you can select a range of colours to detect and ignore the rest. Run the HSVColorTracking example and have a play with the ranges. Swap the marbles image with an image of grapes and see what you can get.
OpenCV has a function specifically built for detecting circles: HoughCircles. Unfortunately Greg's OpenCV Processing library doesn't wrap this function as he does with HoughLines yet, but there it provides function to convert between OpenCV's Mat and Processing PImage. If you're just getting started with Processing and don't have a experience with plain Java, this may be more convoluted.
Try basic thresholding and HSB range thresholding first. Once you have a good looking binary image (where the background is completely black and the grapes are white) you can findContours, get the centroid of each contour, compute the minEnclosingCircle(), etc.
Another option might be to train a support vector machine to distinguish between two classes: grapes and not grapes. This is a more advanced topic, but luckily Greg Borenstein, author of the OpenCV Processing library wrote a nice article with videos and example code on creating on the topic. Check out PSVM: Support Vector Machines for Processing.
Here's a mashup of the HueRangeSelection and FindContours examples using an google image result:
import gab.opencv.*;
PImage img;
OpenCV opencv;
Histogram histogram;
int lowerb = 50;
int upperb = 100;
ArrayList<Contour> contours;
ArrayList<Contour> polygons;
void setup() {
size(800,400);
img = loadImage("grape-harvest-inside.jpg");
opencv = new OpenCV(this, img);
opencv.useColor(HSB);
}
void draw() {
opencv.loadImage(img);
image(img, 0, 0);
opencv.setGray(opencv.getH().clone());
opencv.inRange(lowerb, upperb);
histogram = opencv.findHistogram(opencv.getH(), 255);
image(opencv.getOutput(), width/2, height/2, width/2,height/2);
noStroke(); fill(0);
histogram.draw(10, height - 230, 400, 200);
noFill(); stroke(0);
line(10, height-30, 410, height-30);
text("Hue", 10, height - (textAscent() + textDescent()));
float lb = map(lowerb, 0, 255, 0, 400);
float ub = map(upperb, 0, 255, 0, 400);
stroke(255, 0, 0); fill(255, 0, 0);
strokeWeight(2);
line(lb + 10, height-30, ub +10, height-30);
ellipse(lb+10, height-30, 3, 3 );
text(lowerb, lb-10, height-15);
ellipse(ub+10, height-30, 3, 3 );
text(upperb, ub+10, height-15);
contours = opencv.findContours();
for (Contour contour : contours) {
stroke(0, 255, 0);
noFill();
contour.draw();
}
}
void mouseMoved() {
if (keyPressed) {
upperb += mouseX - pmouseX;
}
else {
if (upperb < 255 || (mouseX - pmouseX) < 0) {
lowerb += mouseX - pmouseX;
}
if (lowerb > 0 || (mouseX - pmouseX) > 0) {
upperb += mouseX - pmouseX;
}
}
upperb = constrain(upperb, lowerb, 255);
lowerb = constrain(lowerb, 0, upperb-1);
}
Here's a preview of selecting range closer to the grapes colour:
You already notice this is both easy to use, but also not full proof and should get you on the right track to asking yourself the right kind of questions.
For example:
what environments are you supporting ? (indoors/outdoors, natural lighting, artificial lighting, daytime, nighttime, both ? etc.) - light controls what your input images will look like and is therefore crucial
how many different grapes will you support ? (can you get away with a single type (colour range), are there are elements that may trigger a false positive ?)
etc.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I am trying to detect the National ID of the below type and get the details of it, For example the location of the signature should be found at the top right corner of the persons image, in this case "BC".
I need to do this application in iphone. I thought of using Opencv for it but how can I achieve the marked details? Do I need to train the application with similar kind Cards or OCR could help?
Is there any specific implementations for mobile applications?
I also gone through card-io which detects the credit card details, does Card-io detects the other card details also?
Update:
I have used tesseract for text detection. Tesseract works good if the image has text alone. So I cropped the red marked regions and given as input to Tesseract, it works good with the MRZ part.
There is a IOS implementation for Tesseract, with which I have tested.
What I need to do?
Now I am trying to automate the text detection part. Now I am planning to automate the following items,
1) Cropping the Face ( I have done using Viola-jones face detector ).
2) Need to take the Initial in this example "BC" from the Photo.
3) Extracting/detecting the MRZ region from the ID card.
I am trying to do 2 & 3, Any ideas or code snippets would be great.
Assuming these IDs are prepared according to a standard template having specific widths, heights, offsets, spacing etc., you can try a template based approach.
MRZ would be easy to detect. Once you detect it in the image, find the transformation that maps the MRZ in your template to it. When you know this transformation you can map any region on your template (for example, the photo of the individual) to the image and extract that region.
Below is a very simple program that follows a happy path. You will have to do more processing to locate the MRZ in general (for example, if there are perspective distortions or rotations). I prepared the template just by measuring the image, and it won't work for your case. I just wanted to convey the idea. Image was taken from wiki
Mat rgb = imread(INPUT_FILE);
Mat gray;
cvtColor(rgb, gray, CV_BGR2GRAY);
Mat grad;
Mat morphKernel = getStructuringElement(MORPH_ELLIPSE, Size(3, 3));
morphologyEx(gray, grad, MORPH_GRADIENT, morphKernel);
Mat bw;
threshold(grad, bw, 0.0, 255.0, THRESH_BINARY | THRESH_OTSU);
// connect horizontally oriented regions
Mat connected;
morphKernel = getStructuringElement(MORPH_RECT, Size(9, 1));
morphologyEx(bw, connected, MORPH_CLOSE, morphKernel);
// find contours
Mat mask = Mat::zeros(bw.size(), CV_8UC1);
vector<vector<Point>> contours;
vector<Vec4i> hierarchy;
findContours(connected, contours, hierarchy, CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE, Point(0, 0));
vector<Rect> mrz;
double r = 0;
// filter contours
for(int idx = 0; idx >= 0; idx = hierarchy[idx][0])
{
Rect rect = boundingRect(contours[idx]);
r = rect.height ? (double)(rect.width/rect.height) : 0;
if ((rect.width > connected.cols * .7) && /* filter from rect width */
(r > 25) && /* filter from width:hight ratio */
(r < 36) /* filter from width:hight ratio */
)
{
mrz.push_back(rect);
rectangle(rgb, rect, Scalar(0, 255, 0), 1);
}
else
{
rectangle(rgb, rect, Scalar(0, 0, 255), 1);
}
}
if (2 == mrz.size())
{
// just assume we have found the two data strips in MRZ and combine them
CvRect max = cvMaxRect(&(CvRect)mrz[0], &(CvRect)mrz[1]);
rectangle(rgb, max, Scalar(255, 0, 0), 2); // draw the MRZ
vector<Point2f> mrzSrc;
vector<Point2f> mrzDst;
// MRZ region in our image
mrzDst.push_back(Point2f((float)max.x, (float)max.y));
mrzDst.push_back(Point2f((float)(max.x+max.width), (float)max.y));
mrzDst.push_back(Point2f((float)(max.x+max.width), (float)(max.y+max.height)));
mrzDst.push_back(Point2f((float)max.x, (float)(max.y+max.height)));
// MRZ in our template
mrzSrc.push_back(Point2f(0.23f, 9.3f));
mrzSrc.push_back(Point2f(18.0f, 9.3f));
mrzSrc.push_back(Point2f(18.0f, 10.9f));
mrzSrc.push_back(Point2f(0.23f, 10.9f));
// find the transformation
Mat t = getPerspectiveTransform(mrzSrc, mrzDst);
// photo region in our template
vector<Point2f> photoSrc;
photoSrc.push_back(Point2f(0.0f, 0.0f));
photoSrc.push_back(Point2f(5.66f, 0.0f));
photoSrc.push_back(Point2f(5.66f, 7.16f));
photoSrc.push_back(Point2f(0.0f, 7.16f));
// surname region in our template
vector<Point2f> surnameSrc;
surnameSrc.push_back(Point2f(6.4f, 0.7f));
surnameSrc.push_back(Point2f(8.96f, 0.7f));
surnameSrc.push_back(Point2f(8.96f, 1.2f));
surnameSrc.push_back(Point2f(6.4f, 1.2f));
vector<Point2f> photoDst(4);
vector<Point2f> surnameDst(4);
// map the regions from our template to image
perspectiveTransform(photoSrc, photoDst, t);
perspectiveTransform(surnameSrc, surnameDst, t);
// draw the mapped regions
for (int i = 0; i < 4; i++)
{
line(rgb, photoDst[i], photoDst[(i+1)%4], Scalar(0,128,255), 2);
}
for (int i = 0; i < 4; i++)
{
line(rgb, surnameDst[i], surnameDst[(i+1)%4], Scalar(0,128,255), 2);
}
}
Result: photo and surname regions in orange. MRZ in blue.
Card.io is designed specifically for embossed credit cards. It won't work for this use case.
There is now the PassportEye library available for this purpose. It's not perfect, but works quite well in my experience: https://pypi.python.org/pypi/PassportEye/
I am doing a project in which I have to inspect pharmaceutical blister pack for missing tablets.
I am trying to use opencv's matchTemplate function. Let me show the code and then some results.
int match(string filename, string templatename)
{
Mat ref = cv::imread(filename + ".jpg");
Mat tpl = cv::imread(templatename + ".jpg");
if (ref.empty() || tpl.empty())
{
cout << "Error reading file(s)!" << endl;
return -1;
}
imshow("file", ref);
imshow("template", tpl);
Mat res_32f(ref.rows - tpl.rows + 1, ref.cols - tpl.cols + 1, CV_32FC1);
matchTemplate(ref, tpl, res_32f, CV_TM_CCOEFF_NORMED);
Mat res;
res_32f.convertTo(res, CV_8U, 255.0);
imshow("result", res);
int size = ((tpl.cols + tpl.rows) / 4) * 2 + 1; //force size to be odd
adaptiveThreshold(res, res, 255, ADAPTIVE_THRESH_MEAN_C, THRESH_BINARY, size, -128);
imshow("result_thresh", res);
while (true)
{
double minval, maxval, threshold = 0.8;
Point minloc, maxloc;
minMaxLoc(res, &minval, &maxval, &minloc, &maxloc);
if (maxval >= threshold)
{
rectangle(ref, maxloc, Point(maxloc.x + tpl.cols, maxloc.y + tpl.rows), CV_RGB(0,255,0), 2);
floodFill(res, maxloc, 0); //mark drawn blob
}
else
break;
}
imshow("final", ref);
waitKey(0);
return 0;
}
And here are some pictures.
The "sample" image of a good blister pack:
The template cropped from "sample" image:
Result with "sample" image:
Missing tablet from this pack is detected:
But here are the problems:
I currently don't have any idea why this happens. Any suggestion and/or help is appreciated.
The original code that I followed and modified is here: http://opencv-code.com/quick-tips/how-to-handle-template-matching-with-multiple-occurences/
I found a solution for my own question. I just need to apply Canny edge detector on both image and template before throwing them to matchTemplate function. The full working code:
int match(string filename, string templatename)
{
Mat ref = cv::imread(filename + ".jpg");
Mat tpl = cv::imread(templatename + ".jpg");
if(ref.empty() || tpl.empty())
{
cout << "Error reading file(s)!" << endl;
return -1;
}
Mat gref, gtpl;
cvtColor(ref, gref, CV_BGR2GRAY);
cvtColor(tpl, gtpl, CV_BGR2GRAY);
const int low_canny = 110;
Canny(gref, gref, low_canny, low_canny*3);
Canny(gtpl, gtpl, low_canny, low_canny*3);
imshow("file", gref);
imshow("template", gtpl);
Mat res_32f(ref.rows - tpl.rows + 1, ref.cols - tpl.cols + 1, CV_32FC1);
matchTemplate(gref, gtpl, res_32f, CV_TM_CCOEFF_NORMED);
Mat res;
res_32f.convertTo(res, CV_8U, 255.0);
imshow("result", res);
int size = ((tpl.cols + tpl.rows) / 4) * 2 + 1; //force size to be odd
adaptiveThreshold(res, res, 255, ADAPTIVE_THRESH_MEAN_C, THRESH_BINARY, size, -64);
imshow("result_thresh", res);
while(1)
{
double minval, maxval;
Point minloc, maxloc;
minMaxLoc(res, &minval, &maxval, &minloc, &maxloc);
if(maxval > 0)
{
rectangle(ref, maxloc, Point(maxloc.x + tpl.cols, maxloc.y + tpl.rows), Scalar(0,255,0), 2);
floodFill(res, maxloc, 0); //mark drawn blob
}
else
break;
}
imshow("final", ref);
waitKey(0);
return 0;
}
Any suggestion for improvement is appreciated. I am strongly concerned about performance and robustness of my code, so I am looking for all ideas.
There are 2 things that got my nerves now: the lower Canny threshold and the negative constant on adaptiveThreshold function.
Edit: Here is the result, as you asked :)
Template:
Test image, missing 2 tablets:
Canny results of template and test image:
matchTemplate result (converted to CV_8U):
After adaptiveThreshold:
Final result:
I don't think think the adaptive threshold is a good choice.
What you need to do here is called non-maximum suppression. You have an image with multiple local maxima, and you want to remove all pixels that are not local maxima.
cv::dilate(res_32f, res_dilated, null, 5);
cv::compare(res_32f, res_dilated, mask_local_maxima, cv::CMP_GE);
cv::set(res_32f, 0, mask_local_maxima)
Now all pixels in the res_32f image that are not local maxima are set to zero. All the maximum pixels are still at their original value, so you can adjust the threshold later in the line
double minval, maxval, threshold = 0.8;
All local maxima should also now be surrounded by enough zeroes that the floodfill will not extend too far.
Now I think you should be able to adjust the threshold to exclude all false positives.
If this is not enough, here is another suggestion:
Instead of just one template, I would run the search with multiple templates; your current template,and one with a tablet from the right side and the left side of the pack. Due to perspective these tablets look quite a bit different. Keep track of the found tablets so you do not detect the smae tablet multiple times.
With these multiple templates you can raise the threshold even higher.
One further refinement: if the detection is still too erratic, try blurring your template and search image with a Gaussian blur. This will remove fine details and noise that may throw of the matchTemplate function, while leaving the larger structures intact.
Using a canny filter instead seems unreliable to me: It seems to rely on the fact that a removed tablet region will have more edges at the center. But I am not sure if this will always be the case; and you discard a lot of information about color and brightness with the canny filter, so I would expect worse results.
(that said, if it works for you, it works)
Have you tried the Surf algorithm in order to get more detailed descriptors? You could try to collect descriptor for both the full and the empty sample image. And perform different action for each one of thr object detected.
I'm trying to improve face detection from a camera capture so I thought it would be better if before the face detection process I removed the background from the image,
I'm using BackgroundSubtractorMOG and CascadeClassifier with lbpcascade_frontalface for face detection,
My question is: how can I grab the foreground image in order to use it as the input to face detection? this is what I have so far:
while (true) {
capture.retrieve(image);
mog.apply(image, fgMaskMOG, training?LEARNING_RATE:0);
if (counter++ > LEARNING_LIMIT) {
training = false;
}
// I think something should be done HERE to 'apply' the foreground mask
// to the original image before passing it to the classifier..
MatOfRect faces = new MatOfRect();
classifier.detectMultiScale(image, faces);
// draw faces rect
for (Rect rect : faces.toArray()) {
Core.rectangle(image, new Point(rect.x, rect.y), new Point(rect.x + rect.width, rect.y + rect.height), new Scalar(255, 0, 0));
}
// show capture in JFrame
frame.update(image);
frameFg.update(fgMaskMOG);
Thread.sleep(1000 / FPS);
}
Thanks
I can answer in C++ using the BackgroundSubtractorMOG2:
You can either use erosion or pass a higher threshold value to the MOG background subtractor to remove the noise. In order to completely get rid of the noise and false positives, you can also blur the mask image and then apply a threshold:
// Blur the mask image
blur(fgMaskMOG2, fgMaskMOG2, Size(5,5), Point(-1,-1));
// Remove the shadow parts and the noise
threshold(fgMaskMOG2, fgMaskMOG2, 128, 255, 0);
Now you can easily find the rectangle bounding the foreground region and pass this area to the cascade classifier:
// Find the foreground bounding rectangle
Mat fgPoints;
findNonZero(fgMaskMOG2, fgPoints);
Rect fgBoundRect = boundingRect(fgPoints);
// Crop the foreground ROI
Mat fgROI = image(fgBoundRect);
// Detect the faces
vector<Rect> faces;
face_cascade.detectMultiScale(fgROI, faces, 1.3, 3, 0|CV_HAAR_SCALE_IMAGE, Size(32, 32));
// Display the face ROIs
for(size_t i = 0; i < faces.size(); ++i)
{
Point center(fgBoundRect.x + faces[i].x + faces[i].width*0.5, fgBoundRect.y + faces[i].y + faces[i].height*0.5);
circle(image, center, faces[i].width*0.5, Scalar(255, 255, 0), 4, 8, 0);
}
In this way, you will reduce the search area for the cascade classifier, which not only makes it faster but also reduces the false positive faces.
If you have the input image and the foreground mask, this is straight forward.
In C++, I would simply add (just where you put your comment): image.copyTo(fgimage,fgMaskMOG);
I'm not familiar with the java interface, but this should be quite similar. Just don't forget to correctly initialize fgimage and reset it each frame.
I wrote a digital OCR for ios.
I have a test image png with two digits 5 and 4.
I find the contours. How do I transfer the contour one at tesseract?
init tesseract:
tess = new tesseract::TessBaseAPI();
tess->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding], "eng");
tess->SetPageSegMode(tesseract::PSM_SINGLE_CHAR); //<-- !!!!
tess->tesseract::TessBaseAPI::SetVariable("tessedit_char_whitelist", "0123456789");
Function for detect contours:
- (std::vector<std::vector<cv::Point> >)findSquaresInImage:(cv::Mat)_image {
std::vector<std::vector<cv::Point> > squares;
cv::Mat pyr, timg, gray0(_image.size(), CV_8U), gray;
int thresh = 50, N = 11;
cv::pyrDown(_image, pyr, cv::Size(_image.cols/2, _image.rows/2));
cv::pyrUp(pyr, timg, _image.size());
std::vector<std::vector<cv::Point> > contours;
int ch[] = {0, 0};
mixChannels(&timg, 1, &gray0, 1, ch, 1);
for( int l = 0; l < N; l++ ) {
if( l == 0 ) {
cv::Canny(gray0, gray, 0, thresh, 5);
cv::dilate(gray, gray, cv::Mat(), cv::Point(-1,-1));
}
else {
gray = gray0 >= (l+1)*255/N;
}
cv::findContours(gray, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
std::vector<cv::Point> approx;
CvRect rec1;
std::string str;
std::map<int,IplImage*> pic_list;
for( size_t i = 0; i < contours.size(); i++ )
{
rec1 = cv::boundingRect(contours[i]);
if (rec1.height > 0.5*gray.rows && rec1.width < 0.756*gray.cols) {
NSLog(#"%d %d %d %d", rec1.width, rec1.height, rec1.x, rec1.y);
cv::approxPolyDP(cv::Mat(contours[i]), approx, arcLength(cv::Mat(contours[i]), true)*0.02, true);
squares.push_back(approx);
}
}
}
return squares; }
function for draw contours:
cv::Mat debugSquares( std::vector<std::vector<cv::Point> > squares, cv::Mat image ) {
for ( int i = 0; i< squares.size(); i++ ) {
// draw contour
cv::drawContours(image, squares, i, cv::Scalar(255,0,0), 1, 8, std::vector<cv::Vec4i>(), 0, cv::Point());
// draw bounding rect
cv::Rect rect = boundingRect(cv::Mat(squares[i]));
cv::rectangle(image, rect.tl(), rect.br(), cv::Scalar(0,255,0), 2, 8, 0);
// draw rotated rect
cv::RotatedRect minRect = minAreaRect(cv::Mat(squares[i]));
cv::Point2f rect_points[4];
minRect.points( rect_points );
for ( int j = 0; j < 4; j++ ) {
cv::line( image, rect_points[j], rect_points[(j+1)%4], cv::Scalar(0,0,255), 1, 8 ); // blue
}
}
return image;
}
method for btn Click:
- (IBAction)onMath:(id)sender {
UIImage *image = [UIImage imageNamed:#"test1.png"];
cv::Mat iMat = [self cvMatFromUIImage:image];
std::vector<std::vector<cv::Point> > sq = [self findSquaresInImage:iMat];
cv::Mat hui = debugSquares(sq, iMat);
image = [self UIImageFromCVMat:hui];
self.imView.image = image;
}
image after:
link to project on github: https://github.com/MaxPatsy/iORC
Can you check this answer here
I described some tips for preparing images for Tesseract here: Using tesseract to recognize license plates
In your example, there are several things going on...
You need to get the text to be black and the rest of the image white (not the reverse). That's what character recognition is tuned on. Grayscale is ok, as long as the background is mostly full white and the text mostly full black; the edges of the text may be gray (antialiased) and that may help recognition (but not necessarily - you'll have to experiment)
One of the issues you're seeing is that in some parts of the image, the text is really "thin" (and gaps in the letters show up after thresholding), while in other parts it is really "thick" (and letters start merging). Tesseract won't like that :) It happens because the input image is not evenly lit, so a single threshold doesn't work everywhere. The solution is to do "locally adaptive thresholding" where a different threshold is calculated for each neighbordhood of the image. There are many ways of doing that, but check out for example:
Adaptive gaussian thresholding in OpenCV with cv2.adaptiveThreshold(...,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,...)
Local Otsu's method
Local adaptive histogram equalization
Another problem you have is that the lines aren't straight. In my experience Tesseract can handle a very limited degree of non-straight lines (a few percent of perspective distortion, tilt or skew), but it doesn't really work with wavy lines. If you can, make sure that the source images have straight lines :) Unfortunately, there is no simple off-the-shelf answer for this; you'd have to look into the research literature and implement one of the state of the art algorithms yourself (and open-source it if possible - there is a real need for an open source solution to this). A Google Scholar search for "curved line OCR extraction" will get you started, for example:
Text line Segmentation of Curved Document Images
Lastly: I think you would do much better to work with the python ecosystem (ndimage, skimage) than with OpenCV in C++. OpenCV python wrappers are ok for simple stuff, but for what you're trying to do they won't do the job, you will need to grab many pieces that aren't in OpenCV (of course you can mix and match). Implementing something like curved line detection in C++ will take an order of magnitude longer than in python (* this is true even if you don't know python).
Good luck!