Open cv - Processing - detect grapes - opencv

I have a problem with opencv, I must detect and tracking grapes with a camera using the program: processing, how do it do? Can I have an exemple? thankyou
This code is an exemple code that detect the face:
import gab.opencv.*;
import processing.video.*;
import java.awt.*;
Capture video;
OpenCV opencv;
void setup() {
size(640, 480);
video = new Capture(this, 640/2, 480/2);
opencv = new OpenCV(this, 640/2, 480/2);
opencv.loadCascade(OpenCV.CASCADE_FRONTALFACE);
video.start();
}
void draw() {
scale(2);
opencv.loadImage(video);
image(video, 0, 0 );
noFill();
stroke(0, 255, 0);
strokeWeight(3);
Rectangle[] faces = opencv.detect();
println(faces.length);
for (int i = 0; i < faces.length; i++) {
println(faces[i].x + "," + faces[i].y);
rect(faces[i].x, faces[i].y, faces[i].width, faces[i].height);
}
}
void captureEvent(Capture c) {
c.read();
}

The code you're using trying to detect faces.
As a basic breakdown you will need to segment the object you're trying to detect (grapes in this case) from the background. I recommend starting simple:
try simply using threshold() and see if the highlights of each grape can be isolated. Hopefully they'll be the brightest spot in the image (if the camera isn't looking directly at a light source)
if method 1 isn't effective, try using colour detection: if you what kind of grapes you want to detect you can select a range of colours to detect and ignore the rest. Run the HSVColorTracking example and have a play with the ranges. Swap the marbles image with an image of grapes and see what you can get.
OpenCV has a function specifically built for detecting circles: HoughCircles. Unfortunately Greg's OpenCV Processing library doesn't wrap this function as he does with HoughLines yet, but there it provides function to convert between OpenCV's Mat and Processing PImage. If you're just getting started with Processing and don't have a experience with plain Java, this may be more convoluted.
Try basic thresholding and HSB range thresholding first. Once you have a good looking binary image (where the background is completely black and the grapes are white) you can findContours, get the centroid of each contour, compute the minEnclosingCircle(), etc.
Another option might be to train a support vector machine to distinguish between two classes: grapes and not grapes. This is a more advanced topic, but luckily Greg Borenstein, author of the OpenCV Processing library wrote a nice article with videos and example code on creating on the topic. Check out PSVM: Support Vector Machines for Processing.
Here's a mashup of the HueRangeSelection and FindContours examples using an google image result:
import gab.opencv.*;
PImage img;
OpenCV opencv;
Histogram histogram;
int lowerb = 50;
int upperb = 100;
ArrayList<Contour> contours;
ArrayList<Contour> polygons;
void setup() {
size(800,400);
img = loadImage("grape-harvest-inside.jpg");
opencv = new OpenCV(this, img);
opencv.useColor(HSB);
}
void draw() {
opencv.loadImage(img);
image(img, 0, 0);
opencv.setGray(opencv.getH().clone());
opencv.inRange(lowerb, upperb);
histogram = opencv.findHistogram(opencv.getH(), 255);
image(opencv.getOutput(), width/2, height/2, width/2,height/2);
noStroke(); fill(0);
histogram.draw(10, height - 230, 400, 200);
noFill(); stroke(0);
line(10, height-30, 410, height-30);
text("Hue", 10, height - (textAscent() + textDescent()));
float lb = map(lowerb, 0, 255, 0, 400);
float ub = map(upperb, 0, 255, 0, 400);
stroke(255, 0, 0); fill(255, 0, 0);
strokeWeight(2);
line(lb + 10, height-30, ub +10, height-30);
ellipse(lb+10, height-30, 3, 3 );
text(lowerb, lb-10, height-15);
ellipse(ub+10, height-30, 3, 3 );
text(upperb, ub+10, height-15);
contours = opencv.findContours();
for (Contour contour : contours) {
stroke(0, 255, 0);
noFill();
contour.draw();
}
}
void mouseMoved() {
if (keyPressed) {
upperb += mouseX - pmouseX;
}
else {
if (upperb < 255 || (mouseX - pmouseX) < 0) {
lowerb += mouseX - pmouseX;
}
if (lowerb > 0 || (mouseX - pmouseX) > 0) {
upperb += mouseX - pmouseX;
}
}
upperb = constrain(upperb, lowerb, 255);
lowerb = constrain(lowerb, 0, upperb-1);
}
Here's a preview of selecting range closer to the grapes colour:
You already notice this is both easy to use, but also not full proof and should get you on the right track to asking yourself the right kind of questions.
For example:
what environments are you supporting ? (indoors/outdoors, natural lighting, artificial lighting, daytime, nighttime, both ? etc.) - light controls what your input images will look like and is therefore crucial
how many different grapes will you support ? (can you get away with a single type (colour range), are there are elements that may trigger a false positive ?)
etc.

Related

Detecting Glare in image

I am adding a card scanning feature to my iOS app and trying to add glare detection. So when a glare is detected while scanning, user is prompted to reposition the card.
I have tried implementing it with OpenCV using
https://www.pyimagesearch.com/2016/10/31/detecting-multiple-bright-spots-in-an-image-with-python-and-opencv/
https://www.pyimagesearch.com/2014/09/29/finding-brightest-spot-image-using-python-opencv/
Finding bright spots in a image using opencv
But considering I have very less background knowledge in OpenCV and computer vision, I am unable to get desired results.
Kindly guide me in right direction on how can I achieve glare detection using OpenCV/CoreImage/GPUImage.
My code so far
+ (bool) imageHavingGlare:(CMSampleBufferRef)buffer {
cv::Mat matImage = [OpenCVWrapper matFromBuffer:buffer];
cv::Mat matImageGrey;
cv::cvtColor(matImage, matImageGrey, CV_BGRA2GRAY);
GaussianBlur(matImageGrey, matImageGrey, cvSize(11,11), 0);
cv::Mat matImageBinarized;
cv::threshold(matImageGrey, matImageBinarized, 30, 255, CV_THRESH_BINARY | CV_THRESH_OTSU);
double min, max;
cv::Point min_loc, max_loc;
cv::minMaxLoc(matImageBinarized, &min, &max, &min_loc, &max_loc);
if(((max_loc.x > 0) && (max_loc.y > 0)))
{
return true;
}
return false;
}

Blob Detection with light-colored blobs

I am having some issues with detecting specific "blobs" in a set of images. Not all images are the same, but I suppose the same parameters would be used to detect anyways.
If you zoom in, you will see small, yellow aphids on the leaf. My goal is to single these out and count them. I don't really need to do much to the image, just obtain a count of them.
Right now, I have this:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Emgu.CV;
using Emgu.CV.Features2D;
using Emgu.CV.Structure;
using Emgu.CV.Util;
namespace AphidCounter
{
class Program
{
static void Main(string[] args)
{
// Read image
Mat im_in = CvInvoke.Imread("myimage1.jpg", Emgu.CV.CvEnum.LoadImageType.Grayscale);
//Mat im_in = CvInvoke.Imread("myimage2.png", Emgu.CV.CvEnum.LoadImageType.Color);
Mat im = im_in;
CvInvoke.Threshold(im_in, im, 40, 255, Emgu.CV.CvEnum.ThresholdType.BinaryInv); // 60, 255, 1
//CvInvoke.NamedWindow("Blob Detector", Emgu.CV.CvEnum.NamedWindowType.AutoSize);
DetectBlobs(im, 0);
CvInvoke.WaitKey(0);
}
static void DetectBlobs(Mat im, int c)
{
int maxT = 50;
int minA = 125; // Minimum area in pixels
int maxA = 550; // Maximum area in pixels
SimpleBlobDetectorParams EMparams = new SimpleBlobDetectorParams();
SimpleBlobDetector detector;
EMparams.MinThreshold = 0;
EMparams.MaxThreshold = 100;
if (minA < 1) minA = 1;
EMparams.FilterByArea = true;
EMparams.MinArea = minA;
EMparams.MaxArea = maxA;
if (maxT < 1) maxT = 1;
EMparams.MinConvexity = (float)maxT / 1000.0F; // 0.67
EMparams.FilterByInertia = true;
EMparams.MinInertiaRatio = 0.01F;
EMparams.FilterByColor = true;
EMparams.blobColor = 0;
VectorOfKeyPoint keyPoints = new VectorOfKeyPoint();
detector = new SimpleBlobDetector(EMparams);
detector.DetectRaw(im, keyPoints);
Mat im_with_keypoints = new Mat();
Bgr color = new Bgr(0, 0, 255);
Features2DToolbox.DrawKeypoints(im, keyPoints, im_with_keypoints, color, Features2DToolbox.KeypointDrawType.DrawRichKeypoints);
// Show blobs
CvInvoke.Imwrite("keypoints1.jpg", im_with_keypoints);
CvInvoke.Imshow("Blob Detector " + keyPoints.Size, im_with_keypoints);
System.Console.WriteLine("Number of keypoints: " + keyPoints.Size);
}
}
}
However, this is the result:
Am I not getting the parameters right? Or is there something else that I'm missing?
It is not because of some wrong parameters. The image segmentation part itself has its limitation.
Grayscale based thresholding may not work when the contrast between the blob and the background is very low. Yet a threshold value around 160 is quite tolerable in this example but not any accurate.
I would suggest to go for colour based thresholding since there is a decent colour gap.
Here is a C++ implementation of colour based thresholding. Blobs are filtered using the same SimpleBlobDetector.
I have converted the image from RGB to ‘Lab’ for better segmentation.
As the image provided is too huge, it took more time to process. So I cropped a key part of the image and tuned the blob params for the same. So I provide the cropped image too (755 x 494px).
Colour based thresholding and blob filtering:
#include "opencv2\imgproc\imgproc.hpp";
#include "opencv2\highgui\highgui.hpp";
#include "opencv2\features2d\features2d.hpp";
using namespace cv;
using namespace std;
void main()
{
char image_path[] = "E:/Coding/media/images/leaf_small.jpg";
Mat img_color, img_lab, img_thresh, img_open, img_close, img_keypoints;
img_color = imread(image_path, IMREAD_ANYCOLOR);
//Convert image to CIE Lab colorspace for better colour based segmentation
cvtColor(img_color, img_lab, CV_BGR2Lab);
//create window before creating trackbar
namedWindow("win_thresh", WINDOW_NORMAL);
namedWindow("win_blob", WINDOW_NORMAL);
//Using trackbar calculate the range of L,a,b values to seperate blobs
int low_L = 150, low_A = 0, low_B = 155,
high_L = 255, high_A = 255, high_B = 255;
//*Use trackbars to caliberate colour thresholding
createTrackbar("low_L", "win_thresh", &low_L, 255);
createTrackbar("low_A", "win_thresh", &low_A, 255);
createTrackbar("low_B", "win_thresh", &low_B, 255);
createTrackbar("high_L", "win_thresh", &high_L, 255);
createTrackbar("high_A", "win_thresh", &high_A, 255);
createTrackbar("high_B", "win_thresh", &high_B, 255);
int minArea = 35, maxArea = 172, minCircularity = 58, minConvexity = 87, minInertiaRatio = 21;
//Use trackbar and set Blob detector parameters
createTrackbar("minArea", "win_blob", &minArea, 200);
createTrackbar("maxArea", "win_blob", &maxArea, 200);
createTrackbar("minCircular", "win_blob", &minCircularity, 99);
createTrackbar("minConvex", "win_blob", &minConvexity, 99);
createTrackbar("minInertia", "win_blob", &minInertiaRatio, 99);
SimpleBlobDetector::Params params;
vector<KeyPoint> keypoints;
while (waitKey(1) != 27) //press 'esc' to quit
{
//inRange thresholds basedon the Scalar boundaries provided
inRange(img_lab, Scalar(low_L, low_A, low_B), Scalar(high_L, high_A, high_B), img_thresh);
//Morphological filling
Mat strucElement = getStructuringElement(CV_SHAPE_ELLIPSE, Size(5, 5), Point(2, 2));
morphologyEx(img_thresh, img_close, MORPH_CLOSE, strucElement);
imshow("win_thresh", img_close);
//**SimpleBlobDetector works only in inverted binary images
//i.e.blobs should be in black and background in white.
bitwise_not(img_close, img_close); // inverts matrix
//Code crashes if minArea or any miin value is set to zero
//since trackbar starts from 0, it is adjusted here by adding 1
params.filterByArea = true;
params.minArea = minArea + 1;
params.maxArea = maxArea + 1;
params.filterByCircularity = true;
params.filterByConvexity = true;
params.filterByInertia = true;
params.minCircularity = (minCircularity + 1) / 100.0;
params.minConvexity = (minConvexity + 1) / 100.0;
params.minInertiaRatio = (minInertiaRatio + 1) / 100.0;
SimpleBlobDetector detector(params);
detector.detect(img_close, keypoints);
drawKeypoints(img_color, keypoints, img_keypoints, Scalar(0, 0, 255), DrawMatchesFlags::DEFAULT);
stringstream displayText;
displayText = stringstream();
displayText << "Blob_count: " << keypoints.size();
putText(img_keypoints, displayText.str(), Point(0, 50), CV_FONT_HERSHEY_PLAIN, 2, Scalar(0, 0, 255), 2);
imshow("win_blob", img_keypoints);
}
return;
}
Output Screenshot
Tune the blob parameters according to the actual HD image.
Since the veins of the leaf are almost of the same colour and intensity of the aphid, this method also may utterly fail when an aphid sits close to or exactly on top of a vein.
This can be an ad-hoc fix but not robust enough.
There got to be a simple and robust method to achieve the result, using some filters, transformation or edge detection. Please share any other optimal solution if available.
EDIT: Opting Grayscale thresholding as previous approach failed
Colour thresholding approach failed for this_image
Colour based thresholding has a very narrow bandwidth, if the image falls within the bandwidth the accuracy will be really good, on the other hand colour shifts totally ruin the accuracy.
Since you will be processing 100s of images, colour thresholding may not be suitable.
I tried normal Grayscale thresholding with some morphological erosion and filling, and got a decent accuracy. Also Grayscale thresholding has better immunity to colour shifts.
Additionally we have auto thrsholding option using OTSU Thresholding which selects the threshold value based on the image.
Code snippet:
threshold(img_gray, img_thresh, 0, 255, THRESH_OTSU);
Mat strucElement = getStructuringElement(CV_SHAPE_ELLIPSE, Size(3, 3), Point(1, 1));
morphologyEx(img_thresh, img_open, MORPH_OPEN, strucElement);
Rest of the code remains the same.
Parameter values:
minArea = 75, maxArea = 1000, minCircularity = 50, minConvexity = 20, minInertiaRatio = 15
The white ants are hard to differentiate from aphids as we are not using colour information. So the min_area has to be carefully tuned in order to exclude them.
Processed images can be found here img_1, img_2.
Tweak the morphology methods and blob parameters to obtain an optimal average count.

Detecting HeartBeat Using WebCam?

I am trying to create an application which can detect heartbeat using your computer webcam. I am working on the code since 2 weeks and developed this code and here I got so far
How does it works? Illustrated below ...
Detecting face using opencv
Getting image of forehead
Applying filter to convert it into grayscale image [you can skip it]
Finding the average intensity of green pixle per frame
Saving the averages into an Array
Applying FFT (I have used minim library)Extract heart beat from FFT spectrum (Here, I need some help)
Here, I need help for extracting heartbeat from FFT spectrum. Can anyone help me. Here, is the similar application developed in python but I am not able to undersand this code so I am developing same in the proessing. Can anyone help me to undersatnd the part of this python code where it is extracting the heartbeat.
//---------import required ilbrary -----------
import gab.opencv.*;
import processing.video.*;
import java.awt.*;
import java.util.*;
import ddf.minim.analysis.*;
import ddf.minim.*;
//----------create objects---------------------------------
Capture video; // camera object
OpenCV opencv; // opencv object
Minim minim;
FFT fft;
//IIRFilter filt;
//--------- Create ArrayList--------------------------------
ArrayList<Float> poop = new ArrayList();
float[] sample;
int bufferSize = 128;
int sampleRate = 512;
int bandWidth = 20;
int centerFreq = 80;
//---------------------------------------------------
void setup() {
size(640, 480); // size of the window
minim = new Minim(this);
fft = new FFT( bufferSize, sampleRate);
video = new Capture(this, 640/2, 480/2); // initializing video object
opencv = new OpenCV(this, 640/2, 480/2); // initializing opencv object
opencv.loadCascade(OpenCV.CASCADE_FRONTALFACE); // loading haar cscade file for face detection
video.start(); // start video
}
void draw() {
background(0);
// image(video, 0, 0 ); // show video in the background
opencv.loadImage(video);
Rectangle[] faces = opencv.detect();
video.loadPixels();
//------------ Finding faces in the video -----------
float gavg = 0;
for (int i = 0; i < faces.length; i++) {
noFill();
stroke(#FFB700); // yellow rectangle
rect(faces[i].x, faces[i].y, faces[i].width, faces[i].height); // creating rectangle around the face (YELLOW)
stroke(#0070FF); //blue rectangle
rect(faces[i].x, faces[i].y, faces[i].width, faces[i].height-2*faces[i].height/3); // creating a blue rectangle around the forehead
//-------------------- storing forehead white rectangle part into an image -------------------
stroke(0, 255, 255);
rect(faces[i].x+faces[i].width/2-15, faces[i].y+15, 30, 15);
PImage img = video.get(faces[i].x+faces[i].width/2-15, faces[i].y+15, 30, 15); // storing the forehead aera into a image
img.loadPixels();
img.filter(GRAY); // converting capture image rgb to gray
img.updatePixels();
int numPixels = img.width*img.height;
for (int px = 0; px < numPixels; px++) { // For each pixel in the video frame...
final color c = img.pixels[px];
final color luminG = c>>010 & 0xFF;
final float luminRangeG = luminG/255.0;
gavg = gavg + luminRangeG;
}
//--------------------------------------------------------
gavg = gavg/numPixels;
if (poop.size()< bufferSize) {
poop.add(gavg);
}
else poop.remove(0);
}
sample = new float[poop.size()];
for (int i=0;i<poop.size();i++) {
Float f = (float) poop.get(i);
sample[i] = f;
}
if (sample.length>=bufferSize) {
//fft.window(FFT.NONE);
fft.forward(sample, 0);
// bpf = new BandPass(centerFreq, bandwidth, sampleRate);
// in.addEffect(bpf);
float bw = fft.getBandWidth(); // returns the width of each frequency band in the spectrum (in Hz).
println(bw); // returns 21.5332031 Hz for spectrum [0] & [512]
for (int i = 0; i < fft.specSize(); i++)
{
// println( " Freq" + max(sample));
stroke(0, 255, 0);
float x = map(i, 0, fft.specSize(), 0, width);
line( x, height, x, height - fft.getBand(i)*100);
// text("FFT FREQ " + fft.getFreq(i), width/2-100, 10*(i+1));
// text("FFT BAND " + fft.getBand(i), width/2+100, 10*(i+1));
}
}
else {
println(sample.length + " " + poop.size());
}
}
void captureEvent(Capture c) {
c.read();
}
The FFT is applied in a window with 128 samples.
int bufferSize = 128;
During the draw method the samples are stored in a array until fill the buffer for the FFT to be applied. Then after that the buffer is keep full. To insert a new sample the oldest is removed. gavg is the average gray channel color.
gavg = gavg/numPixels;
if (poop.size()< bufferSize) {
poop.add(gavg);
}
else poop.remove(0);
Coping poop to sample
sample = new float[poop.size()];
for (int i=0;i < poop.size();i++) {
Float f = (float) poop.get(i);
sample[i] = f;
}
Now is possible to apply the FFT to sample Array
fft.forward(sample, 0);
In the code is only show the spectrum result. The heartbeat frequency must be calculated.
For each band in fft you have to find the maximum and that position is the frequency of heartbeat.
for(int i = 0; i < fft.specSize(); i++)
{ // draw the line for frequency band i, scaling it up a bit so we can see it
heartBeatFrequency = max(heartBeatFrequency,fft.getBand(i));
}
Then get the bandwidth to know the frequency.
float bw = fft.getBandWidth();
Adjusting frequency.
heartBeatFrequency = fft.getBandWidth() * heartBeatFrequency ;
After you get samples size 128 that is bufferSize value or greater than that, forward the fft with the samples array and then get the peak value of the spectrum which would be our heartBeatRate
Following Papers explains the same :
Measuring Heart Rate from Video - Isabel Bush - Stanford - link (Page 4 paragraphs below Figure 2 explain this.)
Real Time Heart Rate Monitoring From Facial RGB Color Video Using Webcam - H. Rahman, M.U. Ahmed, S. Begum, P. Funk - link (Page 4)
After looking at your question , I thought let me get my hands onto this and I tried making a repository for this.
Well, having some issues if someone can have a look at it.
Thank you David Clifte for this answer it helped a lot.

Detect basket ball Hoops and ball tracking

Detect the hoop(basket).To see the samples of "hoop".
Count the no of successful attempts(shoot) and the failure attempts.
I am using opencv.
Input:
Camera position will be static.
The Portrait mode videos from any mobile device.
ref:
What have i tried:
Able to track the basket ball. Still, seeking for a better solution.
results:
My code:
int main () {
VideoCapture vid(path);
if (!vid.isOpened())
exit(-1);
int i_frame_height = vid.get(CV_CAP_PROP_FRAME_HEIGHT);
i_height_basketball = i_height_basketball * I_HEIGHT / i_frame_height;
int fps = vid.get(CV_CAP_PROP_FPS);
Mat mat_black(640, 480, CV_8UC3, Scalar(0, 0, 0));
vector <Mat> vec_frames;
for (int i_push = 0; i_push < I_NO_FRAMES_STORE; i_push++)
vec_frames.push_back(mat_black);
vector <Mat> vec_mat_result;
for (int i_push = 0; i_push < I_RESULT_STORE; i_push++)
vec_mat_result.push_back(mat_black);
int count_frame = 0;
while (true) {
int clk_start = clock();
Mat image, result;
vid >> image;
if (image.empty())
break;
resize(image, image, Size(I_WIDTH, I_HEIGHT));
image.copyTo(vec_mat_result[count_frame % I_RESULT_STORE]);
if (count_frame >= 1)
vec_mat_result[(count_frame - 1) % I_RESULT_STORE].copyTo(result);
GaussianBlur(image, image, Size(9, 9), 2, 2);
image.copyTo(vec_frames[count_frame % I_NO_FRAMES_STORE]);
if (count_frame >= I_NO_FRAMES_STORE - 1) {
Mat mat_diff_temp(I_HEIGHT, I_WIDTH, CV_32S, Scalar(0));
for (int i_diff = 0; i_diff < I_NO_FRAMES_STORE; i_diff++) {
Mat mat_rgb_diff_temp = abs(vec_frames[ (count_frame - 1) % I_NO_FRAMES_STORE ] - vec_frames[ (count_frame - i_diff) % I_NO_FRAMES_STORE ]);
cvtColor(mat_rgb_diff_temp, mat_rgb_diff_temp, CV_BGR2GRAY);
mat_rgb_diff_temp = mat_rgb_diff_temp > I_THRESHOLD;
mat_rgb_diff_temp.convertTo(mat_rgb_diff_temp, CV_32S);
mat_diff_temp = mat_diff_temp + mat_rgb_diff_temp;
}
mat_diff_temp = mat_diff_temp > I_THRESHOLD_2;
// mat_diff_temp.convertTo(mat_diff_temp, CV_8U);
Mat mat_roi = mat_diff_temp.rowRange(0, i_height_basketball);
// imshow("ROI", mat_roi);
Moments mm = cv::moments(mat_roi, true);
Point p_center = Point(mm.m10 / mm.m00, mm.m01 / mm.m00);
circle(result, p_center, 3, CV_RGB(0, 255, 0), -1);
line(result, Point(0, i_height_basketball), Point(result.cols, i_height_basketball), Scalar(225, 0, 0), 1);
}
count_frame = count_frame + 1;
int clk_processing_time = (clock() - clk_start);
if (count_frame > 1)
imshow("image", result);
// waitKey(0);
int delay = (1000 / fps) - clk_processing_time;
if (delay <= 0)
delay = 2;
if (waitKey(delay) >= 27)
break;
}
vid.release();
return 0;
}
Questions:
How to detect the hoop? I thought of doing with Square detection to detect the square regions around the hoop.
What is the best way of counting the successful shoots? Or How to count ?
I have what I suspect will be a fairly strong baseline: once the ball has commenced its downward arc, if the ball demonstrates significant upward movement again, its a miss. Otherwise, its a basket. This won't catch airballs, but I suspect they're relatively few anyway.
I think you could get a whole lot of mileage out of learning the ball trajectory of a successful shot and not worry too much about the hoop. Furthermore, didn't you say the camera was fixed-position? Doesn't that mean the hoop's always in the same place, and so you could just specify its location?
EDIT:
If you absolutely did have to find the hoop, I'd look for an object (sub-region of the image) of about the same size as the ball (which you say you can track) that's orange. More generally, you could learn a classifier for the hoop based on the training images you linked to, and apply it at a mixture of locations and scales, searching for the best match. You should know its approximate location, i.e. that it's in the upper portion of the image and likely to be to one side or the other. Then you could use proximity features to this identified region in addition to trajectory features to build a classifier for whether the shot succeeded or not.

iOS + Tesseract Ocr + OpenCV

I wrote a digital OCR for ios.
I have a test image png with two digits 5 and 4.
I find the contours. How do I transfer the contour one at tesseract?
init tesseract:
tess = new tesseract::TessBaseAPI();
tess->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding], "eng");
tess->SetPageSegMode(tesseract::PSM_SINGLE_CHAR); //<-- !!!!
tess->tesseract::TessBaseAPI::SetVariable("tessedit_char_whitelist", "0123456789");
Function for detect contours:
- (std::vector<std::vector<cv::Point> >)findSquaresInImage:(cv::Mat)_image {
std::vector<std::vector<cv::Point> > squares;
cv::Mat pyr, timg, gray0(_image.size(), CV_8U), gray;
int thresh = 50, N = 11;
cv::pyrDown(_image, pyr, cv::Size(_image.cols/2, _image.rows/2));
cv::pyrUp(pyr, timg, _image.size());
std::vector<std::vector<cv::Point> > contours;
int ch[] = {0, 0};
mixChannels(&timg, 1, &gray0, 1, ch, 1);
for( int l = 0; l < N; l++ ) {
if( l == 0 ) {
cv::Canny(gray0, gray, 0, thresh, 5);
cv::dilate(gray, gray, cv::Mat(), cv::Point(-1,-1));
}
else {
gray = gray0 >= (l+1)*255/N;
}
cv::findContours(gray, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
std::vector<cv::Point> approx;
CvRect rec1;
std::string str;
std::map<int,IplImage*> pic_list;
for( size_t i = 0; i < contours.size(); i++ )
{
rec1 = cv::boundingRect(contours[i]);
if (rec1.height > 0.5*gray.rows && rec1.width < 0.756*gray.cols) {
NSLog(#"%d %d %d %d", rec1.width, rec1.height, rec1.x, rec1.y);
cv::approxPolyDP(cv::Mat(contours[i]), approx, arcLength(cv::Mat(contours[i]), true)*0.02, true);
squares.push_back(approx);
}
}
}
return squares; }
function for draw contours:
cv::Mat debugSquares( std::vector<std::vector<cv::Point> > squares, cv::Mat image ) {
for ( int i = 0; i< squares.size(); i++ ) {
// draw contour
cv::drawContours(image, squares, i, cv::Scalar(255,0,0), 1, 8, std::vector<cv::Vec4i>(), 0, cv::Point());
// draw bounding rect
cv::Rect rect = boundingRect(cv::Mat(squares[i]));
cv::rectangle(image, rect.tl(), rect.br(), cv::Scalar(0,255,0), 2, 8, 0);
// draw rotated rect
cv::RotatedRect minRect = minAreaRect(cv::Mat(squares[i]));
cv::Point2f rect_points[4];
minRect.points( rect_points );
for ( int j = 0; j < 4; j++ ) {
cv::line( image, rect_points[j], rect_points[(j+1)%4], cv::Scalar(0,0,255), 1, 8 ); // blue
}
}
return image;
}
method for btn Click:
- (IBAction)onMath:(id)sender {
UIImage *image = [UIImage imageNamed:#"test1.png"];
cv::Mat iMat = [self cvMatFromUIImage:image];
std::vector<std::vector<cv::Point> > sq = [self findSquaresInImage:iMat];
cv::Mat hui = debugSquares(sq, iMat);
image = [self UIImageFromCVMat:hui];
self.imView.image = image;
}
image after:
link to project on github: https://github.com/MaxPatsy/iORC
Can you check this answer here
I described some tips for preparing images for Tesseract here: Using tesseract to recognize license plates
In your example, there are several things going on...
You need to get the text to be black and the rest of the image white (not the reverse). That's what character recognition is tuned on. Grayscale is ok, as long as the background is mostly full white and the text mostly full black; the edges of the text may be gray (antialiased) and that may help recognition (but not necessarily - you'll have to experiment)
One of the issues you're seeing is that in some parts of the image, the text is really "thin" (and gaps in the letters show up after thresholding), while in other parts it is really "thick" (and letters start merging). Tesseract won't like that :) It happens because the input image is not evenly lit, so a single threshold doesn't work everywhere. The solution is to do "locally adaptive thresholding" where a different threshold is calculated for each neighbordhood of the image. There are many ways of doing that, but check out for example:
Adaptive gaussian thresholding in OpenCV with cv2.adaptiveThreshold(...,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,...)
Local Otsu's method
Local adaptive histogram equalization
Another problem you have is that the lines aren't straight. In my experience Tesseract can handle a very limited degree of non-straight lines (a few percent of perspective distortion, tilt or skew), but it doesn't really work with wavy lines. If you can, make sure that the source images have straight lines :) Unfortunately, there is no simple off-the-shelf answer for this; you'd have to look into the research literature and implement one of the state of the art algorithms yourself (and open-source it if possible - there is a real need for an open source solution to this). A Google Scholar search for "curved line OCR extraction" will get you started, for example:
Text line Segmentation of Curved Document Images
Lastly: I think you would do much better to work with the python ecosystem (ndimage, skimage) than with OpenCV in C++. OpenCV python wrappers are ok for simple stuff, but for what you're trying to do they won't do the job, you will need to grab many pieces that aren't in OpenCV (of course you can mix and match). Implementing something like curved line detection in C++ will take an order of magnitude longer than in python (* this is true even if you don't know python).
Good luck!

Resources