Detect the hoop(basket).To see the samples of "hoop".
Count the no of successful attempts(shoot) and the failure attempts.
I am using opencv.
Input:
Camera position will be static.
The Portrait mode videos from any mobile device.
ref:
What have i tried:
Able to track the basket ball. Still, seeking for a better solution.
results:
My code:
int main () {
VideoCapture vid(path);
if (!vid.isOpened())
exit(-1);
int i_frame_height = vid.get(CV_CAP_PROP_FRAME_HEIGHT);
i_height_basketball = i_height_basketball * I_HEIGHT / i_frame_height;
int fps = vid.get(CV_CAP_PROP_FPS);
Mat mat_black(640, 480, CV_8UC3, Scalar(0, 0, 0));
vector <Mat> vec_frames;
for (int i_push = 0; i_push < I_NO_FRAMES_STORE; i_push++)
vec_frames.push_back(mat_black);
vector <Mat> vec_mat_result;
for (int i_push = 0; i_push < I_RESULT_STORE; i_push++)
vec_mat_result.push_back(mat_black);
int count_frame = 0;
while (true) {
int clk_start = clock();
Mat image, result;
vid >> image;
if (image.empty())
break;
resize(image, image, Size(I_WIDTH, I_HEIGHT));
image.copyTo(vec_mat_result[count_frame % I_RESULT_STORE]);
if (count_frame >= 1)
vec_mat_result[(count_frame - 1) % I_RESULT_STORE].copyTo(result);
GaussianBlur(image, image, Size(9, 9), 2, 2);
image.copyTo(vec_frames[count_frame % I_NO_FRAMES_STORE]);
if (count_frame >= I_NO_FRAMES_STORE - 1) {
Mat mat_diff_temp(I_HEIGHT, I_WIDTH, CV_32S, Scalar(0));
for (int i_diff = 0; i_diff < I_NO_FRAMES_STORE; i_diff++) {
Mat mat_rgb_diff_temp = abs(vec_frames[ (count_frame - 1) % I_NO_FRAMES_STORE ] - vec_frames[ (count_frame - i_diff) % I_NO_FRAMES_STORE ]);
cvtColor(mat_rgb_diff_temp, mat_rgb_diff_temp, CV_BGR2GRAY);
mat_rgb_diff_temp = mat_rgb_diff_temp > I_THRESHOLD;
mat_rgb_diff_temp.convertTo(mat_rgb_diff_temp, CV_32S);
mat_diff_temp = mat_diff_temp + mat_rgb_diff_temp;
}
mat_diff_temp = mat_diff_temp > I_THRESHOLD_2;
// mat_diff_temp.convertTo(mat_diff_temp, CV_8U);
Mat mat_roi = mat_diff_temp.rowRange(0, i_height_basketball);
// imshow("ROI", mat_roi);
Moments mm = cv::moments(mat_roi, true);
Point p_center = Point(mm.m10 / mm.m00, mm.m01 / mm.m00);
circle(result, p_center, 3, CV_RGB(0, 255, 0), -1);
line(result, Point(0, i_height_basketball), Point(result.cols, i_height_basketball), Scalar(225, 0, 0), 1);
}
count_frame = count_frame + 1;
int clk_processing_time = (clock() - clk_start);
if (count_frame > 1)
imshow("image", result);
// waitKey(0);
int delay = (1000 / fps) - clk_processing_time;
if (delay <= 0)
delay = 2;
if (waitKey(delay) >= 27)
break;
}
vid.release();
return 0;
}
Questions:
How to detect the hoop? I thought of doing with Square detection to detect the square regions around the hoop.
What is the best way of counting the successful shoots? Or How to count ?
I have what I suspect will be a fairly strong baseline: once the ball has commenced its downward arc, if the ball demonstrates significant upward movement again, its a miss. Otherwise, its a basket. This won't catch airballs, but I suspect they're relatively few anyway.
I think you could get a whole lot of mileage out of learning the ball trajectory of a successful shot and not worry too much about the hoop. Furthermore, didn't you say the camera was fixed-position? Doesn't that mean the hoop's always in the same place, and so you could just specify its location?
EDIT:
If you absolutely did have to find the hoop, I'd look for an object (sub-region of the image) of about the same size as the ball (which you say you can track) that's orange. More generally, you could learn a classifier for the hoop based on the training images you linked to, and apply it at a mixture of locations and scales, searching for the best match. You should know its approximate location, i.e. that it's in the upper portion of the image and likely to be to one side or the other. Then you could use proximity features to this identified region in addition to trajectory features to build a classifier for whether the shot succeeded or not.
Related
Background:
I am looking to align images for a focus stacking application using a smartphone. Links to images:
First in stack: 1, Last in stack: 2, Final stacked images: 3
I.e. images are nominally the same, BUT contain:
Systematic change in FOCUS as the focal plane shifts between images
Magnification changes slightly (smartphone feature as focus changes!)
Camera moves slightly due to random vibrations.
Images need to be aligned for the focus-stacking APP to work.
Progress to date:
I use OpenCV's findTransformECC() to get alignment. It works well after some experimentation i.e. see cv2.MOTION_EUCLIDEAN for the warp_mode in ECC image alignment method which was useful to improve the initialization of the Warp matrix:
Images aligned at pixel level
60secs to process 8Mpix image (1sec for 0.5Mpix image) (on 3 year old portable PC with OpenCV release libraries)
See stacked image link above.
I briefly investigated a feature detector (SIFT). It did not align the images well, presumably due to the change in focus between images.
Code:
int scale = 1;
int scaleSmall = 4;
float scaleDiff = scaleSmall / scale;
for (i = 0; i< numImages; i++) {
file = dir + image + to_string(i) + ".jpg";
col[i] = imread(file);
resize(col[i], z[i], Size(col[i].cols/scale, col[i].rows/scale));
cvtColor(z[i], zg[i], CV_BGR2GRAY);
resize(zg[i], zgSmall[i], Size(col[i].cols / scaleSmall, col[i].rows / scaleSmall));
}
// Set a 2x3 or 3x3 warp matrix depending on the motion model.
// See https://www.learnopencv.com/image-alignment-ecc-in-opencv-c-python/
// Define the motion model
const int warp_mode = MOTION_HOMOGRAPHY;
// Initialize the matrix to identity
if (warp_mode == MOTION_HOMOGRAPHY) {
warp_init = Mat::eye(3, 3, CV_32F);
warp_matrix = Mat::eye(3, 3, CV_32F);
warp_matrix_prev = Mat::eye(3, 3, CV_32F);
scaleTX = (Mat_<float>(3, 3) << 1, 1, scaleDiff, 1, 1, scaleDiff, 1 / scaleDiff, 1 / scaleDiff, 1);
}
else {
warp_init = Mat::eye(2, 3, CV_32F);
scaleTX = Mat::eye(2, 3, CV_32F);
warp_matrix = Mat::eye(2, 3, CV_32F);
warp_matrix_prev = Mat::eye(2, 3, CV_32F);
scaleTX = (Mat_<float>(2, 3) << 1, 1, scaleDiff, 1, 1, scaleDiff);
}
// Specify the number of iterations.
int number_of_iterations = 5000;
// Specify the threshold of the increment
// in the correlation coefficient between two iterations
double termination_eps = 1e-8;
// Define termination criteria
TermCriteria criteria(TermCriteria::COUNT + TermCriteria::EPS, number_of_iterations, termination_eps);
for (i = 1; i < numImages; i++) {
// Check images right size
if (zg[0].rows < 10 || zg[1].rows < 10)
return;
// Run the ECC algorithm at start to get an initial guess. The results are stored in warp_matrix.
if (i == 1) {
findTransformECC(zgSmall[0], zgSmall[i], warp_init, warp_mode, criteria );
// See https://stackoverflow.com/questions/45997891/cv2-motion-euclidean-for-the-warp-mode-in-ecc-image-alignment-method
warp_matrix = warp_init * scaleTX;
}
// Warp Matrix from previous iteration is used as initialisation
findTransformECC(zg[0], zg[i], warp_matrix, warp_mode, criteria);
if (warp_mode != MOTION_HOMOGRAPHY) {
warpAffine(zg[i], ag[i], warp_matrix, zg[i].size(), INTER_LINEAR + WARP_INVERSE_MAP);
warpAffine(z[i], acol[i], warp_matrix, zg[i].size(), INTER_LINEAR + WARP_INVERSE_MAP);
}
else {
// Use warpPerspective for Homography
warpPerspective(z[i], acol[i], warp_matrix, z[i].size(), INTER_LINEAR + WARP_INVERSE_MAP);
warpPerspective(zg[i], ag[i], warp_matrix, zg[i].size(), INTER_LINEAR + WARP_INVERSE_MAP);
}
}
}
Question:
Can the image registration speed be significantly improved (using the same hardware)?
there are at least 3 improvements that can be done:
5000 iterations may be unnecessary. Try to limit it to 500. Moreover transforming images to gradient domain may help. See GetGradient function from this tutorial.
You can assume that perspective effects are negligible so you can change warp_mode to MOTION_AFFINE to limit the degrees of freedom from 8 to 6.
You can also try another, much faster approach that is based on phase correlation (frequency domain). In the standard way it only estimates translation between images but you can transfer them to the log-polar space to get translation, rotation and scale invariance. This code implements the third approach.
I am using an iPhone camera to detect a TV screen. My current approach is to compare subsequent frames pixel by pixel and keep track of cumulative differences. The result is binary a image as shown in image.
For me this looks like a rectangle but OpenCV does not think so. It's sides are not perfectly straight and sometimes there is even more color bleed to make detection difficult. Here is my OpenCV code trying to detect rectangle, since I am not very familiar with OpenCV it is copied from some example I found.
uint32_t *ptr = (uint32_t*)CVPixelBufferGetBaseAddress(buffer);
cv::Mat image((int)width, (int)height, CV_8UC4, ptr); // unsigned 8-bit values for 4 channels (ARGB)
cv::Mat image2 = [self matFromPixelBuffer:buffer];
std::vector<std::vector<cv::Point>>squares;
// blur will enhance edge detection
cv::Mat blurred(image2);
GaussianBlur(image2, blurred, cvSize(3,3), 0);//change from median blur to gaussian for more accuracy of square detection
cv::Mat gray0(blurred.size(), CV_8U), gray;
std::vector<std::vector<cv::Point> > contours;
// find squares in every color plane of the image
for (int c = 0; c < 3; c++) {
int ch[] = {c, 0};
mixChannels(&blurred, 1, &gray0, 1, ch, 1);
// try several threshold levels
const int threshold_level = 2;
for (int l = 0; l < threshold_level; l++) {
// Use Canny instead of zero threshold level!
// Canny helps to catch squares with gradient shading
if (l == 0) {
Canny(gray0, gray, 10, 20, 3); //
// Dilate helps to remove potential holes between edge segments
dilate(gray, gray, cv::Mat(), cv::Point(-1,-1));
} else {
gray = gray0 >= (l+1) * 255 / threshold_level;
}
// Find contours and store them in a list
findContours(gray, contours, CV_RETR_LIST, CV_CHAIN_APPROX_SIMPLE);
// Test contours
std::vector<cv::Point> approx;
int biggestSize = 0;
for (size_t i = 0; i < contours.size(); i++) {
// approximate contour with accuracy proportional
// to the contour perimeter
approxPolyDP(cv::Mat(contours[i]), approx, arcLength(cv::Mat(contours[i]), true)*0.02, true);
if (approx.size() != 4)
continue;
// Note: absolute value of an area is used because
// area may be positive or negative - in accordance with the
// contour orientation
int areaSize = fabs(contourArea(cv::Mat(approx)));
if (approx.size() == 4 && areaSize > biggestSize)
biggestSize = areaSize;
cv::RotatedRect boundingRect = cv::minAreaRect(approx);
float aspectRatio = boundingRect.size.width / boundingRect.size.height;
cv::Rect boundingRect2 = cv::boundingRect(approx);
float aspectRatio2 = (float)boundingRect2.width / (float)boundingRect2.height;
bool convex = isContourConvex(cv::Mat(approx));
if (approx.size() == 4 &&
fabs(contourArea(cv::Mat(approx))) > minArea &&
(aspectRatio >= minAspectRatio && aspectRatio <= maxAspectRatio) &&
isContourConvex(cv::Mat(approx))) {
double maxCosine = 0;
for (int j = 2; j < 5; j++) {
double cosine = fabs(angle(approx[j%4], approx[j-2], approx[j-1]));
maxCosine = MAXIMUM(maxCosine, cosine);
}
double area = fabs(contourArea(cv::Mat(approx)));
if (maxCosine < 0.3) {
squares.push_back(approx);
}
}
}
}
After Canny-step the image looks like this:
It seems fine to me but for some reason rectangle is not detected. Can anyone explain if there is something wrong with my parameters?
My second approach was to use OpenCV Hough line detection, basically using the same code as above, for Canny image I then call HoughLines function. It gives me quite a few lines as I had to lower threshold to detect vertical lines. The result looks like this:
The problem is that there are some many lines. How can I find out the lines that are touching the sides of blue rectangle as shown in first image?
Or is there a better approach to detect a screen?
First of all, find maximal area contour reference, then compure min area rectangle reference, divide contour area by rectangle area, if it close enough to 1 then your contour similar to rectangle. This will be your required contour and rectangle.
I am struggling with finding the appropriate contour algorithm for a low quality image. The example image shows a rock scene:
What I am trying to achieve is to find contours arround features such as:
light areas
dark areas
grey1 areas
grey2 areas
etc. until grey-n areas
(The number of areas shall be a parameter of choice)
I do not want to take a simple binary-threshold but rather use some sort of contour-finding (for example watershed or other). The major feature-lines shall be kept, noise within a feature-are can be flattened.
The result of my code can be seen on the images to the right.
Unfortunately, as you can easily tell, the colors do not really represent the original large-scale image features! For example: check out the two areas that I circled with red - these features are almost completely flooded with another color. What I imagine is that at least the very light and the very dark areas are covered by its own color.
cv::Mat cv_src = cv::imread(argv[1]);
cv::Mat output;
cv::Mat cv_src_gray;
cv::cvtColor(cv_src, cv_src_gray, cv::COLOR_RGB2GRAY);
double clipLimit = 0.1;
cv::Size titleGridSize = cv::Size(8,8);
cv::Ptr<cv::CLAHE> clahe = cv::createCLAHE(clipLimit, titleGridSize);
clahe->apply(cv_src_gray, output);
cv::equalizeHist(output, output);
cv::cvtColor(output, cv_src, cv::COLOR_GRAY2RGB);
// Create binary image from source image
cv::Mat bw;
cv::cvtColor(cv_src, bw, cv::COLOR_BGR2GRAY);
cv::threshold(bw, bw, 180, 255, cv::THRESH_BINARY);
// Perform the distance transform algorithm
cv::Mat dist;
cv::distanceTransform(bw, dist, cv::DIST_L2, CV_32F);
// Normalize the distance image for range = {0.0, 1.0}
cv::normalize(dist, dist, 0, 1., cv::NORM_MINMAX);
// Threshold to obtain the peaks
cv::threshold(dist, dist, .2, 1., cv::THRESH_BINARY);
// Create the CV_8U version of the distance image
cv::Mat dist_8u;
dist.convertTo(dist_8u, CV_8U);
// Find total markers
std::vector<std::vector<cv::Point> > contours;
cv::findContours(dist_8u, contours, cv::RETR_EXTERNAL, cv::CHAIN_APPROX_SIMPLE);
int ncomp = contours.size();
// Create the marker image for the watershed algorithm
cv::Mat markers = cv::Mat::zeros(dist.size(), CV_32S);
// Draw the foreground markers
for (int i = 0; i < ncomp; i++)
cv::drawContours(markers, contours, i, cv::Scalar::all(i+1), -1);
// Draw the background marker
cv::circle(markers, cv::Point(5,5), 3, CV_RGB(255,255,255), -1);
// Perform the watershed algorithm
cv::watershed(cv_src, markers);
// Generate random colors
std::vector<cv::Vec3b> colors;
for (int i = 0; i < ncomp; i++)
{
int b = cv::theRNG().uniform(0, 255);
int g = cv::theRNG().uniform(0, 255);
int r = cv::theRNG().uniform(0, 255);
colors.push_back(cv::Vec3b((uchar)b, (uchar)g, (uchar)r));
}
// Create the result image
cv::Mat dst = cv::Mat::zeros(markers.size(), CV_8UC3);
// Fill labeled objects with random colors
for (int i = 0; i < markers.rows; i++)
{
for (int j = 0; j < markers.cols; j++)
{
int index = markers.at<int>(i,j);
if (index > 0 && index <= ncomp)
dst.at<cv::Vec3b>(i,j) = colors[index-1];
else
dst.at<cv::Vec3b>(i,j) = cv::Vec3b(0,0,0);
}
}
// Show me what you got
imshow("final_result", dst);
I think you can use a simple clustering such as k-means for this, then examine the cluster centers (or the mean and standard deviations of each cluster). I quickly tried it in matlab.
im = imread('tvBqt.jpg');
gr = rgb2gray(im);
x = double(gr(:));
idx = kmeans(x, 4);
cl = reshape(idx, 600, 472);
figure,
subplot(1, 2, 1), imshow(gr, []), title('original')
subplot(1, 2, 2), imshow(label2rgb(cl), []), title('clustered')
The result:
You could try using SLIC Superpixels. I tried it and showed some good results. You could vary the parameters to get better clustering.
SLIC Superpixels
SLIC Superpixels with OpenCV C++
SLIC Superpixels with OpenCV Python
I wrote a digital OCR for ios.
I have a test image png with two digits 5 and 4.
I find the contours. How do I transfer the contour one at tesseract?
init tesseract:
tess = new tesseract::TessBaseAPI();
tess->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding], "eng");
tess->SetPageSegMode(tesseract::PSM_SINGLE_CHAR); //<-- !!!!
tess->tesseract::TessBaseAPI::SetVariable("tessedit_char_whitelist", "0123456789");
Function for detect contours:
- (std::vector<std::vector<cv::Point> >)findSquaresInImage:(cv::Mat)_image {
std::vector<std::vector<cv::Point> > squares;
cv::Mat pyr, timg, gray0(_image.size(), CV_8U), gray;
int thresh = 50, N = 11;
cv::pyrDown(_image, pyr, cv::Size(_image.cols/2, _image.rows/2));
cv::pyrUp(pyr, timg, _image.size());
std::vector<std::vector<cv::Point> > contours;
int ch[] = {0, 0};
mixChannels(&timg, 1, &gray0, 1, ch, 1);
for( int l = 0; l < N; l++ ) {
if( l == 0 ) {
cv::Canny(gray0, gray, 0, thresh, 5);
cv::dilate(gray, gray, cv::Mat(), cv::Point(-1,-1));
}
else {
gray = gray0 >= (l+1)*255/N;
}
cv::findContours(gray, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
std::vector<cv::Point> approx;
CvRect rec1;
std::string str;
std::map<int,IplImage*> pic_list;
for( size_t i = 0; i < contours.size(); i++ )
{
rec1 = cv::boundingRect(contours[i]);
if (rec1.height > 0.5*gray.rows && rec1.width < 0.756*gray.cols) {
NSLog(#"%d %d %d %d", rec1.width, rec1.height, rec1.x, rec1.y);
cv::approxPolyDP(cv::Mat(contours[i]), approx, arcLength(cv::Mat(contours[i]), true)*0.02, true);
squares.push_back(approx);
}
}
}
return squares; }
function for draw contours:
cv::Mat debugSquares( std::vector<std::vector<cv::Point> > squares, cv::Mat image ) {
for ( int i = 0; i< squares.size(); i++ ) {
// draw contour
cv::drawContours(image, squares, i, cv::Scalar(255,0,0), 1, 8, std::vector<cv::Vec4i>(), 0, cv::Point());
// draw bounding rect
cv::Rect rect = boundingRect(cv::Mat(squares[i]));
cv::rectangle(image, rect.tl(), rect.br(), cv::Scalar(0,255,0), 2, 8, 0);
// draw rotated rect
cv::RotatedRect minRect = minAreaRect(cv::Mat(squares[i]));
cv::Point2f rect_points[4];
minRect.points( rect_points );
for ( int j = 0; j < 4; j++ ) {
cv::line( image, rect_points[j], rect_points[(j+1)%4], cv::Scalar(0,0,255), 1, 8 ); // blue
}
}
return image;
}
method for btn Click:
- (IBAction)onMath:(id)sender {
UIImage *image = [UIImage imageNamed:#"test1.png"];
cv::Mat iMat = [self cvMatFromUIImage:image];
std::vector<std::vector<cv::Point> > sq = [self findSquaresInImage:iMat];
cv::Mat hui = debugSquares(sq, iMat);
image = [self UIImageFromCVMat:hui];
self.imView.image = image;
}
image after:
link to project on github: https://github.com/MaxPatsy/iORC
Can you check this answer here
I described some tips for preparing images for Tesseract here: Using tesseract to recognize license plates
In your example, there are several things going on...
You need to get the text to be black and the rest of the image white (not the reverse). That's what character recognition is tuned on. Grayscale is ok, as long as the background is mostly full white and the text mostly full black; the edges of the text may be gray (antialiased) and that may help recognition (but not necessarily - you'll have to experiment)
One of the issues you're seeing is that in some parts of the image, the text is really "thin" (and gaps in the letters show up after thresholding), while in other parts it is really "thick" (and letters start merging). Tesseract won't like that :) It happens because the input image is not evenly lit, so a single threshold doesn't work everywhere. The solution is to do "locally adaptive thresholding" where a different threshold is calculated for each neighbordhood of the image. There are many ways of doing that, but check out for example:
Adaptive gaussian thresholding in OpenCV with cv2.adaptiveThreshold(...,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,...)
Local Otsu's method
Local adaptive histogram equalization
Another problem you have is that the lines aren't straight. In my experience Tesseract can handle a very limited degree of non-straight lines (a few percent of perspective distortion, tilt or skew), but it doesn't really work with wavy lines. If you can, make sure that the source images have straight lines :) Unfortunately, there is no simple off-the-shelf answer for this; you'd have to look into the research literature and implement one of the state of the art algorithms yourself (and open-source it if possible - there is a real need for an open source solution to this). A Google Scholar search for "curved line OCR extraction" will get you started, for example:
Text line Segmentation of Curved Document Images
Lastly: I think you would do much better to work with the python ecosystem (ndimage, skimage) than with OpenCV in C++. OpenCV python wrappers are ok for simple stuff, but for what you're trying to do they won't do the job, you will need to grab many pieces that aren't in OpenCV (of course you can mix and match). Implementing something like curved line detection in C++ will take an order of magnitude longer than in python (* this is true even if you don't know python).
Good luck!
I would like to detect a movement in a tile of grids defined by N*N, I've tried a way which is done by https://stackoverflow.com/users/724461/andrey-kamaev
and it shown in the following code, but the result isn't accurate at all, I would like to do a more accurate approach.
cv::Sobel(input, sobel, CV_32F, 1, 1);
int h = input.rows / NUM_BLOCK_ROWS;
int w = input.rows / NUM_BLOCK_COLUMNS;
float pos=0;
for (int r = 0; r<NUM_BLOCK_ROWS; r++)
for(int c=0; c<NUM_BLOCK_COLUMNS; c++)
{
cv::Scalar weight = cv::sum(sobel(cv::Range(h*r, (r+1)*h), cv::Range(c*w, (c+1)*w)));
if (weight[0] + weight[1] > 60) {
input(cv::Range(h*r, (r+1)*h-1), cv::Range(c*w, (c+1)*w-1)).setTo(cv::Scalar(0,0,255));
}
}
I used Frame Differencing approach and it worked.
What about optical flow? OpenCV's implementation is here