OpenCV - Determine position of wrist - opencv

I need to determine the position of the wrist in a frame with parts of a human under arm & matching hand.
So far I have isolated the hand & arm and I'm able to draw a polygon & hull curve around it:
I achieve this result by simple binary thresholding and automatic contour fitting.
Based on this I want to extract the location of the wrist. This needs to work for all orientations of the hand/wrist.
However, being fairly new to working with OpenCV it is unclear to me what the best way is to determine/isolate the location of the wrist. I have various ideas for this:
The arm section is a fairly straight. Maybe a simple line detection over the contour polygon might do the job to get straight lines for the under arm.
Somehow split the contour polygon into multiple sections. Basically it's fair to assume that the location of the wrist has the smallest distance between the two arms contouring the under arm. Is there a way to find that point along the polygon and then "cut" or "split" the polygon to get two? From there I'd have one polygon representing a rectangle which should be easy to work with.
Use an approach that iterates along the main axis of the polygon fitted using fitLine(), measuring the distance between two opposing points of the polygon, finding the shortest distance.
Unfortunately I lack the experience to make the correct choice here - or even come up with a better idea.
I'd appreciate any kind of ideas & pointers towards achieving this. I could find a lot of valuable research material when it comes to hand detection & tracking and basic body part matching using Haar cascades. Unfortunately, I couldn't find a way to apply those technologies for my use case.
Here's some raw material (images & videos) to work with: (Google Drive Link!): https://drive.google.com/drive/folders/1hU4hGw5dYtVrcXTq8TYWCWfcLWjT-ZJU?usp=sharing

Approach: I used the advantage of arm side. The thickness of arm is almost same until hitting the hand.
Assumption: I coded by assuming the arm will enter the screen vertically. Otherwise my code may not work. I tried all of the images you shared and its working properly for all.
My steps:
Make a simple segmentation methodology for getting only needed part of the image
Start to count the non-black pixel for each column by beginning from the arm side.
Until hitting a column which is different from the previous column countings, you are still on the arm side. When you hit, you reached the wrist.
Note: I decided the threshold experimentally.
Here are the results and code:
Input Image:
After segmentation:
Output after algorithm:
Code:
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>
#include <cstdlib>
using namespace cv;
using namespace std;
int main() {
Mat src, gray, blur_image, threshold_output;
// take input image
src = imread("/ur/image/directory/image_01.jpg", 1);
// convert to grayscale
cvtColor(src, gray, COLOR_BGR2GRAY);
// add blurring to the input image
medianBlur(gray,gray,9);
// Apply a segmentation to arm
for(int i=0; i<gray.rows; i++)
for(int j=0;j<gray.cols; j++)
if(gray.at<uchar>(Point(j,i))<110)
gray.at<uchar>(Point(j,i)) = 0;
//Creat a bgr mat to show the results clearly
Mat copy_gray = gray;
cvtColor(copy_gray,copy_gray,CV_GRAY2BGR);
double sum = 0;
int loop_cnt = 0,enter = 1;
Point first,second;
for(int j=gray.cols-1; j>=0; j--)
{
loop_cnt++;
int counter = 0,ff=1,enter2 = 1;
for(int i=0;i<gray.rows; i++)
{
if(gray.at<uchar>(Point(j,i))!=0 && enter)
{
if(ff)
first = Point(j,i);
counter++;
ff = 0;
}
if(!ff && gray.at<uchar>(Point(j,i))==0 && enter2)
{
second = Point(j,i);
enter2 = 0;
}
}
sum += (double)counter;
double average = sum/(double)(loop_cnt);
if(abs(average-counter)>20.0 && enter)
{
line(copy_gray,Point(j,0),Point(j,500),Scalar(0,255,0),5);
enter = 0;
}
}
int distance = norm(second-first)/2;
circle(copy_gray,Point(first.x,first.y+distance),20,Scalar(0,0,255),5);
imshow("Result",copy_gray);
waitKey(0);
return 0;
}

Related

Detecting a hand above a chessboard using opencv

I am developing an android application for analyzing chess games based on series of photos. To process images, I am using OpenCV. My question is how can I detect that there is a player's hand on a picture? Because I would like to filter those photos and analyze only the ones with the only chessboard on them.
So far I managed to get the Canny, so from an image like that
original image
I am able to get that canny
.
But I have no idea what can I do next...
The code I used to get Canny:
Mat gray, blur, cannyed;
cvtColor(img, gray, CV_BGR2GRAY);
GaussianBlur(gray, blur, Size(7, 7), 0, 0);
Canny(blur, cannyed, 50, 100, 3);
I would highly appreciate any ideas and advice on what to do next and what OpenCV functions can I use.
You have a very nice spectrum in the chess board. A hand in it messes up the frequencies built up by the regular transitions between the black and white squares. Try moving a bigger square (let's say the size of a 4.5 x 4.5 squares) around and see what happens to the frequencies.
Another approach if you have the sequence of pictures taken as a movie is to analyse the motions. Take the difference of consecutive frames (low pass filter them a bit first) to detect motions. Filter the motions in time (over several frames). Then threshold these motions to get a binary image. Erode the binary shapes to filter out small moving objects (noise, chess figure) be able to detect if any larger moving shape is on the board (e.g. a hand).
Here, After Canny Edge detection the morphological operations of horizontal and vertical lines extraction process i tried.
Mat horizontal = cannyed.clone();
// Specify size on horizontal axis
int horizontalsize = horizontal.cols / 60;
// Create structure element for extracting horizontal lines through morphology operations
Mat horizontalStructure = getStructuringElement(MORPH_RECT, Size(horizontalsize,1));
erode(horizontal, horizontal, horizontalStructure, Point(-1, -1),2);
dilate(horizontal, horizontal, horizontalStructure, Point(-1, -1),1);
imshow("horizontal",horizontal);
Mat vertical = cannyed.clone();
// Specify size on horizontal axis
int verticalsize = vertical.cols / 60;
// Create structure element for extracting horizontal lines through morphology operations
Mat verticalStructure = getStructuringElement(MORPH_RECT, Size(1,verticalsize));
erode(vertical, vertical, verticalStructure, Point(-1, -1));
dilate(vertical, vertical, verticalStructure, Point(-1, -1),2);
imshow("vertical",vertical);
the results are ,
Horizontal Lines in the chess board
Then, from the figure you can see there is a proper interval in between the lines. The area where hand is present there is more interval in lines.
In that location, if contour is done, the hand (or any object ) over the chess board can be detected.
This helps to solve for any object when placed over chess board.
Thank you all very much for your suggestions.
So I solved the problem mostly using Gowthaman's method. First I use his code to generate vertical and horizontal lines. Then I combine them like this:
Mat combined = vertical + horizontal;
So I get something like that when there is no hand
or like that when there is a hand
.
Next I count white pixels using the code:
int GetPixelCount(Mat image, uchar color)
{
int result = 0;
for (int i = 0; i < image.rows; i++)
{
for (int j = 0; j < image.cols; j++)
{
if (image.at<uchar>(Point(j, i)) == color)
result++;
}
}
return result;
}
I do that for every photo in the series. First photo is always without a hand, so I use is as a template. If current photo has less then 98% of template white pixels then I deduce there is hand (or something else) in it.
Most likely this is not an optimal method and has lots of weaknesses, but it is very simple and works for me just fine :)

directional edge detection in OpenCV

I would like to detect edge that has certain angle/orientation.
Adapting from a post in SO, I've figured out to use OpenCV magnitude, phase and Sobel functions to filter out unwanted edge points. Then use the magnitude image (with phase image as condition) to output the edge points.
However, the results is not similar to Canny edge function. It's good that the edges with unwanted angles are filtered out but detected edges are blobs of points, not thin line edge
the left edge image is also plotted out after findContours is used, but this barely helps out
1) what else should be added to mimic Canny processing?
2) As for the purpose of directional edge detection, is this approach more robust than using a directional kernel other than typical Sobel ones?
Thank you!
Edit 01:
forgot to put my code link
alternatively, you can try lsd,(http://www.ipol.im/pub/art/2012/gjmr-lsd/). it outputs lines as two point pairs so directional filtering is also possible.
there's also another line segment implementation # http://sourceforge.net/projects/lswms/ though the lsd link above has better results
if you want a single pixel edge, you would need to do skeletonization/thinning
edit
rename the lsd.c into lsd.cpp when you are compiling. i used version 1.6 attached in the url. code and results below. you can tweak the thresholds to suppress the small segments as well.
#include "opencv2/opencv.hpp"
using namespace cv;
#include "lsd.h"
void lsd_call(Mat& im)
{
Mat gray;
cvtColor(im,gray,CV_BGR2GRAY);
Mat imgdouble;
gray.convertTo(imgdouble,CV_64FC1);
double * image;
double * out;
int x,y,i,j,n;
out = lsd(&n,(double*)imgdouble.data,imgdouble.cols,imgdouble.rows);
Mat lines = im.clone();
Mat lines_binary = Mat::zeros(gray.size(),CV_8UC1);
for(i=0;i<n;i++)
{
double x1,y1,x2,y2,w;
x1 = out[7*i+0];
y1 = out[7*i+1];
x2 = out[7*i+2];
y2 = out[7*i+3];
w = out[7*i+4];
double length = sqrt(pow(x1-x2,2)+pow(y1-y2,2));
double angle = atan2(y2 - y1, x2 - x1) * 180 / CV_PI;
if(angle<180 && angle>90)
{
line(lines,Point2d(out[7*i+0],out[7*i+1]),Point2d(out[7*i+2],out[7*i+3]),Scalar (0,0,255));
line(lines_binary,Point2d(out[7*i+0],out[7*i+1]),Point2d(out[7*i+2],out[7*i+3]) ,Scalar(255));
}
if(length>75)
{
//line(todraw,Point2d(out[7*i+0],out[7*i+1]),Point2d(out[7*i+2],out[7*i+3]), Scalar(0,0,255),out[7*i+4]);
}
}
imshow("lines",lines);
imshow("lines_binary",lines_binary);
imwrite("c:/data/lines.jpg",lines);
imwrite("c:/data/linesbinary.jpg",lines_binary);
free( (void *) out );
}
int main(int argc,char** argv )
{
Mat im = imread("c:/data/lines.png");
lsd_call(im);
waitKey(0);
}
1)
Canny edge detector produces thin edges because of non-maxima supression along the neighbours.
In order to mimic that, you need to choose edge pixels with maximimum edge response along that direction. So blobs of points can be prevented this way.
As you can probably guess, the weaker images in the grid can be suppressed with threshold defined by you.
2) I can't give a definite answer to that sadly. For the angel given, the kernels might be limited by
discretization. So for many different angles, this approach 'should' be better.

Automatic color calibration for object tracker

This is my first post, so forgive me if I miss something.
I have been playing around with OpenCV2 with Visual Studio C++. I have a basic object tracker working. By applying a Gaussian Blur, Converting to HSV, Thresholding with Trackbars, Eroding then Dilating. Now I want to set up some way of easily calibrating the color to be thresholded without using the Trackbars.
I've tried setting up an area of interest and taking the average BGR or HSV values (I've tried both ways). Then if needed use trackbars to make finer adjustments, but it does not seem to work. Am I on the right track, or is there a better way?
I have basically followed this video to get where I am.
https://www.youtube.com/watch?v=bSeFrPrqZ2A
I am not looking for a code to copy and paste. I am just looking for an Algorithm or explanation of a way to do it. Cheers
EDIT
Sorry I'll try and clear it up. What I have done is written an object tracking program for a home robot vision project. I just want to make it easier to calibrate what color is to be thresholded. At the moment I use trackbars to set the min and max HSV values for thresholding. Then use Erode and Dilate to clear up the binary image. Before using cv::findConturs and cv::moments to find the centroid for the largest contour.
What I have tried is setting a small 40x40pixel square in the center of the screen. When, for example, I hold a green ball in this square and hit spacebar. I cycle through each pixel in the square and get each separate Hue, Saturation and Value um...value. Then take the mode of each and use that to set the min and max threshold values.
Here is a segment of the code
if(cv::waitKey(20) == 32){ // wait for spacebar
int count = 0;
cv::Mat roi_Crop = frame_HSV(roi); //create cropped image from frame_HSV
for(int i=0; i<roi_Crop.rows; i++) // cycle through each pixel
{
for(int j=0; j<roi_Crop.cols; j++)
{
Hue[count] = roi_Crop.at<cv::Vec3b>(i,j)[0];
Sat[count] = roi_Crop.at<cv::Vec3b>(i,j)[1];
Val[count] = roi_Crop.at<cv::Vec3b>(i,j)[2];
count++;
}
}
HSV_Mode[0] = findMode(Hue);
HSV_Mode[1] = findMode(Sat);
HSV_Mode[2] = findMode(Val);
}
I hope this helps.

Reshaping noisy coin into a circle form

I'm doing a coin detection using JavaCV (OpenCV wrapper) but I have a little problem when the coins are connected. If I try to erode them to separate these coins they loose their circle form and if I try to count pixels inside each coin there can be problems so that some coins can be miscounted as one that bigger. What I want to do is firstly to reshape them and make them like a circle (equal with the radius of that coin) and then count pixels inside them.
Here is my thresholded image:
And here is eroded image:
Any suggestions? Or is there any better way to break bridges between coins?
It looks similar to a problem I recently had to separate bacterial colonies growing on agar plates.
I performed a distance transform on the thresholded image (in your case you will need to invert it).
Then found the peaks of the distance map (by calculating the difference between a the dilated distance map and the distance map and finding the zero values).
Then, I assumed each peak to be the centre of a circle (coin) and the value of the peak in the distance map to be the radius of the circle.
Here is the result of your image after this pipeline:
I am new to OpenCV, and c++ so my code is probably very messy, but I did that:
int main( int argc, char** argv ){
cv::Mat objects, distance,peaks,results;
std::vector<std::vector<cv::Point> > contours;
objects=cv::imread("CUfWj.jpg");
objects.copyTo(results);
cv::cvtColor(objects, objects, CV_BGR2GRAY);
//THIS IS THE LINE TO BLUR THE IMAGE CF COMMENTS OF THIS POST
cv::blur( objects,objects,cv::Size(3,3));
cv::threshold(objects,objects,125,255,cv::THRESH_BINARY_INV);
/*Applies a distance transform to "objects".
* The result is saved in "distance" */
cv::distanceTransform(objects,distance,CV_DIST_L2,CV_DIST_MASK_5);
/* In order to find the local maxima, "distance"
* is subtracted from the result of the dilatation of
* "distance". All the peaks keep the save value */
cv::dilate(distance,peaks,cv::Mat(),cv::Point(-1,-1),3);
cv::dilate(objects,objects,cv::Mat(),cv::Point(-1,-1),3);
/* Now all the peaks should be exactely 0*/
peaks=peaks-distance;
/* And the non-peaks 255*/
cv::threshold(peaks,peaks,0,255,cv::THRESH_BINARY);
peaks.convertTo(peaks,CV_8U);
/* Only the zero values of "peaks" that are non-zero
* in "objects" are the real peaks*/
cv::bitwise_xor(peaks,objects,peaks);
/* The peaks that are distant from less than
* 2 pixels are merged by dilatation */
cv::dilate(peaks,peaks,cv::Mat(),cv::Point(-1,-1),1);
/* In order to map the peaks, findContours() is used.
* The results are stored in "contours" */
cv::findContours(peaks, contours, CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE);
/* The next steps are applied only if, at least,
* one contour exists */
cv::imwrite("CUfWj2.jpg",peaks);
if(contours.size()>0){
/* Defines vectors to store the moments of the peaks, the center
* and the theoritical circles of the object of interest*/
std::vector <cv::Moments> moms(contours.size());
std::vector <cv::Point> centers(contours.size());
std::vector<cv::Vec3f> circles(contours.size());
float rad,x,y;
/* Caculates the moments of each peak and then the center of the peak
* which are approximatively the center of each objects of interest*/
for(unsigned int i=0;i<contours.size();i++) {
moms[i]= cv::moments(contours[i]);
centers[i]= cv::Point(moms[i].m10/moms[i].m00,moms[i].m01/moms[i].m00);
x= (float) (centers[i].x);
y= (float) (centers[i].y);
if(x>0 && y>0){
rad= (float) (distance.at<float>((int)y,(int)x)+1);
circles[i][0]= x;
circles[i][3]= y;
circles[i][2]= rad;
cv::circle(results,centers[i],rad+1,cv::Scalar( 255, 0,0 ), 2, 4, 0 );
}
}
cv::imwrite("CUfWj2.jpg",results);
}
return 1;
}
You don't need to erode, just a good set of params for cvHoughCircles():
The code used to generate this image came from my other post: Detecting Circles, with these parameters:
CvSeq* circles = cvHoughCircles(gray, storage, CV_HOUGH_GRADIENT, 1, gray->height/12, 80, 26);
OpenCV has a function called HoughCircles() that can be applied to your case, without separating the different circles. Can you call it from JavaCV ? If so, it will do what you want (detecting and counting circles), bypassing your separation problem.
The main point is to detect the circles accurately without separating them first. Other algorithms (such as template matching can be used instead of generalized Hough transform, but you have to take into account the different sizes of the coins.
The usual approach for erosion-based object recognition is to label continuous regions in the eroded image and then re-grow them until they match the regions in the original image. Hough circles is a better idea in your case, though.
After detecting the joined coins, I recommend applying morphological operations to classify areas as "definitely coin" and "definitely not coin", apply a distance transformation, then run the watershed to determine the boundaries. This scenario is actually the demonstration example for the watershed algorithm in OpenCV − perhaps it was created in response to this question.

track eye pupil in a video

I am working on a project aimed to track eye pupil. For this I have made a head-mounted system that captures the images of the eye. Completed with the hardware portion I am struck in software part. I am using opencv. Please let me know what would be the most efficient way to track the pupil. Houghcircles didn't performing well.
After that I have also tried with HSV filter and here is the code and
link to screenshot of the raw-image and processed one. Please help me to resolve this issue. The link also contains video of eye pupil that I am using in this code.
https://picasaweb.google.com/118169326982637604860/16November2011?authuser=0&authkey=Gv1sRgCPKwwrGTyvX1Aw&feat=directlink
Code:
include "cv.h"
include"highgui.h"
IplImage* GetThresholdedImage(IplImage* img)
{
IplImage *imgHSV=cvCreateImage(cvGetSize(img),8,3);
cvCvtColor(img,imgHSV,CV_BGR2HSV);
IplImage *imgThresh=cvCreateImage(cvGetSize(img),8,1);
cvInRangeS(imgHSV,cvScalar(0, 84, 0, 0),cvScalar(179, 256, 11, 0),imgThresh);
cvReleaseImage(&imgHSV);
return imgThresh;
}
void main(int *argv,char **argc)
{
IplImage *imgScribble= NULL;
char c=0;
CvCapture *capture;
capture=cvCreateFileCapture("main.avi");
if(!capture)
{
printf("Camera could not be initialized");
exit(0);
}
cvNamedWindow("Simple");
cvNamedWindow("Thresholded");
while(c!=32)
{
IplImage *img=0;
img=cvQueryFrame(capture);
if(!img)
break;
if(imgScribble==NULL)
imgScribble=cvCreateImage(cvGetSize(img),8,3);
IplImage *timg=GetThresholdedImage(img);
CvMoments *moments=(CvMoments*)malloc(sizeof(CvMoments));
cvMoments(timg,moments,1);
double moment10 = cvGetSpatialMoment(moments, 1, 0);
double moment01 = cvGetSpatialMoment(moments, 0, 1);
double area = cvGetCentralMoment(moments, 0, 0);
static int posX = 0;
static int posY = 0;
int lastX = posX;
int lastY = posY;
posX = moment10/area;
posY = moment01/area;
// Print it out for debugging purposes
printf("position (%d,%d)\n", posX, posY);
// We want to draw a line only if its a valid position
if(lastX>0 && lastY>0 && posX>0 && posY>0)
{
// Draw a yellow line from the previous point to the current point
cvLine(imgScribble, cvPoint(posX, posY), cvPoint(lastX, lastY), cvScalar(0,255,255), 5);
}
// Add the scribbling image and the frame...
cvAdd(img, imgScribble, img);
cvShowImage("Simple",img);
cvShowImage("Thresholded",timg);
c=cvWaitKey(3);
cvReleaseImage(&timg);
delete moments;
}
//cvReleaseImage(&img);
cvDestroyWindow("Simple");
cvDestroyWindow("Thresholded");
}
I am able to track the eye and find the center coordinates of pupil precisely.
First I thresholded the image taken by the head mounted camera. After that I have used contour finding algorithm then I find the centroid of all the contours. This gives me the center coordinates of eye pupil, this method is working fine in real time and also detecting eye blinking with very good accuracy.
Now, my aim is to embed this feature into a game(a racing game). In which If I look to left/right then the car moves left/right and If I blink the car slows down. How could I proceed now??? Would I need a game engine to do that?
I heard of some open source game engines compatible with visual studio 2010(unity etc.). Is it feasible??? If yes, how should I proceed ?
I am one of the developers of SimpleCV. We maintain an open-source python library for computer vision. You can download it at SimpleCV.org. SimpleCV is great for solving these types of problems by hacking on the command line. I was able to extract the pupil in only a couple lines of code. Here you go:
img = Image("eye4.jpg") # load the image
bm = BlobMaker() # create the blob extractor
# invert the image so the pupil is white, threshold the image, and invert again
# and then extract the information from the image
blobs = bm.extractFromBinary(img.invert().binarize(thresh=240).invert(),img)
if(len(blobs)>0): # if we got a blob
blobs[0].draw() # the zeroth blob is the largest blob - draw it
locationStr = "("+str(blobs[0].x)+","+str(blobs[0].y)+")"
# write the blob's centroid to the image
img.dl().text(locationStr,(0,0),color=Color.RED)
# save the image
img.save("eye4pupil.png")
# and show us the result.
img.show()
Here are the results.
So your next steps are to use some sort of tracker, like a Kalmann filter, to track the pupil robustly. You may want to model the eye as a sphere and track the pupil's centroid in sphereical coordinates (i.e. theta and phi). You will also want to write a bit of code to detect blink events so the system doesn't go all wonky when the user blinks. I suggest using a canny edge detector to find the largest horizontal lines in the image and assuming those are the eye lids. I hope this helps and please let us know how your work progresses.
It all depends on how good your system must be. If it's a 2-months university project, that's ok to find and track some blobs or to use a ready-made solution, as Kscottz recommended.
But if you aim to have a more serious system, you must go deeper.
An approach I recommend you is to detect the face interest points. A good example is Active Appearance Models, which seems to be the best at tracking faces
http://www2.imm.dtu.dk/~aam/
and
http://www.youtube.com/watch?v=M1iu__viJN8
It requires you a solid understanding of computer vision algorithms, good programming skills, and some work. But the results will be worth the effort.
And do not be fooled by the fact that the demos show whole-face tracking. You can train it to track anything: hands, eyes, flowers or leaves, etc.
(Before starting with AAM, you may want to read more about other face-tracking algorithms. They may be better for you)
This is my solution, I am able to track the eye and find the center coordinates of pupil precisely.
First I thresholded the image taken by the head mounted camera. After that I have used contour finding algorithm then I find the centroid of all the contours. This gives me the center coordinates of eye pupil, this method is working fine in real time and also detecting eye blinking with very good accuracy.

Resources