Count the number of fingers - opencv

This is an image of convex hull of hand plus defect points:
Where:
Blue circles: start points
Red circles: end points
Green circles: depth points.
I want to define the finger counter from defect points. Does any one have any idea to get the number of fingers?

I don't think it's clear cut. As far as I can see, assuming you can easily find ROI for the hand, you can easily make it classification problem of 6 (0-5) classes. And then you can train your data with some classification methods such as NN, SVM, and so on.
By adding more segmentation of palm and folded fingers, you will get enough accuracy.

if (defectArray[i].depth > ??? )
Draw marks for all defects. Count defects

Related

Is canny edge detection edge rotationlly invariant?

Suppose that the Canny edge detector successfully detects an edge in an image. The edge is then rotated by θ, where the relationship between a point on the original edge (x,y)(x,y) and a point on the rotated edge (x′,y′)(x′,y′) is defined as x′ = xcosθ; y′ = xsinθ;
Will the rotated edge be detected using the same Canny edge detector?
(I think we should find answer considering that the detection of an edge by the Canny edge detector depends only on the magnitude of its derivative.)
The answer is both yes and no, and which one you go for depends on how literally you take the question.
First of all, we're dealing with a rectangular grid, so given an integer location (x,y), the corresponding point (x',y') in a rotated image is highly likely not an integer location. And considering that the output of Canny is a set of points, and not a smooth function that can be interpolated, it would be difficult to establish a correspondence between the set resulting from the rotated and the one resulting from the original image.
Think for example about the number of pixels on a discrete line of a given length at 0 degrees and at 45 degrees. (Hint: the line at 45 degrees has sqrt(2) times fewer pixels.)
But if you take the question more generally and interpret it as "will an edge that is detected in the original image also be detected after rotating the image by θ degrees?" then the answer is yes, in theory.
Of course practice is always a bit different than theory. The details of the implementation matter here. And there is always numerical imprecision to contend with.
Let's start by assuming the rotation is computed correctly, with a precise interpolation scheme (cubic, Lanczos) and not rounded after to uint8 or something (i.e. we're computing using floating-point values).
If you read the original paper by Canny, you'll see he proposes using Gaussian derivatives as the best compromise between compact support and computational precision. I have seen few implementations that actually do. Typically I see a convolution with a Gaussian and then Sobel derivatives. Especially for smaller sigmas (less smoothing) the difference can be quite large. Gaussian derivatives are rotationally invariant, Sobel derivatives are not.
The next step in the algorithm is non-maximum suppression. This is where the continuous gradient is converted to a set of points. For each pixel, it checks to see if it is a local maximum in the direction of the gradient. Because this is done per pixel, a different set of locations are tested in the rotated image compared to the original. Nonetheless, it should detect points along the same ridges in both cases.
Next, a hysteresis threshold is applied. This is a two-threshold operation that keeps pixels above one threshold as long as at least one pixel above a second threshold is present in the same connected component. This is where the differences could occur between rotated and original image. Remember we're dealing with a set of pixels. We have samples the continuous gradient function at discrete points. There could be an edge that has one pixel above the second threshold in one version of the image, but not in the other. This would only occur for edges very close to the chosen threshold, of course.
Next comes a thinning. Because the non-maximum suppression can yield points along a thicker line, a thinning operation is applied that removes pixels from the set that are not needed to maintain connectivity of the lines. Which pixels are selected here will also differ between rotated and original images, but this does not change the geometry of the solution, so we still have the same set of points.
So, the answer is yes and no. :)
Note that the same logic applies to translation.

Calculating the neighborhood distance

What method would you use to compute a distance that represents the number of "jumps" one has to do to go from one area to another area in a given 2D map?
Let's take the following map for instance:
(source: free.fr)
The end result of the computation would be a triangle like this:
A B C D E F
A
B 1
C 2 1
D 2 1 1
E . . . .
F 3 2 2 1 .
Which means that going from A to D, it takes 2 jumps.
However, to go from anywhere to E, it's impossible because the "gap" is too big, and so the value is "infinite", represented here as a dot for simplification.
As you can see on the example, the polygons may share points, but most often they are simply close together and so a maximum gap should be allowed to consider two polygons to be adjacent.
This, obviously, is a simplified example, but in the real case I'm faced with about 60000 polygons and am only interested by jump values up to 4.
As input data, I have the polygon vertices as an array of coordinates, from which I already know how to calculate the centroid.
My initial approach would be to "paint" the polygons on a white background canvas, each with their own color and then walk the line between two candidate polygons centroid. Counting the colors I encounter could give me the number of jumps.
However, this is not really reliable as it does not take into account concave arrangements where one has to walk around the "notch" to go from one polygon to the other as can be seen when going from A to F.
I have tried looking for reference material on this subject but could not find any because I have a hard time figuring what the proper terms are for describing this kind of problem.
My target language is Delphi XE2, but any example would be most welcome.
You can create inflated polygon with small offset for every initial polygon, then check for intersection with neighbouring (inflated) polygons. Offseting is useful to compensate small gaps between polygons.
Both inflating and intersection problems might be solved with Clipper library.
Solution of the potential neighbours problem depends on real conditions - for example, simple method - divide plane to square cells, and check for neighbours that have vertices in the same cell and in the nearest cells.
Every pair of intersecting polygons gives an edge in (unweighted, undirected) graph. You want to find all the path with length <=4 - just execute depth-limited BFS from every vertice (polygon) - assuming that graph is sparse
You can try a single link clustering or some voronoi diagrams. You can also brute-force or try Density-based spatial clustering of applications with noise (DBSCAN) or K-means clustering.
I would try that:
1) Do a Delaunay triangulation of all the points of all polygons
2) Remove from Delaunay graph all triangles that have their 3 points in the same polygon
Two polygons are neightbor by point if at least one triangle have at least one points in both polygons (or obviously if polygons have a common point)
Two polygons are neightbor by side if each polygon have at least two adjacents points in the same quad = two adjacent triangles (or obviously two common and adjacent points)
Once the gaps are filled with new polygons (triangles eventually combined) use Djikistra Algorithm ponderated with distance from nearest points (or polygons centroid) to compute the pathes.

Comparison metric for two open contours

I'm validating an image segmentation algorithm applied to 2D images. The algorithm generates a contour segment, i.e. a set of connected pixels that form a freecurve in 2D space. The idea is to compare this set of pixels with a ground-truth, in my case another contour segment manually traced by an expert. An image showing what would be a segmentation result and the corresponding manual (ground-truth) segmentation is shown below:
I'm trying to think of an adequate comparison metric to validate the segmentation results. Ideally the best metric would be the point-to-point euclidean distance between corresponding pairs of pixels on each segment, however (as seen in previous figure) the segments don't have the same length (i.e. differ by the total number of pixels) so pixel-to-pixel comparisons have to be discarded.
Can you suggest me an adequate metric for validating my algorithm? Thanks for any suggestion!
For each pixel in the ground truth, take the distance to the nearest pixel in the segmentation result. Then take the sum of that for all ground truth pixels as the total error.
That's basically recall weighted by distance. If you start with the pixels in the result, it would resemble precision instead.
If the curves are closed, you can compute the area between the curves. If you can tell which pixels belong to a segment, that is as easy as computing XOR set of the 2 pixel sets.
Here is an example using that I've created using Matlab:
You could divide each line into n segments of equal length, then compute the euclidean distance between each segment and its pair on the other line.

How does cv::ContourArea() deal with non-closed curves?

After using cv::Canny(), it seems that there are some non-closed curves in the image. So my question is, what will cv::ContourArea() deal with them? Counting the area by close the curve first or just ignore them?
From ContourArea reference:
Calculates the contour area
So it just calculates area (number of pixels if image is discontinuous) of contour.

Detection of pattern of circles using opencv

I have to detect the pattern of 6 circles using opencv. I have detected the circles and their centroids by using thresholding and contour function in opencv.
Now I have to define the relation between these circles in a way that should be invariant to scale and rotation. With this I would be able to detect this pattern in various views. I have to use this pattern for determining the object pose.
How can I achieve scale/rotation invariance? Do you have any reference I could read about it?
To make your pattern invariant toward rotation & scale, you have to normalize the direction and the scale when detecting your pattern. Here is a simple algorithm to achieve this
detect centers and circle size (you say you have already achieved this - good!)
compute the average center using a simple mean. Express all the centers from this mean
find the farthest center using a simple norm (euclidian is good enough)
scale the center position and the circle sizes so that this maximum distance is 1.0
rotate the centers so that coordinates of the farthest one is (1.0, 0)
you're done. You are now the proud owner of a scale/rotation invariant pattern detector!! Congratulations!
Now you can find patterns, transform them as suggested, and compare center position & circle sizes.
It is not entirely clear to me if you need to find the rotation, or merely get rid of it, or detect if the circles actually form the pattern you linked. Either way, the answer is much the same.
I would start by finding the two circles that have only one neighbour. For each circle centroid calculate the distance to the closest two neighbours. If the distances differ in more than say 10%, the centroid belongs to an "end" circle (one of the top ones in your link).
Now that you have found the two end circles, rotate them so that they are horizontal to each other. If the other centroids are now above them, rotate another 180 degrees so that the pattern ends up in the orientation you want.
Now you can calculate the scaling from the average inter-centroid distance.
Hope that helps.
Your question sounds exactly like what the SURF algorithm does. It finds groups of interest and groups them together in a way invarant to rotation and scale, and can find the same object in other pictures.
Just search for OpenCV and SURF.

Resources