How Convexity Defect is calculated in OpenCV? - opencv

What is the algorithm used in OpenCV function convexityDefects() to calculate the convexity defects of a contour?
Please, describe and illustrate the high-level operation of the algorithm, along with its inputs and outputs.

Based on the documentation, the input are two lists of coordinates:
contour defining the original contour (red on the image below)
convexhull defining the convex hull corresponding to that contour (blue on the image below)
The algorithm works in the following manner:
If the contour or the hull contain 3 or less points, then the contour is always convex, and no more processing is needed. The algorithm assures that both the contour and the hull are accessed in the same orientation.
N.B.: In further explanation I assume they are in the same orientation, and ignore the details regarding representation of the floating point depth as an integer.
Then for each pair of adjacent hull points (H[i], H[i+1]), defining one edge of the convex hull, calculate the distance from the edge for each point on the contour C[n] that lies between H[i] and H[i+1] (excluding C[n] == H[i+1]). If the distance is greater than zero, then a defect is present. When a defect is present, record i, i+1, the maximum distance and the index (n) of the contour point where the maximum located.
Distance is calculated in the following manner:
dx0 = H[i+1].x - H[i].x
dy0 = H[i+1].y - H[i].y
if (dx0 is 0) and (dy0 is 0) then
scale = 0
else
scale = 1 / sqrt(dx0 * dx0 + dy0 * dy0)
dx = C[n].x - H[i].x
dy = C[n].y - H[i].y
distance = abs(-dy0 * dx + dx0 * dy) * scale
It may be easier to visualize in terms of vectors:
C: defect vector from H[i] to C[n]
H: hull edge vector from H[i] to H[i+1]
H_rot: hull edge vector H rotated 90 degrees
U_rot: unit vector in direction of H_rot
H components are [dx0, dy0], so rotating 90 degrees gives [-dy0, dx0].
scale is used to find U_rot from H_rot, but because divisions are more computationally expensive than multiplications, the inverse is used as an optimization. It's also pre-calculated before the loop over C[n] to avoid recomputing each iteration.
|H| = sqrt(dx0 * dx0 + dy0 * dy0)
U_rot = H_rot / |H| = H_rot * scale
Then, a dot product between C and U_rot gives the perpendicular distance from the defect point to the hull edge, and abs() is used to get a positive magnitude in any orientation.
distance = abs(U_rot.C) = abs(-dy0 * dx + dx0 * dy) * scale
In the scenario depicted on the above image, in first iteration, the edge is defined by H[0] and H[1]. The contour points tho examine for this edge are C[0], C[1], and C[2] (since C[3] == H[1]).
There are defects at C[1] and C[2]. The defect at C[1] is the deepest, so the algorithm will record (0, 1, 1, 50).
The next edge is defined by H[1] and H[2], and corresponding contour point C[3]. No defect is present, so nothing is recorded.
The next edge is defined by H[2] and H[3], and corresponding contour point C[4]. No defect is present, so nothing is recorded.
Since C[5] == H[3], the last contour point can be ignored -- there can't be a defect there.

Related

How do you calculate the average gradient direction and average gradient strength/magnitude

In OpenCV how do you calculate the average gradient strength in a Mat and the average gradient direction?
I have sourced the below methods by googling but I want to confirm I am actually doing this correctly before moving onto the next step.
Is this correct?
Mat img = imread('foo.png', CV_8UC); // read image as grayscale single channel
// Calculate the mean intensity and the std deviation
// Any errors here or am I doing this correctly?
Scalar sMean, sStdDev;
meanStdDev(src, sMean, sStdDev);
double mean = sMean[0];
double stddev = sStdDev[0];
// Calculate the average gradient magnitude/strength across the image
// Any errors here or am I doing this correctly?
Mat dX, dY, magnitude;
Sobel(src, dX, CV_32F, 1, 0, 1);
Sobel(src, dY, CV_32F, 0, 1, 1);
magnitude(dX, dY, magnitude);
Scalar sMMean, sMStdDev;
meanStdDev(magnitude, sMMean, sMStdDev);
double magnitudeMean = sMMean[0];
double magnitudeStdDev = sMStdDev[0];
// Calculate the average gradient direction across the image
// Any errors here or am I doing this correctly?
Scalar avgHorizDir = mean(dX);
Scalar avgVertDir = mean(dY);
double avgDir = atan2(-avgVertDir[0], avgHorizDir[0]);
float blurriness = cv::videostab::calcBlurriness(src); // low values = sharper. High values = blurry
Technically those are the correct ways of obtaining the two averages.
The way you compute mean direction uses weighted directional statistics, meaning that pixels without a strong gradient have less influence on the average.
However, for most images this average direction is not very meaningful, as there exist edges in all directions and cancel out.
If your image is of a single edge, then this will work great.
If your image has lines in it, containing edges in opposite directions, this will not work. In this case, you want to average the double angle (average orientations). The obvious way of doing this is to compute the direction per pixel as an angle, double them, then use directional statistics to average (ie convert back to vectors and average those). Doubling the angle causes opposite directions to be mapped to the same value, thus averaging doesn’t cancel these out.
Another simple way to average orientations is to take the average of the tensor field obtained by the outer product of the gradient field with itself, and determine the direction of the eigenvector corresponding to the largest eigenvalue. The tensor field is obtained as follows:
Mat Sxx = dX * dX;
Mat Syy = dY * dY;
Mat Sxy = dX * dY;
This should then be averaged:
Scalar mSxx = mean(sXX);
Scalar mSyy = mean(sYY);
Scalar mSxy = mean(sXY);
These values form a 2x2 real-valued symmetric matrix:
| mSxx mSxy |
| mSxy mSyy |
It is relatively straight-forward to determine its eigendecomposition, and can be done analytically. I don’t have the equations on hand right now, so I’ll leave it as an exercise to the reader. :)

Estimating distance from camera to ground plane point

How can I calculate distance from camera to a point on a ground plane from an image?
I have the intrinsic parameters of the camera and the position (height, pitch).
Is there any OpenCV function that can estimate that distance?
You can use undistortPoints to compute the rays backprojecting the pixels, but that API is rather hard to use for your purpose. It may be easier to do the calculation "by hand" in your code. Doing it at least once will also help you understand what exactly that API is doing.
Express your "position (height, pitch)" of the camera as a rotation matrix R and a translation vector t, representing the coordinate transform from the origin of the ground plane to the camera. That is, given a point in ground plane coordinates Pg = [Xg, Yg, Zg], its coordinates in camera frame are given by
Pc = R * Pg + t
The camera center is Cc = [0, 0, 0] in camera coordinates. In ground coordinates it is then:
Cg = inv(R) * (-t) = -R' * t
where inv(R) is the inverse of R, R' is its transpose, and the last equality is due to R being an orthogonal matrix.
Let's assume, for simplicity, that the the ground plane is Zg = 0.
Let K be the matrix of intrinsic parameters. Given a pixel q = [u, v], write it in homogeneous image coordinates Q = [u, v, 1]. Its location in camera coordinates is
Qc = Ki * Q
where Ki = inv(K) is the inverse of the intrinsic parameters matrix. The same point in world coordinates is then
Qg = R' * Qc + Cg
All the points Pg = [Xg, Yg, Zg] that belong to the ray from the camera center through that pixel, expressed in ground coordinates, are then on the line
Pg = Cg + lambda * (Qg - Cg)
for lambda going from 0 to positive infinity. This last formula represents three equations in ground XYZ coordinates, and you want to find the values of X, Y, Z and lambda where the ray intersects the ground plane. But that means Zg=0, so you have only 3 unknowns. Solve them (you recover lambda from the 3rd equation, then substitute in the first two), and you get Xg and Yg of the solution to your problem.

How to check if a point is inside a set of contours

Hello to everyone. The above image is sum of two images in which i did feature matching and draw all matching points. I also found the contours of the pcb parts in the first image (half left image-3 contours). The question is, how could i draw only the matching points that is inside those contours in the first image instead this blue mess? I'm using python 2.7 and opencv 2.4.12.
I wrote a function for draw matches cause in opencv 2.4.12 there isn't any implemented method for that. If i didn't include something please tell me. Thank you in advance!
import numpy as np
import cv2
def drawMatches(img1, kp1, img2, kp2, matches):
# Create a new output image that concatenates the two images
# (a.k.a) a montage
rows1 = img1.shape[0]
cols1 = img1.shape[1]
rows2 = img2.shape[0]
cols2 = img2.shape[1]
# Create the output image
# The rows of the output are the largest between the two images
# and the columns are simply the sum of the two together
# The intent is to make this a colour image, so make this 3 channels
out = np.zeros((max([rows1,rows2]),cols1+cols2,3), dtype='uint8')
# Place the first image to the left
out[:rows1,:cols1] = np.dstack([img1, img1, img1])
# Place the next image to the right of it
out[:rows2,cols1:] = np.dstack([img2, img2, img2])
# For each pair of points we have between both images
# draw circles, then connect a line between them
for mat in matches:
# Get the matching keypoints for each of the images
img1_idx = mat.queryIdx
img2_idx = mat.trainIdx
# x - columns
# y - rows
(x1,y1) = kp1[img1_idx].pt
(x2,y2) = kp2[img2_idx].pt
# Draw a small circle at both co-ordinates
# radius 4
# colour blue
# thickness = 1
cv2.circle(out, (int(x1),int(y1)), 4, (255, 0, 0), 1)
cv2.circle(out, (int(x2)+cols1,int(y2)), 4, (255, 0, 0), 1)
# Draw a line in between the two points
# thickness = 1
# colour blue
cv2.line(out, (int(x1),int(y1)), (int(x2)+cols1,int(y2)), (255,0,0), 1)
# Show the image
cv2.imshow('Matched Features', out)
cv2.imwrite("shift_points.png", out)
cv2.waitKey(0)
cv2.destroyWindow('Matched Features')
# Also return the image if you'd like a copy
return out
img1 = cv2.imread('pic3.png', 0) # Original image - ensure grayscale
img2 = cv2.imread('pic1.png', 0) # Rotated image - ensure grayscale
sift = cv2.SIFT()
# find the keypoints and descriptors with SIFT
kp1, des1 = sift.detectAndCompute(img1,None)
kp2, des2 = sift.detectAndCompute(img2,None)
# Create matcher
bf = cv2.BFMatcher()
# Perform KNN matching
matches = bf.knnMatch(des1, des2, k=2)
# Apply ratio test
good = []
for m,n in matches:
if m.distance < 0.75*n.distance:
# Add first matched keypoint to list
# if ratio test passes
good.append(m)
# Show only the top 10 matches - also save a copy for use later
out = drawMatches(img1, kp1, img2, kp2, good)
Based on what you are asking I am assuming you mean you have some sort of closed contour outlining the areas you want to bound your data point pairs to.
This is fairly simple for polygonal contours and more math is required for more complex curved lines but the solution is the same.
You draw a line from the point in question to infinity. Most people draw out a line to +x infinity, but any direction works. If there are an odd number of line intersections, the point is inside the contour.
See this article:
http://www.geeksforgeeks.org/how-to-check-if-a-given-point-lies-inside-a-polygon/
For point pairs, only pairs where both points are inside the contour are fully inside the contour. For complex contour shapes with concave sections, if you also want to test that the linear path between the points does not cross the contour, you perform a similar test with just the line segment between the two points, if there are any line intersections the direct path between the points crosses outside the contour.
Edit:
Since your contours are rectangles, a simpler approach will suffice for determining if your points are inside the rectangle.
If your rectangles are axis aligned (they are straight and not rotated), then you can use your values for top,left and bottom,right to check.
Let point A = Top,Left, point B = Bottom,Right, and point C = your test point.
I am assuming an image based coordinate system where 0,0 is the left,top of the image, and width,height is the bottom right. (I'm writing in C#)
bool PointIsInside(Point A, Point B, Point C)
{
if (A.X <= C.X && B.X >= C.X && A.Y <= C.Y && B.Y >= C.Y)
return true;
return false;
}
if your rectangle is NOT axis aligned, then you can perform four half-space tests to determine if your point is inside the rectangle.
Let Point A = Top,Left, Point B = Bottom,Right, double W = Width, double H = Height, double N = rotation angle, and Point C = test point.
for an axis aligned rectangle, Top,Right can be calculated by taking the vector (1,0) , multiplying by Width, and adding that vector to Top,Left. For Bottom,Right We take the vector (0,1), multiply by height, and add to Top,Right.
(1,0) is the equivalent of a Unit Vector (length of 1) at Angle 0. Similarly, (0,1) is a unit vector at angle 90 degrees. These vectors can also be considered the direction the line is pointing. This also means these same vectors can be used to go from Bottom,Left to Bottom,Right, and from Top,Left to Bottom,Left as well.
We need to use different unit vectors, at the angle provided. To do this we simply need to take the Cosine and Sine of the angle provided.
Let Vector X = direction from Top,Left to Top,Right, Vector Y = direction from Top,Right to Bottom,Right.
I am using angles in degrees for this example.
Vector X = new Vector();
Vector Y = new Vector();
X.X = Math.Cos(R);
X.Y = Math.Sin(R);
Y.X = Math.Cos(R+90);
Y.Y = Math.Sin(R+90);
Since we started with Top,Left, we can find Bottom,Right by simply adding the two vectors to Top,Left
Point B = new Point();
B = A + X + Y;
We now want to do a half-space test using the dot product for our test point. The first two test will use the test point, and Top,Left, the other two will use the test point, and Bottom,Right.
The half-space test is inherently based on directionality. Is the point in front, behind, or perpendicular to a given direction? We have the two directions we need, but they are directions based on the top,left point of the rectangle, not the full space of the image, so we need to get a vector from the top,left, to the point in question, and another from the bottom, right, since those are the two points we test against.
This is simple to calculate, as it is just Destination - Origin.
Let Vector D = Top,Left to test point C, and Vector E = Bottom,Right to test point.
Vector D = C - A;
Vector E = C - B;
The dot product is x1 * x2 + y1*y2 of the two vectors. if the result is positive, the two directions have an absolute angle of less than 90 degrees, or are roughly going in the same direction, a result of zero means they are perpendicular. In our case it means the test point is directly on a side of the rectangle we are testing against. Less than zero means an absolute angle of greater than 90 degrees, or they are roughly going opposite directions.
If a point is inside the rectangle, then the dot products from top left will be >= 0, while the dot products from bottom right will be <= 0. In essence the test point is closer to bottom right when testing from top left, but when taking the same directions when we are already at bottom right, it will be going away, back toward top,left.
double DotProd(Vector V1, Vector V2)
{
return V1.X * V2.X + V1.Y * V2.Y;
}
and so our test ends up as:
if( DotProd(X, D) >= 0 && DotProd(Y, D) >= 0 && DotProd(X, E) <= 0 && DotProd(Y, E) <= 0)
then the point is inside the rectangle. Do this for both points, if both are true then the line is inside the rectangle.

How to calculate quantized angle?

I am looking at the code for Hough transformation in image segmentation. The following code is from Computer Vision by Linda Shapiro. Can somebody tell me what is quantize_angle and how can I compute it?
The Hough transform looks for straight lines (or other features) in an image and represents these features as points in a different 2D coordinate system, where one axis represents the angle θ of a detected line, and the other represents the distance δ from this line to the centre of the image.
Source: Wikipedia
To produce a Hough transform of finite dimensions, both θ and δ have to be quantized. For example, if θ lies in the range (0 ≤ θ < 2π), then you could map it to the range 0–255 by a function such as the following:
int quantize_angle(float theta) {
int q = floor(theta * 128.0 / 3.141592654 + 0.5);
return q % 256;
}
This will result in a Hough transform that is 256 pixels wide.

Circle estimation from 2D data set

I am doing some computer vision based hand gesture recognising stuff. Here, I want to detect a circle (a circular motion) made by my hand. My initial stages are working fine and I am able to get a blob whose centroid from each frame I am plotting. This is essentially my data set. A collection of 2D co-ordinate points. Now I want to detect a circular type motion and say generate a call to a function which says "Circle Detected". The circle detector will give a YES / NO boolean output.
Here is a sample of the data set I am generating in 40 frames
The x, y values are just plotted to a bitmap image using MATLAB.
My initial hand movement was slow and later I picked up speed to complete the circle within stipulated time (40 frames). There is no hard and fast rule about the number of frames thing but for now I am using a 40 frame sliding window for circle detection (0-39) then (1-40) then (2-41) etc.
I am also calculating the arc-tangent between successive points using:
angle = atan2(prev_y - y, prev_x - x) * 180 / pi;
Now what approach should I take for detecting a circle (This sample image should result in a YES). The angle as I am noticing is not steadily increasing from 0 to 360. It does increase but with jumps here and there.
If you are only interested in full or nearly full circles:
I think that the standard parameter estimation approach: Hough/RANSAC won't work very well in this case.
Since you have frames order and therefore distances between consecutive blob centers, you can create a nearly uniform sub sample of the data (let say, pick 20 points spaced ~evenly), calculate the center and measure the distance of all points from that center.
If it is nearly a circle all points will have similar distance from the center.
If you want to do something slightly more robust, you can:
Compute center (mean) of all points.
Perform gradient descent to update the center: should be fairly easy an you won't have local minima. The error term I would probably use is max(D) - min(D) where D is the vector of distances between the blob centers and estimated circle center (but you can use robust statistics instead of max & min)
Evaluate the circle
I would use a Least Square estimation. Numerically you can use the Nelder-Mead method. You get the circle that best approximate your points and on the basis of the residual error value you decide whether to consider the circle valid or not.
Being points the array of the points, xc, yc the coordinates of the center and r the radius, this could be an example of error to minimize:
class Circle
{
private PointF[] _points;
public Circle(PointF[] points)
{
_points = points;
}
public double MinimizeFunction(double xc, double yc, double r)
{
double d, d2, dx, dy, sum;
sum = 0;
foreach(PointF p in _points)
{
dx = p.X - xc;
dy = p.Y - yc;
d2 = dx * dx + dy * dy;
// sum += d2 - r * r;
d = Math.Sqrt(d2) - r;
sum += d * d;
}
return sum;
}
public double ResidualError(double xc, double yc, double r)
{
return Math.Sqrt(MinimizeFunctional(xc, yc, r)) / (_points.Length - 3);
}
}
There is a slight difference between the commented functional and the uncommented, but for practical reason this difference is meaningless. Instead, from a theoretical point of view the difference is important.
Since you need to supply a initial values set (xc, yc, r), you can calculate the circle given three points, choosing three points far from each other.
If you need more details on "circle given three points" or Nelder-Mead you can google or ask me here.

Resources