Rectangle detection with Hough transform - image-processing

I'm trying to implement rectangle detection using the Hough transform, based on
this paper.
I programmed it using Matlab, but after the detection of parallel pair lines and orthogonal pairs, I must detect the intersection of these pairs. My question is about the quality of the two line intersection in Hough space.
I found the intersection points by solving four equation systems. Do these intersection points lie in cartesian or polar coordinate space?

For those of you wondering about the paper, it's:
Rectangle Detection based on a Windowed Hough Transform by Cláudio Rosito Jung and Rodrigo Schramm.
Now according to the paper, the intersection points are expressed as polar coordinates, obviously you implementation may be different (the only way to tell is to show us your code).
Assuming you are being consistent with his notation, your peaks should be expressed as:
You must then perform peak paring given by equation (3) in section 4.3 or
where represents the angular threshold corresponding to parallel lines
and is the normalized threshold corresponding to lines of similar length.

The accuracy of the Hough space should be dependent on two main factors.
The accumulator maps onto Hough Space. To loop through the accumulator array requires that the accumulator divide the Hough Space into a discrete grid.
The second factor in accuracy in Linear Hough Space is the location of the origin in the original image. Look for a moment at what happens if you do a sweep of \theta for any given change in \rho. Near the origin, one of these sweeps will cover far less pixels than a sweep out near the edges of the image. This has the consequence that near the edges of the image you need a much higher \rho \theta resolution in your accumulator to achieve the same level of accuracy when transforming back to Cartesian.
The problem with increasing the resolution of course is that you will need more computational power and memory to increase it. Also If you uniformly increase the accumulator resolution you have wasted resolution near the origin where it is not needed.
Some ideas to help with this.
place the origin right at the
center of the image. as opposed to
using the natural bottom left or top
left of an image in code.
try using the closest image you can
get to a square. the more elongated an
image is for a given area the more
pronounced the resolution trap
becomes at the edges
Try dividing your image into 4/9/16
etc different accumulators each with
an origin in the center of that sub-image.
It will require a little overhead to link
the results of each accumulator together
for rectangle detection, but it should help
spread the resolution more evenly.
The ultimate solution would be to increase
the resolution linearly depending on the
distance from the origin. this can be achieved using the
(x-a)^2 + (y-b)^2 = \rho^2
circle equation where
- x,y are the current pixel
- a,b are your chosen origin
- \rho is the radius
once the radius is known adjust your accumulator
resolution accordingly. You will have to keep
track of the center of each \rho \theta bin.
for transforming back to Cartesian

The link to the referenced paper does not work, but if you used the standard hough transform than the four intersection points will be expressed in cartesian coordinates. In fact, the four lines detected with the hough tranform will be expressed using the "normal parametrization":
rho = x cos(theta) + y sin(theta)
so you will have four pairs (rho_i, theta_i) that identifies your four lines. After checking for orthogonality (for example just by comparing the angles theta_i) you solve four equation system each of the form:
rho_j = x cos(theta_j) + y sin(theta_j)
rho_k = x cos(theta_k) + y sin(theta_k)
where x and y are the unknowns that represents the cartesian coordinates of the intersection point.

I am not a mathematician. I am willing to stand corrected...
From Hough 2) ... any line on the xy plane can be described as p = x cos theta + y sin theta. In this representation, p is the normal distance and theta is the normal angle of a straight line, ... In practical applications, the angles theta and distances p are quantized, and we obtain an array C(p, theta).
from CRC standard math tables Analytic Geometry, Polar Coordinates in a Plane section ...
Such an ordered pair of numbers (r, theta) are called polar coordinates of the point p.
Straight lines: let p = distance of line from O, w = counterclockwise angle from OX to the perpendicular through O to the line. Normal form: r cos(theta - w) = p.
From this I conclude that the points lie in polar coordinate space.

Related

How to obtain the orientation of a square in the real world from an image knowing the centroid position in the real world and the image projection

I'm trying to obtain the orientation of a square in the real world from an image. I know the projection of each vertex in the image and with this and a depth camera I can obtain the position of the centroid in the real world.
I need the orientation of the square (actually, the normal vector to the plane) and the depth camera has not enough resolution. The camera parameters are also known.
I've search and I've only found estimation algorithms too overkill for problems with much less information. But in this case, I have a lot of data of the shape, distance, camera, image, etc. but I am not being able to get it.
Thanks in advance.
I assume the image is captured with an ordinary camera, and that your "square" is well approximated by an actual geometrical rectangle, with parallel opposite sides and orthogonal adjacent ones
If you only need the square's normal, and the camera is calibrated (in particular, the nonlinear lens distortion is removed from the image), then it can trivially be obtained from the vanishing points and the center. The algorithm is as follows:
Express the images of the four vertices p_i, i=1..4, in homogeneous coordinates: p_i = (u_i, v_i, 1). The ordering of i is unimportant, but in the following I assume it's clockwise starting from any one vertex. Also, for convenience, where in the following I write, say, i + n, it's assumed that the addition is modulo 4, so that, e.g., i + 1 = 1 when i = 4.
Compute the equations of the lines covering the square sides: l_i = p_(i+1) X p_i, where X represents the cross product.
Compute the equations of the diagonals: d_13 = p_1 X p_3, d_24 = p_2 X p_4.
Compute the center: c = d_13 X d_24.
Compute the vanishing points of the pairs of parallel sides: v_13 = l_1 X l_3, v_24 = l_2 X l_4. They represent the directions of the images of two lines which, in 3D, are orthogonal to each other.
Compute the images of the axes the 3D orthogonal coordinate frame rooted at the square center, and with two of the axes parallel to the square sides: x = c X v_13, y = c X v_24.
Lastly, the plane normal, in 3D camera coordinate frame, is their cross product: z = x X y .
Note that removing the distortion is important, because even a small amount of distortion can greatly affects the location of the vanishing points when the square sides are nearly parallel.
If you want to know why this works, the following excerpt from Hartley and Zisserman's "Multiple View Geometry in Computer Vision" should sufice:

Finding edges in a height map

I want to find sharp edges in a heightmap image, while ignoring shallow edges.
OpenCV offers multiple approaches to finding edges in a 2d Image: Canny, Sobel, etc.
However, all these approaches work by comparing the intensity values on both sides of the edge.
If the 2D Image represents a height map of a 3D object, then this results in some weird behaviour.
In a height map, the height of a 3D object at a given X/Y coordinate is represented as the intensity of the 2D Pixel at that X/Y coordinate:
In the above picture, at the edge B the intensity changes only slightly between the left and right side, even though it is a sharp corner.
At the edge A, there is a bigchange in the intensity between pixels on the left side of the edge and the right, even though it is only a shallow angle.
So there is no threshold for Canny or Sobel that will preserve the sharp edge but filter the shallow edge.
(In the above example, the edge B has one side with an ascending slope, and one side with a descending slope. I could filter for this feature; but that would remove the edges C and D as well)
How can I get a binary edge image, containing only edges above a certain angle? (e.g. edge B, C, and D, but not A)
Or alternatively, how can I get a gradient derivative image, where the intensity of each pixel is proportional to the angle of the edge at that pixel?
Probably you'll want to use second derivative instead of first for this task.
Here's my intuition: taking derivative of height (intensity in your case) at each position on an evenly spaced grid would be proportional to arctan of the surface slope between sampling points (or at sampling points if you use a 2-sided derivative approximation). But since you want to detect sharp edges - you are looking for a derivative of slope at the sampling points. This means that you can set a threshold on a derivative of arctan of derivative of intensity to achieve your goal (luckily there's no "need to go deeper" :) )
You will have to be extra careful with taking a derivative of "slope angles" that you'll get - depending on the coordinate system you may come across ambiguity of angle difference (there are 2 ways to get from one angle to another, which are different in general case; you're looking for the "shorter" one). You can look for possible solution here
I have a rather simple approach that I came across wile reading a blog post.
It involves computing the median value of the gray scale image. Using this value we can now set two threshold values:
lower: max(0, (1.0 - 0.33) * v)
upper: min(255, (1.0 + 0.33) * v)
Now pass these two values as parameters into the cv2.Canny() function.
You will now be able to perform an optimized edge detection given any image. The crux of this answer depends on the median value of the image which varies for different images.
If i understand your question correctly, "what you need is basically a corner with high intensity values".
If that is so then look for Harris corner detector which would help you to find points with high gradient change in both direction.
http://docs.opencv.org/2.4/doc/tutorials/features2d/trackingmotion/harris_detector/harris_detector.html
Once you detect the corners you can filter the corners which have high intensity by using a suitable threshold.

calculate the distance of a contour point to contour at a specific angle

Now I have a set of contour points. I have ray L which starts at Pn and has an angle of ALPHA clockwise to the horizontal axis. I want to calculate the length of line which starts at Pn and ends at the point that ray L intersects with the contour, in this case is one point between Pn-2 and Pn-3. So how can I efficently and fast calculate this length?
No algorithm can solve this in faster than linear time, since the number of intersections may be linear, and so is the size of the output. I can suggest the following algorithm, which is quite convenient and efficient to implement:
transfer the points to a coordinate system x',y' whose center is Pn and x' is parallel to L. (In practice only the y' coordinate needs to be calculated. This requires 2 multiplication and 2 additions per point).
now find all the intersecting segments by searching for adjacent indices where the y' coordinates changes signs.
Calculate the intersection & length only for these segments
You could just compute the intersection of ray L with all line segments consisting of any pair of neighbouring contour points.
Of course you might want to optimize this process by sorting by distance to Pn or whatever. Depending on the countour (concave shape?) there could be multiple intersections, so you have to choose the right one (inner, outer, ...).
Instea of computing the intersection you also could draw the contour and the ray (e.g. using openCV) and find the point of intersection by using logical and.

Detect aligned points in set of points OpenCV

Given a set of points on an image, I want to detect groups of aligned points as shown in the figure:
How can I do this? Any help will be appreciated.
This is a good potential application of the Hough Transform. The Hough space for lines is (r, \theta) where r is the distance from origin to closest point on line and \theta is its orientation.
Each point in x-y space becomes a sinusoid in Hough space as shown in the Wiki article.
The places where all the sinusoids intersect corresponds to a single line that passes through all the points. If the points are not perfectly colinear, the intersection will be "fuzzy".
The simplest algorithm to fit lines to points is to make a rectangular (r, \theta) accumulator array set to zero initially. Then trace a sinusoid for each point into this discrete (r, \theta) space, incrementing each accumulator element by a fixed amount. Find prospective line fits by looking for large array elements. The element coordinates give (r, \theta) for the fit.
Tracing the sinusoid is straightforward. If you have T accumulator bins on the \theta axis then each corresponds to an angle k(\pi)/N for some 0 <= k < T. So for k in this range, calculate the distance from the origin to the closest point of a line with this orientation passing through the point. This provides an r value. If there are R bins on the R axis and Rmax is the maximum value of r, then increment bin (floor(r/rMax*R), k).
As a start, you can try this:
List all lines that can be formed by selecting any two of these points (n(n-1)/2 ones for n points).
For any two of these lines, check if they are aligned (i.e. slope diff within say 10 degrees).
For each aligned pair lines, you can easily check whether other points are also aligned on these lines. And these points will be the aligned points you need.

Getting corners from convex points

I have written algorithm to extract the points shown in the image. They form convex shape and I know order of them. How do I extract corners (top 3 and bottom 3) from such points?
I'm using opencv.
if you already have the convex hull of the object, and that hull includes the corner points, then all you need to to do is simplify the hull until it only has 6 points.
There are many ways to simplify polygons, for example you could just use this simple algorithm used in this answer: How to find corner coordinates of a rectangle in an image
do
for each point P on the convex hull:
measure its distance to the line AB _
between the point A before P and the point B after P,
remove the point with the smallest distance
repeat until 6 points left
If you do not know the exact number of points, then you could remove points until the minimum distance rises above a certain threshold
you could also do Ramer-Douglas-Peucker to simplify the polygon, openCV already has that implemented in cv::approxPolyDP.
Just modify the openCV squares sample to use 6 points instead of 4
Instead of trying to directly determine which of your feature points correspond to corners, how about applying an corner detection algorithm on the entire image then looking for which of your feature points appear close to peaks in the corner detector?
I'd suggest starting with a Harris corner detector. The OpenCV implementation is cv::cornerHarris.
Essentially, the Harris algorithm applies both a horizontal and a vertical Sobel filter to the image (or some other approximation of the partial derivatives of the image in the x and y directions).
It then constructs a 2 by 2 structure matrix at each image pixel, looks at the eigenvalues of that matrix, and calls points corners if both eigenvalues are above some threshold.

Resources