In the below picture, I have the 2D locations of the green points and I want to calculate the locations of the red points, or, as an intermediate step, I want to calculate the locations of the blue points. All in 2D.
Of course, I do not only want to find those locations for the picture above. In the end, I want an automated algorithm which takes a set of checkerboard corner points to calculate the outer corners.
I need the resulting coordinates to be as accurate as possible, so I think that I need a solution which does not only take the outer green points into account, but which also uses all the other green points' locations to calculate a best fit for the outer corners (red or blue).
If OpenCV can do this, please point me into that direction.
In general, if all you have is the detection of some, but not all, the inner corners, the problem cannot be solved. This is because the configuration is invariant to translation - shifting the physical checkerboard by whole squares would produce the same detected corner position on the image, but due to different physical corners.
Further, the configuration is also invariant to rotations by 180 deg in the checkerboard plane and, unless you are careful to distinguish between the colors of the squares adjacent each corner, to rotations by 90 deg and reflections with respect the center and the midlines.
This means that, in addition to detecting the corners, you need to extract from the image some features of the physical checkerboard that can be used to break the above invariance. The simplest break is to detect all 9 corners of one row and one column, or at least their end-corners. They can be used directly to rectify the image by imposing the condition that their lines be at 90 deg angle. However, this may turn out to be impossible due to occlusions or detector failure, and more sophisticated methods may be necessary.
For example, you can try to directly detect the chessboard edges, i.e. the fat black lines at the boundary. One way to do that, for example, would be to detect the letters and numbers nearby, and use those locations to constrain a line detector to nearby areas.
By the way, if the photo you posted is just a red herring, and you are interested in detecting general checkerboard-like patterns, and can control the kind of pattern, there are way more robust methods of doing it. My personal favorite is the "known 2D crossratios" pattern of Matsunaga and Kanatani.
I solved it robustly, but not accurately, with the following solution:
Find lines with at least 3 green points closely matching the line. (thin red lines in pic)
Keep bounding lines: From these lines, keep those with points only to one side of the line or very close to the line.
Filter bounding lines: From the bounding lines, take the 4 best ones/those with most points on them. (bold white lines in pic)
Calculate the intersections of the 4 remaining bounding lines (none of the lines are perfectly parallel, so this results in 6 intersections, of which we want only 4).
From the intersections, remove the one farthest from the average position of the intersections until only 4 of them are left.
That's the 4 blue points.
You can then feed these 4 points into OpenCV's findPerspectiveTransform function to find a perspective transform (aka a homography):
Point2f* srcPoints = (Point2f*) malloc(4 * sizeof(Point2f));
std::vector<Point2f> detectedCorners = CheckDet::getOuterCheckerboardCorners(srcImg);
for (int i = 0; i < MIN(4, detectedCorners.size()); i++) {
srcPoints[i] = detectedCorners[i];
}
Point2f* dstPoints = (Point2f*) malloc(4 * sizeof(Point2f));
int dstImgSize = 400;
dstPoints[0] = Point2f(dstImgSize * 1/8, dstImgSize * 1/8);
dstPoints[1] = Point2f(dstImgSize * 7/8, dstImgSize * 1/8);
dstPoints[2] = Point2f(dstImgSize * 7/8, dstImgSize * 7/8);
dstPoints[3] = Point2f(dstImgSize * 1/8, dstImgSize * 7/8);
Mat m = getPerspectiveTransform(srcPoints, dstPoints);
For our example image, the input and output of findPerspectiveTranform looks like this:
input
(349.1, 383.9) -> ( 50.0, 50.0)
(588.9, 243.3) -> (350.0, 50.0)
(787.9, 404.4) -> (350.0, 350.0)
(506.0, 593.1) -> ( 50.0, 350.0)
output
( 1.6 -1.1 -43.8 )
( 1.4 2.4 -1323.8 )
( 0.0 0.0 1.0 )
You can then transform the image's perspective to board coordinates:
Mat plainBoardImg;
warpPerspective(srcImg, plainBoardImg, m, Size(dstImgSize, dstImgSize));
Results in the following image:
For my project, the red points that you can see on the board in the question are not needed anymore, but I'm sure they can be calculated easily from the homography by inverting it and then using the inverse for back-tranforming the points (0, 0), (0, dstImgSize), (dstImgSize, dstImgSize), and (dstImgSize, 0).
The algorithm works surprisingly reliable, however, it does not use all the available information, because it uses only the outer points (those which are connected with the white lines). It does not use any data of the inner points for additional accuracy. I would still like to find an even better solution, which uses the data of the inner points.
Related
I am trying to find corners of a square, potentially rotated shape, to determine the direction of its primary axes (horizontal and vertical) and be able to do a perspective transform (straighten it out).
From a prior processing stage I obtain the coordinates of a point (red dot in image) belonging to the shape. Next I do a flood-fill of the shape on a thresholded version of the image to determine its center (not shown) and area, by summing up X and Y of all filled pixels and dividing them by the area (number of pixels filled).
Given this information, what is an easy and reliable way to determine the corners of the shape (blue arrows)?
I was thinking about keeping track of P1, P2, P3, P4 where P1 is (minX, minY), P2 is (minX, maxY), P3 (maxY, minY) and P4 (maxY, maxY), so P1 is the point with the smallest value of X encountered, and of all those P, the one where Y is smallest too. Then sort them to get a clock-wise ordering. But I'm not sure if this is correct in all cases and efficient.
PS: I can't use OpenCV.
Looking your image, direction of 2 axes of the 2D pattern coordinate system will be able to be estimated from histogram of gradient direction.
When creating such histogram, 4 peeks will be found clearly.
If the image captured from front (image without perspective, your image looks like this case), Ideally, the angles between adjacent peaks are all 90 degrees.
directions of 2 axes of the pattern coordinate system will be directly estimated from those peaks.
After that, 4 corners can be simply estimated from "Axis aligned bounding box" (along the estimated axis, of course).
If not (when image is a picture with perspective), 4 peaks indicates which edge line is along the axis of the pattern coordinates.
So, for example, you can estimate corner location as intersection of 2 lines that along edge.
What I eventually ended up doing is the following:
Trace the edges of the contour using Moore-Neighbour Tracing --> this gives me a sequence of points lying on the border of rectangle.
During the trace, I observe changes in rectangular distance between the first and last points in a sliding window. The idea is inspired by the paper "The outline corner filter" by C. A. Malcolm (https://spie.org/Publications/Proceedings/Paper/10.1117/12.939248?SSO=1).
This is giving me accurate results for low computational overhead and little space.
I have an array of data from a grayscale image that I have segmented sets of contiguous points of a certain intensity value from.
Currently I am doing a naive bounding box routine where I find the minimum and maximum (x,y) [row, col] points. This obviously does not provide the smallest possible box that contains the set of points which is demonstrable by simply rotating a rectangle so the longest axis is no longer aligned with a principal axis.
What I wish to do is find the minimum sized oriented bounding box. This seems to be possible using an algorithm known as rotating calipers, however the implementations of this algorithm seem to rely on the idea that you have a set of vertices to begin with. Some details on this algorithm: https://www.geometrictools.com/Documentation/MinimumAreaRectangle.pdf
My main issue is in finding the vertices within the data that I currently have. I believe I need to at least find candidate vertices in order to reduce the amount of iterations I am performing, since the amount of points is relatively large and treating the interior points as if they are vertices is unnecessary if I can figure out a way to not include them.
Here is some example data that I am working with:
Here's the segmented scene using the naive algorithm, where it segments out the central objects relatively well due to the objects mostly being aligned with the image axes:
.
In red, you can see the current bounding boxes that I am drawing utilizing 2 vertices: top-left and bottom-right corners of the groups of points I have found.
The rotation part is where my current approach fails, as I am only defining the bounding box using two points, anything that is rotated and not axis-aligned will occupy much more area than necessary to encapsulate the points.
Here's an example with rotated objects in the scene:
Here's the current naive segmentation's performance on that scene, which is drawing larger than necessary boxes around the rotated objects:
Ideally the result would be bounding boxes aligned with the longest axis of the points that are being segmented, which is what I am having trouble implementing.
Here's an image roughly showing what I am really looking to accomplish:
You can also notice unnecessary segmentation done in the image around the borders as well as some small segments, which should be removed with some further heuristics that I have yet to develop. I would also be open to alternative segmentation algorithm suggestions that provide a more robust detection of the objects I am interested in.
I am not sure if this question will be completely clear, therefore I will try my best to clarify if it is not obvious what I am asking.
It's late, but that might still help. This is what you need to do:
expand pixels to make small segments connect larger bodies
find connected bodies
select a sample of pixels from each body
find the MBR ([oriented] minimum bounding rectangle) for selected set
For first step you can perform dilation. It's somehow like DBSCAN clustering. For step 3 you can simply select random pixels from a uniform distribution. Obviously the more pixels you keep, the more accurate the MBR will be. I tested this in MATLAB:
% import image as a matrix of 0s and 1s
oI = ~im2bw(rgb2gray(imread('vSb2r.png'))); % original image
% expand pixels
dI = imdilate(oI,strel('disk',4)); % dilated
% find connected bodies of pixels
CC = bwconncomp(dI);
L = labelmatrix(CC) .* uint8(oI); % labeled
% mark some random pixels
rI = rand(size(oI))<0.3;
sI = L.* uint8(rI) .* uint8(oI); % sampled
% find MBR for a set of connected pixels
for i=1:CC.NumObjects
[Y,X] = find(sI == i);
mbr(i) = getMBR( X, Y );
end
You can also remove some ineffective pixels using some more processing and morphological operations:
remove holes
find boundaries
find skeleton
In MATLAB:
I = imfill(I, 'holes');
I = bwmorph(I,'remove');
I = bwmorph(I,'skel');
I have a photo of a Go-board, which is basically a grid with n*n squares, each of size a.
Depending on how the image was taken, the grid can have either one vanishing point like this (n = 15, board size b = 15*a):
or two vanishing points like this (n = 9, board size b = 9*a):
So what is available to me are the four screen space coordinates of the four corners of the flat board: p1, p2, p3, p4.
What I would like to do is to calculate the corresponding four screen space coordinates q1, q2, q3, q4 of the corners of the board, if the board was moved 'upward' (perpendicular to the plane of the board) in world space by a, or in other words the coordinates on top of the board, if the board had a thickness of a.
Is the information about the four points even sufficient to calculate this?
If this is not enough information, maybe it would help to make the assumption that the distance of the camera to the center of the board is typically of the order of 1.5 or 2 times the board size b?
From my understanding, the four lines p1-q1, p2-q2, p3-q3, p4-q4 would all go through the same (yet unknown) vanishing point, located somewhere below the board.
Maybe a sufficient approximation (because typically for a Go board n=18 and therefore square size a is small in comparison to the board size) for the direction of each of the lines p1-q1, p2-q2, ... in screen space would be to simply choose a line perpendicular to the horizon (given by the two vanishing points vp1-vp2 or by p1-p2 in the case of only one vanishing point)?
Having made this approximation, still the length of the four lines p1-q1, p2-q2, p3-q3, p4-q4 would need to be calculated ...
Any hints are highly appreciated!
PS: I am using Objective-C & OpenCV
Not yet a full answer but this might help to move forward. As MvG pointed out 4 points alone are not enough. Luckily we know the board is a square so even with perspective distortion the diagonals in 2D should/will intersect at board center (unless serious fish-eye or other distortions are present in the image). Here a test image (created by OpenGL I used as a test input):
The grayish surface is 2D QUAD using 2D perspective distorted corner points (your input). The aqua/bluish grid is 3D OpenGL grid I created the 2D corner points with (to see if they match). The green lines are 2D diagonals and Orange points are the 2D corner points and the diagonals intersection. As you can see 2D diagonal intersection correspond exactly with 3D board mid cell center.
Now we can use the ratio between half diagonal lengths to assume/fit the perspective. If we handle cell coordinates in range <0,9> we want to achieve further division of halve diagonals like this:
I am still not sure how exactly (linear ratio l0/(l0+l1) is not working) so I need to inspect perspective mapping equations to find relative ratio dependence and compute inverse (when I have time mood for this).
If that will be a success than we can compute any points along the diagonals (we want the cell edges). If that is done from that we can easily compute visual size of any cell size a and use the vanishing point without any 3D transform matrices at all.
In case this is not doable there is still the option to use DIP/CV techniques to detect the cell crossings like this:
OpenCV Birdseye view without loss of data
using just the bullet #2 but for that you need to take into account type of images you will have and adjust the detector or add preprocessing for it ...
Now back to your offsetting you can simply offset your cells up by the visual size of the cell like this:
And handle the left side points (either interpolate the size or use the sane as neighboring cell) That should work unless too weird angles of the board are used.
I want to find sharp edges in a heightmap image, while ignoring shallow edges.
OpenCV offers multiple approaches to finding edges in a 2d Image: Canny, Sobel, etc.
However, all these approaches work by comparing the intensity values on both sides of the edge.
If the 2D Image represents a height map of a 3D object, then this results in some weird behaviour.
In a height map, the height of a 3D object at a given X/Y coordinate is represented as the intensity of the 2D Pixel at that X/Y coordinate:
In the above picture, at the edge B the intensity changes only slightly between the left and right side, even though it is a sharp corner.
At the edge A, there is a bigchange in the intensity between pixels on the left side of the edge and the right, even though it is only a shallow angle.
So there is no threshold for Canny or Sobel that will preserve the sharp edge but filter the shallow edge.
(In the above example, the edge B has one side with an ascending slope, and one side with a descending slope. I could filter for this feature; but that would remove the edges C and D as well)
How can I get a binary edge image, containing only edges above a certain angle? (e.g. edge B, C, and D, but not A)
Or alternatively, how can I get a gradient derivative image, where the intensity of each pixel is proportional to the angle of the edge at that pixel?
Probably you'll want to use second derivative instead of first for this task.
Here's my intuition: taking derivative of height (intensity in your case) at each position on an evenly spaced grid would be proportional to arctan of the surface slope between sampling points (or at sampling points if you use a 2-sided derivative approximation). But since you want to detect sharp edges - you are looking for a derivative of slope at the sampling points. This means that you can set a threshold on a derivative of arctan of derivative of intensity to achieve your goal (luckily there's no "need to go deeper" :) )
You will have to be extra careful with taking a derivative of "slope angles" that you'll get - depending on the coordinate system you may come across ambiguity of angle difference (there are 2 ways to get from one angle to another, which are different in general case; you're looking for the "shorter" one). You can look for possible solution here
I have a rather simple approach that I came across wile reading a blog post.
It involves computing the median value of the gray scale image. Using this value we can now set two threshold values:
lower: max(0, (1.0 - 0.33) * v)
upper: min(255, (1.0 + 0.33) * v)
Now pass these two values as parameters into the cv2.Canny() function.
You will now be able to perform an optimized edge detection given any image. The crux of this answer depends on the median value of the image which varies for different images.
If i understand your question correctly, "what you need is basically a corner with high intensity values".
If that is so then look for Harris corner detector which would help you to find points with high gradient change in both direction.
http://docs.opencv.org/2.4/doc/tutorials/features2d/trackingmotion/harris_detector/harris_detector.html
Once you detect the corners you can filter the corners which have high intensity by using a suitable threshold.
(This is a follow-up from this previous question).
I was able to successfully use OpenCV / Hough transforms to detect lines in pictures (scanned text); at first it would detect many many lines (at least one line per line of text), but by adjusting the 'threshold' parameter via trial-and-error, it now only detects "real" lines.
(The 'threshold' parameter is dependant on image size, which is a bit of a problem if one has to deal with images of different resolutions, but that's another story).
My problem is that the Hough transform sometimes detects two lines where there is only one; those two lines are very near one another and (apparently) parallel.
=> How can I identify that two lines are almost parallel and very near one another? (so that I can keep only one).
If you use the standard or multiscale hough, you will end up with the rho and theta coordinates of the lines in polar coordinates. Rho is the distance to the origin, and theta is normally the angle between the detected line and the Y axis. Without looking into the details of the hough transform in opencv, this is a general rule in those coordinates: two lines will be almost parallel and very near one another when:
- their thetas are nearly identical AND their rhos are nearly identical
OR
- their thetas are near 180 degrees apart AND their rhos are near each other's negative
I hope that makes sense.
That's interesting about the theta being the angle between the line and the y-axis.
Generally, the rho and theta values are visualized as being the angle from the x-axis to the line perpendicular to the line in question. The rho is then the length of this perpendicular line. Thus, a theta = 90 and rho = 20 would mean a horizontal line 20 pixels up from the origin.
A nice image is shown on Hough Transform question