I'm trying to calculate the size of a bounding box after rotating a square. I've attached an image that hopefully describes what I'm looking to do.
After rotating by x degrees, the bounds becomes bigger. Is there a way to calculate this new size, given the angle and the dimensions of the original square? Thank you.
This can be solved through 2 applications of Pythag.
Each side of your larger square is split into two by a corner of your small blue square. Lets call the larger of these 2 sections length a, the smaller length b (although if x > 45 degrees then b will be larger), with side length l for the blue square and L as length of the black square.
We can calculate the first as: Cos(x) = a/l.
And the 2nd as Sin(x) = b/l
Thus we have L = (Sin(x)+Cos(x))*l.
Edit: Area is of course side length squared in both cases.
This works only if you have the co-ordinates. If you can get the co-ordinates of the four vertices, it would be so easy.. Lets assume the point at the top left corner of the bound be A. And the top two vertices of the square be sq_a and sq_b. The value of the vertex A would be (sq_a.x,sq_b.y). Then by symmetry , all the small four triangles formed between the bound and the square will be of the same area. Calculate the area of the triangle formed between A,sq_a and sq_b (which should be easy .. 1/2 * breadth * height). Multiply by 4 and the you will get the total area. Sorry couldnt post detailed pics.
Related
I want to be able to interpolate any point in an image grid with uniform pixel distances. Until now, I succeeded in interpolating inner points surrounded by 4 existing pixel points. However, I am now lost at interpolating points that are at the boundaries of an image.
In the image example below, dark red dots represent the position of the values of the pixels (so from which to calculate the X and Y factors/weights for interpolation from the point we need to interpolate). So image[0] contains the actual value (not position) of the dark red dot in the left upper pixel, image[1] has the value of the dark red dot in the right upper pixel, image[2] has value of dark red dot of left-bottom pixel and so on. The grid is divided in such a way that vertex (0,0) represents the upper left corner of the image and (1,1) is the bottom right corner of the image and so on. All positions from the point that we need to interpolate to the vertices and red dot center of the pixels are represented by 2D coordinate where X and Y are always between 0 and 1. Each pixel has 4 vertices which are colored in yellow with their corresponding coordinates.
Now, I know how to calculate point p3 since it has 4 pixel centers around it. But how can we interpolate the points p1 and p2 if there are no 4 centers of pixels to use in the bilinear interpolation formula?
Example of 2x2 image:
I am trying to find corners of a square, potentially rotated shape, to determine the direction of its primary axes (horizontal and vertical) and be able to do a perspective transform (straighten it out).
From a prior processing stage I obtain the coordinates of a point (red dot in image) belonging to the shape. Next I do a flood-fill of the shape on a thresholded version of the image to determine its center (not shown) and area, by summing up X and Y of all filled pixels and dividing them by the area (number of pixels filled).
Given this information, what is an easy and reliable way to determine the corners of the shape (blue arrows)?
I was thinking about keeping track of P1, P2, P3, P4 where P1 is (minX, minY), P2 is (minX, maxY), P3 (maxY, minY) and P4 (maxY, maxY), so P1 is the point with the smallest value of X encountered, and of all those P, the one where Y is smallest too. Then sort them to get a clock-wise ordering. But I'm not sure if this is correct in all cases and efficient.
PS: I can't use OpenCV.
Looking your image, direction of 2 axes of the 2D pattern coordinate system will be able to be estimated from histogram of gradient direction.
When creating such histogram, 4 peeks will be found clearly.
If the image captured from front (image without perspective, your image looks like this case), Ideally, the angles between adjacent peaks are all 90 degrees.
directions of 2 axes of the pattern coordinate system will be directly estimated from those peaks.
After that, 4 corners can be simply estimated from "Axis aligned bounding box" (along the estimated axis, of course).
If not (when image is a picture with perspective), 4 peaks indicates which edge line is along the axis of the pattern coordinates.
So, for example, you can estimate corner location as intersection of 2 lines that along edge.
What I eventually ended up doing is the following:
Trace the edges of the contour using Moore-Neighbour Tracing --> this gives me a sequence of points lying on the border of rectangle.
During the trace, I observe changes in rectangular distance between the first and last points in a sliding window. The idea is inspired by the paper "The outline corner filter" by C. A. Malcolm (https://spie.org/Publications/Proceedings/Paper/10.1117/12.939248?SSO=1).
This is giving me accurate results for low computational overhead and little space.
In OpenCV or object detection models, they represent bounding box as 4 numbers e.g. x,y,width,height or x1,y1,x2,y2.
These numbers seem to be ill-defined but it's fine when the resolution is big.
But it causes me to think when the image has very low resolution e.g. 8x8, the one-pixel error can cause things to go very wrong.
So I want to know, what exactly does it mean when you say that a bounding box has x1=0, x2=100?
Specifically, I want to clear these confusions when understood well:
Does the bounding box border occupy the 0th pixel or is it surrounding 0th pixel (its border is at x=-1)?
Where is the exact end of the bounding box? If the image have shape=(8,8), would the end be at 7 or 8?
If you want to represent a bounding box that occupy the entire image, what should be its values?
So I think the right question should be, how do I think about bounding box intuitively so that these are not confusing for me?
OK. After many days working with bounding boxes, I have my own intuition on how to think about bounding box coordinates now.
I divide coordinates in 2 categories: continuous and discrete. The mental problems usually arise when you try to convert between them.
Suppose the image have width=100, height=100 then you can have a continuous point with x,y that can have any real value in the range [0,100].
It means that points like (0,0), (0.5,7.1,39.83,99.9999) are valid points.
Now you can convert a continuous point to a discrete point on the image by taking the floor of the number. E.g. (5.5, 8.9) gets mapped to pixel number (5,8) on the image. It's very important to understand that you should not use the ceiling or rounding operation to convert it to the discrete version. Suppose you have a continuous point (0.9,0.9) this point lies in the (0,0) pixel so it's closest to (0,0) pixel, not (1,1) pixel.
From this foundation, let's try to answer my question:
So I want to know, what exactly does it mean when you say that a bounding box has x1=0, x2=100?
It means that the continuous point 1 has x value = 0, and continuous point 2, has x value = 100. Continuous point has zero size. It's not a pixel.
Does the bounding box border occupy the 0th pixel or is it surrounding 0th pixel (its border is at x=-1)?
In continuous-space, the bounding box border occupy zero space. The border is infinitesimally slim. But when we want to draw it onto an image, the border will have the size of at least 1 pixel thick. So if we have a continuous point (0,0), it will occupy 0th pixel of the image. But theoretically, it represents a slim border at the left side and top side of the 0th pixel.
Where is the exact end of the bounding box? If the image have shape=(8,8), would the end be at 7 or 8?
The biggest x,y value you can have is 7.999... but when converted to discrete version you will be left with 7 which represent the last pixel.
If you want to represent a bounding box that occupy the entire image, what should be its values?
You should represent bounding box coordinates in continuous space instead of discrete space because of the precision that you have. It means the largest bounding box starts at (0,0) and ends at (100,100). But if you want to draw this box, you need to convert it to discrete version and draws the bounding box at (0,0) and end at (99,99).
In OpenCv the bounding rectangle can be defined in many ways. One way is its top-left corner and bottom-right corner. In case of constructor Rect(int x1, int y1, int x2, int y2) it defines those two points. The rectangle starts exactly on that pixel and coordinate. For subpixel rectangles there are also variants holding the floating point coordinates.
So I want to know, what exactly does it mean when you say that a bounding box has x1=0, x2=100?
That means the top-left corner x-coordinate starts at 0 and bottom-right x-coordinate
starts at 100.
Does the bounding box border occupy the 0th pixel or is it surrounding 0th pixel (its border is at x=-1)?
The border starts exactly on the 0-th pixel. Meaning that rectangle with width and height of 1px when drawn is just a signle dot (1px)
Where is the exact end of the bounding box? If the image have shape=(8,8), would the end be at 7 or 8?
The end would be at 7, see below.
If you want to represent a bounding box that occupy the entire image, what should be its values?
Lets have an image size of 100,100. The around the image rectangle defined by two points would be Rect(Point(0,0), Point(99,99)) by starting point and size Rect(0, 0, 100, 100)
The basic is to know that image of size X,Y has a minimum top-left coordinate at (0,0) and maximum at bottom-right (X-1,Y-1)
In the below picture, I have the 2D locations of the green points and I want to calculate the locations of the red points, or, as an intermediate step, I want to calculate the locations of the blue points. All in 2D.
Of course, I do not only want to find those locations for the picture above. In the end, I want an automated algorithm which takes a set of checkerboard corner points to calculate the outer corners.
I need the resulting coordinates to be as accurate as possible, so I think that I need a solution which does not only take the outer green points into account, but which also uses all the other green points' locations to calculate a best fit for the outer corners (red or blue).
If OpenCV can do this, please point me into that direction.
In general, if all you have is the detection of some, but not all, the inner corners, the problem cannot be solved. This is because the configuration is invariant to translation - shifting the physical checkerboard by whole squares would produce the same detected corner position on the image, but due to different physical corners.
Further, the configuration is also invariant to rotations by 180 deg in the checkerboard plane and, unless you are careful to distinguish between the colors of the squares adjacent each corner, to rotations by 90 deg and reflections with respect the center and the midlines.
This means that, in addition to detecting the corners, you need to extract from the image some features of the physical checkerboard that can be used to break the above invariance. The simplest break is to detect all 9 corners of one row and one column, or at least their end-corners. They can be used directly to rectify the image by imposing the condition that their lines be at 90 deg angle. However, this may turn out to be impossible due to occlusions or detector failure, and more sophisticated methods may be necessary.
For example, you can try to directly detect the chessboard edges, i.e. the fat black lines at the boundary. One way to do that, for example, would be to detect the letters and numbers nearby, and use those locations to constrain a line detector to nearby areas.
By the way, if the photo you posted is just a red herring, and you are interested in detecting general checkerboard-like patterns, and can control the kind of pattern, there are way more robust methods of doing it. My personal favorite is the "known 2D crossratios" pattern of Matsunaga and Kanatani.
I solved it robustly, but not accurately, with the following solution:
Find lines with at least 3 green points closely matching the line. (thin red lines in pic)
Keep bounding lines: From these lines, keep those with points only to one side of the line or very close to the line.
Filter bounding lines: From the bounding lines, take the 4 best ones/those with most points on them. (bold white lines in pic)
Calculate the intersections of the 4 remaining bounding lines (none of the lines are perfectly parallel, so this results in 6 intersections, of which we want only 4).
From the intersections, remove the one farthest from the average position of the intersections until only 4 of them are left.
That's the 4 blue points.
You can then feed these 4 points into OpenCV's findPerspectiveTransform function to find a perspective transform (aka a homography):
Point2f* srcPoints = (Point2f*) malloc(4 * sizeof(Point2f));
std::vector<Point2f> detectedCorners = CheckDet::getOuterCheckerboardCorners(srcImg);
for (int i = 0; i < MIN(4, detectedCorners.size()); i++) {
srcPoints[i] = detectedCorners[i];
}
Point2f* dstPoints = (Point2f*) malloc(4 * sizeof(Point2f));
int dstImgSize = 400;
dstPoints[0] = Point2f(dstImgSize * 1/8, dstImgSize * 1/8);
dstPoints[1] = Point2f(dstImgSize * 7/8, dstImgSize * 1/8);
dstPoints[2] = Point2f(dstImgSize * 7/8, dstImgSize * 7/8);
dstPoints[3] = Point2f(dstImgSize * 1/8, dstImgSize * 7/8);
Mat m = getPerspectiveTransform(srcPoints, dstPoints);
For our example image, the input and output of findPerspectiveTranform looks like this:
input
(349.1, 383.9) -> ( 50.0, 50.0)
(588.9, 243.3) -> (350.0, 50.0)
(787.9, 404.4) -> (350.0, 350.0)
(506.0, 593.1) -> ( 50.0, 350.0)
output
( 1.6 -1.1 -43.8 )
( 1.4 2.4 -1323.8 )
( 0.0 0.0 1.0 )
You can then transform the image's perspective to board coordinates:
Mat plainBoardImg;
warpPerspective(srcImg, plainBoardImg, m, Size(dstImgSize, dstImgSize));
Results in the following image:
For my project, the red points that you can see on the board in the question are not needed anymore, but I'm sure they can be calculated easily from the homography by inverting it and then using the inverse for back-tranforming the points (0, 0), (0, dstImgSize), (dstImgSize, dstImgSize), and (dstImgSize, 0).
The algorithm works surprisingly reliable, however, it does not use all the available information, because it uses only the outer points (those which are connected with the white lines). It does not use any data of the inner points for additional accuracy. I would still like to find an even better solution, which uses the data of the inner points.
I'm drawing squares along a circular path for an iOS application. However, at certain points along the circle, the squares start to go out of the circle's circumference. How do I make sure that the squares stay inside?
Here's an illustration I made. The green squares represent the positions I need the squares to actually be in. The red squares are where they actually appear given the following values for each square's upper-left corner:
x = origin.x + radius * cos(DEGREES_TO_RADIANS(angle));
y = origin.y + radius * sin(DEGREES_TO_RADIANS(angle));
Origin refers to the center of the circle. I have a loop that repeats this for every angle from 1 till 360 degrees.
EDIT: I've changed my design to position the centers of the squares along the circular path rather than their upper left corners.
why not just draw the centers of the squares along a smaller circle inside of the bigger one?
You could do the math to figure out exactly what the radius would have to be to ensure an exact fit, but you could probably trial and error your way there quickly too.
Doing it this way ensures that your objects would end up laid out in an actual circle too, which is not the case if you were merely making sure that one and only one corner of each square touched the larger bounding circle (that would create a slightly octagonal shape instead of a circle)
ryan cumley's answer made me realize how dumb I was all along. I just needed to change each square's anchor point to its center & that solved it. Now every calculated value for x & y would position every square's center exactly on the circular path.
Option 1) You could always find the diameter of the circle and then using Pythagorean Theorem, you could create a square that would fit perfectly within the circle. You could then loop through the square that was just made in the circle to create smaller squares, but I doubt this is what you are aiming for.
Option2) Find out what half of the length of one of the diagonals of the squares should be, and create a ring within the first ring. Then lay down squares at key points (like ever 30 degrees or 15 degrees, etc) along the inner path. Ex: http://i.imgur.com/1XYhoQ0.png
As you can see, the smaller (inner) circle is in the center of each green square, and that ensures that the corners of each square just touches the larger (outer) circle. Obviously my cheaply made picture in paint is not perfect, but mathematically it will work.