Given a bounding box defined by 8 points, test whether a point is within the bounds of this box - bounding-box

Given a bounding box made up of 8 points in a 3D space, not aligned with any axis (example).
How can I test whether a point p is within the bounds of the box?
All questions I could find are either only about 2D, or axis-aligned bounding boxes, or creating a bounding box from points.

Related

Calculate the Correct Translation and Scaling Transformations for Segmentation Polygons

I have an image with a list of segmentation polygons. Getting the bounding box with the polygons is easy enough. I used the bounding box to crop out the objects of interest.
Now, I want to scale the polygons so that they wrap around the cropped object. I scaled the polygons using the ratio of the original image size to the crop size. However, I do not know how to add the correct offsets to the bounding boxes. As a result, the polygon points incorrectly wrap around the object.
For object crops, what is the correct transformation for scaling and translating arbitrary points within the crops?

Coordinates of bounding box in an image

I am doing object detection in order to count penguins on a UAV georeferenced dataset, so for practical reasons let's say they appear as dots on the images. After running the object detection model, it returns inferred images with the corresponding bounding boxes for each penguin detected.
I need to extract the coordinate of the center of the bounding box (something like x,y), so, as the image is georeferenced, I would be able to convert image b.box center coordinates into GPS coordinates.
This picture is a good example. Here, the authors are counting banana plants, and after detecting the plants of the same regions in 3 differently-treated pictures of the same area, they see that up to three boxes appear around some of the plants (left). So in order to count each plant as one, despite having some of them up to 3 bboxes, this is what they do (quoted from the original article):
Collect bounding boxes of detection from each ROI tiles.
Calculate centroid of each bounding box.
Add the tile number information on x and y-value of centroids to overlay them on original ROI image.
And this is exactly what I am looking for, the step number 3, how to calculate the centroid of each bbox and how to obtain the x,y coords, so then I would be able to transform those coords into real ones, as the image is georeferenced, and then display each real coord on a mosaic.
Thank you very much in advance.
You could use the Intersection over Union algorithm to select one of the boxes and then use the coordinates of the selected box to plot the output circle or box over detected objects.

Where exactly does the bounding box start or end?

In OpenCV or object detection models, they represent bounding box as 4 numbers e.g. x,y,width,height or x1,y1,x2,y2.
These numbers seem to be ill-defined but it's fine when the resolution is big.
But it causes me to think when the image has very low resolution e.g. 8x8, the one-pixel error can cause things to go very wrong.
So I want to know, what exactly does it mean when you say that a bounding box has x1=0, x2=100?
Specifically, I want to clear these confusions when understood well:
Does the bounding box border occupy the 0th pixel or is it surrounding 0th pixel (its border is at x=-1)?
Where is the exact end of the bounding box? If the image have shape=(8,8), would the end be at 7 or 8?
If you want to represent a bounding box that occupy the entire image, what should be its values?
So I think the right question should be, how do I think about bounding box intuitively so that these are not confusing for me?
OK. After many days working with bounding boxes, I have my own intuition on how to think about bounding box coordinates now.
I divide coordinates in 2 categories: continuous and discrete. The mental problems usually arise when you try to convert between them.
Suppose the image have width=100, height=100 then you can have a continuous point with x,y that can have any real value in the range [0,100].
It means that points like (0,0), (0.5,7.1,39.83,99.9999) are valid points.
Now you can convert a continuous point to a discrete point on the image by taking the floor of the number. E.g. (5.5, 8.9) gets mapped to pixel number (5,8) on the image. It's very important to understand that you should not use the ceiling or rounding operation to convert it to the discrete version. Suppose you have a continuous point (0.9,0.9) this point lies in the (0,0) pixel so it's closest to (0,0) pixel, not (1,1) pixel.
From this foundation, let's try to answer my question:
So I want to know, what exactly does it mean when you say that a bounding box has x1=0, x2=100?
It means that the continuous point 1 has x value = 0, and continuous point 2, has x value = 100. Continuous point has zero size. It's not a pixel.
Does the bounding box border occupy the 0th pixel or is it surrounding 0th pixel (its border is at x=-1)?
In continuous-space, the bounding box border occupy zero space. The border is infinitesimally slim. But when we want to draw it onto an image, the border will have the size of at least 1 pixel thick. So if we have a continuous point (0,0), it will occupy 0th pixel of the image. But theoretically, it represents a slim border at the left side and top side of the 0th pixel.
Where is the exact end of the bounding box? If the image have shape=(8,8), would the end be at 7 or 8?
The biggest x,y value you can have is 7.999... but when converted to discrete version you will be left with 7 which represent the last pixel.
If you want to represent a bounding box that occupy the entire image, what should be its values?
You should represent bounding box coordinates in continuous space instead of discrete space because of the precision that you have. It means the largest bounding box starts at (0,0) and ends at (100,100). But if you want to draw this box, you need to convert it to discrete version and draws the bounding box at (0,0) and end at (99,99).
In OpenCv the bounding rectangle can be defined in many ways. One way is its top-left corner and bottom-right corner. In case of constructor Rect(int x1, int y1, int x2, int y2) it defines those two points. The rectangle starts exactly on that pixel and coordinate. For subpixel rectangles there are also variants holding the floating point coordinates.
So I want to know, what exactly does it mean when you say that a bounding box has x1=0, x2=100?
That means the top-left corner x-coordinate starts at 0 and bottom-right x-coordinate
starts at 100.
Does the bounding box border occupy the 0th pixel or is it surrounding 0th pixel (its border is at x=-1)?
The border starts exactly on the 0-th pixel. Meaning that rectangle with width and height of 1px when drawn is just a signle dot (1px)
Where is the exact end of the bounding box? If the image have shape=(8,8), would the end be at 7 or 8?
The end would be at 7, see below.
If you want to represent a bounding box that occupy the entire image, what should be its values?
Lets have an image size of 100,100. The around the image rectangle defined by two points would be Rect(Point(0,0), Point(99,99)) by starting point and size Rect(0, 0, 100, 100)
The basic is to know that image of size X,Y has a minimum top-left coordinate at (0,0) and maximum at bottom-right (X-1,Y-1)

Does the surface area divided by bounding box feature have a name?

I'm writing a connected component system and one of the descriptors I can easily compute is the surface area along with the component's rectangular bounding box.
What is surface area divided by bounding area called? (or any mixture of these two parameters).
For example, if my object were a rectangle, this parameter would be 1.0.
Extent or rectangularity, apparently:
Extent of an image object is defined as area of the image object
divided by the area of its bounding rectangle.
Source: Question text in https://dsp.stackexchange.com/questions/49026/what-is-the-application-difference-between-extent-and-solidity-in-image-processi
Rectangularity is the ratio of the object to the area of the minimum
bounding rectangle.
Source: Page 45 in http://www.cyto.purdue.edu/cdroms/micro2/content/education/wirth10.pdf
But the definitions I've run across do not always fully specify the rectangle. The ambiguity is related to the concept of "ferret box". Ferret boxes' edges do not have to be parallel to the image axes like good old bounding boxes. So depending on which you choose, your "extent" value might change.

scilab - Drawing bounding box

So in scilab I did a analyzeblobs on my image and got a feature which is called BoundingBox which shows the rectangle around my object.
Now when I call this bounding Box I get 4 numbers, which I suppose are related to the corners of the rectangle.
What I don't know is that what are these numbers representing? Are they the pixel Index? or what?
Basically I want to calculate the width of the rectangle of my bounding box, so I need the coordinates of those four corners, but I don't know how to get it.
So I got the Answer:
the four elements are in order (x,y, width, height).
x,y are the coordinates of the top left corner
and the next two are the width and height of the rectangle.
So my second question has also been answered.

Resources