I am trying to put thresholds on the aspect ratios of rotated rectangles obtained around certain objects in the image using OpenCV. To compare the aspect ratio of a rotated rectangle with the threshold, I need to take the ratio of the longer dimension and the shorter dimension of the rotated rectangle.
I am confused in this regard: what is the convention in OpenCV? Is rotatedRectanlge.size.width always smaller than rotatedRectangle.size.height? i.e., is the width of a rotated rectangle always assigned to the smaller of the two dimensions of the rotated Rectangle in OpenCV?
I tried running some code to find an answer. And, it seems like rotatedRectangle.size.width is actually the smaller dimension of a rotatedRectangle. But I still want some confirmation from anyone who has encountered something similar.
EDIT: I am using fitEllipse to get the rotated rectangle and my version of OpenCV is 2.4.1.
Please help!
There is no convention for a rotated rectangle per se, as the documentation says
The class represents rotated (i.e. not up-right) rectangles on a plane. Each rectangle is specified by the center point (mass center), length of each side (represented by cv::Size2f structure) and the rotation angle in degrees.
However, you don't specify what function or operation is creating your rotated rects - for example, if you used fitEllipse it may be that there is some internal detail of the algorithm that prefers to use the larger (or smaller) dimension as the width (or height).
Perhaps you could comment or edit your question with more information. As it stands, if you want the ratio of the longer:shorter dimensions, you will need to specifically test which is longer first.
EDIT
After looking at the OpenCV source code, the fitEllipse function contains the following code
if( box.size.width > box.size.height )
{
float tmp;
CV_SWAP( box.size.width, box.size.height, tmp );
box.angle = (float)(90 + rp[4]*180/CV_PI);
}
So, at least for this implementation, it seems that width is always taken as the shorter dimension. However, I wouldn't rely on that staying true in a future implementation.
Related
I want to fit an image of a clown like face into a contour of another face (a person).
I am detecting the persons face and getting a elliptical-like contour.
I can figure out the center, radius, highest, lowest, left-most and right-most points.
How do I fit the clown face (a square image which I can make elliptical by cutting the face out of the empty background of a png and then detecting the contour) into the persons face?
Or at the least, how do I fit a polygon into another polygon.
I can fit a rectangular image into a rectangular contour with ease, but faces aren't that shape.
Python preferable, but C++ is also manageable, thank you.
Edit: Visual representation as requested:
I have
and I want to make it like this:
but I want the clown face to stretch over the guys face and fit within the blue contour.
I think the keyword you are looking for is Active Appearance Models. First, you need to fit a model to first face (such as this one), which lays inside the contour. Then, you should fit the same model to the clown face. After that, since you have fitted same model to both faces, you can stretch it as you need.
I haven't use AAM myself and I'm not an expert about it, so my explanation might not be enough or might not be exactly correct, but I'm sure it will give you some insight.
A simple and good answer to this question is to find the extreme top, bottom, left, and right points on your contour (head) and then resize your mask to match the aspect ration and place it to cover the 4 points.
Because human heads are elliptical you can use fitEllipse() to give you those 4 points. This will automagically fix any problems with the person tilting their head because regardless of the angle you will know which point is top, bottom, left, and right.
The relevant code for finding the ellipse is:
vector<Point> contour;
// Do whatever you are doing to populate this vector
RotatedRect ellipse = fitEllipse(Mat(contour));
There is also an example as well as documentation for RotatedRect.
// Resize your mask with these sizes for optimum fit
ellipse.size.width
ellipse.size.height
You can rotate your image like this.
UPDATE:
You may also want to find the contour's extreme points to know how much you need to scale your image to ensure that all of the face is covered.
I need to find the rotation angle between two binary images. SO I can correct the rotation by rotating the images by the specified angle. Can someone help please?
I already tried the Principle axis rotation angle but It doesn't give accurate result. Can some one suggest me a better method. And this image an be anything. It need not to be the image I uploaded here. But all the images are binary.
Threshold source.
Apply thinning algorithm as described here.
Find contour and approxPolyDP.
Now for each consecutive points calculate angle.
double angle = atan2(p1.y - p2.y, p1.x - p2.x)
Do the same for second image and calculate difference in angle.
For each image
Threshold the image so that object pixels are non-zero and background pixels are zero
Find the convexhull of the non-zero pixels (you may use any method to reduce the number of points that you use to calculate the convexhull, such as first finding contours. The main idea is to find the convexhull)
Calculate the minimum-area-rectangle using minAreaRect and it'll return a RotatedRect object (in C++). This object contains the rotation angle
Take the difference
Note: this approach will not work if somehow the resulting min-area-rect returns the same angle though the object rotation is different. Therefore, I feel it's better to use other measures such as moments of the filled convexhull to calculate the rotation: http://en.wikipedia.org/wiki/Image_moment
What is Distance Transform?What is the theory behind it?if I have 2 similar images but in different positions, how does distance transform help in overlapping them?The results that distance transform function produce are like divided in the middle-is it to find the center of one image so that the other is overlapped just half way?I have looked into the documentation of opencv but it's still not clear.
Look at the picture below (you may want to increase you monitor brightness to see it better). The pictures shows the distance from the red contour depicted with pixel intensities, so in the middle of the image where the distance is maximum the intensities are highest. This is a manifestation of the distance transform. Here is an immediate application - a green shape is a so-called active contour or snake that moves according to the gradient of distances from the contour (and also follows some other constraints) curls around the red outline. Thus one application of distance transform is shape processing.
Another application is text recognition - one of the powerful cues for text is a stable width of a stroke. The distance transform run on segmented text can confirm this. A corresponding method is called stroke width transform (SWT)
As for aligning two rotated shapes, I am not sure how you can use DT. You can find a center of a shape to rotate the shape but you can also rotate it about any point as well. The difference will be just in translation which is irrelevant if you run matchTemplate to match them in correct orientation.
Perhaps if you upload your images it will be more clear what to do. In general you can match them as a whole or by features (which is more robust to various deformations or perspective distortions) or even using outlines/silhouettes if they there are only a few features. Finally you can figure out the orientation of your object (if it has a dominant orientation) by running PCA or fitting an ellipse (as rotated rectangle).
cv::RotatedRect rect = cv::fitEllipse(points2D);
float angle_to_rotate = rect.angle;
The distance transform is an operation that works on a single binary image that fundamentally seeks to measure a value from every empty point (zero pixel) to the nearest boundary point (non-zero pixel).
An example is provided here and here.
The measurement can be based on various definitions, calculated discretely or precisely: e.g. Euclidean, Manhattan, or Chessboard. Indeed, the parameters in the OpenCV implementation allow some of these, and control their accuracy via the mask size.
The function can return the output measurement image (floating point) - as well as a labelled connected components image (a Voronoi diagram). There is an example of it in operation here.
I see from another question you have asked recently you are looking to register two images together. I don't think the distance transform is really what you are looking for here. If you are looking to align a set of points I would instead suggest you look at techniques like Procrustes, Iterative Closest Point, or Ransac.
I have a digital image, and I want to make some calculation based on distances on it. So I need to get the Milimeter/Pixel proportion. What I'm doing right now, is to mark two points wich I know the real world distance, to calculate the Euclidian distance between them, and than obtain the proportion.
The question is, Only with two points can I make the correct Milimeter/Pixel's proportion, or do I need to use 4 points, 2 for the X-Axis and 2 for Y-axis?
If your image is of a flat surface and the camera direction is perpendicular to that surface, then your scale factor should be the same in both directions.
If your image is of a flat surface, but it is tilted relative to the camera, then marking out a rectangle of known proportions on that surface would allow you to compute a perspective transform. (See for example this question)
If your image is of a 3D scene, then of course there is no way in general to convert pixels to distances.
If you know the distance between the points A and B measured on the picture(say in inch) and you also know the number of pixels between the points, you can easily calculate the pixels/inch ratio by dividing <pixels>/<inches>.
I suggest to take the points on the picture such that the line which intersects them is either horizontal either vertical such that calculations do not have errors taking into account the pixels have a rectangular form.
I have a photograph containing multiple rectangles of various sizes and orientations. I am currently trying to find the distance from the camera to any rectangles present in the image. What is the best way to accomplish this?
For example, an example photograph might look like similar to this (although this is probably very out-of-proportion):
I can find the pixel coordinates of the corners of any of the rectangles in the image, along with the camera FOV and resolution. I also know beforehand the length and width of any rectangle that could be in the image (but not what angle they face the camera). The ratio of length to width of each rectangular target that could be in the image is guaranteed to be unique. The rectangles and the camera will always be parallel to the ground.
What I've tried:
I hacked out a solution based on some example code I found on the internet. I'm basically iterating through each rectangle and finding the average pixel length and height.
I then use this to find the ratio of length vs. height, and compare it against a list of
the ratios of all known rectangular targets so I can find the actual height of the target in inches. I then use this information to find the distance:
...where actual_height is the real height of the target in inches, the IMAGE_HEIGHT is how tall the image is (in pixels), the pixel_height is the average height of the rectangle on the image (in pixels), and the VERTICAL_FOV is the angle the camera sees along the vertical axis in degrees (about 39.75 degrees on my camera).
I found this formula on the internet, and while it seems to work somewhat ok, I don't really understand how it works, and it always seems to undershoot the actual distance by a bit.
In addition, I'm not sure how to go about modifying the formula so that it can deal with rectangles that are very skewed from viewing them along an angle. Since my algorithm works by finding the proportion of the length and height, it works ok for rectangles 1 and 2 (which aren't too skewed), but doesn't work for rectangle 3, since it's very skewed, throwing the ratios completely off.
I considered finding the ratio using the method outlined in this StackOverflow question regarding the proportions of a perspective-deformed rectangle, but I wasn't sure how well that would work with what I have, and was wondering if it's overkill or if there's a simpler solution I could try.
FWIW I once did something similar with triangles (full 6DoF pose, not just distance).