MinAreaRect angles - Unsure about the angle returned - opencv

From the functions for MinAreaRect, does it return angles in the range of 0-360 degrees?
I am unsure as i have an object that is oriented at 90 degrees or so but I keep getting either -1 or -15 degrees. Could this be an openCV error?
Any guidance much appreciated.
Thanks

I'm going to assume you're using C++, but the answer should be the same if you're using C or Python.
The function minAreaRect seems to give angles ranging from -90 to 0 degrees, not including zero, so an interval of [-90, 0).
The function gives -90 degrees if the rectangle it outputs isn't rotated, i.e. the rectangle has two sides exactly horizontal and two sides exactly vertical. As the rectangle rotates clockwise, the angle increases (goes towards zero). When zero is reached, the angle given by the function ticks back over to -90 degrees again.
So if you have a long rectangle from minAreaRect, and it's lying down flat, minAreaRect will call the angle -90 degrees. If you rotate the image until the rectangle given by minAreaRect is perfectly upright, then the angle will say -90 degrees again.
I didn't actually know any of this (I procrastinated from my OpenCV project to find out how it works :/). Anyway, here's an OpenCV program that demonstrates minAreaRect if I haven't explained it clear enough already:
#include <stdio.h>
#include <opencv\cv.h>
#include <opencv\highgui.h>
using namespace cv;
int main() {
float angle = 0;
Mat image(200, 400, CV_8UC3, Scalar(0));
RotatedRect originalRect;
Point2f vertices[4];
vector<Point2f> vertVect;
RotatedRect calculatedRect;
while (waitKey(5000) != 27) {
// Create a rectangle, rotating it by 10 degrees more each time.
originalRect = RotatedRect(Point2f(100,100), Size2f(100,50), angle);
// Convert the rectangle to a vector of points for minAreaRect to use.
// Also move the points to the right, so that the two rectangles aren't
// in the same place.
originalRect.points(vertices);
for (int i = 0; i < 4; i++) {
vertVect.push_back(vertices[i] + Point2f(200, 0));
}
// Get minAreaRect to find a rectangle that encloses the points. This
// should have the exact same orientation as our original rectangle.
calculatedRect = minAreaRect(vertVect);
// Draw the original rectangle, and the one given by minAreaRect.
for (int i = 0; i < 4; i++) {
line(image, vertices[i], vertices[(i+1)%4], Scalar(0, 255, 0));
line(image, vertVect[i], vertVect[(i+1)%4], Scalar(255, 0, 0));
}
imshow("rectangles", image);
// Print the angle values.
printf("---\n");
printf("Original angle: %7.2f\n", angle);
printf("Angle given by minAreaRect: %7.2f\n", calculatedRect.angle);
printf("---\n");
// Reset everything for the next frame.
image = Mat(200, 400, CV_8UC3, Scalar(0));
vertVect.clear();
angle+=10;
}
return 0;
}
This lets you easily see how the angle, and shape, of a manually drawn rectangle compares to the minAreaRect interpretation of the same rectangle.

Improving on the answer of #Adam Goodwin i want to add my little code that changes the behaviour a little bit:
I wanted to have the angle between the longer side and vertical (to me it is the most natural way to think about rotated rectangles):
If you need the same, just use this code:
void printAngle(RotatedRect calculatedRect){
if(calculatedRect.size.width < calculatedRect.size.height){
printf("Angle along longer side: %7.2f\n", calculatedRect.angle+180);
}else{
printf("Angle along longer side: %7.2f\n", calculatedRect.angle+90);
}
}
To see it in action just insert it in Adam Goodwins code:
printf("Angle given by minAreaRect: %7.2f\n", calculatedRect.angle);
printAngle(calculatedRect);
printf("---\n");

After experiment, I find that if the long side is in the left of the bottom Point, the angle value is between long side and Y+ axis, but if the long side is in the right of the bottom Point, the angle value is between long side and X+ axis.
So I use the code like this(java):
rRect = Imgproc.minAreaRect(mop2f);
if(rRect.size.width<rRect.size.height){
angle = 90 -rRect.angle;
}else{
angle = -rRect.angle;
}
The angle is from 0 to 180.

After much experiment, I have found that the relationship between the rectangle orientation and output angle of minAreaRect(). It can be summarized in the following image
The following description assume that we have a rectangle with unequal height and width length, i.e., it is not square.
If the rectangle lies vertically (width < height), then the detected angle is -90. If the rectangle lies horizontally, then the detected angle is also -90 degree.
If the top part of the rectangle is in first quadrant, then the detected angle decreases as the rectangle rotate from horizontal to vertical position, until the detected angle becomes -90 degrees. In first quadrant, the width of detected rectangle is longer than its height.
If the top part of the detected rectangle is in second quadrant, then the angle decreases as the rectangle rotate from vertical to horizontal position. But there is a difference between second and first quadrant. If the rectangle approaches vertical position but has not been in vertical position, its angle approaches 0. If the rectangle approaches horizontal position but has not been in horizontal position, its angle approaches -90 degrees.
This post here is also good in explaining this.

It depends on the version of opencv, at least for Python.
For opencv-python='4.5.4.60'. The angle is that between positive x-axis and the first line the axis meets when it rotates anti-clock wise. The following is the code to snippet.
import cv2
import numpy as np
box1 = [[0, 0], [1, 0], [1, 2], [0, 2]]
cv2.minAreaRect(np.asarray(box1)) # angel = 90.0
box2 = [[0, 0], [2, 0], [2, 1], [0, 1]]
cv2.minAreaRect(np.asarray(box2)) # angel = 90.0
box3 = [[0, 0], [2**0.5, 2**0.5], [0.5*2**0.5, 1.5*2**0.5], [-0.5*2**0.5, 0.5*2**0.5]]
cv2.minAreaRect(np.asarray(box3, dtype=np.float32)) # angle = 44.999
box4 = [[0, 0], [-2**0.5, 2**0.5], [-0.5*2**0.5, 1.5*2**0.5], [0.5*2**0.5, 0.5*2**0.5]]
cv2.minAreaRect(np.asarray(box4, dtype=np.float32)) # angle = 45.0
box5 = [[0, 0], [-0.5*2**0.5, 0.5*2**0.5], [-2**0.5, 0], [-0.5*2**0.5, -0.5*2**0.5]]
cv2.minAreaRect(np.asarray(box5, dtype=np.float32)) # angle = 45.0
For opencv-python='3.4.13.47'. The angle is that between positive x-axis and the first line the axis meets when it rotates clock wise. The following is the code to snippet.
import cv2
import numpy as np
box1 = [[0, 0], [1, 0], [1, 2], [0, 2]]
cv2.minAreaRect(np.asarray(box1)) # angel = -90.0
box2 = [[0, 0], [2, 0], [2, 1], [0, 1]]
cv2.minAreaRect(np.asarray(box2)) # angel = -90.0
box3 = [[0, 0], [2**0.5, 2**0.5], [0.5*2**0.5, 1.5*2**0.5], [-0.5*2**0.5, 0.5*2**0.5]]
cv2.minAreaRect(np.asarray(box3, dtype=np.float32)) # angle = -44.999
box4 = [[0, 0], [-2**0.5, 2**0.5], [-0.5*2**0.5, 1.5*2**0.5], [0.5*2**0.5, 0.5*2**0.5]]
cv2.minAreaRect(np.asarray(box4, dtype=np.float32)) # angle = -45.0
box5 = [[0, 0], [-0.5*2**0.5, 0.5*2**0.5], [-2**0.5, 0], [-0.5*2**0.5, -0.5*2**0.5]]
cv2.minAreaRect(np.asarray(box5, dtype=np.float32)) # angle = -45.0

Related

Why do we need `rotateToARCamera` in Apple's `Visualizing a Point Cloud Using Scene Depth` sample code?

Sample code: https://developer.apple.com/documentation/arkit/visualizing_a_point_cloud_using_scene_depth
In the code, when unprojecting depthmap into world point, we are using a positive z value(depth value). But in my understanding, ARKit uses right-handed coordinate system which means points with positive z value are behind the camera. So maybe we need to do some extra work to align the coordinate system(using rotateToARCamera matrix?). But I cannot understand why we need to flip both Y and Z plane.
static func makeRotateToARCameraMatrix(orientation: UIInterfaceOrientation) -> matrix_float4x4 {
// flip to ARKit Camera's coordinate
let flipYZ = matrix_float4x4(
[1, 0, 0, 0],
[0, -1, 0, 0],
[0, 0, -1, 0],
[0, 0, 0, 1] )
let rotationAngle = Float(cameraToDisplayRotation(orientation: orientation)) * .degreesToRadian
return flipYZ * matrix_float4x4(simd_quaternion(rotationAngle, Float3(0, 0, 1)))
}
Update: I guess the key point is the coordinate system used for camera intrinsics matrix's pin-hole model has an inverse direction compared to the normal camera space in ARKit.
Depth Map is a coordinate system where the Y coordinate is smaller at the top and larger at the bottom like image data, but ARKit is a coordinate system where the Y coordinate is smaller from the bottom and larger at the top.
For this reason, I think it is necessary to invert the Y coordinate.

What is difference between (CountNonZero) and (Moment M00) and (ContourArea) in OpenCV?

If I have a 3x3 binary image and there is a contour in locations(x,y): (0,0), (0,1), (1,0), (1,1)
I get the contour via findContours method.
I want to get this contour's area:
with CountNonZero: 4
with contourArea: 1
with Moment M00: 1
What is the correct answer and what is the difference between them?
This contour is square so the area is 2*2 = 4
So why is ContourArea equal to 1?
I am using EmguCV and this is my code:
VectorOfVectorOfPoint cont = new VectorOfVectorOfPoint();
Image<Gray, byte> img = new Image<Gray, byte>(3,3);
img[0, 0] = new Gray(255);
img[0, 1] = new Gray(255);
img[1, 0] = new Gray(255);
img[1, 1] = new Gray(255);
CvInvoke.FindContours(img, cont, null, Emgu.CV.CvEnum.RetrType.External, Emgu.CV.CvEnum.ChainApproxMethod.ChainApproxSimple);
Moments m = CvInvoke.Moments(cont[0], true);
Console.WriteLine(CvInvoke.ContourArea(cont[0]));
CvInvoke.Imshow("ss", img);
CvInvoke.WaitKey(0);
I’m not aware of implementation details, but I would suspect that the “contour” is a polygon that goes from pixel center to pixel center around the object. This polygon is smaller than the set of pixels, each edge is moved inwards by half a pixel distance.
This is consistent with the area of a 2x2 pixel block being measured as 1 pixel.
If you want to measure area, don’t use the contour functionality. Use connected component analysis (object labeling) and count the number of pixels in each connected component.
OpenCV is not meant for precise quantification, and there are lots of things in it that don’t make sense to me.
In correlation with Cris's answer, the contour in your case is a square whose side length is 1 pixel ==> area = 1 pixel squared.
This is how the image and the contour would look like:
image:
[[255 255 0]
[255 255 0]
[ 0 0 0]]
contour:
[[[0 0]]
[[0 1]]
[[1 1]]
[[1 0]]]
area:
1.0

Inverse Perspective Transform?

I am trying to find the bird's eye image from a given image. I also have the rotations and translations (also intrinsic matrix) required to convert it into the bird's eye plane. My aim is to find an inverse homography matrix(3x3).
rotation_x = np.asarray([[1,0,0,0],
[0,np.cos(R_x),-np.sin(R_x),0],
[0,np.sin(R_x),np.cos(R_x),0],
[0,0,0,1]],np.float32)
translation = np.asarray([[1, 0, 0, 0],
[0, 1, 0, 0 ],
[0, 0, 1, -t_y/(dp_y * np.sin(R_x))],
[0, 0, 0, 1]],np.float32)
intrinsic = np.asarray([[s_x * f / (dp_x ),0, 0, 0],
[0, 1 * f / (dp_y ) ,0, 0 ],
[0,0,1,0]],np.float32)
#The Projection matrix to convert the image coordinates to 3-D domain from (x,y,1) to (x,y,0,1); Not sure if this is the right approach
projection = np.asarray([[1, 0, 0],
[0, 1, 0],
[0, 0, 0],
[0, 0, 1]], np.float32)
homography_matrix = intrinsic # translation # rotation # projection
inv = cv2.warpPerspective(source_image, homography_matrix,(w,h),flags = cv2.INTER_CUBIC | cv2.WARP_INVERSE_MAP)
My question is, Is this the right approach, as I can manual set a suitable ty,rx, but not for the one (ty,rx) which is provided.
First premise: your bird's eye view will be correct only for one specific plane in the image, since a homography can only map planes (including the plane at infinity, corresponding to a pure camera rotation).
Second premise: if you can identify a quadrangle in the first image that is the projection of a rectangle in the world, you can directly compute the homography that maps the quad into the rectangle (i.e. the "birds's eye view" of the quad), and warp the image with it, setting the scale so the image warps to a desired size. No need to use the camera intrinsics. Example: you have the image of a building with rectangular windows, and you know the width/height ratio of these windows in the world.
Sometimes you can't find rectangles, but your camera is calibrated, and thus the problem you describe comes into play. Let's do the math. Assume the plane you are observing in the given image is Z=0 in world coordinates. Let K be the 3x3 intrinsic camera matrix and [R, t] the 3x4 matrix representing the camera pose in XYZ world frame, so that if Pc and Pw represent the same 3D point respectively in camera and world coordinates, it is Pc = R*Pw + t = [R, t] * [Pw.T, 1].T, where .T means transposed. Then you can write the camera projection as:
s * p = K * [R, t] * [Pw.T, 1].T
where s is an arbitrary scale factor and p is the pixel that Pw projects onto. But if Pw=[X, Y, Z].T is on the Z=0 plane, the 3rd column of R only multiplies zeros, so we can ignore it. If we then denote with r1 and r2 the first two columns of R, we can rewrite the above equation as:
s * p = K * [r1, r2, t] * [X, Y, 1].T
But K * [r1, r2, t] is a 3x3 matrix that transforms points on a 3D plane to points on the camera plane, so it is a homography.
If the plane is not Z=0, you can repeat the same argument replacing [R, t] with [R, t] * inv([Rp, tp]), where [Rp, tp] is the coordinate transform that maps a frame on the plane, with the plane normal being the Z axis, to the world frame.
Finally, to obtain the bird's eye view, you select a rotation R whose third column (the components of the world's Z axis in camera frame) is opposite to the plane's normal.

OpenCV : wrapPerspective on whole image

I'm detecting markers on images captured by my iPad. Because of that I want to calculate translations and rotations between them, I want to change change perspective on images these image, so it would look like I'm capturing them directly above markers.
Right now I'm using
points2D.push_back(cv::Point2f(0, 0));
points2D.push_back(cv::Point2f(50, 0));
points2D.push_back(cv::Point2f(50, 50));
points2D.push_back(cv::Point2f(0, 50));
Mat perspectiveMat = cv::getPerspectiveTransform(points2D, imagePoints);
cv::warpPerspective(*_image, *_undistortedImage, M, cv::Size(_image->cols, _image->rows));
Which gives my these results (look at the right-bottom corner for result of warpPerspective):
As you probably see result image contains recognized marker in left-top corner of the result image. My problem is that I want to capture whole image (without cropping) so I could detect other markers on that image later.
How can I do that? Maybe I should use rotation/translation vectors from solvePnP function?
EDIT:
Unfortunatelly changing size of warped image don't help much, because image is still translated so left-top corner of marker is in top-left corner of image.
For example when I've doubled size using:
cv::warpPerspective(*_image, *_undistortedImage, M, cv::Size(2*_image->cols, 2*_image->rows));
I've recieved these images:
Your code doesn't seem to be complete, so it is difficult to say what the problem is.
In any case the warped image might have completely different dimensions compared to the input image so you will have to adjust the size paramter you are using for warpPerspective.
For example try to double the size:
cv::warpPerspective(*_image, *_undistortedImage, M, 2*cv::Size(_image->cols, _image->rows));
Edit:
To make sure the whole image is inside this image, all corners of your original image must be warped to be inside the resulting image. So simply calculate the warped destination for each of the corner points and adjust the destination points accordingly.
To make it more clear some sample code:
// calculate transformation
cv::Matx33f M = cv::getPerspectiveTransform(points2D, imagePoints);
// calculate warped position of all corners
cv::Point3f a = M.inv() * cv::Point3f(0, 0, 1);
a = a * (1.0/a.z);
cv::Point3f b = M.inv() * cv::Point3f(0, _image->rows, 1);
b = b * (1.0/b.z);
cv::Point3f c = M.inv() * cv::Point3f(_image->cols, _image->rows, 1);
c = c * (1.0/c.z);
cv::Point3f d = M.inv() * cv::Point3f(_image->cols, 0, 1);
d = d * (1.0/d.z);
// to make sure all corners are in the image, every position must be > (0, 0)
float x = ceil(abs(min(min(a.x, b.x), min(c.x, d.x))));
float y = ceil(abs(min(min(a.y, b.y), min(c.y, d.y))));
// and also < (width, height)
float width = ceil(abs(max(max(a.x, b.x), max(c.x, d.x)))) + x;
float height = ceil(abs(max(max(a.y, b.y), max(c.y, d.y)))) + y;
// adjust target points accordingly
for (int i=0; i<4; i++) {
points2D[i] += cv::Point2f(x,y);
}
// recalculate transformation
M = cv::getPerspectiveTransform(points2D, imagePoints);
// get result
cv::Mat result;
cv::warpPerspective(*_image, result, M, cv::Size(width, height), cv::WARP_INVERSE_MAP);
I implemented littleimp's answer in python in case anyone needs it. It should be noted that this will not work properly if the vanishing points of the polygons are falling within the image.
import cv2
import numpy as np
from PIL import Image, ImageDraw
import math
def get_transformed_image(src, dst, img):
# calculate the tranformation
mat = cv2.getPerspectiveTransform(src.astype("float32"), dst.astype("float32"))
# new source: image corners
corners = np.array([
[0, img.size[0]],
[0, 0],
[img.size[1], 0],
[img.size[1], img.size[0]]
])
# Transform the corners of the image
corners_tranformed = cv2.perspectiveTransform(
np.array([corners.astype("float32")]), mat)
# These tranformed corners seems completely wrong/inverted x-axis
print(corners_tranformed)
x_mn = math.ceil(min(corners_tranformed[0].T[0]))
y_mn = math.ceil(min(corners_tranformed[0].T[1]))
x_mx = math.ceil(max(corners_tranformed[0].T[0]))
y_mx = math.ceil(max(corners_tranformed[0].T[1]))
width = x_mx - x_mn
height = y_mx - y_mn
analogy = height/1000
n_height = height/analogy
n_width = width/analogy
dst2 = corners_tranformed
dst2 -= np.array([x_mn, y_mn])
dst2 = dst2/analogy
mat2 = cv2.getPerspectiveTransform(corners.astype("float32"),
dst2.astype("float32"))
img_warp = Image.fromarray((
cv2.warpPerspective(np.array(image),
mat2,
(int(n_width),
int(n_height)))))
return img_warp
# image coordingates
src= np.array([[ 789.72, 1187.35],
[ 789.72, 752.75],
[1277.35, 730.66],
[1277.35,1200.65]])
# known coordinates
dst=np.array([[0, 1000],
[0, 0],
[1092, 0],
[1092, 1000]])
# Create the image
image = Image.new('RGB', (img_width, img_height))
image.paste( (200,200,200), [0,0,image.size[0],image.size[1]])
draw = ImageDraw.Draw(image)
draw.line(((src[0][0],src[0][1]),(src[1][0],src[1][1]), (src[2][0],src[2][1]),(src[3][0],src[3][1]), (src[0][0],src[0][1])), width=4, fill="blue")
#image.show()
warped = get_transformed_image(src, dst, image)
warped.show()
There are two things you need to do:
Increase the size of the output of cv2.warpPerspective
Translate the warped source image such that the center of the warped source image matches with the center of cv2.warpPerspective output image
Here is how code will look:
# center of source image
si_c = [x//2 for x in image.shape] + [1]
# find where center of source image will be after warping without comepensating for any offset
wsi_c = np.dot(H, si_c)
wsi_c = [x/wsi_c[2] for x in wsi_c]
# warping output image size
stitched_frame_size = tuple(2*x for x in image.shape)
# center of warping output image
wf_c = image.shape
# calculate offset for translation of warped image
x_offset = wf_c[0] - wsi_c[0]
y_offset = wf_c[1] - wsi_c[1]
# translation matrix
T = np.array([[1, 0, x_offset], [0, 1, y_offset], [0, 0, 1]])
# translate tomography matrix
translated_H = np.dot(T.H)
# warp
stitched = cv2.warpPerspective(image, translated_H, stitched_frame_size)

How to find corners on a Image using OpenCv

I´m trying to find the corners on a image, I don´t need the contours, only the 4 corners. I will change the perspective using 4 corners.
I´m using Opencv, but I need to know the steps to find the corners and what function I will use.
My images will be like this:(without red points, I will paint the points after)
EDITED:
After suggested steps, I writed the code: (Note: I´m not using pure OpenCv, I´m using javaCV, but the logic it´s the same).
// Load two images and allocate other structures (I´m using other image)
IplImage colored = cvLoadImage(
"res/scanteste.jpg",
CV_LOAD_IMAGE_UNCHANGED);
IplImage gray = cvCreateImage(cvGetSize(colored), IPL_DEPTH_8U, 1);
IplImage smooth = cvCreateImage(cvGetSize(colored), IPL_DEPTH_8U, 1);
//Step 1 - Convert from RGB to grayscale (cvCvtColor)
cvCvtColor(colored, gray, CV_RGB2GRAY);
//2 Smooth (cvSmooth)
cvSmooth( gray, smooth, CV_BLUR, 9, 9, 2, 2);
//3 - cvThreshold - What values?
cvThreshold(gray,gray, 155, 255, CV_THRESH_BINARY);
//4 - Detect edges (cvCanny) -What values?
int N = 7;
int aperature_size = N;
double lowThresh = 20;
double highThresh = 40;
cvCanny( gray, gray, lowThresh*N*N, highThresh*N*N, aperature_size );
//5 - Find contours (cvFindContours)
int total = 0;
CvSeq contour2 = new CvSeq(null);
CvMemStorage storage2 = cvCreateMemStorage(0);
CvMemStorage storageHull = cvCreateMemStorage(0);
total = cvFindContours(gray, storage2, contour2, Loader.sizeof(CvContour.class), CV_RETR_CCOMP, CV_CHAIN_APPROX_NONE);
if(total > 1){
while (contour2 != null && !contour2.isNull()) {
if (contour2.elem_size() > 0) {
//6 - Approximate contours with linear features (cvApproxPoly)
CvSeq points = cvApproxPoly(contour2,Loader.sizeof(CvContour.class), storage2, CV_POLY_APPROX_DP,cvContourPerimeter(contour2)*0.005, 0);
cvDrawContours(gray, points,CvScalar.BLUE, CvScalar.BLUE, -1, 1, CV_AA);
}
contour2 = contour2.h_next();
}
}
So, I want to find the cornes, but I don´t know how to use corners function like cvCornerHarris and others.
First, check out /samples/c/squares.c in your OpenCV distribution. This example provides a square detector, and it should be a pretty good start on how to detect corner-like features. Then, take a look at OpenCV's feature-oriented functions like cvCornerHarris() and cvGoodFeaturesToTrack().
The above methods can return many corner-like features - most will not be the "true corners" you are looking for. In my application, I had to detect squares that had been rotated or skewed (due to perspective). My detection pipeline consisted of:
Convert from RGB to grayscale (cvCvtColor)
Smooth (cvSmooth)
Threshold (cvThreshold)
Detect edges (cvCanny)
Find contours (cvFindContours)
Approximate contours with linear features (cvApproxPoly)
Find "rectangles" which were structures that: had polygonalized contours possessing 4 points, were of sufficient area, had adjacent edges were ~90 degrees, had distance between "opposite" vertices was of sufficient size, etc.
Step 7 was necessary because a slightly noisy image can yield many structures that appear rectangular after polygonalization. In my application, I also had to deal with square-like structures that appeared within, or overlapped the desired square. I found the contour's area property and center of gravity to be helpful in discerning the proper rectangle.
At a first glance, for a human eye there are 4 corners. But in computer vision, a corner is considered to be a point that has large gradient change in intensity across its neighborhood. The neighborhood can be a 4 pixel neighborhood or an 8 pixel neighborhood.
In the equation provided to find the gradient of intensity, it has been considered for 4-pixel neighborhood SEE DOCUMENTATION.
Here is my approach for the image in question. I have the code in python as well:
path = r'C:\Users\selwyn77\Desktop\Stack\corner'
filename = 'env.jpg'
img = cv2.imread(os.path.join(path, filename))
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) #--- convert to grayscale
It is a good choice to always blur the image to remove less possible gradient changes and preserve the more intense ones. I opted to choose the bilateral filter which unlike the Gaussian filter doesn't blur all the pixels in the neighborhood. It rather blurs pixels which has similar pixel intensity to that of the central pixel. In short it preserves edges/corners of high gradient change but blurs regions that have minimal gradient changes.
bi = cv2.bilateralFilter(gray, 5, 75, 75)
cv2.imshow('bi',bi)
To a human it is not so much of a difference compared to the original image. But it does matter. Now finding possible corners:
dst = cv2.cornerHarris(bi, 2, 3, 0.04)
dst returns an array (the same 2D shape of the image) with eigen values obtained from the final equation mentioned HERE.
Now a threshold has to be applied to select those corners beyond a certain value. I will use the one in the documentation:
#--- create a black image to see where those corners occur ---
mask = np.zeros_like(gray)
#--- applying a threshold and turning those pixels above the threshold to white ---
mask[dst>0.01*dst.max()] = 255
cv2.imshow('mask', mask)
The white pixels are regions of possible corners. You can find many corners neighboring each other.
To draw the selected corners on the image:
img[dst > 0.01 * dst.max()] = [0, 0, 255] #--- [0, 0, 255] --> Red ---
cv2.imshow('dst', img)
(Red colored pixels are the corners, not so visible)
In order to get an array of all pixels with corners:
coordinates = np.argwhere(mask)
UPDATE
Variable coor is an array of arrays. Converting it to list of lists
coor_list = [l.tolist() for l in list(coor)]
Converting the above to list of tuples
coor_tuples = [tuple(l) for l in coor_list]
I have an easy and rather naive way to find the 4 corners. I simply calculated the distance of each corner to every other corner. I preserved those corners whose distance exceeded a certain threshold.
Here is the code:
thresh = 50
def distance(pt1, pt2):
(x1, y1), (x2, y2) = pt1, pt2
dist = math.sqrt( (x2 - x1)**2 + (y2 - y1)**2 )
return dist
coor_tuples_copy = coor_tuples
i = 1
for pt1 in coor_tuples:
print(' I :', i)
for pt2 in coor_tuples[i::1]:
print(pt1, pt2)
print('Distance :', distance(pt1, pt2))
if(distance(pt1, pt2) < thresh):
coor_tuples_copy.remove(pt2)
i+=1
Prior to running the snippet above coor_tuples had all corner points:
[(4, 42),
(4, 43),
(5, 43),
(5, 44),
(6, 44),
(7, 219),
(133, 36),
(133, 37),
(133, 38),
(134, 37),
(135, 224),
(135, 225),
(136, 225),
(136, 226),
(137, 225),
(137, 226),
(137, 227),
(138, 226)]
After running the snippet I was left with 4 corners:
[(4, 42), (7, 219), (133, 36), (135, 224)]
UPDATE 2
Now all you have to do is just mark these 4 points on a copy of the original image.
img2 = img.copy()
for pt in coor_tuples:
cv2.circle(img2, tuple(reversed(pt)), 3, (0, 0, 255), -1)
cv2.imshow('Image with 4 corners', img2)
Here's an implementation using cv2.goodFeaturesToTrack() to detect corners. The approach is
Convert image to grayscale
Perform canny edge detection
Detect corners
Optionally perform 4-point perspective transform to get top-down view of image
Using this starting image,
After converting to grayscale, we perform canny edge detection
Now that we have a decent binary image, we can use cv2.goodFeaturesToTrack()
corners = cv2.goodFeaturesToTrack(canny, 4, 0.5, 50)
For the parameters, we give it the canny image, set the maximum number of corners to 4 (maxCorners), use a minimum accepted quality of 0.5 (qualityLevel), and set the minimum possible Euclidean distance between the returned corners to 50 (minDistance). Here's the result
Now that we have identified the corners, we can perform a 4-point perspective transform to obtain a top-down view of the object. We first order the points clockwise then draw the result onto a mask.
Note: We could have just found contours on the Canny image instead of doing this step to create the mask, but pretend we only had the 4 corner points to work with
Next we find contours on this mask and filter using cv2.arcLength() and cv2.approxPolyDP(). The idea is that if the contour has 4 points, then it must be our object. Once we have this contour, we perform a perspective transform
Finally we rotate the image depending on the desired orientation. Here's the result
Code for only detecting corners
import cv2
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
canny = cv2.Canny(gray, 120, 255, 1)
corners = cv2.goodFeaturesToTrack(canny,4,0.5,50)
for corner in corners:
x,y = corner.ravel()
cv2.circle(image,(x,y),5,(36,255,12),-1)
cv2.imshow('canny', canny)
cv2.imshow('image', image)
cv2.waitKey()
Code for detecting corners and performing perspective transform
import cv2
import numpy as np
def rotate_image(image, angle):
# Grab the dimensions of the image and then determine the center
(h, w) = image.shape[:2]
(cX, cY) = (w / 2, h / 2)
# grab the rotation matrix (applying the negative of the
# angle to rotate clockwise), then grab the sine and cosine
# (i.e., the rotation components of the matrix)
M = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
cos = np.abs(M[0, 0])
sin = np.abs(M[0, 1])
# Compute the new bounding dimensions of the image
nW = int((h * sin) + (w * cos))
nH = int((h * cos) + (w * sin))
# Adjust the rotation matrix to take into account translation
M[0, 2] += (nW / 2) - cX
M[1, 2] += (nH / 2) - cY
# Perform the actual rotation and return the image
return cv2.warpAffine(image, M, (nW, nH))
def order_points_clockwise(pts):
# sort the points based on their x-coordinates
xSorted = pts[np.argsort(pts[:, 0]), :]
# grab the left-most and right-most points from the sorted
# x-roodinate points
leftMost = xSorted[:2, :]
rightMost = xSorted[2:, :]
# now, sort the left-most coordinates according to their
# y-coordinates so we can grab the top-left and bottom-left
# points, respectively
leftMost = leftMost[np.argsort(leftMost[:, 1]), :]
(tl, bl) = leftMost
# now, sort the right-most coordinates according to their
# y-coordinates so we can grab the top-right and bottom-right
# points, respectively
rightMost = rightMost[np.argsort(rightMost[:, 1]), :]
(tr, br) = rightMost
# return the coordinates in top-left, top-right,
# bottom-right, and bottom-left order
return np.array([tl, tr, br, bl], dtype="int32")
def perspective_transform(image, corners):
def order_corner_points(corners):
# Separate corners into individual points
# Index 0 - top-right
# 1 - top-left
# 2 - bottom-left
# 3 - bottom-right
corners = [(corner[0][0], corner[0][1]) for corner in corners]
top_r, top_l, bottom_l, bottom_r = corners[0], corners[1], corners[2], corners[3]
return (top_l, top_r, bottom_r, bottom_l)
# Order points in clockwise order
ordered_corners = order_corner_points(corners)
top_l, top_r, bottom_r, bottom_l = ordered_corners
# Determine width of new image which is the max distance between
# (bottom right and bottom left) or (top right and top left) x-coordinates
width_A = np.sqrt(((bottom_r[0] - bottom_l[0]) ** 2) + ((bottom_r[1] - bottom_l[1]) ** 2))
width_B = np.sqrt(((top_r[0] - top_l[0]) ** 2) + ((top_r[1] - top_l[1]) ** 2))
width = max(int(width_A), int(width_B))
# Determine height of new image which is the max distance between
# (top right and bottom right) or (top left and bottom left) y-coordinates
height_A = np.sqrt(((top_r[0] - bottom_r[0]) ** 2) + ((top_r[1] - bottom_r[1]) ** 2))
height_B = np.sqrt(((top_l[0] - bottom_l[0]) ** 2) + ((top_l[1] - bottom_l[1]) ** 2))
height = max(int(height_A), int(height_B))
# Construct new points to obtain top-down view of image in
# top_r, top_l, bottom_l, bottom_r order
dimensions = np.array([[0, 0], [width - 1, 0], [width - 1, height - 1],
[0, height - 1]], dtype = "float32")
# Convert to Numpy format
ordered_corners = np.array(ordered_corners, dtype="float32")
# Find perspective transform matrix
matrix = cv2.getPerspectiveTransform(ordered_corners, dimensions)
# Return the transformed image
return cv2.warpPerspective(image, matrix, (width, height))
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
canny = cv2.Canny(gray, 120, 255, 1)
corners = cv2.goodFeaturesToTrack(canny,4,0.5,50)
c_list = []
for corner in corners:
x,y = corner.ravel()
c_list.append([int(x), int(y)])
cv2.circle(image,(x,y),5,(36,255,12),-1)
corner_points = np.array([c_list[0], c_list[1], c_list[2], c_list[3]])
ordered_corner_points = order_points_clockwise(corner_points)
mask = np.zeros(image.shape, dtype=np.uint8)
cv2.fillPoly(mask, [ordered_corner_points], (255,255,255))
mask = cv2.cvtColor(mask, cv2.COLOR_BGR2GRAY)
cnts = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.015 * peri, True)
if len(approx) == 4:
transformed = perspective_transform(original, approx)
result = rotate_image(transformed, -90)
cv2.imshow('canny', canny)
cv2.imshow('image', image)
cv2.imshow('mask', mask)
cv2.imshow('transformed', transformed)
cv2.imshow('result', result)
cv2.waitKey()
find contours with RETR_EXTERNAL option.(gray -> gaussian filter -> canny edge -> find contour)
find the largest size contour -> this will be the edge of the rectangle
find corners with little calculation
Mat m;//image file
findContours(m, contours_, hierachy_, RETR_EXTERNAL);
auto it = max_element(contours_.begin(), contours_.end(),
[](const vector<Point> &a, const vector<Point> &b) {
return a.size() < b.size(); });
Point2f xy[4] = {{9000,9000}, {0, 1000}, {1000, 0}, {0,0}};
for(auto &[x, y] : *it) {
if(x + y < xy[0].x + xy[0].y) xy[0] = {x, y};
if(x - y > xy[1].x - xy[1].y) xy[1] = {x, y};
if(y - x > xy[2].y - xy[2].x) xy[2] = {x, y};
if(x + y > xy[3].x + xy[3].y) xy[3] = {x, y};
}
xy[4] will be the four corners.
I was able to extract four corners this way.
Apply houghlines to the canny image - you will get a list of points
apply convex hull to this set of points

Resources