Related
I Detected the ArUco marker and estimated the pose. See the image below. However, Xt (X translation) I get is a positive value. According to the drawAxis function, the positive direction is away from the image center. So I thought it was supposed to be a negative value. Why I am getting positive instead.
My camera is about 120 mm away from the imaging surface. But I am getting Zt (Z translation) in the range of 650 mm. Is pose estimation giving the pose of marker with respect to physical camera or image plane center? I didn't get why the Zt is so high.
I kept measuring Pose while changing Z, and obtained roll, pitch, yaw. I noticed roll ( rotation w.r.t. cam X-axis) is changing its sign back and forth magnitude change 166-178, but the sign of Xt did not change with the sign change in roll. Any thoughts on why it behaves like that?
Any suggestion to get more consistent data?
image=cv.imread(fname)
arucoDict = cv.aruco.Dictionary_get(cv.aruco.DICT_4X4_1000)
arucoParams = cv.aruco.DetectorParameters_create()
(corners, ids, rejected) = cv.aruco.detectMarkers(image, arucoDict,
parameters=arucoParams)
print(corners, ids, rejected)
if len(corners) > 0:
# flatten the ArUco IDs list
ids = ids.flatten()
# loop over the detected ArUCo corners
#for (markerCorner, markerID) in zip(corners, ids):
#(markerCorner, markerID)=(corners, ids)
# extract the marker corners (which are always returned in
# top-left, top-right, bottom-right, and bottom-left order)
#corners = corners.reshape((4, 2))
(topLeft, topRight, bottomRight, bottomLeft) = corners[0][0][0],corners[0][0][1],corners[0][0][2],corners[0][0][3]
# convert each of the (x, y)-coordinate pairs to integers
topRight = (int(topRight[0]), int(topRight[1]))
bottomRight = (int(bottomRight[0]), int(bottomRight[1]))
bottomLeft = (int(bottomLeft[0]), int(bottomLeft[1]))
topLeft = (int(topLeft[0]), int(topLeft[1]))
# draw the bounding box of the ArUCo detection
cv.line(image, topLeft, topRight, (0, 255, 0), 2)
cv.line(image, topRight, bottomRight, (0, 255, 0), 2)
cv.line(image, bottomRight, bottomLeft, (0, 255, 0), 2)
cv.line(image, bottomLeft, topLeft, (0, 255, 0), 2)
# compute and draw the center (x, y)-coordinates of the ArUco
# marker
cX = int((topLeft[0] + bottomRight[0]) / 2.0)
cY = int((topLeft[1] + bottomRight[1]) / 2.0)
cv.circle(image, (cX, cY), 4, (0, 0, 255), -1)
if topLeft[1]!=topRight[1] or topLeft[0]!=bottomLeft[0]:
rot1=np.degrees(np.arctan((topLeft[0]-bottomLeft[0])/(bottomLeft[1]-topLeft[1])))
rot2=np.degrees(np.arctan((topRight[1]-topLeft[1])/(topRight[0]-topLeft[0])))
rot=(np.round(rot1,3)+np.round(rot2,3))/2
print(rot1,rot2,rot)
else:
rot=0
# draw the ArUco marker ID on the image
rotS=",rotation:"+str(np.round(rot,3))
cv.putText(image, ("position: "+str(cX) +","+str(cY)),
(100, topLeft[1] - 15), cv.FONT_HERSHEY_SIMPLEX,0.5, (255, 0, 80), 2)
cv.putText(image, rotS,
(400, topLeft[1] -15), cv.FONT_HERSHEY_SIMPLEX,0.5, (255, 0, 80), 2)
print("[INFO] ArUco marker ID: {}".format(ids))
d=np.round((math.dist(topLeft,bottomRight)+math.dist(topRight,bottomLeft))/2,3)
# Get the rotation and translation vectors
rvecs, tvecs, obj_points = cv.aruco.estimatePoseSingleMarkers(corners,aruco_marker_side_length,mtx,dst)
# Print the pose for the ArUco marker
# The pose of the marker is with respect to the camera lens frame.
# Imagine you are looking through the camera viewfinder,
# the camera lens frame's:
# x-axis points to the right
# y-axis points straight down towards your toes
# z-axis points straight ahead away from your eye, out of the camera
#for i, marker_id in enumerate(marker_ids):
# Store the translation (i.e. position) information
transform_translation_x = tvecs[0][0][0]
transform_translation_y = tvecs[0][0][1]
transform_translation_z = tvecs[0][0][2]
# Store the rotation information
rotation_matrix = np.eye(4)
rotation_matrix[0:3, 0:3] = cv.Rodrigues(np.array(rvecs[0]))[0]
r = R.from_matrix(rotation_matrix[0:3, 0:3])
quat = r.as_quat()
# Quaternion format
transform_rotation_x = quat[0]
transform_rotation_y = quat[1]
transform_rotation_z = quat[2]
transform_rotation_w = quat[3]
# Euler angle format in radians
roll_x, pitch_y, yaw_z = euler_from_quaternion(transform_rotation_x,transform_rotation_y,transform_rotation_z,transform_rotation_w)
roll_x = math.degrees(roll_x)
pitch_y = math.degrees(pitch_y)
yaw_z = math.degrees(yaw_z)
Disclaimer: this goes for OpenCV v4.5.5 and corresponding aruco module (contrib repo). They redid a lot of aruco stuff for v4.6.0 and v4.7.0, so best check everything I say here.
Without checking all the code (looks roughly okay), a few basics about OpenCV and aruco:
Both use right-handed coordinate systems. Thumb X, index Y, middle Z.
OpenCV uses X right, Y down, Z far, for screen/camera frames. Origin for screens and pictures is the top left corner. For cameras, the origin is the center of the pinhole model, which would be the center of the aperture. I can't comment on lenses or lens systems. Assume the lens center is the origin. That's probably close enough.
Aruco uses X right, Y far, Z up, if the marker is lying flat on a table. Origin is in the center of the marker. The top left corner of the marker is considered the "first" corner.
The marker can be considered to have its own coordinate system/frame.
The pose given by rvec and tvec is the pose of the marker in the camera frame. That means np.linalg.norm(tvec) gives you the direct distance from the camera to the marker's center. tvec's Z is just the component parallel to optical axis.
If the marker is in the right half of the picture ("half" defined by camera matrix's cx,cy), you'd expect tvec's X to grow. Lower half, Y positive/growing.
Conversely, that transformation transforms marker-local coordinates to camera-local. Try transforming some marker-local points, such as origin or points on the axes. I believe that cv::transform can help with that. Using OpenCV's projectPoints to map 3D space points to 2D image points, you can then draw the marker's axes, or a cube on top of it, or anything you like.
Say the marker sits upright and faces the camera dead-on. When you consider the frame triads of the marker and the camera in space ("world" space), both would be X "right", but one's Y and Z are opposite the other's Y and Z, so you'd expect to see a rotation around the X axis by half a turn (rotating Z and Y).
You could imagine the transformation to happen like this:
initially the camera looks through the marker, from the marker's back out into the world. The camera would be "upside down". The camera sees marker-space.
the pose's rotation component rotates the whole marker-local world around the camera's origin. Seen from the world frame (point of reference), the camera rotates, into an attitude you'd find natural.
the pose's translation moves the marker's world out in front of the camera (Z being positive), or equivalently, the camera backs away from the marker.
If you get implausible values, check aruco_marker_side_length and camera matrix. f would be around 500-3000 for typical resolutions (VGA-4k) and fields of view (60-80 degrees).
I have some code, largely taken from various sources linked at the bottom of this post, written in Python, that takes an image of shape [height, width] and some bounding boxes in the [x_min, y_min, x_max, y_max] format, both numpy arrays, and rotates an image and its bounding boxes counterclockwise. Since after rotation the bounding box becomes more of a "diamond shape", i.e. not axis aligned, then I perform some calculations to make it axis aligned. The purpose of this code is to perform data augmentation in training an object detection neural network through the use of rotated data (where flipping horizontally or vertically is common). It seems flips of other angles are common for image classification, without bounding boxes, but when there is boxes, the resources for how to flip the boxes as well as the images is relatively sparse/niche.
It seems when I input an angle of 45 degrees, that I get some less than "tight" bounding boxes, as in the four corners are not a very good annotation, whereas the original one was close to perfect.
The image shown below is the first image in the MS COCO 2014 object detection dataset (training image), and its first bounding box annotation. My code is as follows:
import math
import cv2
import numpy as np
# angle assumed to be in degrees
# bbs a list of bounding boxes in x_min, y_min, x_max, y_max format
def rotateImageAndBoundingBoxes(im, bbs, angle):
h, w = im.shape[0], im.shape[1]
(cX, cY) = (w//2, h//2) # original image center
M = cv2.getRotationMatrix2D((cX, cY), angle, 1.0) # 2 by 3 rotation matrix
cos = np.abs(M[0, 0])
sin = np.abs(M[0, 1])
# compute the dimensions of the rotated image
nW = int((h * sin) + (w * cos))
nH = int((h * cos) + (w * sin))
# adjust the rotation matrix to take into account translation of the new centre
M[0, 2] += (nW / 2) - cX
M[1, 2] += (nH / 2) - cY
rotated_im = cv2.warpAffine(im, M, (nW, nH))
rotated_bbs = []
for bb in bbs:
# get the four rotated corners of the bounding box
vec1 = np.matmul(M, np.array([bb[0], bb[1], 1], dtype=np.float64)) # top left corner transformed
vec2 = np.matmul(M, np.array([bb[2], bb[1], 1], dtype=np.float64)) # top right corner transformed
vec3 = np.matmul(M, np.array([bb[0], bb[3], 1], dtype=np.float64)) # bottom left corner transformed
vec4 = np.matmul(M, np.array([bb[2], bb[3], 1], dtype=np.float64)) # bottom right corner transformed
x_vals = [vec1[0], vec2[0], vec3[0], vec4[0]]
y_vals = [vec1[1], vec2[1], vec3[1], vec4[1]]
x_min = math.ceil(np.min(x_vals))
x_max = math.floor(np.max(x_vals))
y_min = math.ceil(np.min(y_vals))
y_max = math.floor(np.max(y_vals))
bb = [x_min, y_min, x_max, y_max]
rotated_bbs.append(bb)
// my function to resize image and bbs to the original image size
rotated_im, rotated_bbs = resizeImageAndBoxes(rotated_im, w, h, rotated_bbs)
return rotated_im, rotated_bbs
The good bounding box looks like:
The not-so-good bounding box looks like :
I am trying to determine if this is an error of my code, or this is expected behavior? It seems like this problem is less apparent at integer multiples of pi/2 radians (90 degrees), but I would like to achieve tight bounding boxes at any angle of rotation. Any insights at all appreciated.
Sources:
[Open CV2 documentation] https://docs.opencv.org/3.4/da/d54/group__imgproc__transform.html#gafbbc470ce83812914a70abfb604f4326
[Data Augmentation Discussion]
https://blog.paperspace.com/data-augmentation-for-object-detection-rotation-and-shearing/
[Mathematics of rotation around an arbitrary point in 2 dimension]
https://math.stackexchange.com/questions/2093314/rotation-matrix-of-rotation-around-a-point-other-than-the-origin
It seems for the most part this is expected behavior as per the comments. I do have a kind of hacky solution to this problem, where you can write a function like
# assuming box coords = [x_min, y_min, x_max, y_max]
def cropBoxByPercentage(box_coords, image_width, image_height, x_percentage=0.05, y_percentage=0.05):
box_xmin = box_coords[0]
box_ymin = box_coords[1]
box_xmax = box_coords[2]
box_ymax = box_coords[3]
box_width = box_xmax-box_xmin+1
box_height = box_ymax-box_ymin+1
dx = int(x_percentage * box_width)
dy = int(y_percentage * box_height)
box_xmin = max(0, box_xmin-dx)
box_xmax = min(image_width-1, box_xmax+dx)
box_ymin = max(0, box_ymax - dy)
box_ymax = min(image_height - 1, box_ymax + dy)
return np.array([box_xmin, box_xmax, box_ymin, box_ymax])
Where computing the x_percentage and y_percentage can be computed using a fixed value, or could be computed using some heuristic.
I want to perform matrix transformation operations like rotation, scaling, translation on an image without any other external library(PIL, CV2) function call.
I am trying with the basic matrix transformation approach. But I am facing some issues likes while rotation there are holes, image is cropped.
I want to perform all the above operation by along the center. Needed some guidance here.
Rotation Logic:
newImage = np.zeros((2*row, 2*column, 3), dtype=np.uint8)
radAngle = math.radians(int(input("angle: ")))
c, s = math.cos(radAngle), math.sin(radAngle)
mid_coords = np.floor(0.5*np.array(image.shape))
for i in range(0, row-1):
for j in range(0, column-1):
x2 = mid_coords[0] + (i-mid_coords[0])*c - (j-mid_coords[1])*s
y2 = mid_coords[1] + (i-mid_coords[0])*s + (j-mid_coords[1])*c
newImage[int(math.ceil(x2)), int(math.ceil(y2))] = image[i, j]
The reason for holes in your rotated image is you are assigning each pixel of your original image to the computed pixel location in the rotated image. Due to this, some pixels in the rotated image are left out.
You should do the other way round.
For each pixel in the rotated image, find its value from the original image.
Please refer to this answer.
I want to read text on the object. But OCR program can't recognize it. When I give the small part, it can recognize. I have to transform circle text to linear text. How can I do this? Thanks.
you can transform the image from Cartesian coordinate system to Polar coordinate system to prepare circle path text image for OCR program. This function logPolar() can help.
Here are some steps to prepare circle path text image:
Find the circles' centers using HoughCircles().
Get the mean and do some offset, so get the center.
(Optinal) Crop a square of the image from the center.
Do logPolar(), then rotate it if necessary.
After detect circles and get the mean of centers and do offset.
The croped image:
After logPolar() and rotate()
My Python3-OpenCV3.3 code is presented here, maybe it helps.
#!/usr/bin/python3
# 2017.10.10 12:44:37 CST
# 2017.10.10 14:08:57 CST
import cv2
import numpy as np
##(1) Read and resize the original image(too big)
img = cv2.imread("circle.png")
img = cv2.resize(img, (W//4, H//4))
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
## (2) Detect circles
circles = cv2.HoughCircles(gray, method=cv2.HOUGH_GRADIENT, dp=1, minDist=3, circles=None, param1=200, param2=100, minRadius = 200, maxRadius=0 )
## make canvas
canvas = img.copy()
## (3) Get the mean of centers and do offset
circles = np.int0(np.array(circles))
x,y,r = 0,0,0
for ptx,pty, radius in circles[0]:
cv2.circle(canvas, (ptx,pty), radius, (0,255,0), 1, 16)
x += ptx
y += pty
r += radius
cnt = len(circles[0])
x = x//cnt
y = y//cnt
r = r//cnt
x+=5
y-=7
## (4) Draw the labels in red
for r in range(100, r, 20):
cv2.circle(canvas, (x,y), r, (0, 0, 255), 3, cv2.LINE_AA)
cv2.circle(canvas, (x,y), 3, (0,0,255), -1)
## (5) Crop the image
dr = r + 20
croped = img[y-dr:y+dr+1, x-dr:x+dr+1].copy()
## (6) logPolar and rotate
polar = cv2.logPolar(croped, (dr,dr),80, cv2.WARP_FILL_OUTLIERS )
rotated = cv2.rotate(polar, cv2.ROTATE_90_COUNTERCLOCKWISE)
## (7) Display the result
cv2.imshow("Canvas", canvas)
cv2.imshow("croped", croped)
cv2.imshow("polar", polar)
cv2.imshow("rotated", rotated)
cv2.waitKey();cv2.destroyAllWindows()
I´m trying to find the corners on a image, I don´t need the contours, only the 4 corners. I will change the perspective using 4 corners.
I´m using Opencv, but I need to know the steps to find the corners and what function I will use.
My images will be like this:(without red points, I will paint the points after)
EDITED:
After suggested steps, I writed the code: (Note: I´m not using pure OpenCv, I´m using javaCV, but the logic it´s the same).
// Load two images and allocate other structures (I´m using other image)
IplImage colored = cvLoadImage(
"res/scanteste.jpg",
CV_LOAD_IMAGE_UNCHANGED);
IplImage gray = cvCreateImage(cvGetSize(colored), IPL_DEPTH_8U, 1);
IplImage smooth = cvCreateImage(cvGetSize(colored), IPL_DEPTH_8U, 1);
//Step 1 - Convert from RGB to grayscale (cvCvtColor)
cvCvtColor(colored, gray, CV_RGB2GRAY);
//2 Smooth (cvSmooth)
cvSmooth( gray, smooth, CV_BLUR, 9, 9, 2, 2);
//3 - cvThreshold - What values?
cvThreshold(gray,gray, 155, 255, CV_THRESH_BINARY);
//4 - Detect edges (cvCanny) -What values?
int N = 7;
int aperature_size = N;
double lowThresh = 20;
double highThresh = 40;
cvCanny( gray, gray, lowThresh*N*N, highThresh*N*N, aperature_size );
//5 - Find contours (cvFindContours)
int total = 0;
CvSeq contour2 = new CvSeq(null);
CvMemStorage storage2 = cvCreateMemStorage(0);
CvMemStorage storageHull = cvCreateMemStorage(0);
total = cvFindContours(gray, storage2, contour2, Loader.sizeof(CvContour.class), CV_RETR_CCOMP, CV_CHAIN_APPROX_NONE);
if(total > 1){
while (contour2 != null && !contour2.isNull()) {
if (contour2.elem_size() > 0) {
//6 - Approximate contours with linear features (cvApproxPoly)
CvSeq points = cvApproxPoly(contour2,Loader.sizeof(CvContour.class), storage2, CV_POLY_APPROX_DP,cvContourPerimeter(contour2)*0.005, 0);
cvDrawContours(gray, points,CvScalar.BLUE, CvScalar.BLUE, -1, 1, CV_AA);
}
contour2 = contour2.h_next();
}
}
So, I want to find the cornes, but I don´t know how to use corners function like cvCornerHarris and others.
First, check out /samples/c/squares.c in your OpenCV distribution. This example provides a square detector, and it should be a pretty good start on how to detect corner-like features. Then, take a look at OpenCV's feature-oriented functions like cvCornerHarris() and cvGoodFeaturesToTrack().
The above methods can return many corner-like features - most will not be the "true corners" you are looking for. In my application, I had to detect squares that had been rotated or skewed (due to perspective). My detection pipeline consisted of:
Convert from RGB to grayscale (cvCvtColor)
Smooth (cvSmooth)
Threshold (cvThreshold)
Detect edges (cvCanny)
Find contours (cvFindContours)
Approximate contours with linear features (cvApproxPoly)
Find "rectangles" which were structures that: had polygonalized contours possessing 4 points, were of sufficient area, had adjacent edges were ~90 degrees, had distance between "opposite" vertices was of sufficient size, etc.
Step 7 was necessary because a slightly noisy image can yield many structures that appear rectangular after polygonalization. In my application, I also had to deal with square-like structures that appeared within, or overlapped the desired square. I found the contour's area property and center of gravity to be helpful in discerning the proper rectangle.
At a first glance, for a human eye there are 4 corners. But in computer vision, a corner is considered to be a point that has large gradient change in intensity across its neighborhood. The neighborhood can be a 4 pixel neighborhood or an 8 pixel neighborhood.
In the equation provided to find the gradient of intensity, it has been considered for 4-pixel neighborhood SEE DOCUMENTATION.
Here is my approach for the image in question. I have the code in python as well:
path = r'C:\Users\selwyn77\Desktop\Stack\corner'
filename = 'env.jpg'
img = cv2.imread(os.path.join(path, filename))
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) #--- convert to grayscale
It is a good choice to always blur the image to remove less possible gradient changes and preserve the more intense ones. I opted to choose the bilateral filter which unlike the Gaussian filter doesn't blur all the pixels in the neighborhood. It rather blurs pixels which has similar pixel intensity to that of the central pixel. In short it preserves edges/corners of high gradient change but blurs regions that have minimal gradient changes.
bi = cv2.bilateralFilter(gray, 5, 75, 75)
cv2.imshow('bi',bi)
To a human it is not so much of a difference compared to the original image. But it does matter. Now finding possible corners:
dst = cv2.cornerHarris(bi, 2, 3, 0.04)
dst returns an array (the same 2D shape of the image) with eigen values obtained from the final equation mentioned HERE.
Now a threshold has to be applied to select those corners beyond a certain value. I will use the one in the documentation:
#--- create a black image to see where those corners occur ---
mask = np.zeros_like(gray)
#--- applying a threshold and turning those pixels above the threshold to white ---
mask[dst>0.01*dst.max()] = 255
cv2.imshow('mask', mask)
The white pixels are regions of possible corners. You can find many corners neighboring each other.
To draw the selected corners on the image:
img[dst > 0.01 * dst.max()] = [0, 0, 255] #--- [0, 0, 255] --> Red ---
cv2.imshow('dst', img)
(Red colored pixels are the corners, not so visible)
In order to get an array of all pixels with corners:
coordinates = np.argwhere(mask)
UPDATE
Variable coor is an array of arrays. Converting it to list of lists
coor_list = [l.tolist() for l in list(coor)]
Converting the above to list of tuples
coor_tuples = [tuple(l) for l in coor_list]
I have an easy and rather naive way to find the 4 corners. I simply calculated the distance of each corner to every other corner. I preserved those corners whose distance exceeded a certain threshold.
Here is the code:
thresh = 50
def distance(pt1, pt2):
(x1, y1), (x2, y2) = pt1, pt2
dist = math.sqrt( (x2 - x1)**2 + (y2 - y1)**2 )
return dist
coor_tuples_copy = coor_tuples
i = 1
for pt1 in coor_tuples:
print(' I :', i)
for pt2 in coor_tuples[i::1]:
print(pt1, pt2)
print('Distance :', distance(pt1, pt2))
if(distance(pt1, pt2) < thresh):
coor_tuples_copy.remove(pt2)
i+=1
Prior to running the snippet above coor_tuples had all corner points:
[(4, 42),
(4, 43),
(5, 43),
(5, 44),
(6, 44),
(7, 219),
(133, 36),
(133, 37),
(133, 38),
(134, 37),
(135, 224),
(135, 225),
(136, 225),
(136, 226),
(137, 225),
(137, 226),
(137, 227),
(138, 226)]
After running the snippet I was left with 4 corners:
[(4, 42), (7, 219), (133, 36), (135, 224)]
UPDATE 2
Now all you have to do is just mark these 4 points on a copy of the original image.
img2 = img.copy()
for pt in coor_tuples:
cv2.circle(img2, tuple(reversed(pt)), 3, (0, 0, 255), -1)
cv2.imshow('Image with 4 corners', img2)
Here's an implementation using cv2.goodFeaturesToTrack() to detect corners. The approach is
Convert image to grayscale
Perform canny edge detection
Detect corners
Optionally perform 4-point perspective transform to get top-down view of image
Using this starting image,
After converting to grayscale, we perform canny edge detection
Now that we have a decent binary image, we can use cv2.goodFeaturesToTrack()
corners = cv2.goodFeaturesToTrack(canny, 4, 0.5, 50)
For the parameters, we give it the canny image, set the maximum number of corners to 4 (maxCorners), use a minimum accepted quality of 0.5 (qualityLevel), and set the minimum possible Euclidean distance between the returned corners to 50 (minDistance). Here's the result
Now that we have identified the corners, we can perform a 4-point perspective transform to obtain a top-down view of the object. We first order the points clockwise then draw the result onto a mask.
Note: We could have just found contours on the Canny image instead of doing this step to create the mask, but pretend we only had the 4 corner points to work with
Next we find contours on this mask and filter using cv2.arcLength() and cv2.approxPolyDP(). The idea is that if the contour has 4 points, then it must be our object. Once we have this contour, we perform a perspective transform
Finally we rotate the image depending on the desired orientation. Here's the result
Code for only detecting corners
import cv2
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
canny = cv2.Canny(gray, 120, 255, 1)
corners = cv2.goodFeaturesToTrack(canny,4,0.5,50)
for corner in corners:
x,y = corner.ravel()
cv2.circle(image,(x,y),5,(36,255,12),-1)
cv2.imshow('canny', canny)
cv2.imshow('image', image)
cv2.waitKey()
Code for detecting corners and performing perspective transform
import cv2
import numpy as np
def rotate_image(image, angle):
# Grab the dimensions of the image and then determine the center
(h, w) = image.shape[:2]
(cX, cY) = (w / 2, h / 2)
# grab the rotation matrix (applying the negative of the
# angle to rotate clockwise), then grab the sine and cosine
# (i.e., the rotation components of the matrix)
M = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
cos = np.abs(M[0, 0])
sin = np.abs(M[0, 1])
# Compute the new bounding dimensions of the image
nW = int((h * sin) + (w * cos))
nH = int((h * cos) + (w * sin))
# Adjust the rotation matrix to take into account translation
M[0, 2] += (nW / 2) - cX
M[1, 2] += (nH / 2) - cY
# Perform the actual rotation and return the image
return cv2.warpAffine(image, M, (nW, nH))
def order_points_clockwise(pts):
# sort the points based on their x-coordinates
xSorted = pts[np.argsort(pts[:, 0]), :]
# grab the left-most and right-most points from the sorted
# x-roodinate points
leftMost = xSorted[:2, :]
rightMost = xSorted[2:, :]
# now, sort the left-most coordinates according to their
# y-coordinates so we can grab the top-left and bottom-left
# points, respectively
leftMost = leftMost[np.argsort(leftMost[:, 1]), :]
(tl, bl) = leftMost
# now, sort the right-most coordinates according to their
# y-coordinates so we can grab the top-right and bottom-right
# points, respectively
rightMost = rightMost[np.argsort(rightMost[:, 1]), :]
(tr, br) = rightMost
# return the coordinates in top-left, top-right,
# bottom-right, and bottom-left order
return np.array([tl, tr, br, bl], dtype="int32")
def perspective_transform(image, corners):
def order_corner_points(corners):
# Separate corners into individual points
# Index 0 - top-right
# 1 - top-left
# 2 - bottom-left
# 3 - bottom-right
corners = [(corner[0][0], corner[0][1]) for corner in corners]
top_r, top_l, bottom_l, bottom_r = corners[0], corners[1], corners[2], corners[3]
return (top_l, top_r, bottom_r, bottom_l)
# Order points in clockwise order
ordered_corners = order_corner_points(corners)
top_l, top_r, bottom_r, bottom_l = ordered_corners
# Determine width of new image which is the max distance between
# (bottom right and bottom left) or (top right and top left) x-coordinates
width_A = np.sqrt(((bottom_r[0] - bottom_l[0]) ** 2) + ((bottom_r[1] - bottom_l[1]) ** 2))
width_B = np.sqrt(((top_r[0] - top_l[0]) ** 2) + ((top_r[1] - top_l[1]) ** 2))
width = max(int(width_A), int(width_B))
# Determine height of new image which is the max distance between
# (top right and bottom right) or (top left and bottom left) y-coordinates
height_A = np.sqrt(((top_r[0] - bottom_r[0]) ** 2) + ((top_r[1] - bottom_r[1]) ** 2))
height_B = np.sqrt(((top_l[0] - bottom_l[0]) ** 2) + ((top_l[1] - bottom_l[1]) ** 2))
height = max(int(height_A), int(height_B))
# Construct new points to obtain top-down view of image in
# top_r, top_l, bottom_l, bottom_r order
dimensions = np.array([[0, 0], [width - 1, 0], [width - 1, height - 1],
[0, height - 1]], dtype = "float32")
# Convert to Numpy format
ordered_corners = np.array(ordered_corners, dtype="float32")
# Find perspective transform matrix
matrix = cv2.getPerspectiveTransform(ordered_corners, dimensions)
# Return the transformed image
return cv2.warpPerspective(image, matrix, (width, height))
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
canny = cv2.Canny(gray, 120, 255, 1)
corners = cv2.goodFeaturesToTrack(canny,4,0.5,50)
c_list = []
for corner in corners:
x,y = corner.ravel()
c_list.append([int(x), int(y)])
cv2.circle(image,(x,y),5,(36,255,12),-1)
corner_points = np.array([c_list[0], c_list[1], c_list[2], c_list[3]])
ordered_corner_points = order_points_clockwise(corner_points)
mask = np.zeros(image.shape, dtype=np.uint8)
cv2.fillPoly(mask, [ordered_corner_points], (255,255,255))
mask = cv2.cvtColor(mask, cv2.COLOR_BGR2GRAY)
cnts = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.015 * peri, True)
if len(approx) == 4:
transformed = perspective_transform(original, approx)
result = rotate_image(transformed, -90)
cv2.imshow('canny', canny)
cv2.imshow('image', image)
cv2.imshow('mask', mask)
cv2.imshow('transformed', transformed)
cv2.imshow('result', result)
cv2.waitKey()
find contours with RETR_EXTERNAL option.(gray -> gaussian filter -> canny edge -> find contour)
find the largest size contour -> this will be the edge of the rectangle
find corners with little calculation
Mat m;//image file
findContours(m, contours_, hierachy_, RETR_EXTERNAL);
auto it = max_element(contours_.begin(), contours_.end(),
[](const vector<Point> &a, const vector<Point> &b) {
return a.size() < b.size(); });
Point2f xy[4] = {{9000,9000}, {0, 1000}, {1000, 0}, {0,0}};
for(auto &[x, y] : *it) {
if(x + y < xy[0].x + xy[0].y) xy[0] = {x, y};
if(x - y > xy[1].x - xy[1].y) xy[1] = {x, y};
if(y - x > xy[2].y - xy[2].x) xy[2] = {x, y};
if(x + y > xy[3].x + xy[3].y) xy[3] = {x, y};
}
xy[4] will be the four corners.
I was able to extract four corners this way.
Apply houghlines to the canny image - you will get a list of points
apply convex hull to this set of points