Assume an image I of dimension (2, 2). Graphical coordinates C are given as:
C = [[0, 0], [1, 0],
[0, 1], [1, 1]]
Objective: rotate I by 90 degrees about the centre (NOT the origin).
Transformation Matrix:
TRotate(90) = [[0, 1], [-1, 0]]
(Assume each coordinate pair can be transformed in lockstep (i.e. on a GPU).)
Method:
Convert graphical coordinates to mathematical coordinates with the origin as the centre of the image.
Apply the transformation matrix.
Convert back to graphical coordinates.
E.g.:
Convert to graphical coordinates:
tx' = tx - width /2
ty' = ty - width /2
C' =[[-1, -1], [0, -1],
[-1, 0], [0, 0]]
Apply the transformation matrix:
C" = [[-1, 1], [-1, -0],
[0, 1], [0, 0]]
Convert back:
C" = [[0, 2], [0, 1],
[1, 2], [1, 1]]
the conversion back is out of bounds...
I'm really battling to get a proper rotation about a 'centre of gravity' working. I think that my conversion to 'mathematical coordinates is wrong'.
I had better luck by rather converting the coordinates into the following:
C' =[[-1, -1], [1, -1],
[-1, 1], [1, 1]]
I achieved this transformation by observing that if the origin existed in between the four pixels, with the +ve y-axis pointing down, and the +ve x-axis to the right, then the point (0,0) would be (-1, -1) and so on for the rest. (The resultant rotation and conversion give the desired result).
However, I can't find the right kind of transform to apply to the coordinates to place the origin at the centre. I've tried a transformation matrix using homogenous coordinates but this does not work.
Edit
For Malcolm's advice:
position vector =
[0
0
1]
Translate by subtracting width/2 == 1:
[-1
-1
0]
Rotate by multiplying the transformation matrix:
|-1| | 0 1 0| |-1|
|-1| X |-1 0 0| = | 1|
|0 | | 0 0 1| | 0|
You need an extra row in your matrix, for translation by x and translation by y. You then add an extra column to your position vector, call it w, which is hard-coded to 1. This is a trick to ensure that the translation can be performed with standard matrix multiplication.
Since you need a translation followed by a rotation, you need to set up the translation matrix, then do a multiplication by the rotation matrix (make them both 3x3s with the last column ignored if you're shaky on matrix multiplication). So the translations and rotations will be interweaved with each other.
Related
I am trying to converts canny edges to geometry in opencv (which I can then manipulate with Shapely). I am using FindContours which works great when there is a polygon in the middle of the image, but if the edges intersect the border, it returns a polygon which follows the length of the edge and back again, e.g.
[[1, 1], [2, 2], [3, 3], [2, 2], [1, 1]]
Currently my existing workflow looks something like this:
edges = cv2.Canny(frame, 100, 200)
contours = cv2.FindContours(edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
Example of input image:
I am trying to extract rotation and translation from the fundamental matrix. I used the intrinsic matrices to get the essential matrix, but the SVD is not giving expected results. So I composed the essential matrix and trying my SVD code to get the Rotation and translation matrices back and found that the SVD code is wrong.
I created the essential matrix using Rotation and translation matrices
R = [[ 0.99965657, 0.02563432, -0.00544263],
[-0.02596087, 0.99704732, -0.07226806],
[ 0.00357402, 0.07238453, 0.9973704 ]]
T = [-0.1679611706725666, 0.1475313058767286, -0.9746915198833979]
tx = np.array([[0, -T[2], T[1]], [T[2], 0, -T[0]], [-T[1], T[0], 0]])
E = R.dot(tx)
// E Output: [[-0.02418259, 0.97527093, 0.15178621],
[-0.96115177, -0.01316561, 0.16363519],
[-0.21769595, -0.16403593, 0.01268507]]
Now while trying to get it back using SVD.
U,S,V = np.linalg.svd(E)
diag_110 = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 0]])
newE = U.dot(diag_110).dot(V.T)
U,S,V = np.linalg.svd(newE)
W = np.array([[0, -1, 0], [1, 0, 0], [0, 0, 1]])
Z = np.array([[0, 1, 0],[-1, 0, 0],[0, 0, 0]])
R1 = U.dot(W).dot(V.T)
R2 = U.dot(W.T).dot(V.T)
T = U.dot(Z).dot(U.T);
T = [T[1,0], -T[2, 0], T[2, 1]]
'''
Output
R1 : [[-0.99965657, -0.00593909, 0.02552386],
[ 0.02596087, -0.35727319, 0.93363906],
[-0.00357402, -0.93398105, -0.35730468]]
R2 : [[-0.90837444, -0.20840016, -0.3625262 ],
[ 0.26284261, 0.38971602, -0.8826297 ],
[-0.32522244, 0.89704559, 0.29923163]]
T : [-0.1679611706725666, 0.1475313058767286, -0.9746915198833979],
'''
What is wrong with the SVD code? I referred the code here and here
Your R1 output is a left-handed and axis-permuted version of your initial (ground-truth) rotation matrix: notice that the first column is opposite to the ground-truth, and the second and third are swapped, and that the determinant of R1 is ~= -1 (i.e. it's a left-handed frame).
The reason this happens is that the SVD decomposition returns unitary matrices U and V with no guaranteed parity. In addition, you multiplied by an axis-permutation matrix W. It is up to you to flip or permute the axes so that the rotation has the correct handedness. You do so by enforcing constraints from the images and the scene, and known order of the cameras (i.e. knowing which camera is the left one).
I have an image something like the image below (on the left):
I want to extract only the pixels in red on the right: the pixels that belong to a 1px vertical line, but not to any thicker line or other region with more than 1 adjacent black pixel. The image is bitonal.
I have so far tried a morphology OPEN with a vertical (10px, which is find for my purposes) and horizontal kernel and taken the difference, but this needs an awkward shift and leaves some "speckles":
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, 10))
vertical_mask1 = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, vertical_kernel,
iterations=1)
horz_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 1))
horz_mask = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horz_kernel,
iterations=1)
M = np.float32([[1,0,-1],[0,1,1]])
rows, cols = horz_mask.shape
vertical_mask = cv2.warpAffine(horz_mask, M, (cols, rows))
result = cv2.bitwise_and(thresh, cv2.bitwise_not(horz_mask))
What is the correct way to isolate the 1px lines (and only the 1px lines)?
In the general case, for other kernels, this question is: how do I find all pixels in the image that are in regions that the kernel "fits inside" (and then a subtraction to get my desired result)?
That's basically (binary) template matching. You need to derive proper templates from your "kernels". For larger "kernels", that might involve using masks for these templates, too, cf. cv2.matchTemplate.
What's the most important feature for a single pixel vertical line? The left and right neighbour of the current pixel must be 0. So, the template to match is [0, 1, 0]. By using the TemplateMatchMode cv2.TM_SQDIFF_NORMED, perfect matches will lead to close to 0 values in the result array.
You can mask those locations, and dilate according to the size of your template. Then, you use bitwise_and to extract the actual pixels that belong to your template.
Here's some code with a few template ("kernels"):
import cv2
import numpy as np
img = cv2.imread('AapJk.png', cv2.IMREAD_GRAYSCALE)[:, :50]
vert_line = np.array([[0, 1, 0]], np.uint8)
cross = np.array([[0, 1, 0], [1, 1, 1], [0, 1, 0]], np.uint8)
corner = np.array([[0, 0, 1], [0, 0, 1], [1, 1, 1]], np.uint8)
for i_k, k in enumerate([vert_line, cross, corner]):
m, n = k.shape
img_tmp = 1 - img // 255
mask = cv2.matchTemplate(img_tmp, k, cv2.TM_SQDIFF_NORMED) < 10e-6
mask = cv2.dilate(mask.astype(np.uint8), np.ones((m, n)), anchor=(n-1, m-1))
m, n = mask.shape
mask = cv2.bitwise_and(img_tmp[:m, :n], mask)
out = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
roi = out[:m, :n]
roi[mask.astype(bool), :] = [0, 0, 255]
cv2.imwrite('{}.png'.format(i_k), out)
Vertical line:
Cross:
Bottom right corner 3 x 3:
Larger templates ("kernels") most likely will require additional masks, depending on how many or which neighbouring pixels should be considered or not.
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.19041-SP0
Python: 3.9.1
PyCharm: 2021.1.3
NumPy: 1.20.3
OpenCV: 4.5.2
----------------------------------------
I am trying to find the bird's eye image from a given image. I also have the rotations and translations (also intrinsic matrix) required to convert it into the bird's eye plane. My aim is to find an inverse homography matrix(3x3).
rotation_x = np.asarray([[1,0,0,0],
[0,np.cos(R_x),-np.sin(R_x),0],
[0,np.sin(R_x),np.cos(R_x),0],
[0,0,0,1]],np.float32)
translation = np.asarray([[1, 0, 0, 0],
[0, 1, 0, 0 ],
[0, 0, 1, -t_y/(dp_y * np.sin(R_x))],
[0, 0, 0, 1]],np.float32)
intrinsic = np.asarray([[s_x * f / (dp_x ),0, 0, 0],
[0, 1 * f / (dp_y ) ,0, 0 ],
[0,0,1,0]],np.float32)
#The Projection matrix to convert the image coordinates to 3-D domain from (x,y,1) to (x,y,0,1); Not sure if this is the right approach
projection = np.asarray([[1, 0, 0],
[0, 1, 0],
[0, 0, 0],
[0, 0, 1]], np.float32)
homography_matrix = intrinsic # translation # rotation # projection
inv = cv2.warpPerspective(source_image, homography_matrix,(w,h),flags = cv2.INTER_CUBIC | cv2.WARP_INVERSE_MAP)
My question is, Is this the right approach, as I can manual set a suitable ty,rx, but not for the one (ty,rx) which is provided.
First premise: your bird's eye view will be correct only for one specific plane in the image, since a homography can only map planes (including the plane at infinity, corresponding to a pure camera rotation).
Second premise: if you can identify a quadrangle in the first image that is the projection of a rectangle in the world, you can directly compute the homography that maps the quad into the rectangle (i.e. the "birds's eye view" of the quad), and warp the image with it, setting the scale so the image warps to a desired size. No need to use the camera intrinsics. Example: you have the image of a building with rectangular windows, and you know the width/height ratio of these windows in the world.
Sometimes you can't find rectangles, but your camera is calibrated, and thus the problem you describe comes into play. Let's do the math. Assume the plane you are observing in the given image is Z=0 in world coordinates. Let K be the 3x3 intrinsic camera matrix and [R, t] the 3x4 matrix representing the camera pose in XYZ world frame, so that if Pc and Pw represent the same 3D point respectively in camera and world coordinates, it is Pc = R*Pw + t = [R, t] * [Pw.T, 1].T, where .T means transposed. Then you can write the camera projection as:
s * p = K * [R, t] * [Pw.T, 1].T
where s is an arbitrary scale factor and p is the pixel that Pw projects onto. But if Pw=[X, Y, Z].T is on the Z=0 plane, the 3rd column of R only multiplies zeros, so we can ignore it. If we then denote with r1 and r2 the first two columns of R, we can rewrite the above equation as:
s * p = K * [r1, r2, t] * [X, Y, 1].T
But K * [r1, r2, t] is a 3x3 matrix that transforms points on a 3D plane to points on the camera plane, so it is a homography.
If the plane is not Z=0, you can repeat the same argument replacing [R, t] with [R, t] * inv([Rp, tp]), where [Rp, tp] is the coordinate transform that maps a frame on the plane, with the plane normal being the Z axis, to the world frame.
Finally, to obtain the bird's eye view, you select a rotation R whose third column (the components of the world's Z axis in camera frame) is opposite to the plane's normal.
From the functions for MinAreaRect, does it return angles in the range of 0-360 degrees?
I am unsure as i have an object that is oriented at 90 degrees or so but I keep getting either -1 or -15 degrees. Could this be an openCV error?
Any guidance much appreciated.
Thanks
I'm going to assume you're using C++, but the answer should be the same if you're using C or Python.
The function minAreaRect seems to give angles ranging from -90 to 0 degrees, not including zero, so an interval of [-90, 0).
The function gives -90 degrees if the rectangle it outputs isn't rotated, i.e. the rectangle has two sides exactly horizontal and two sides exactly vertical. As the rectangle rotates clockwise, the angle increases (goes towards zero). When zero is reached, the angle given by the function ticks back over to -90 degrees again.
So if you have a long rectangle from minAreaRect, and it's lying down flat, minAreaRect will call the angle -90 degrees. If you rotate the image until the rectangle given by minAreaRect is perfectly upright, then the angle will say -90 degrees again.
I didn't actually know any of this (I procrastinated from my OpenCV project to find out how it works :/). Anyway, here's an OpenCV program that demonstrates minAreaRect if I haven't explained it clear enough already:
#include <stdio.h>
#include <opencv\cv.h>
#include <opencv\highgui.h>
using namespace cv;
int main() {
float angle = 0;
Mat image(200, 400, CV_8UC3, Scalar(0));
RotatedRect originalRect;
Point2f vertices[4];
vector<Point2f> vertVect;
RotatedRect calculatedRect;
while (waitKey(5000) != 27) {
// Create a rectangle, rotating it by 10 degrees more each time.
originalRect = RotatedRect(Point2f(100,100), Size2f(100,50), angle);
// Convert the rectangle to a vector of points for minAreaRect to use.
// Also move the points to the right, so that the two rectangles aren't
// in the same place.
originalRect.points(vertices);
for (int i = 0; i < 4; i++) {
vertVect.push_back(vertices[i] + Point2f(200, 0));
}
// Get minAreaRect to find a rectangle that encloses the points. This
// should have the exact same orientation as our original rectangle.
calculatedRect = minAreaRect(vertVect);
// Draw the original rectangle, and the one given by minAreaRect.
for (int i = 0; i < 4; i++) {
line(image, vertices[i], vertices[(i+1)%4], Scalar(0, 255, 0));
line(image, vertVect[i], vertVect[(i+1)%4], Scalar(255, 0, 0));
}
imshow("rectangles", image);
// Print the angle values.
printf("---\n");
printf("Original angle: %7.2f\n", angle);
printf("Angle given by minAreaRect: %7.2f\n", calculatedRect.angle);
printf("---\n");
// Reset everything for the next frame.
image = Mat(200, 400, CV_8UC3, Scalar(0));
vertVect.clear();
angle+=10;
}
return 0;
}
This lets you easily see how the angle, and shape, of a manually drawn rectangle compares to the minAreaRect interpretation of the same rectangle.
Improving on the answer of #Adam Goodwin i want to add my little code that changes the behaviour a little bit:
I wanted to have the angle between the longer side and vertical (to me it is the most natural way to think about rotated rectangles):
If you need the same, just use this code:
void printAngle(RotatedRect calculatedRect){
if(calculatedRect.size.width < calculatedRect.size.height){
printf("Angle along longer side: %7.2f\n", calculatedRect.angle+180);
}else{
printf("Angle along longer side: %7.2f\n", calculatedRect.angle+90);
}
}
To see it in action just insert it in Adam Goodwins code:
printf("Angle given by minAreaRect: %7.2f\n", calculatedRect.angle);
printAngle(calculatedRect);
printf("---\n");
After experiment, I find that if the long side is in the left of the bottom Point, the angle value is between long side and Y+ axis, but if the long side is in the right of the bottom Point, the angle value is between long side and X+ axis.
So I use the code like this(java):
rRect = Imgproc.minAreaRect(mop2f);
if(rRect.size.width<rRect.size.height){
angle = 90 -rRect.angle;
}else{
angle = -rRect.angle;
}
The angle is from 0 to 180.
After much experiment, I have found that the relationship between the rectangle orientation and output angle of minAreaRect(). It can be summarized in the following image
The following description assume that we have a rectangle with unequal height and width length, i.e., it is not square.
If the rectangle lies vertically (width < height), then the detected angle is -90. If the rectangle lies horizontally, then the detected angle is also -90 degree.
If the top part of the rectangle is in first quadrant, then the detected angle decreases as the rectangle rotate from horizontal to vertical position, until the detected angle becomes -90 degrees. In first quadrant, the width of detected rectangle is longer than its height.
If the top part of the detected rectangle is in second quadrant, then the angle decreases as the rectangle rotate from vertical to horizontal position. But there is a difference between second and first quadrant. If the rectangle approaches vertical position but has not been in vertical position, its angle approaches 0. If the rectangle approaches horizontal position but has not been in horizontal position, its angle approaches -90 degrees.
This post here is also good in explaining this.
It depends on the version of opencv, at least for Python.
For opencv-python='4.5.4.60'. The angle is that between positive x-axis and the first line the axis meets when it rotates anti-clock wise. The following is the code to snippet.
import cv2
import numpy as np
box1 = [[0, 0], [1, 0], [1, 2], [0, 2]]
cv2.minAreaRect(np.asarray(box1)) # angel = 90.0
box2 = [[0, 0], [2, 0], [2, 1], [0, 1]]
cv2.minAreaRect(np.asarray(box2)) # angel = 90.0
box3 = [[0, 0], [2**0.5, 2**0.5], [0.5*2**0.5, 1.5*2**0.5], [-0.5*2**0.5, 0.5*2**0.5]]
cv2.minAreaRect(np.asarray(box3, dtype=np.float32)) # angle = 44.999
box4 = [[0, 0], [-2**0.5, 2**0.5], [-0.5*2**0.5, 1.5*2**0.5], [0.5*2**0.5, 0.5*2**0.5]]
cv2.minAreaRect(np.asarray(box4, dtype=np.float32)) # angle = 45.0
box5 = [[0, 0], [-0.5*2**0.5, 0.5*2**0.5], [-2**0.5, 0], [-0.5*2**0.5, -0.5*2**0.5]]
cv2.minAreaRect(np.asarray(box5, dtype=np.float32)) # angle = 45.0
For opencv-python='3.4.13.47'. The angle is that between positive x-axis and the first line the axis meets when it rotates clock wise. The following is the code to snippet.
import cv2
import numpy as np
box1 = [[0, 0], [1, 0], [1, 2], [0, 2]]
cv2.minAreaRect(np.asarray(box1)) # angel = -90.0
box2 = [[0, 0], [2, 0], [2, 1], [0, 1]]
cv2.minAreaRect(np.asarray(box2)) # angel = -90.0
box3 = [[0, 0], [2**0.5, 2**0.5], [0.5*2**0.5, 1.5*2**0.5], [-0.5*2**0.5, 0.5*2**0.5]]
cv2.minAreaRect(np.asarray(box3, dtype=np.float32)) # angle = -44.999
box4 = [[0, 0], [-2**0.5, 2**0.5], [-0.5*2**0.5, 1.5*2**0.5], [0.5*2**0.5, 0.5*2**0.5]]
cv2.minAreaRect(np.asarray(box4, dtype=np.float32)) # angle = -45.0
box5 = [[0, 0], [-0.5*2**0.5, 0.5*2**0.5], [-2**0.5, 0], [-0.5*2**0.5, -0.5*2**0.5]]
cv2.minAreaRect(np.asarray(box5, dtype=np.float32)) # angle = -45.0