I Detected the ArUco marker and estimated the pose. See the image below. However, Xt (X translation) I get is a positive value. According to the drawAxis function, the positive direction is away from the image center. So I thought it was supposed to be a negative value. Why I am getting positive instead.
My camera is about 120 mm away from the imaging surface. But I am getting Zt (Z translation) in the range of 650 mm. Is pose estimation giving the pose of marker with respect to physical camera or image plane center? I didn't get why the Zt is so high.
I kept measuring Pose while changing Z, and obtained roll, pitch, yaw. I noticed roll ( rotation w.r.t. cam X-axis) is changing its sign back and forth magnitude change 166-178, but the sign of Xt did not change with the sign change in roll. Any thoughts on why it behaves like that?
Any suggestion to get more consistent data?
image=cv.imread(fname)
arucoDict = cv.aruco.Dictionary_get(cv.aruco.DICT_4X4_1000)
arucoParams = cv.aruco.DetectorParameters_create()
(corners, ids, rejected) = cv.aruco.detectMarkers(image, arucoDict,
parameters=arucoParams)
print(corners, ids, rejected)
if len(corners) > 0:
# flatten the ArUco IDs list
ids = ids.flatten()
# loop over the detected ArUCo corners
#for (markerCorner, markerID) in zip(corners, ids):
#(markerCorner, markerID)=(corners, ids)
# extract the marker corners (which are always returned in
# top-left, top-right, bottom-right, and bottom-left order)
#corners = corners.reshape((4, 2))
(topLeft, topRight, bottomRight, bottomLeft) = corners[0][0][0],corners[0][0][1],corners[0][0][2],corners[0][0][3]
# convert each of the (x, y)-coordinate pairs to integers
topRight = (int(topRight[0]), int(topRight[1]))
bottomRight = (int(bottomRight[0]), int(bottomRight[1]))
bottomLeft = (int(bottomLeft[0]), int(bottomLeft[1]))
topLeft = (int(topLeft[0]), int(topLeft[1]))
# draw the bounding box of the ArUCo detection
cv.line(image, topLeft, topRight, (0, 255, 0), 2)
cv.line(image, topRight, bottomRight, (0, 255, 0), 2)
cv.line(image, bottomRight, bottomLeft, (0, 255, 0), 2)
cv.line(image, bottomLeft, topLeft, (0, 255, 0), 2)
# compute and draw the center (x, y)-coordinates of the ArUco
# marker
cX = int((topLeft[0] + bottomRight[0]) / 2.0)
cY = int((topLeft[1] + bottomRight[1]) / 2.0)
cv.circle(image, (cX, cY), 4, (0, 0, 255), -1)
if topLeft[1]!=topRight[1] or topLeft[0]!=bottomLeft[0]:
rot1=np.degrees(np.arctan((topLeft[0]-bottomLeft[0])/(bottomLeft[1]-topLeft[1])))
rot2=np.degrees(np.arctan((topRight[1]-topLeft[1])/(topRight[0]-topLeft[0])))
rot=(np.round(rot1,3)+np.round(rot2,3))/2
print(rot1,rot2,rot)
else:
rot=0
# draw the ArUco marker ID on the image
rotS=",rotation:"+str(np.round(rot,3))
cv.putText(image, ("position: "+str(cX) +","+str(cY)),
(100, topLeft[1] - 15), cv.FONT_HERSHEY_SIMPLEX,0.5, (255, 0, 80), 2)
cv.putText(image, rotS,
(400, topLeft[1] -15), cv.FONT_HERSHEY_SIMPLEX,0.5, (255, 0, 80), 2)
print("[INFO] ArUco marker ID: {}".format(ids))
d=np.round((math.dist(topLeft,bottomRight)+math.dist(topRight,bottomLeft))/2,3)
# Get the rotation and translation vectors
rvecs, tvecs, obj_points = cv.aruco.estimatePoseSingleMarkers(corners,aruco_marker_side_length,mtx,dst)
# Print the pose for the ArUco marker
# The pose of the marker is with respect to the camera lens frame.
# Imagine you are looking through the camera viewfinder,
# the camera lens frame's:
# x-axis points to the right
# y-axis points straight down towards your toes
# z-axis points straight ahead away from your eye, out of the camera
#for i, marker_id in enumerate(marker_ids):
# Store the translation (i.e. position) information
transform_translation_x = tvecs[0][0][0]
transform_translation_y = tvecs[0][0][1]
transform_translation_z = tvecs[0][0][2]
# Store the rotation information
rotation_matrix = np.eye(4)
rotation_matrix[0:3, 0:3] = cv.Rodrigues(np.array(rvecs[0]))[0]
r = R.from_matrix(rotation_matrix[0:3, 0:3])
quat = r.as_quat()
# Quaternion format
transform_rotation_x = quat[0]
transform_rotation_y = quat[1]
transform_rotation_z = quat[2]
transform_rotation_w = quat[3]
# Euler angle format in radians
roll_x, pitch_y, yaw_z = euler_from_quaternion(transform_rotation_x,transform_rotation_y,transform_rotation_z,transform_rotation_w)
roll_x = math.degrees(roll_x)
pitch_y = math.degrees(pitch_y)
yaw_z = math.degrees(yaw_z)
Disclaimer: this goes for OpenCV v4.5.5 and corresponding aruco module (contrib repo). They redid a lot of aruco stuff for v4.6.0 and v4.7.0, so best check everything I say here.
Without checking all the code (looks roughly okay), a few basics about OpenCV and aruco:
Both use right-handed coordinate systems. Thumb X, index Y, middle Z.
OpenCV uses X right, Y down, Z far, for screen/camera frames. Origin for screens and pictures is the top left corner. For cameras, the origin is the center of the pinhole model, which would be the center of the aperture. I can't comment on lenses or lens systems. Assume the lens center is the origin. That's probably close enough.
Aruco uses X right, Y far, Z up, if the marker is lying flat on a table. Origin is in the center of the marker. The top left corner of the marker is considered the "first" corner.
The marker can be considered to have its own coordinate system/frame.
The pose given by rvec and tvec is the pose of the marker in the camera frame. That means np.linalg.norm(tvec) gives you the direct distance from the camera to the marker's center. tvec's Z is just the component parallel to optical axis.
If the marker is in the right half of the picture ("half" defined by camera matrix's cx,cy), you'd expect tvec's X to grow. Lower half, Y positive/growing.
Conversely, that transformation transforms marker-local coordinates to camera-local. Try transforming some marker-local points, such as origin or points on the axes. I believe that cv::transform can help with that. Using OpenCV's projectPoints to map 3D space points to 2D image points, you can then draw the marker's axes, or a cube on top of it, or anything you like.
Say the marker sits upright and faces the camera dead-on. When you consider the frame triads of the marker and the camera in space ("world" space), both would be X "right", but one's Y and Z are opposite the other's Y and Z, so you'd expect to see a rotation around the X axis by half a turn (rotating Z and Y).
You could imagine the transformation to happen like this:
initially the camera looks through the marker, from the marker's back out into the world. The camera would be "upside down". The camera sees marker-space.
the pose's rotation component rotates the whole marker-local world around the camera's origin. Seen from the world frame (point of reference), the camera rotates, into an attitude you'd find natural.
the pose's translation moves the marker's world out in front of the camera (Z being positive), or equivalently, the camera backs away from the marker.
If you get implausible values, check aruco_marker_side_length and camera matrix. f would be around 500-3000 for typical resolutions (VGA-4k) and fields of view (60-80 degrees).
I am using opencv::solvePnP to return a camera pose. I run PnP, and it returns the rvec and tvec values.(rotation vector and position).
I then run this function to convert the values to the camera pose:
void GetCameraPoseEigen(cv::Vec3d tvecV, cv::Vec3d rvecV, Eigen::Vector3d &Translate, Eigen::Quaterniond &quats)
{
Mat R;
Mat tvec, rvec;
tvec = DoubleMatFromVec3b(tvecV);
rvec = DoubleMatFromVec3b(rvecV);
cv::Rodrigues(rvec, R); // R is 3x3
R = R.t(); // rotation of inverse
tvec = -R*tvec; // translation of inverse
Eigen::Matrix3d mat;
cv2eigen(R, mat);
Eigen::Quaterniond EigenQuat(mat);
quats = EigenQuat;
double x_t = tvec.at<double>(0, 0);
double y_t = tvec.at<double>(1, 0);
double z_t = tvec.at<double>(2, 0);
Translate.x() = x_t * 10;
Translate.y() = y_t * 10;
Translate.z() = z_t * 10;
}
This works, yet at some rotation angles, the converted rotation values flip randomly between positive and negative values. Yet, the source rvecV value does not. I assume this means I am going wrong with my conversion. How can i get a stable Quaternion from the PnP returned cv::Vec3d?
EDIT: This seems to be Quaternion flipping, as mentioned here:
Quaternion is flipping sign for very similar rotations?
Based on that, i have tried adding:
if(quat.w() < 0)
{
quat = quat.Inverse();
}
But I see the same flipping.
Both quat and -quat represent the same rotation. You can check that by taking a unit quaternion, converting it to a rotation matrix, then doing
quat.coeffs() = -quat.coeffs();
and converting that to a rotation matrix as well.
If for some reason you always want a positive w value, negate all coefficients if w is negative.
The sign should not matter...
... rotation-wise, as long as all four fields of the 4D quaternion are getting flipped. There's more to it explained here:
Quaternion to EulerXYZ, how to differentiate the negative and positive quaternion
Think of it this way:
Angle/axis both flipped mean the same thing
and mind the clockwise to counterclockwise transition much like in a mirror image.
There may be convention to keep the quat.w() or quat[0] component positive and change other components to opposite accordingly. Assume w = cos(angle/2) then setting w > 0 just means: I want angle to be within the (-pi, pi) range. So that the -270 degrees rotation becomes +90 degrees rotation.
Doing the quat.Inverse() is probably not what you want, because this creates a rotation in the opposite direction. That is -quat != quat.Inverse().
Also: check that both systems have the same handedness (chirality)! Test if your rotation matrix determinant is +1 or -1.
(sry for the image link, I don't have enough reputation to embed them).
I just started learning metal and can best show you my frustration with the following series of screenshots. From top to bottom we have
(1) My model where the model matrix is the identity matrix
(2) My model rotated 60 deg about the x axis with orthogonal projection
(3) My model rotated 60 deg about the y axis with orthogonal projection
(4) My model rotated 60 deg about the z axis
So I use the following function for conversion into normalized device coordinates:
- (CGPoint)normalizedDevicePointForViewPoint:(CGPoint)point
{
CGPoint p = [self convertPoint:point toCoordinateSpace:self.window.screen.fixedCoordinateSpace];
CGFloat halfWidth = CGRectGetMidX(self.window.screen.bounds);
CGFloat halfHeight = CGRectGetMidY(self.window.screen.bounds);
CGFloat px = ( p.x - halfWidth ) / halfWidth;
CGFloat py = ( p.y - halfHeight ) / halfHeight;
return CGPointMake(px, -py);
}
The following rotates and orthogonally projects the model:
- (matrix_float4x4)zRotation
{
self.rotationZ = M_PI / 3;
const vector_float3 zAxis = { 0, 0, 1 };
const matrix_float4x4 zRot = matrix_float4x4_rotation(zAxis, self.rotationZ);
const matrix_float4x4 modelMatrix = zRot;
return matrix_multiply( matrix_float4x4_orthogonal_projection_on_z_plane(), modelMatrix );
}
As you can see when I use the exact same method for rotating about the other two axes, it looks fine-not distorted. What am I doing wrong? Is there some sort of scaling/aspect ratio thing I should be setting somewhere? What things could it be? I've been staring at this for an embarrassingly long period of time so any help/ideas that can lead me in the right direction are much appreciated. Thank you in advance.
There's nothing wrong with your rotation or projection matrices. The visual oddity arises from the fact that you move your vertices into NDC space prior to rotation. A rectangle doesn't preserve its aspect ratio when rotating in NDC space, because the mapping from NDC back to screen coordinates is not 1:1.
I would recommend not working in NDC until the very end of the vertex pipeline (i.e., pass vertices into your vertex function in "world" space, and out to the rasterizer as NDC). You can do this with a classic construction of the orthographic projection matrix that scales and biases the vertices, correctly accounting for the non-square aspect ratio of window coordinates.
I'm new to SceneKit coming from 2D SpriteKit and was trying to figure out how to adjust the camera so that it's at the top of the world facing down. I have the location part right, however on the rotation I'm getting stuck. If I adjust the X,YorZaxis, nothing seems to happen, however on the W axis the slightest change (even0.1` higher or lower) seems to move the camera in an unknown direction. What am I doing wrong?
cameraNode.position = SCNVector3Make(0, 10, 0)
cameraNode.rotation = SCNVector4Make(0, 0, 0, 0.5)
the rotation vector is decomposed as (x_axis, y_axis, z_axis, angle)
Setting a rotation axis with a null angle is the identity (no effective rotation). Setting an angle with a null rotation axis does not actually define a rotation.
As for why a small change of the angle has a huge effect, it's because they are expressed in radians.
A rotation of 90º around the x axis can be achieved as follows
node.rotation = SCNVector4Make(1, 0, 0, M_PI_2)
But you can also use Euler angles (see SCNNode.eulerAngles) if you find it easier:
node.eulerAngles = SCNVector3Make(M_PI_2, 0, 0)
From my experiments, the angle returned by RotatedRect's angle variable goes from -90 to 0 degrees, which is not sufficient to determine if the object is leaned to the left or right.
For example, if the angle is -45 degrees, we cannot say if we need to rotate +45 or -45 degrees to deskew it.
An excerpt of the code I'm using:
RotatedRect rotated_rect = minAreaRect(contour);
float blob_angle_deg = rotated_rect.angle;
Mat mapMatrix = getRotationMatrix2D(center, blob_angle_deg, 1.0);
Leaning the object in one direction I get angles from 0 to -90 degrees, while leaning the object to the other direction I get angles from -90 to 0 degrees.
How can I find the angle by which I should rotate my image to deskew it?
After learning from Sebastian Schmitz and Michael Burdinov answers this is how I solved it:
RotatedRect rotated_rect = minAreaRect(contour);
float blob_angle_deg = rotated_rect.angle;
if (rotated_rect.size.width < rotated_rect.size.height) {
blob_angle_deg = 90 + blob_angle_deg;
}
Mat mapMatrix = getRotationMatrix2D(center, blob_angle_deg, 1.0);
So, in fact, RotatedRect's angle does not provide enough information for knowing an object's angle, you must also use RotatedRect's size.width and size.height.
I explained how you can convert the angle of the rectangle into [0-180] in this thread.
The Angle is always calculated along the longer side.
Switching values of width and height of rectangle is the same as rotating it by 90 degrees. So if the range of angles was 180 degrees instead of 90 than same rectangle would have 2 representations (width, height, angle) and (height, width, angle+90). Having range of 90 degrees you can represent every rectangle and you can do that in only one way.
This is what I use (c is my contour). Basically, I get the longest line's du and dv, and then use atan2()
rect = cv2.minAreaRect(c)
box = cv2.boxPoints(rect)
origin = box[0]
rect_width, rect_height = rect[1]
if rect_width > rect_height:
target = box[3]
else:
target = box[1]
dv = target[1] - origin[1]
du = target[0] - origin[0]
angle_rads = math.atan2(dv, du)