is there a function of OpenCV such that with:
coordinates of a marker in the image plane
extrinsic parameters
intrinsic parameters
z coordinate (distance between the marker and the cam, because I use sensor kinect)
provides the corresponding world coordinates of the marker?
Any help is much appreciated. Thanks!
To find the world coordinates of the marker you need its coordinates relative to the camera. If you know the camera pose P relative to the origin and the marker pose M relative to the camera, to get the marker pose relative to the origin you simply multiply them together
final = [P]*[M]
It sounds like you are just struggling with finding M. All you need to do is multiply your position by the inverse of the camera matrix and then by your Z coordinate.
Z*cam_mat.inv()*[x_image,y_image,1] = [x_world,y_world,z_world]
M = [1,0,0,x_world,
0,1,0,y_world,
0,0,1,z_world,
0,0,0,1]
Related
Context: I have a big 2mx2m arena on which 4 aruco markers are printed, their position from a corner of the arena is known and is fixed. Now I have another aruco marker on a robot moving over this arena. PS the position known are in 2d.
Problem: I want to find the position of the robot in the arena (wrt to the known corner of arena).I am using python for the same, first detecting markers from the image using DetectMarker() then estimating the pose of markers. The tvec values returned by the pose estimation function gives the position of marker wrt to the camera coordinate system, that works fine if the camera is perpendicular to the arena but when the camera is kept at an angle then there is a large error in the position.
Is my approach right? Consider the camera is calibrated well, what is the source of error?
rvec, tvec, _ = aruco.estimatePoseSingleMarkers(corners, actual_size, mtx, dist)
cv2.imshow('img',img)
index = np.where(ids==0) # getting the known aruco marker
rotation_matrix[:3, :3], _ = cv2.Rodrigues(rvec[index]) # Computing Rotational Matrix
for i in range(3):
rotation_matrix[i][3]= tvec[index][0][i] # Adding Translation Values to it
inverse_rot = np.linalg.inv(rotation_matrix) # Inversing the matrix
for i,j,k in zip(ids,tvec,rvec):
print(i,'POS:',j) # prints id and tvec values
pt[:3] = j.reshape(3,1)
rot_point = np.dot(inverse_rot,pt) # Homogeneous matrix . tvec values
print(rot_point[:3]) # The new position
print(np.sqrt(rot_point[0]**2 + rot_point[1]**2 )) # Distance
rotational_matrix is 4x4 matrix containing rotation, translation, which is used to transfer the coordinate system(from camera system) to one of the markers(known marker on arena, so that marker becomes the origin), and converting other points(tvecs in the camera system) to the marker system.
Homogeneous Coordinate transformation
How can I calculate the distance of an object of known size (e.g. aruco marker of 0.14m printed on paper) from camera. I know the camera matrix (camMatx) and my fx,fy ~= 600px assuming no distortion. From this data I am able to calculate the pose of the aruco marker and have obtained [R|t]. Now the task is to get the distance of the aruco marker from the camera. I also know the height of the camera from ground plane (15m).
How should I go about solving this problem. Any help would be appreciated. Also please note I have also seen approach of similar triangles, but that would work on knowing the distance of the object, which doesnt apply in my case as I have to calculate the distance.
N.B: I dont know the camera sensor height. But I know how high the camera is located above ground.
I know the dimensions of the area in which my object is moving (70m x 45m). In the end I would like to plot the coordinate of the moving object on a 2D map drawn to the scale.
I have set of known 3D points in world coordinate system and I know corresponding 2D points in the image.
Now for a new 3D coordinate (x, y, z) I need to find the 2D image coordinate (u, v) how can I find that in OpenCV ?? How can I find transformation matrix (camera matrix, rotation, translation) using OpenCV ?
First you need to read about Fundamental Matrix , and epipolar geometry and understand how projection of world coordinates to image plane is done.
From the first part of your question it seems you already have this projection matrix. For any new world coordinates just use this matrix.
I am trying to determine camera position in world coordinates, relative to a fiducial position based on fiducial marker found in a scene.
My methodology for determining the viewMatrix is described here:
Determine camera pose?
I have the rotation and translation, [R|t], from the trained marker to the scene image. Given camera calibration training, and thus the camera intrinsic results, I should be able to discern the cameras position in world coordinates based on the perspective & orientation of the marker found in the scene image.
Can anybody direct me to a discussion or example similar to this? I'd like to know my cameras position based on the fiducial marker, and I'm sure that something similar to this has been done before, I'm just not searching the correct keywords.
Appreciate your guidance.
What do you mean under world coordinates? If you mean object coordinates then you should use the inverse transformation of solvepnp's result.
Given a view matrix [R|t], we have that inv([R|t]) = [R'|-R'*t], where R' is the transpose of R. In OpenCV:
cv::Mat rvec, tvec;
cv::solvePnP(objectPoints, imagePoints, intrinsics, distortion, rvec, tvec);
cv::Mat R;
cv::Rodrigues(rvec, rotation);
R = R.t(); // inverse rotation
tvec = -R * tvec; // translation of inverse
// camPose is a 4x4 matrix with the pose of the camera in the object frame
cv::Mat camPose = cv::Mat::eye(4, 4, R.type());
R.copyTo(camPose.rowRange(0, 3).colRange(0, 3)); // copies R into camPose
tvec.copyTo(camPose.rowRange(0, 3).colRange(3, 4)); // copies tvec into camPose
Update #1:
Result of solvePnP
solvePnP estimates the object pose given a set of object points (model coordinates), their corresponding image projections (image coordinates), as well as the camera matrix and the distortion coefficients.
The object pose is given by two vectors, rvec and tvec. rvec is a compact representation of a rotation matrix for the pattern view seen on the image. That is, rvec together with the corresponding tvec brings the fiducial pattern from the model coordinate space (in which object points are specified) to the camera coordinate space.
That is, we are in the camera coordinate space, it moves with the camera, and the camera is always at the origin. The camera axes have the same directions as image axes, so
x-axis is pointing in the right side from the camera,
y-axis is pointing down,
and z-axis is pointing to the direction of camera view
The same would apply to the model coordinate space, so if you specified the origin in upper right corner of the fiducial pattern, then
x-axis is pointing to the right (e.g. along the longer side of your pattern),
y-axis is pointing to the other side (e.g. along the shorter one),
and z-axis is pointing to the ground.
You can specify the world origin as the first point of the object points that is the first object is set to (0, 0, 0) and all other points have z=0 (in case of planar patterns). Then tvec (combined rvec) points to the origin of the world coordinate space in which you placed the fiducial pattern. solvePnP's output has the same units as the object points.
Take a look at to the following: 6dof positional tracking. I think this is very similar as you need.
I know that in the general case, making this conversion is impossible since depth information is lost going from 3d to 2d.
However, I have a fixed camera and I know its camera matrix. I also have a planar calibration pattern of known dimensions - let's say that in world coordinates it has corners (0,0,0) (2,0,0) (2,1,0) (0,1,0). Using opencv I can estimate the pattern's pose, giving the translation and rotation matrices needed to project a point on the object to a pixel in the image.
Now: this 3d to image projection is easy, but how about the other way? If I pick a pixel in the image that I know is part of the calibration pattern, how can I get the corresponding 3d point?
I could iteratively choose some random 3d point on the calibration pattern, project to 2d, and refine the 3d point based on the error. But this seems pretty horrible.
Given that this unknown point has world coordinates something like (x,y,0) -- since it must lie on the z=0 plane -- it seems like there should be some transformation that I can apply, instead of doing the iterative nonsense. My maths isn't very good though - can someone work out this transformation and explain how you derive it?
Here is a closed form solution that I hope can help someone. Using the conventions in the image from your comment above, you can use centered-normalized pixel coordinates (usually after distortion correction) u and v, and extrinsic calibration data, like this:
|Tx| |r11 r21 r31| |-t1|
|Ty| = |r12 r22 r32|.|-t2|
|Tz| |r13 r23 r33| |-t3|
|dx| |r11 r21 r31| |u|
|dy| = |r12 r22 r32|.|v|
|dz| |r13 r23 r33| |1|
With these intermediate values, the coordinates you want are:
X = (-Tz/dz)*dx + Tx
Y = (-Tz/dz)*dy + Ty
Explanation:
The vector [t1, t2, t3]t is the position of the origin of the world coordinate system (the (0,0) of your calibration pattern) with respect to the camera optical center; by reversing signs and inversing the rotation transformation we obtain vector T = [Tx, Ty, Tz]t, which is the position of the camera center in the world reference frame.
Similarly, [u, v, 1]t is the vector in which lies the observed point in the camera reference frame (starting from camera center). By inversing the rotation transformation we obtain vector d = [dx, dy, dz]t, which represents the same direction in world reference frame.
To inverse the rotation transformation we take advantage of the fact that the inverse of a rotation matrix is its transpose (link).
Now we have a line with direction vector d starting from point T, the intersection of this line with plane Z=0 is given by the second set of equations. Note that it would be similarly easy to find the intersection with the X=0 or Y=0 planes or with any plane parallel to them.
Yes, you can. If you have a transformation matrix that maps a point in the 3d world to the image plane, you can just use the inverse of this transformation matrix to map a image plane point to the 3d world point. If you already know that z = 0 for the 3d world point, this will result in one solution for the point. There will be no need to iteratively choose some random 3d point. I had a similar problem where I had a camera mounted on a vehicle with a known position and camera calibration matrix. I needed to know the real world location of a lane marking captured on the image place of the camera.
If you have Z=0 for you points in world coordinates (which should be true for planar calibration pattern), instead of inversing rotation transformation, you can calculate homography for your image from camera and calibration pattern.
When you have homography you can select point on image and then get its location in world coordinates using inverse homography.
This is true as long as the point in world coordinates is on the same plane as the points used for calculating this homography (in this case Z=0)
This approach to this problem was also discussed below this question on SO: Transforming 2D image coordinates to 3D world coordinates with z = 0