Suppose I have a 240x180 camera with the following distortion coefficients,
k1 = -0.138592767408, k2=0.0933736664192, k3=0.0, p1=-0.000335586987532, p2=0.000173720158228
The camera center is given to be u0, v0 = 129.924663379, 99.1864303447.
In the above formulae, p1=k4 and p2=k5, u and v are the undistorted co-ordinates and ud, vd the corresponding distorted ones.
My understanding of the undistorted co-ordinates is that it is a mapping of every undistorted co-ordinate to a distorted one. Therefore, the ranges of both undistorted and distorted co-ordinates should be between (0-240, 0-180). However, going by the above formula, I get extremely high numbers for distorted co-ordinates if I input undistorted co-ordinates in the range(0-240, 0-180) which goes against this understanding. Have I done something wrong here?
Related
I am following the OpenCV camera calibration tutorial https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_calib3d/py_calibration/py_calibration.html to run camera calibration
ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, gray.shape[::-1],None,None)
What I want to do next is to reconstruction the 3D location of some feature points. The features points are defined in the image space. Here is what I am planning to do:
Found the new camera matrix:
h, w = my image dimension
newcameramtx, roi=cv2.getOptimalNewCameraMatrix(mtx,dist,(w,h),1,(w,h))
Undistort the feature point location:
new_points= cv2.undistortPoints(my_feature_points, mtx, dist, P=newcameramtx)
Reconstruct the 3D coordinate of the feature points for a given Z. I have two problems here. First, I do not know how to reconstruct the 3D coordinate. 2. When I do it, should I use the original camera matrix "mtx" or the new camera matrix "newcameramtx". And how about "roi"? where should I apply it?
Thank you very much.
Take a look at this version of the docs, which I find easier to read. The key equation is this one:
Once you have undistorted your image this equation applies. The matrix with fx, fy, cx, and cy is your camera matrix, often called M.
This equation tells you how to go from 2D pixel locations, on the left (x, y), to 3D locations in the world (on the right, [X, Y, Z].
First, I do not know how to reconstruct the 3D coordinate
To do that, we can apply the equation. Given a pixel location and a range (plug in as w), we have:
which we can do in code like so:
pixels = [x, y, range]
XYZ = np.mult( np.linalg.inv(mtx), pixels)
I'm not sure that you want to be calling getOptimalNewCameraMatrix, because that is cropping out pixels that may not be valid. I'd skip that for the moment until you have a better understanding of the system. The ROI is telling you where the undistorted image won't have any blank pixels.
I really recommend the book Learning OpenCV (or the new version 3 one); it helped me a huge amount. It took me from getting really frustrated reading the docs (which assume a lot of prior knowledge) to actually understanding what was going on.
From my understanding, undistortPoints takes a set of points on a distorted image, and calculates where their coordinates would be on an undistorted version of the same image. Likewise, projectPoints maps a set of object coordinates to their corresponding image coordinates.
However, I am unsure if projectPoints maps the object coordinates to a set of image points on the distorted image (ie. the original image) or one that has been undistorted (straight lines)?
Furthermore, the OpenCV documentation for undistortPoints states that 'the function performs a reverse transformation to projectPoints()'. Could you please explain how this is so?
Quote from the 3.2 documentation for projectPoints():
Projects 3D points to an image plane.
The function computes
projections of 3D points to the image plane given intrinsic and
extrinsic camera parameters.
You have the parameter distCoeffs:
If the vector is empty, the zero distortion coefficients are assumed.
With no distorsion the equation is:
With K the intrinsic matrix and [R | t] the extrinsic matrix or the transformation that transforms a point in the object or world frame to the camera frame.
For undistortPoints(), you have the parameter R:
Rectification transformation in the object space (3x3 matrix). R1 or R2 computed by cv::stereoRectify can be passed here. If the matrix is empty, the identity transformation is used.
The reverse transformation is the operation where you compute for a 2D image point ([u, v]) the corresponding 3D point in the normalized camera frame ([x, y, z=1]) using the intrinsic parameters.
With the extrinsic matrix, you can get the point in the camera frame:
The normalized camera frame is obtained by dividing by the depth:
Assuming no distortion, the image point is:
And the "reverse transformation" assuming no distortion:
Related posts:
Exact definition of the matrices in OpenCv StereoRectify
What is the camera frame of the rvec and tvec calculated from the cv::calibrateCamera? Is it the original (distorted) camera or the undistorted one? Does the camera coordinate change when the image is undistorted (not rectified)?
What is the R1 from the cv::stereoRectify(). To my understanding, R1 rotate the left camera coordinate (O_c) to a frontal parallel camera coordinate (O_cr) so that the image is rectified (row aligned with the right one). In other word, apply R1 on the 3D points in the O_cr will result in points in the O_c. (or is it the other way around?)
Few posts and the OpenCV book tried to explain it, but I just want to confirm that I understand it clearly. As the explanation of rotating image plane is confusing for me.
Thanks!
I can only reply to 1)
rvec and tvec describe the camera pose expressed in your calibration pattern's coordinate system. For each calibration pose you get an individual rvec and tvec.
Undistortion does not influence the camera position. Pixel positions are modified by using the radial and tangential distortion parameters (distCoeffs) resulting from the camera calibration.
I am searching lots of resources on internet for many days but i couldnt solve the problem.
I have a project in which i am supposed to detect the position of a circular object on a plane. Since on a plane, all i need is x and y position (not z) For this purpose i have chosen to go with image processing. The camera(single view, not stereo) position and orientation is fixed with respect to a reference coordinate system on the plane and are known
I have detected the image pixel coordinates of the centers of circles by using opencv. All i need is now to convert the coord. to real world.
http://www.packtpub.com/article/opencv-estimating-projective-relations-images
in this site and other sites as well, an homographic transformation is named as:
p = C[R|T]P; where P is real world coordinates and p is the pixel coord(in homographic coord). C is the camera matrix representing the intrinsic parameters, R is rotation matrix and T is the translational matrix. I have followed a tutorial on calibrating the camera on opencv(applied the cameraCalibration source file), i have 9 fine chessbordimages, and as an output i have the intrinsic camera matrix, and translational and rotational params of each of the image.
I have the 3x3 intrinsic camera matrix(focal lengths , and center pixels), and an 3x4 extrinsic matrix [R|T], in which R is the left 3x3 and T is the rigth 3x1. According to p = C[R|T]P formula, i assume that by multiplying these parameter matrices to the P(world) we get p(pixel). But what i need is to project the p(pixel) coord to P(world coordinates) on the ground plane.
I am studying electrical and electronics engineering. I did not take image processing or advanced linear algebra classes. As I remember from linear algebra course we can manipulate a transformation as P=[R|T]-1*C-1*p. However this is in euclidian coord system. I dont know such a thing is possible in hompographic. moreover 3x4 [R|T] Vector is not invertible. Moreover i dont know it is the correct way to go.
Intrinsic and extrinsic parameters are know, All i need is the real world project coordinate on the ground plane. Since point is on a plane, coordinates will be 2 dimensions(depth is not important, as an argument opposed single view geometry).Camera is fixed(position,orientation).How should i find real world coordinate of the point on an image captured by a camera(single view)?
EDIT
I have been reading "learning opencv" from Gary Bradski & Adrian Kaehler. On page 386 under Calibration->Homography section it is written: q = sMWQ where M is camera intrinsic matrix, W is 3x4 [R|T], S is an "up to" scale factor i assume related with homography concept, i dont know clearly.q is pixel cooord and Q is real coord. It is said in order to get real world coordinate(on the chessboard plane) of the coord of an object detected on image plane; Z=0 then also third column in W=0(axis rotation i assume), trimming these unnecessary parts; W is an 3x3 matrix. H=MW is an 3x3 homography matrix.Now we can invert homography matrix and left multiply with q to get Q=[X Y 1], where Z coord was trimmed.
I applied the mentioned algorithm. and I got some results that can not be in between the image corners(the image plane was parallel to the camera plane just in front of ~30 cm the camera, and i got results like 3000)(chessboard square sizes were entered in milimeters, so i assume outputted real world coordinates are again in milimeters). Anyway i am still trying stuff. By the way the results are previosuly very very large, but i divide all values in Q by third component of the Q to get (X,Y,1)
FINAL EDIT
I could not accomplish camera calibration methods. Anyway, I should have started with perspective projection and transform. This way i made very well estimations with a perspective transform between image plane and physical plane(having generated the transform by 4 pairs of corresponding coplanar points on the both planes). Then simply applied the transform on the image pixel points.
You said "i have the intrinsic camera matrix, and translational and rotational params of each of the image.” but these are translation and rotation from your camera to your chessboard. These have nothing to do with your circle. However if you really have translation and rotation matrices then getting 3D point is really easy.
Apply the inverse intrinsic matrix to your screen points in homogeneous notation: C-1*[u, v, 1], where u=col-w/2 and v=h/2-row, where col, row are image column and row and w, h are image width and height. As a result you will obtain 3d point with so-called camera normalized coordinates p = [x, y, z]T. All you need to do now is to subtract the translation and apply a transposed rotation: P=RT(p-T). The order of operations is inverse to the original that was rotate and then translate; note that transposed rotation does the inverse operation to original rotation but is much faster to calculate than R-1.
I have a pair of matched 2D features extracted from rectified stereo image. Using cvPerspectiveTransform function in OpenCV, I attempted to reconstruct those features in 3D. The result is not consistent with the actual object dimension in real world. I realize there is a function in Matlab calibration toolbox that converts 2D stereo features into 3D point cloud. Nevertheless, the features are lifted from original images.
If I want to work with rectified images, is it possible to reconstruct the 3D locations based on 2D feature locations and disparity information.
If you know the focal length (f) and the baseline width (b, the distance of the projection axis of both cameras) as well as the disparity (d) in a rectified stereo image pair, you can calculate the distance (Z) with the following formula:
Z = f*(b/d);
This follows from the following equations:
x_l = f*(X/Z); // projecting a 3D point onto the left image
x_r = f*((X+b)/Z); // projecting the same 3D point onto the right image
d = x_r - x_l = f * (b/Z); // calculating the disparity
Solving the last equation for Z should lead to the formula given above.