I'm doing a mobile augmented reality app. I need to calibrate my camera to get the intrinsic and extrinsic parameters using chessboard calibration.
Can I assum that if I calibrate my nexus 4, all nexus will have the same focal length, skew factor and distortion matrix ?
Thanks
Well, the answer can be both YES and NO. As you say, in real life none camera is exactly the same with another one, not even if they came from the same manufacturer. But, in order to make our lifes easier, yes we use this simplification, even for photogrammetric/computer vision projects, were the accuracy demands are quite high.
Most of the cameras come with undistortion operation coded into a camera pipeline so you most likely don't need to search for distortion parameters at all. Just check that straight lines at the image periphery are really straight. I expect the skew to be close to zero and fx=fy since pixels are square.
Apart from the parameters you mentioned there is also two for principal points Cx, Cy (intersection of an optical axis with the sensor that is often close to w/2, h/2). So overall you have only 3 parameters: F, Cx, Cy with the first one being the most variable among phones of the same model (from my experience). If you aren't using your phone to figure a relative position of another camera most likely you need to know only focal length accurately.
Obviously when you need to worry about a single parameter there are easier ways to get it than using a chessboard rig and trying to find extrinsic parameters in addition to the intrinsic ones. You can figure it out even without measurements - just quire a camera field of view (such as getHorizontalViewAngle()) and use
tan(fov) = image_width/2 / f
Alternatively you can do a simple measurement keeping your phone parallel to the target: for a vertical target of size H that produces image of h pixels you get f as
f/z = h/H
Well... if this camera has a built-in autofocus, the focal length will be changed all the time
Related
I am trying to find the coordinates of an object, which is detected from single camera, by using OpenCV. The camera will be mounted on the drone, looking through directly to the surface.
I have:
-Camera's coordinates from GPS sensor on the drone.
-Camera's height .
-Camera's intrinsic parameters.
3D Reconstruction formula
According to this formula, I need to find the extrinsic parameters to find the real world coordinates. I suppose to be use OpenCV’s solvePnP method to find extrinsic parameters. As I know, extrinsic parameters are about the camera location but my camera will be on the drone and the location will be change. Is the extrinsic parameters are constant just like the intrinsic parameters?
Is there any other way to do this calculation?
What you're trying to do is what is called monoplotting. In order to estimate XY real world coordinates from a single image you need to know the following:
X0, Y0, Z0 real world coordinates of your camera
Exterior orientation of your camera (Pitch, Roll and Yaw)
Interior orientation of your camera (focal length, principle point in x and y direction, radial and tangential distortion parameters)
So the XYZ of your camera and the exterior orientations are typically stored within the exif data of your image. This does however depend on the drone. There are some good python modules for extracting this information in the images. I use exifread.
The interior orientation can be more difficult to get, as you need to perform at camera calibration, which you can do with OpenCV. You should be able to use the tutorial Yunus linked in his comment. However there is a shortcut if you use photogrammetry software from pix4D, you can use their camera database which stores information about many different drones and their interior orientations. They are not perfect but should be alright for many use cases, see link.
When you have all of these parameters, you need to do the following:
Undistort your images
Create image coordinates of the points you wish to know the real world coordinate of, use undistorted image.
Create rotation matrices with rotations along X,Y,Z axis and do matrix multiplikation on them (RXRYRZ)
Apply collinearity equations
Regarding 1. you can use cv2.undistort for this, the link Yunus provided have a tutorial for this as well.
Regarding 3, I'm sure OpenCV probably can provide this matrix for you, but creating the function for your self is quite easy and good for understanding what is going on. See wikipedia: link It can be a little confusing which of the pitch, roll, yaw angles to use for which matrices. This all depends on how your cameras exterior coordinate system is. The typical convention is that pitch is around x, roll is around y and yaw is around z.
Regarding 4. The collinearity equations depends on the rotation matrices you created, see: link the term Z-Z0 just means the negative relative height above the object you try to find coordinates for. So if you do not have the relative height of your drone you need to know the height of your object and subtract the drones height from the objects height (you'll get a negative number).
I hope this helps you and have pointed you towards the right direction.
I need to find the intrinsic parameters of a CCTV camera using a set of historic footage images (That is all I got, no control on the environment, thus no chessboard calibration).
The good news is that I have the access to some ground-truth real-world coordinates, visible in most of the images.
Just wondering if there is any solid approach to come up with the camera intrinsic parameters.
P.S. I already found the homography matrix using cv2.findHomography in Python.
P.S. I have already tested QTcalib on two machines, but it is unable to visualize the images in the first place. Not sure what is wrong with it.
Thanks in advance.
intrinsic parameters contain both fx fy cx cy and skew with additional distortion parameters k1-k5 r1-r2.
Assuming you have no distortion and cx and cy are perfectly in the center. Image origin at top left as a normal understanding of the image. As you say you know some ground truth level 3D points.3D measurements are with respect to camera optical axis. Then this 3D point P can be projected into camera image plane called p. The P p O(the camera optical center) with center lines forms isosceles triangle.
fx / (p_x-cx) = P_z / P_x
fx = (p_x-cx) * P_z / P_x
The same goes for the fy. and usually fx and fy are the same.
This is under the perfect assumption that you don't have distortion on camera. If you start to have distortion, then you need to find enough sample points all over the image to form distortion understanding as shown below. One or 2 points won't give you the whole picture understanding.
There are some cheats in some papers that using sea vanishing lines(see ref, it is a series of works) or perfect 3D building vanishing points to detect the distortion. We start from extrinsic to intrinsic and it can get some good guess after some trial eventually. But it is very much in research and can not apply to general cases.
Ref: Han Wang, Wei Mou, Xiaozheng Mou, Shenghai Yuan, Soner Ulun, Shuai Yang and Bok-Suk Shin, An Automatic Self-Calibration Approach for Wide Baseline Stereo Cameras Using Sea Surface Images, unmanned system
If all you have is a video and a few 3d points, your best bet is probably to matchmove it, that is, do a manually assisted bundle adjustment using a 3D computer graphics environment, e.g. Blender. There are a lot of tutorials online on how to do it (example). To add the 3d points as constraints, you build some shapes representing them in the virtual world (e.g. some small spheres) and place them so that their relative positions match the ground truth you have, then add them to the tracker solution.
Assuming the static scene, with a single camera moving exactly sideways at small distance, there are two frames and a following computed optic flow (I use opencv's calcOpticalFlowFarneback):
Here scatter points are detected features, which are painted in pseudocolor with depth values (red is little depth, close to the camera, blue is more distant). Now, I obtain those depth values by simply inverting optic flow magnitude, like d = 1 / flow. Seems kinda intuitive, in a motion-parallax-way - the brighter the object, the closer it is to the observer. So there's a cube, exposing a frontal edge and a bit of a side edge to the camera.
But then I'm trying to project those feature points from camera plane to the real-life coordinates to make a kind of top view map (where X = (x * d) / f and Y = d (where d is depth, x is pixel coordinate, f is focal length, and X and Y are real-life coordinates). And here's what I get:
Well, doesn't look cubic to me. Looks like the picture is skewed to the right. I've spent some time thinking about why, and it seems that 1 / flow is not an accurate depth metric. Playing with different values, say, if I use 1 / power(flow, 1 / 3), I get a better picture:
But, of course, power of 1 / 3 is just a magic number out of my head. The question is, what is the relationship between optic flow in depth in general, and how do I suppose to estimate it for a given scene? We're just considering camera translation here. I've stumbled upon some papers, but no luck trying to find a general equation yet. Some, like that one, propose a variation of 1 / flow, which isn't going to work, I guess.
Update
What bothers me a little is that simple geometry points me to 1 / flow answer too. Like, optic flow is the same (in my case) as disparity, right? Then using this formula I get d = Bf / (x2 - x1), where B is distance between two camera positions, f is focal length, x2-x1 is precisely the optic flow. Focal length is a constant, and B is constant for any two given frames, so that leaves me with 1 / flow again multiplied by a constant. Do I misunderstand something about what optic flow is?
for a static scene, moving a camera precisely sideways a known amount, is exactly the same as a stereo camera setup. From this, you can indeed estimate depth, if your system is calibrated.
Note that calibration in this sense is rather broad. In order to get real accurate depth, you will need to in the end supply a scale parameter on top of the regular calibration stuff you have in openCV, or else there is a single uniform ambiguity of the 3D (This last step is often called going to the "metric" reconstruction from only the "Euclidean").
Another thing which is apart of broad calibration is lens distortion compensation. Before anything else, you probably want to force your cameras to behave like pin-hole cameras (which real-world cameras usually dont).
With that said, optical flow is definetely very different from a metric depth map. If you properly calibraty and rectify your system first, then optical flow is still not equivalent to disparity estimation. If your system is rectified, there is no point in doing a full optical flow estimation (such as Farnebäck), because the problem is thereafter constrained along the horizontal lines of the image. Doing a full optical flow estimation (giving 2 d.o.f) will introduce more error after said rectification likely.
A great reference for all this stuff is the classic "Multiple View Geometry in Computer Vision"
I'm using openCV the calibrateCamera function to calibrate my camera. I started from the tutorial implementation, but there seems something wrong.
The camera is looking down on a table and i use a chessboard with an area that covers about 1/2 or 1/4 of my total image. Since I aim to track a flat object that slides over this table, I also slide my chessboard over this table.
So my first question is: is it ok that i move my chessboard over this table? Or do I have to make some 3D movements in order to get some good result?
Because I was wondering: how does the function guesses the distance between the table and the camera? He has only a guess of his focal point, and he has only one "eye", so there is no depth vision.
My second question: how does the bloody thing work? :p Can anyone show me some implementation of this function?
Thx!
the camera calibration needs a seed of points to calculate the camera matrix and the the position of the central point of the camera, and the distortion matrices , if you want to use a chessboard you have to take in consideration its dimension(I never used the circles function because the detection of chessboard is easier ) , the dimension of the chessboard should be pair X unpair number so you can get a correct rotation matrix ! the calibration function needs minimum 8x set of chessboardCorners and ( I use 30 tell 50) it depends on how precise you want to be .the return value of the calibration function is the re-projection error this should be near to zero if the calibration is good.
the cameraCalibration take a the size of used chessboards ( you can use different chessboardSize) and the dimension ( in mm or cm or even m etc.. ) your result will depend on your given dimension.
By the way after getting the chessboardCorners you have to refines them with the function CornerSubPix you can set how good the refinement is in the function parameter.
In the internet you can find a lot docs about this subject.
http://www.ics.uci.edu/~majumder/vispercep/cameracalib.pdf
I hope it helps !
regarding the chessboard positions, I got best results with 25-30 images
First I do 3 -4 images that show the chessboard at different distances full frame half 1/3 1/4
then I make sure to go to each corner, each center of each edge plus 4 rotation on each axis XYZ. When using a 640x480 sensor my reprojection error was mostly around 0.1 or even better
here a few links that got me in the right direction:
How to verify the correctness of calibration of a webcam?
My problem statement is very simple. But I am unable to get the opencv calibration work for me. I am using the code from here : source code.
I have to take images parallel to the camera at a fixed distance. I tried taking test images (about 20 of them) only parallel to the camera as well as at different planes. Also I changed the size and the no of squares.
What would be the best way to calibrate in this scenario?
The undistorted image is cropped later, that's why it looks smaller.
After going through the images closely, the pincushion distortion seems to have been corrected. But the "trapezoidal" distortion still remains. Since the camera is mounted in a closed box, the planes at which I can take images is limited.
To simplify what Vlad already said: It is theoretically impossible to calibrate your camera with test images only parallel to the camera. You have to change your calibration board's orientation. In fact, you should have different orientation in each test image.
Check out the first two images in the link below to see how the calibration board should be slanted (or tilted):
http://www.vision.caltech.edu/bouguetj/calib_doc/
think about calibration problem as finding a projection matrix P:
image_points = P * 3d_points, where P = intrinsic * extrinsic
Now just bear with me:
You basically are interested in intrinsic part but the calibration algorithm has to find both intrinsic and extrinsic. Now, each column of projection matrix can be obtained if you select a 3D point at infinity, for example xInf = [1, 0, 0, 0]. This point is at infinity because when you transform it from homogeneous coordinates to Cartesian you get
[1/0, 0, 0]. If you multiply a projection matrix with a point at infinity you will get its corresponding column (1st for Xinf, 2nd for yInf, 3rd for zInf and 4th for camera center).
Thus the conclusion is simple - to get a projection matrix (that is a successful calibration) you have to clearly see points at infinity or vanishing points from the converging extensions of lines in your chessboard rig (aka end of the railroad tracks at the horizon). Your images don’t make it easy to detect vanishing points since you don’t slant your chessboard, nor rotate nor scale it by stepping back. Thus your calibration will always fail.