Camera calibration focal length twice as large as expected - opencv

I used the CameraCalibration node in Meshroom 2021.1.0 using a checkerboard grid to do a camera calibration. From what I understand, Meshroom is using OpenCV, so this question indirectly relates to the calibration process in OpenCV as well.
The lens I'm using is advertised as an 8mm lens, so I was expecting a focal length of something between 7 and 9 mm, but the fx value was 2541.273 and fy value was 2641.111 and I know the sensor pixel size is 6 microns, so when converting from pixels to mm, I'm getting focal lengths of 15.247 mm and 15.847 mm respectively which is right around double what I would expect.
The checkerboard I'm using has 50 mm squares, and I specified the size of the square in the camera calibration, and I double checked the printed dimensions with calipers. I also verified that the size of my images were full resolution compared to the expected size based on sensor specifications, so it wasn't a case where the resolution was half or double the original sensor size or something like that.
Curious if there is anything obvious I may have missed that would cause the focal length in the calibration to come out double what is expected.
I went through a similar calibration process with my smartphone and the camera I was testing with advertised a focal length of 7 mm and the camera calibration returned in that case an fx of 7.21 mm and an fy of 7.20 mm. The only difference was the grid I was using in that test was using 30 mm squares and was 7 x 5 instead of 4 x 3, but the process to get those values was essentially the same.
Update:
I reran a camera calibration with a different set of images, and this time I got an fx of 23.07 mm and fy of 23.23 mm, so it would seem that the previous run that was off by a factor of 2 may have been a coincidence that it was off by 2. Given how inconsistent the focal length values are from one run to the next and how far off they are from expected values, I'm guessing that the errors that I'm seeing are due to poor calibration images being used in the process? The camera is fixed, so I'm moving the checkerboard on a surface, so mostly in a single plane. To get a good calibration do I just need a better variety of orientations that the checkerboard is captured in like different distances and different angles?
Is the size of the grid just too small for the field of view to get good calibration values from it? I calibrated with 80 calibration shots similar to the two above moving the board from one edge to the other.
I got a larger calibration target using the ChAruco pattern, and it looks like the values are more stable now, but every now and then if I repeat the calibration, I can get very far out numbers. Should the board below be large enough to get stable calibration values?

Related

Will bad camera calibration affect pixel coordinates?

I am working with Turtlebots and ROS, and using a camera to find the pixel positions of a marker in the camera. I've moved over from simulations to a physical system. The issue I'm having is that the pixel positions in my physical system did not match the pixel positions in the physical system despite the marker and everything else being in the same position as in the simulations. There was a shift in the vertical pixel position by about 40 pixels when everything else like the height between the camera and marker, the marker position, and the distance between the marker and camera were the same in both the physical and simulated system. The simulated system does not need a camera calibration matrix, it is assumed to be ideal.
The resolution I'm using is 640x480, so the center pixels should be cx=320 and cy=240, but what I noticed in the camera calibration matrix I was using in the physical system was that the cx was around 318, which is pretty accurate, but the cy was around 202, which is far from what it should be. This also made me think that the shift in pixel positions in the vertical direction is shifted with about the same amount of pixels that I'm getting as an error.
So is it right to assume that the error in the center pixel in the calibration could be causing the error in the pixel positions?
I have been trying to calibrate a USB camera (Logitech C920 I think) and I've been using the camera_calibrator ROS package found here http://wiki.ros.org/camera_calibration to calibrate the camera. I think the camera calibration did not go that well, seeing as I always have a pretty big error in either cx or cy. Here are the calibration matrices.
First calibration matrix, used 15x10 vertices with size 0.25
Recalibrated but did not actually use this yet, calibrated with 8x6 size 0.25
Same as previous, some difference between the two
The checkerboards were on A4 papers.
Thanks in advance.
I believe the answer to your question is to answer how to perform a better camera calibration.
Quoting from Calib.io enter link description here:
Choose the right size calibration target.
Perform calibration at the approximate working distance (WD) of your final application.
The target should have a high feature count.
Collect images from different areas and tilts.
Use good lighting.
Calibration is only as accurate as the calibration target used. Use laser or inkjet printed targets only to validate and test.
Per sample, proper mounting of calibration target and camera.
Remove bad observations. Carefully inspect reprojection errors.
Obtaining a low re-projection error does not equal a good camera calibration. Be careful of over fitting.

Required tolerance for camera calibration target

In reading about and experimenting with camera calibration I haven't seen any mention of the required tolerance for the placement of calibration targets. For example say I have a field of view of 200mm x 30mm and I want to be able to measure the position of objects in this field to within 1mm. I will calibrate my camera using a grid pattern and the OpenCV calibrateCamera flow. Say my calibration target is a printed chessboard grid with 5mm pitch. What is the tolerance on that 5mm spacing between corners on my target? Does a tighter tolerance result in more accurate pixel to real-world transformation? Does a tighter tolerance result in better distortion removal?
Note I'm measuring objects on a 2D plane, no depth measurement, and unfortunately I don't have the ability to move the calibration targets around and take multiple views of it. So I'm talking specifically about calibrating using a single view.
Calibration using a single view is a poor idea, generally speaking, because of the small number of independent samples it entails, so it is possible that tolerance on the calibration grid manufacture be the least of your worries. But if you must...
The controlling factor here is the sensor's dot pitch. Given the nominal focal length of your lens, and that you want your calibration RMSE to be order of a few tenths of pixel, you can work out the angle spanned by, say, 1/10 of a pixel along the sensor's horizontal axis. Back projecting that at the nominal distance between the lens's exit pupil and the target will give you a length in 3D world that measures the uncertainty in a target's corner location at the calibration optimum. Your physical target points should be known at least as accurately, and normally better.
Example:
Setup: Dot pitch 5um, 16mm focal lens, 200mm working distance to target.
Backprojected 1/10 pixel: 200/16*0.5um =~ 6um.
Backprojected 1/2 pixel : 200/16*2.5um =~ 31um.
You can loosen that if you assume perfect Chi-square scaling of the errors with the square root of the number of the data points. If you have, say, 100 corners, you can multiply that by 10, i.e. ~ 300um for 1/2 pixel
Note that with this kind of tolerances temperature control (for camera and target) may become a factor to keep into account.

Is there a way to find mm per pixel value for a camera?

I need to implement dimension inspection of an object with a tolerance of 20 microns using image processing. To measure the dimension in mm, i need the mm per pixel value for pixel to mm conversion.
Camera and lens Specifications:
5 MP Matrix vision camera (2592 x 1944)
25 mm lens
How i tried to do it:
I used a 30 cm ruler to get the actual field of view in mm covered by the camera.I got a plot of the image using Matplotlib function in OpenCV as shown in the fig.
Image for scaling
From the image i got 31 mm as the actual width covered by the camera and the camera resolution is 2592 x 1944. So i obtained mm/pixel = 31/2952 = 0.011959876.
But i want to know if it is the correct way to find the mm/pixel value using a centimeter scale specially when tolerance of 20 micron is needed in dimension inspection. If this is not the correct way, then a solution procedure for finding mm/pixel value would be really helpful.
I believe what you are doing really borderline. First of all, to be as precise as possible I would use the right (or left) edge of the most left and most right ruler ticks like I sketched here:
and then use this value in pixel to calculate the mm/pixel calibration value. Even using this method 20 mu is really tough to achieve. Let's say we can determine the ruler tick edge position with a precision of 2 pixels (very optimistic) then you would have an error of about 31mm/2580 * 2, which is about 25 mu.
If you really need the 20mu calibration precision I would go for a microscope calibration target. I've been always used one of those for this kind of calibration task.
20 microns over a field of view of 31 mm = 31000 µm corresponds to 1.7 pixel, so your measurement error must be smaller than that. This is a stringent requirement. Your ruler and manual operation are not appropriate.
In the first place, you should check the magnitude of the lens distortion, which could very well exceed these 1.7 pixels. You will need a precise calibration procedure that can fit a deformation model to the image. For this purpose you should use a certified calibration target such as grid of dots or a chessboard pattern.
At the same time as the calibration software measures and compensates the distortion, it will provide the scale factor between physical units (knowing the grid spacing) and pixels. You can measure feature location on the target by blob analysis or gauging techniques, then use least-squares fitting of a model.
Software packages made for machine vision applications do contain such tools.
Also be aware that there can be a bias in the dimensional measurement of the object due to mis-location of the edges. Simply moving the light source can result in variations of the measured size.
If your objects are always the same and at the same place in the field of view, a cheap solution is to establish a repeatable measurement procedure in pixels, and physically measure one of the parts. This will give you a scale factor valid in the same conditions.
But simply moving the object will have a noticeable effect, both by changing the light reflection/shadows on edges and by having a different distortion.

Same intrinsic parameter for same camera?

I'm doing a mobile augmented reality app. I need to calibrate my camera to get the intrinsic and extrinsic parameters using chessboard calibration.
Can I assum that if I calibrate my nexus 4, all nexus will have the same focal length, skew factor and distortion matrix ?
Thanks
Well, the answer can be both YES and NO. As you say, in real life none camera is exactly the same with another one, not even if they came from the same manufacturer. But, in order to make our lifes easier, yes we use this simplification, even for photogrammetric/computer vision projects, were the accuracy demands are quite high.
Most of the cameras come with undistortion operation coded into a camera pipeline so you most likely don't need to search for distortion parameters at all. Just check that straight lines at the image periphery are really straight. I expect the skew to be close to zero and fx=fy since pixels are square.
Apart from the parameters you mentioned there is also two for principal points Cx, Cy (intersection of an optical axis with the sensor that is often close to w/2, h/2). So overall you have only 3 parameters: F, Cx, Cy with the first one being the most variable among phones of the same model (from my experience). If you aren't using your phone to figure a relative position of another camera most likely you need to know only focal length accurately.
Obviously when you need to worry about a single parameter there are easier ways to get it than using a chessboard rig and trying to find extrinsic parameters in addition to the intrinsic ones. You can figure it out even without measurements - just quire a camera field of view (such as getHorizontalViewAngle()) and use
tan(fov) = image_width/2 / f
Alternatively you can do a simple measurement keeping your phone parallel to the target: for a vertical target of size H that produces image of h pixels you get f as
f/z = h/H
Well... if this camera has a built-in autofocus, the focal length will be changed all the time

Relative Camera Pose Estimation using OpenCV

I'm trying to estimate the relative camera pose using OpenCV. Cameras in my case are calibrated (i know the intrinsic parameters of the camera).
Given the images captured at two positions, i need to find out the relative rotation and translation between two cameras. Typical translation is about 5 to 15 meters and yaw angle rotation between cameras range between 0 - 20 degrees.
For achieving this, following steps are adopted.
a. Finding point corresponding using SIFT/SURF
b. Fundamental Matrix Identification
c. Estimation of Essential Matrix by E = K'FK and modifying E for singularity constraint
d. Decomposition Essential Matrix to get the rotation, R = UWVt or R = UW'Vt (U and Vt are obtained SVD of E)
e. Obtaining the real rotation angles from rotation matrix
Experiment 1: Real Data
For real data experiment, I captured images by mounting a camera on a tripod. Images captured at Position 1, then moved to another aligned Position and changed yaw angles in steps of 5 degrees and captured images for Position 2.
Problems/Issues:
Sign of the estimated yaw angles are not matching with ground truth yaw angles. Sometimes 5 deg is estimated as 5deg, but 10 deg as -10 deg and again 15 deg as 15 deg.
In experiment only yaw angle is changed, however estimated Roll and Pitch angles are having nonzero values close to 180/-180 degrees.
Precision is very poor in some cases the error in estimated and ground truth angles are around 2-5 degrees.
How to find out the scale factor to get the translation in real world measurement units?
The behavior is same on simulated data also.
Have anybody experienced similar problems as me? Have any clue on how to resolve them.
Any help from anybody would be highly appreciated.
(I know there are already so many posts on similar problems, going trough all of them has not saved me. Hence posting one more time.)
In chapter 9.6 of Hartley and Zisserman, they point out that, for a particular essential matrix, if one camera is held in the canonical position/orientation, there are four possible solutions for the second camera matrix: [UWV' | u3], [UWV' | -u3], [UW'V' | u3], and [UW'V' | -u3].
The difference between the first and third (and second and fourth) solutions is that the orientation is rotated by 180 degrees about the line joining the two cameras, called a "twisted pair", which sounds like what you are describing.
The book says that in order to choose the correct combination of translation and orientation from the four options, you need to test a point in the scene and make sure that the point is in front of both cameras.
For problems 1 and 2,
Look for "Euler angles" in wikipedia or any good math site like Wolfram Mathworld. You would find out the different possibilities of Euler angles. I am sure you can figure out why you are getting sign changes in your results based on literature reading.
For problem 3,
It should mostly have to do with the accuracy of our individual camera calibration.
For problem 4,
Not sure. How about, measuring a point from camera using a tape and comparing it with the translation norm to get the scale factor.
Possible reasons for bad accuracy:
1) There is a difference between getting reasonable and precise accuracy in camera calibration. See this thread.
2) The accuracy with which you are moving the tripod. How are you ensuring that there is no rotation of tripod around an axis perpendicular to surface during change in position.
I did not get your simulation concept. But, I would suggest the below test.
Take images without moving the camera or object. Now if you calculate relative camera pose, rotation should be identity matrix and translation should be null vector. Due to numerical inaccuracies and noise, you might see rotation deviation in arc minutes.

Resources