Stereo calibration, do extrinsics change if the lens changes? - opencv

I have a stereo camera setup. Typically I would calibrate the intrinsics of each camera, and then using this result calibrate the extrinsics, so the baseline between the cameras.
What happens now if I change for example the focus or zoom on the lenses? Of course I will have to re-calibrate the intrinsics, but what about the extrinsics?
My first thought would be no, the actual body of the camera didn't move. But on my second thought, doesn't the focal point within the camera change with the changed focus? And isn't the extrinsic calibration actually the calibration between the two focal points of the cameras?
In short: should I re-calibrate the extrinsics of my setup after changing the intrinsics?
Thanks for any help!

Yes you should.
It's about the optical center of each camera. Different lenses put that in different places (but hopefully along the optical axis).

Related

Difference between stereo camera calibration vs two single camera calibrations using OpenCV

I have a vehicle with two cameras, left and right. Is there a difference between me calibrating each camera separately vs me performing "stereo calibration" ? I am asking because I noticed in the OpenCV documentation that there is a stereoCalibrate function, and also a stereo calibration tool for MATLAB. If I do separate camera calibration on each and then perform a depth calculation using the undistorted images of each camera, will the results be the same ?
I am not sure what the difference is between the two methods. I performed normal camera calibration for each camera separately.
For intrinsics, it doesn't matter. The added information ("pair of cameras") might make the calibration a little better though.
Stereo calibration gives you the extrinsics, i.e. transformation matrices between cameras. That's for... stereo vision. If you don't perform stereo calibration, you would lack the extrinsics, and then you can't do any depth estimation at all, because that requires the extrinsics.
TL;DR
You need stereo calibration if you want 3D points.
Long answer
There is a huge difference between single and stereo camera calibration.
The output of single camera calibration are intrinsic parameters only (i.e. the 3x3 camera matrix and a number of distortion coefficients, depending on the model used). In OpenCV this is accomplished by cv2.calibrateCamera. You may check my custom library that helps reducing the boilerplate.
When you do stereo calibration, its output is given by the intrinsics of both cameras and the extrinsic parameters.
In OpenCV this is done with cv2.stereoCalibrate. OpenCV fixes the world origin in the first camera and then you get a rotation matrix R and translation vector t to go from the first camera (origin) to the second one.
So, why do we need extrinsics? If you are using a stereo system for 3D scanning then you need those (and the intrinsics) to do triangulation, so to obtain 3D points in the space: if you know the projection of a general point p in the space on both cameras, then you can calculate its position.
To add something to what #Christoph correctly answered before, the intrinsics should be almost the same, however, cv2.stereoCalibrate may improve the calculation of the intrinsics if the flag CALIB_FIX_INTRINSIC is not set. This happens because the system composed by two cameras and the calibration board is solved as a whole by numerical optimization.

How to rectify my own image to the cameras of the KITTI dataset using OpenCV

Based on the documentation of stereo-rectify from OpenCV, one can rectify an image based on two camera matrices, their distortion coefficients, and a rotation-translation from one camera to another.
I would like to rectify an image I took using my own camera to the stereo setup from the KITTI dataset. From their calibration files, I know the camera matrix and size of images before rectification of all the cameras. All their data is calibrated to their camera_0.
From this PNG, I know the position of each of their cameras relative to the front wheels of the car and relative to ground.
I can also do a monocular calibration on my camera and get a camera matrix and distortion coefficients.
I am having trouble coming up with the rotation and translation matrix/vector between the coordinate systems of the first and the second cameras, i.e. from their camera to mine or vice-versa.
I positioned my camera on top of my car at almost exactly the same height and almost exactly the same distance from the center of the front wheels, as shown in the PNG.
However now I am at a loss as to how I can create the joint rotation-translation matrix. In a normal stereo-calibrate, these are returned by the setereoCalibrate function.
I looked at some references about coordinate transformation but I don't have sufficient practice in them to figure it out on my own.
Any suggestions or references are appreciated!

Can I move my camera after intrinsic calibration?

I have 2 camera settings where the extrinsic properties between the two cameras do not matter. Generally, I start my work by calibrating each camera intrinsically and then move on to image processing.
I was just thinking - since the intrinsic calibration gives me a camera matrix that contains information on focal length, optical centre etc, as well as the distortion coefficients. From my understanding, these parameters do not change as long as the camera lenses are not adjusted. Therefore, maybe I am able to move the cameras after all?
I am thinking maybe this idea comes from my shallow understanding of the camera calibration. Please share your opinions on this matter. Thanks!
Yes, you have the correct understanding of camera calibration.
A camera's intrinsic parameters do not change if you move the camera, that is what separates the intrinsic parameters from the extrinsic ones. As you point out, the intrinsic parameters may change if you adjust the lens. Careful: depending on the lens type, simply focusing could be such a change to the lens.
There may be small influences on the intrinsic parameters from moving the camera (as the camera is not perfectly rigid) or from changing surroundings (e.g. temperature), but they are small enough to be disregarded for most use cases.

Where the origin of the camera system really is?

When we compute the pose of the camera with respect to a primitive like a marker or a 3D model..etc, the origin of that primitive is usually precisly known like the origin of a chessboard or a marker (in blue).
Now the question is where is the origin of the camera (in black)? The vector translation of the pose is expressed with respect to which reference? How can we determine where it is?
The optical center is meant to be on the optical axis (ideally it projects to the center of the image), at a distance of the sensor equal to the focal length, which can be expressed in pixel units (knowing the pixel size).
You can see where the optical axis lies (it is the symmetry axis of the lens), but the optical center is somewhere inside the camera.
OpenCV uses the pinhole camera model to model cameras. The origin of the 3D coordinate system used in OpenCV, for camera calibration and other purposes, is the camera itself, or more specifically, the pinhole of the camera model. It is the point where all light rays that enter the camera converge to a point, and is also called the "centre of projection".
Real cameras with lenses do not actually have a pinhole. But by analysing images taken with the camera, it is possible to calculate a pinhole model which models the real camera's optics very closely. That is what OpenCV does when it calibrates your camera. As #Yves Daoust said, the pinhole of this model (and hence the 3D coordinate origin) will be a 3D point somewhere inside your camera (or possibly behind it, depending on its focal length), but it is not possible for OpenCV to say exactly where it is relative to your camera's body, because OpenCV knows nothing about the physical size or shape of your camera or its sensor.
Even if you knew exactly where the origin is relative to your camera's body, it probably would not be of much use, because you can't take any physical measurements with respect to a point that is located inside your camera without taking it apart! Really, you can do everything you need to do in OpenCV without knowing this detail.

Camera rotation and translation from tracking?

What is the best way to find camera rotation and translation from tracking the scene without calibrating the camera?
I'm not sure how you will skip calibration because you get the translation and rotation from the calibration process. Maybe instead of using two cameras you'll compare frame 0 to frame 1000 from the same (moving) camera, but the calibration process is the same.
Check out Chapter 11 in the Learning OpenCV book. If you want to skip lens correction, check the section heading "Computing extrinsics only." The function is cvFindExtrinsicCameraParams2.

Resources