OpenCV Stereo Calibration of IR-Cameras - opencv

I have 2 webcams with removed IR blocking filters and applied visible light blocking filters. Thus, both cameras can only see IR light. So I can not calibrate the stereo cameras by oberserving points on a chessboard (because I don't see the chessboard). Instead of this I had the idea to use some amount of IR-LEDs as a tracking pattern. I could attach the LEDs on some chessboard, for instance. AFAIK, the OpenCV stereoCalibrate function awaits the objectPoints, as well as the imagePoints1 and imagePoints2 and will return both camera matrices, distortion coeffs as well as the fundamental matrix.
How many points in my images do I need to detect in order to get the function running appropriate? For the fundamental matrix I know the eight-point algorithm. So, are 8 points enough? The problem is, I don't want to use a huge amount of IR-LEDs as a tracking pattern.
Are there some better ways to do so?

Why not remove the filters, calibrate and then replace them?
(For pure curiosities sake what are you working on?)

Related

Difference between stereo camera calibration vs two single camera calibrations using OpenCV

I have a vehicle with two cameras, left and right. Is there a difference between me calibrating each camera separately vs me performing "stereo calibration" ? I am asking because I noticed in the OpenCV documentation that there is a stereoCalibrate function, and also a stereo calibration tool for MATLAB. If I do separate camera calibration on each and then perform a depth calculation using the undistorted images of each camera, will the results be the same ?
I am not sure what the difference is between the two methods. I performed normal camera calibration for each camera separately.
For intrinsics, it doesn't matter. The added information ("pair of cameras") might make the calibration a little better though.
Stereo calibration gives you the extrinsics, i.e. transformation matrices between cameras. That's for... stereo vision. If you don't perform stereo calibration, you would lack the extrinsics, and then you can't do any depth estimation at all, because that requires the extrinsics.
TL;DR
You need stereo calibration if you want 3D points.
Long answer
There is a huge difference between single and stereo camera calibration.
The output of single camera calibration are intrinsic parameters only (i.e. the 3x3 camera matrix and a number of distortion coefficients, depending on the model used). In OpenCV this is accomplished by cv2.calibrateCamera. You may check my custom library that helps reducing the boilerplate.
When you do stereo calibration, its output is given by the intrinsics of both cameras and the extrinsic parameters.
In OpenCV this is done with cv2.stereoCalibrate. OpenCV fixes the world origin in the first camera and then you get a rotation matrix R and translation vector t to go from the first camera (origin) to the second one.
So, why do we need extrinsics? If you are using a stereo system for 3D scanning then you need those (and the intrinsics) to do triangulation, so to obtain 3D points in the space: if you know the projection of a general point p in the space on both cameras, then you can calculate its position.
To add something to what #Christoph correctly answered before, the intrinsics should be almost the same, however, cv2.stereoCalibrate may improve the calculation of the intrinsics if the flag CALIB_FIX_INTRINSIC is not set. This happens because the system composed by two cameras and the calibration board is solved as a whole by numerical optimization.

How to improve accuracy of camera extrinsics calibration

I have a multi-camera system where the field of views are mostly non-overlapping. I have been researching on methods to calibrate the camera extrinsics and the first thing I'm going to try is to take a picture of a chessboard at a known location and use solvePnP from OpenCV to find the extrinsic rotation and translation vectors for each camera separately (following the method described in the answer here).
My problem is, this method uses only one measurement and as every measurement it is prone to errors. I assume that by taking multiple measurements, either by changing the position or the orientation of the chessboard, the accuracy can be improved. But what would be the best way to combine the rotation and translation obtained from the different measurements? A simple average?
In theory I would think that an option could be using solvePnP on all the points at the same time. Since I am calculating extrinsics the camera can't be moved so I would have to change to position and/or orientation of the board for each picture and measure the 3D points positions as accurately as possible each time.
I'm also wondering if using two chessboards in the same picture would be a possible solution, even if OpenCV doesn't seem to support multiple chessboard detection.
Is there a better way to measure extrinsics or anything that I'm missing?

Live 360° Panorama Image Stitching implementation

I am planning to implement a live 360° panorama stitcher having 6 cameras of the same model.
I came across the stitching_detailed.cpp implementation from OpenCV. The problem is that it takes around 1 second to stitch only 2 images together using my desired parameters, which is fairly slow.
As my application should be ran in real-time. I need to be able to stitch 6 images together in around 100 ms for it to be "acceptable". The output resolution should be around 0.2 Megapixels. Therefore, I am starting to do my own implementation in C++, based pretty much on what is done on stitchig_detailed. I am aiming to use as much as possible the CUDA functions on OpenCV (some of them are not even implemented stitching_detailed).
I have been carefully studying the stitching pipeline on which the previous algorithm is based, as described in Images stitching by OpenCV and in the paper Automatic Panoramic Image Stitching using Invariant Features.
As the stitching pipeline is too general, there are several assumptions I have made in order to simplify it and speed it up, I would like to get some feedback to know if they are valid:
All the images I will provide to the algorithm are for sure part of the panorama image. So I do not have to extra check on that.
The 6 cameras will be fixed in position and orientation. Therefore, I know beforehand the order in which the cameras need to be stitched into the panorama picture. I can therefore avoid trying to match images from cameras that are not contiguous.
As the cameras are going to remain static. It would be valid to perform the registration step in order to get the camera orientation Matrix R only once (as a kind of initialization). Afterwards, I could only perform the compositing block for subsequent frames. (Again all this assuming the cameras remain completely static).
I also have the following questions...
I can indeed calibrate the cameras prior to my application and obtain each of the intrinsic camera parameters Matrix K and its respective distortion parameters. Could I plug K into the stitching pipeline and therefore avoid the K calculation in the registration step?
What other thing (if any) could camera calibration bring into the pipeline? Distortion correction?
If my previous assumption about executing only the compositing block is correct... Could I still take out some parts of it? My guess is that maybe the seam finder should be ran only once (in the initialization of the algorithm).
Is exposure compensation needed at all for my application case? (As the cameras are literally the same).
Any lead would be deeply appreciated, thanks!
The first thing you can do to reduce your progressing time is to calibrate your camera so that you don't need to process images to find homography matrices based on features. Find them beforehand so that they are constant matrix

How to take stereo images using single camera?

I want to find the depth map for stereo images.At present i am working on the internet image,I want to take stereo images so that i can work on it by my own.How to take best stereo images without much noise.I have single camera.IS it necessary to do rectification?How much distance must be kept between the cameras?
Not sure I've understood your problem correclty - will try anyway
I guess your currently working with images from middlebury or something similar. If you want to use similar algorithms you have to rectify your images because they are based on the assumption that corresponding pixels are on the same line in all images. If you actually want depth images (!= disparity images) you also need to get the camera extrinsics.
Your setup should have two cameras and you have to make sure that they don't change there relative position/orientation - otherwise your rectification will break apart. In the first step you have to calibrate your system to get intrinsic and extrinsic camera parameters. For that you can either use some tool or roll your own with (for example) OpenCV (calib-module). Print out a calibration board to calibrate your system. Afterwards you can take images and use the calibration to rectify the images.
Regarding color-noise:
You could make your aperture very small and use high exposure times. In my own opinion this is useless because real world situations have to deal with such things anyway.
In short, there are plenty of stereo images on the internet that are already rectified. If you want to take your own stereo images you have to follow these three steps:
The relationship between distance to the object z (mm) and disparity in pixels D is inverse: z=fb/D, where f is focal length in pixels and b is camera separation in mm. Select b such that you have at least several pixels of disparity;
If you know camera intrinsic matrix and compensated for radial distortions you still have to rectify your images in order to ensure that matches are located in the same row. For this you need to find a fundamental matrix, recover essential matrix, apply rectifying homographies and update your intrinsic camera parameters... or use stereo pairs from the Internet.
The low level of noise in the camera image is helped by brightly illuminated scenes, large aperture, large pixel size, etc.; however, depending on your set up you still can end up with a very noisy disparity map. The way to reduce this noise is to trade-off with accuracy and use larger correlation windows. Another way to clean up a disparity map is to use various validation techniques such as
error validation;
uniqueness validation or back-and-force validation
blob-noise supression, etc.
In my experience:
-I did the rectification, so I had to obtain the fundamental matrix, and this may not be correct with some image pairs.
-Better resolution of your camera is better for the matching, I use OpenCV and it has an implementation of BRISK descriptor, it was useful for me.
-Try to cover the same area and try not to do unnecessary rotations.
-Once you understand the Theory, OpenCV is a good friend. Here is some result, but I am still working on it:
Depth map:
Rectified images:

OpenCV: Camera Pose Estimation

I try to match two overlapping images captured with a camera. To do this, I'd like to use OpenCV. I already extracted the features with the SurfFeatureDetector. Now I try to to compute the rotation and translation vector between the two images.
As far as I know, I should use cvFindExtrinsicCameraParams2(). Unfortunately, this method require objectPoints as an argument. These objectPoints are the world coordinates of the extracted features. These are not known in the current context.
Can anybody give me a hint how to solve this problem?
The problem of simultaneously computing relative pose between two images and the unknown 3d world coordinates has been treated here:
Berthold K. P. Horn. Relative orientation revisited. Berthold K. P. Horn. Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 545 Technology ...
EDIT: here is a link to the paper:
http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.64.4700
Please see my answer to a related question where I propose a solution to this problem:
OpenCV extrinsic camera from feature points
EDIT: You may want to take a look at bundle adjustments too,
http://en.wikipedia.org/wiki/Bundle_adjustment
That assumes an initial estimate is available.
EDIT: I found some code resources you might want to take a look at:
Resource I:
http://www.maths.lth.se/vision/downloads/
Two View Geometry Estimation with Outliers
C++ code for finding the relative orientation of two calibrated
cameras in presence of outliers. The obtained solution is optimal in
the sense that the number of inliers is maximized.
Resource II:
http://lear.inrialpes.fr/people/triggs/src/ Relative orientation from
5 points: a somewhat more polished C routine implementing the minimal
solution for relative orientation of two calibrated cameras from
unknown 3D points. 5 points are required and there can be as many as
10 feasible solutions (but 2-5 is more common). Also requires a few
CLAPACK routines for linear algebra. There's also a short technical
report on this (included with the source).
Resource III:
http://www9.in.tum.de/praktika/ppbv.WS02/doc/html/reference/cpp/toc_tools_stereo.html
vector_to_rel_pose Compute the relative orientation between two
cameras given image point correspondences and known camera parameters
and reconstruct 3D space points.
There is a theoretical solution, however, the OpenCV implementation of camera pose estimation lacks the needed tools.
The theoretical approach:
Step 1: extract the homography (the matrix describing the geometrical transform between images). use findHomography()
Step 2. Decompose the result matrix into rotations and translations. Use cv::solvePnP();
Problem: findHomography() returns a 3x3 matrix, corresponding to a projection from a plane to another. solvePnP() needs a 3x4 matrix, representing the 3D rotation/translation of the objects. I think that with some approximations, you can modify the solvePnP to give you some results, but it requires a lot of math and a very good understanding of 3D geometry.
Read more about at http://en.wikipedia.org/wiki/Transformation_matrix

Resources