Opencv Charuco Calibration board - opencv

I have an issue with the Charuco calibration. I have a large pre-printed board lying around that i would like to use, but my calibration results are useless. When generating my own board, I found out that it is using the correct markers but that the checkerboard is inverted, leading to different positions on the board. I use the opencv example (create_board_charuco.cpp) with the following parameters: ./create_board -d=0 -ml=120 -sl=150 -w=9 -h=4 -bb=1 -si=1 out_board.jpg
I assume this difference is also why my calibration results are bad. The resulting calibration patterns are below:
What the board i have looks like:
https://calib.io/pages/camera-calibration-pattern-generator
What the layout i generated with opencv looks like

This is due to a chance made in v4.6.0 of OpenCV. A legacy flag (default on) was already added and should be present in the next minor release.
https://github.com/opencv/opencv_contrib/issues/3291

Related

Should a chessboard/charuco board take up most of the image when performing camera calibration?

I've spent some time trying to calibrate two similar cameras (ExCam IPQ1715 and ExCam IPQ1765) to varying degrees of success, with the eventual goal of using them for short-range photogrammetry. I've been using a charuco board, along with the OpenCV Charuco calibration library, and have noticed that the quality of my calibration is closely tied to how much of the images is taken up by the board. (I measure calibration quality by RMS reprojection error given by OpenCV, and also by just seeing if the undistorted images appear to have straighter lines on the board than the originals).
I'm still pretty inexperienced, and there have been other factors messing with my calibration (leaving autofocus on, OpenCV charuco identification sometimes getting strange false positives on some images without me noticing), so my question is less about my results and more about best practice for camera calibration in general:
How crucial is it that the board (charuco, chessboard) take up most of the image space? Is there generally a minimum amount that it should cover? Is this even an issue at all, or am I likely mistaking it for another cause of bad calibration?
I've seen lots of calibration tutorials online where the board seems to take up a small portion of the image, but then have also found other people experiencing similar issues. In short, I'm horribly lost.
Any guidance would be awesome - thanks!
Consider the point that camera calibration calculation is a model fitting.
i.e. optimize the model parameters with the measurements.
So... You should pay attention to:
If the board image is too small to see the distortion in the board image,
is it possible to optimize the distortion parameters with such image?
If the pattern image is only distributed near the center of the image,
is it possible to estimate valid parameter values for regions far from the center?
(this will be an extrapolation).
If the pattern distribution is not uniform, the density of the data can affect the results.
e.g. With least square optimization, errors in regions with little data can be neglected.
Therefore, my suggestion is:
Pattern images that are extremely small are useless.
The data should cover the entire field of view of the camera image, and the distribution should be as uniform as possible.
Use enough data. With few data may cause overfitting.
Check the pattern recognition results of all images(sample code often omit this).

AR With External Tracking - Alignment is wrong, values are right

I recently managed to get my augmented reality application up and running close to what is expected. However, I'm having an issue where, even though the values are correct, the augmentation is still off by some translation! It would be wonderful to get this solved as I'm so close to having this done.
The system utilizes an external tracking system (Polaris Spectra stereo optical tracker) with IR-reflective markers to establish global and reference frames. I have a LEGO structure with a marker attached which is the target of the augmentation, a 3D model of the LEGO structure created using CAD with the exact specs of its real-world counterpart, a tracked pointer tool, and a camera with a world reference marker attached to it. The virtual space was registered to the real world using a toolset in 3D Slicer, a medical imaging software which is the environment I'm developing in. Below are a couple of photos just to clarify exactly the system I'm dealing with (May or may not be relevant to the issue).
So a brief overview of exactly what each marker/component does (Markers are the black crosses with four silver balls):
The world marker (1st image on right) is the reference frame for all other marker's transformations. It is fixed to the LEGO model so that a single registration can be done for the LEGO's virtual equivalent.
The camera marker (1st image, attached to camera) tracks the camera. The camera is registered to this marker by an extrinsic calibration performed using cv::solvePnP().
The checkerboard is used to acquire data for extrinsic calibration using a tracked pointer (unshown) and cv::findChessboardCorners().
Up until now I've been smashing my face against the mathematics behind the system until everything finally lined up. When I move where I estimate the camera origin to be to the reference origin, the translation vector between the two is about [0; 0; 0]. So all of the registration appears to work correctly. However, when I run my application, I get the following results:
As you can see, there's a strange offset in the augmentation. I've tried removing distortion correction on the image (currently done with cv::undistort()), but it just makes the issue worse. The rotations are all correct and, as I said before, the translations all seem fine. I'm at a loss for what could be causing this. Of course, there's so much that can go wrong during implementation of the rendering pipeline, so I'm mostly posting this here under the hope that someone has experienced a similar issue. I already performed this project using a webcam-based tracking method and experienced no issues like this even though I used the same rendering process.
I've been purposefully a little ambiguous in this post to avoid bogging down readers with the minutia of the situation as there are so many different details I could include. If any more information is needed I can provide it. Any advice or insight would be massively appreciated. Thanks!
Here are a few tests that you could do to validate that each module works well.
First verify your extrinsic and intrinsic calibrations:
Check that the position of the virtual scene-marker with respect to the virtual lego scene accurately corresponds to the position of the real scene-marker with respect to the real lego scene (e.g. the real scene-marker may have moved since you last measured its position).
Same for the camera-marker, which may have moved since you last calibrated its position with respect to the camera optical center.
Check that the calibration of the camera is still accurate. For such a camera, prefer a camera matrix of the form [fx,0,cx;0,fy,cy;0,0,1] (i.e. with a skew fixed to zero) and estimate the camera distortion coefficients (NB: OpenCV's undistort functions do not support camera matrices with non-zero skews; using such matrices may not raise any exception but will result in erroneous undistortions).
Check that the marker tracker does not need to be recalibrated.
Then verify the rendering pipeline, e.g. by checking that the scene-marker reprojects correctly into the camera image when moving the camera around.
If it does not reproject correctly, there is probably an error with the way you map the OpenCV camera matrix into the OpenGL projection matrix, or with the way you map the OpenCV camera pose into the OpenGL model view matrix. Try to determine which one is wrong using toy examples with simple 3D points and simple projection and modelview matrices.
If it reprojects correctly, then there probably is a calibration problem (see above).
Beyond that, it is hard to guess what could be wrong without directly interacting with the system. If I were you and I still had no idea where the problem could be after doing the tests above, I would try to start back from scratch and validate each intermediate step using toy examples.

Undistorting/rectify images with OpenCV

I took the example of code for calibrating a camera and undistorting images from this book: shop.oreilly.com/product/9780596516130.do
As far as I understood the usual camera calibration methods of OpenCV work perfectly for "normal" cameras.
When it comes to Fisheye-Lenses though we have to use a vector of 8 calibration parameters instead of 5 and also the flag CV_CALIB_RATIONAL_MODEL in the method cvCalibrateCamera2.
At least, that's what it says in the OpenCV documentary
So, when I use this on an array of images like this (Sample images from OCamCalib) I get the following results using cvInitUndistortMap: abload.de/img/rastere4u2w.jpg
Since the resulting images are cut out of the whole undistorted image, I went ahead and used cvInitUndistortRectifyMap (like it's described here stackoverflow.com/questions/8837478/opencv-cvremap-cropping-image). So I got the following results: abload.de/img/rasterxisps.jpg
And now my question is: Why is not the whole image undistorted? In some pics of my later results you can recognize that the laptop for example is still totally distorted. How can I acomplish even better results using the standard OpenCV methods?
I'm new to stackoverflow and I'm new to OpenCV as well, so please excuse any of my shortcomings when it comes to expressing my problems.
All chessboard corners should be visible to be found. The algorithm expect a certain size of chessboard such as 4x3 or 7x6 (for example). The white border around a chess board should be visible too or dark squares may not be defined precisely.
You still have high distortions at the image periphery after undistort() since distortions are radial (that is they increase with the radius) and your found coefficients are wrong. The latter are wrong since a calibration process minimizes the sum of squared errors in pixel coordinates and you did not represent the periphery with enough samples.
TODO: You have to have 20-40 chess board pattern images if you use 8 distCoeff. Slant your boards at different angles, put them at different distances and spread them around, especially at the periphery. Remember, the success of calibration depends on sampling and also on seeing vanishing points clearly from your chess board (hence slanting and tilting).

Object tracking openCV, questions, advices?

Short introduction: project of augmented reality
Goal: load 3D hairstyles templates on the head of someone.
So I am using OpenCV to track the face of the person, then I have to track the cap (We assume that the user has a cap and we can decide of a landmark or everything we need on the cap to detect it) of the user. Once I detected the landmark, I have to get the coordinates of the landmark, and then send it to the 3D engine to launch/update the 3D object.
So, to detect precisely the landmark(s) of the cap I tested a few methods first:
cvFindChessBoardCorner ... very efficient on a plan surface but not a cap (http://dsynflo.blogspot.com/2010/06/simplar-augmented-reality-for-opencv.html)
color detection (http://www.aishack.in/2010/07/tracking-colored-objects-in-opencv/) ... not really efficient. If the luminosity change, the colors change ...
I came to you today to think about it with you. Do I need special landmark on the cap ? (If yes Which one ? If not, How can I do ?)
Is it a good idea to mix up the detection of colors and detection of form ?
.... Am i on the right way ? ^^
I appreciate any advice regarding the use of a cap to target the head of the user, and the different functions I have to use in OpenCV library.
Sorry for my english if it's not perfect.
Thank you a lot !
A quick method, off the top of my head, is to combine both method.
Color tracking using histograms and mean shift
Here is an alternative color detection method using histogram:
Robust Hand Detection via Computer Vision
The idea is this:
For a cap of known color, say bright green/blue (like the kind of colors you see for image matting screen), you can pre-compute a histogram using just the hue and saturation color channels. We deliberately exclude the lightness channel to make it more robust to lighting variations. Now, with the histogram, you can create a back projection map i.e. a mask with a probability value at each pixel in the image indicating the probability that the color there is the color of the cap.
Now, after obtaining the probability map, you can run the meanshift or camshift algorithms (available in OpenCV) on this probability map (NOT the image), with the initial window placed somewhere above the face you detected using OpenCV's algorithm. This window will eventually end up at the mode of the probability distribution i.e. the cap.
Details are in the link on Robust Hand Detection I gave above. For more details, you should consider getting the official OpenCV book or borrowing it from your local library. There is a very nice chapter on using meanshift and camshift for tracking objects. Alternatively, just search the web using any queries along meashift/camshift for object tracking.
Detect squares/circles to get orientation of head
If in addition you wish to further confirm this final location, you can add 4 small squares/circles on the front of the cap and use OpenCV's built in algorithm to detect them only in this region of interest (ROI). It is sort of like detecting the squares in those QR code thing. This step further gives you information on the orientation of the cap and hence, the head, which might be useful when you render the hair. E.g. after locating 2 adjacent squares/circles, you can compute the angle between them and the horizontal/vertical line.
You can detect squares/corners using the standard corner detectors etc in OpenCV.
For circles, you can try using the HoughCircle algorithm: http://docs.opencv.org/modules/imgproc/doc/feature_detection.html#houghcircles
Speeding this up
Make extensive use of Region of Interests (ROIs)
To speed things up, you should, as often as possible, run your algorithm on small regions of the image(ROIs) (also the probability map). You can extract ROI from OpenCV image, which are themselves images, and run OpenCV's algorithms on them the same way you would run them on whole images. For e.g., you could compute the probability map for an ROI around the detected face. Similarly, the meanshift/camshift algorithm should only be run on this smaller map. Likewise for the additional step to detect squares or circles. Details can be found in the OpenCV book as well as quick search online.
Compile OpenCV with TBB and CUDA
A number of OpenCV's algorithm can achieve significant speed ups WITHOUT the programmer needing to do any additional work simply by compiling your OpenCV library with TBB (Thread building blocks) and CUDA support switched on. In particular, the face detection algorithm in OpenCV (Viola Jones) will run a couple times faster.
You can switch on these options only after you have installed the packages for TBB and CUDA.
TBB: http://threadingbuildingblocks.org/download
CUDA: https://developer.nvidia.com/cuda-downloads
And then compile OpenCV from source: http://docs.opencv.org/doc/tutorials/introduction/windows_install/windows_install.html#windows-installation
Lastly, I'm not sure whether you are using the "C version" of OpenCV. Unless strictly necessary (for compatibility issues etc.), I recommend using the C++ interface of OpenCV simply because it is more convenient (from my personal experience at least). Now let me state upfront that I don't intend this statement to start a flame war on the merits of C vs C++.
Hope this helps.

Why is object boundaries not clear in opencv stereo correspondence

I got 2 pictures which is almost parallel, and not positioned very far from each other.
I'm using OpenCV to try to create a disparity map (Stereo correspondence).
Because I'm trying to use it in a real world scenario, the use of chessboard calibration is a bit un-practical.
Because of that, I'm using stereoRectifyUncalibrated().
I tried to compare the results, using 2 different sets of corresponding points for the rectification:
Points manually selected(point & click)
Points generated from SURF and filtered with RANSAC
Input image1:
http://i.stack.imgur.com/2TCi3.jpg
Input image2:
http://i.stack.imgur.com/j1fFA.jpg
(Note that I do undistortion on the images before using them for rectification etc)
Rectified images with SURF and RANSAC:(1 and 2 in that order):
http://i.stack.imgur.com/pwbUw.jpg
http://i.stack.imgur.com/mb7TM.jpg
Rectified images using the manually selected points(which is more inaccurate!):
http://i.stack.imgur.com/Bodeg.jpg
Now, the thing is, looking at the result we see that the surf-version is almost perfectly rectified.(The epipolar lines are quite well aligned).
While the manually selected point version is quite badly rectified...the epipolar lines are nowhere near aligned.
But when we look at the result of openCV's sgBM() using both our rectifications:
Manual point result:
http://i.stack.imgur.com/N8Cyp.png
SURF point result:
http://i.stack.imgur.com/tGsCN.jpg
The disparity/depth shown is more accurate/correct with the SURF-point(well rectified version). No surprise there.
However, the actual detected object-pixels and object-boundaries are actually a lot better on the badly rectified verison.
For instance, you can see that the pen is actually a pen and has the shape of a pen, in the bad rectified disparity map, but not in the well-rectified map.
Question is, why?
And how can I fix it?
(I tried fooling around with the sgBM() paramteres, making sure they are the same for both etc, but it does not have any effect. It is the different rectifications alone that makes the badly rectified image look good for some reason(with respect to object-boundaries)).

Resources