opencv calibrateCamera function yielding bad results - opencv

I'm trying to get opencv camera calibration working but having trouble getting it to output valid data. I have an uncalibrated camera that I would like to calibrate, but to test my code I am using an Azure Kinect camera (the color camera), since the SDK supplies the correct intrinsics for it and I can verify them. I've collected 30 images of a chessboard from slightly different angles, which I understand should be sufficient, and run the calibration function, but no matter what flags I pass in I get values for fx and fy that are pretty different from the correct fx and fy, and distortion coefficients that are WILDLY different. Am I doing something wrong? Do I need more or better data?
A sample of the images I'm using can be found here: https://www.dropbox.com/sh/9pa94uedoe5mlxz/AABisSvgWwBT-bY65lfzp2N3a?dl=0
Save them in c:\calibration_test to run the code below.
#include <filesystem>
#include <iostream>
#include <opencv2/core.hpp>
#include <opencv2/calib3d/calib3d.hpp>
#include <opencv2/features2d/features2d.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/imgcodecs.hpp>
using namespace std;
namespace fs = experimental::filesystem;
static bool extractCorners(cv::Mat colorImage, vector<cv::Point3f>& corners3d, vector<cv::Point2f>& corners)
{
// Each square is 20x20mm
const float kSquareSize = 0.020f;
const cv::Size boardSize(7, 9);
const cv::Point3f kCenterOffset((float)(boardSize.width - 1) * kSquareSize, (float)(boardSize.height - 1) * kSquareSize, 0.f);
cv::Mat image;
cv::cvtColor(colorImage, image, cv::COLOR_BGRA2GRAY);
int chessBoardFlags = cv::CALIB_CB_ADAPTIVE_THRESH | cv::CALIB_CB_NORMALIZE_IMAGE;
if (!cv::findChessboardCorners(image, boardSize, corners, chessBoardFlags))
{
return false;
}
cv::cornerSubPix(image, corners, cv::Size(11, 11), cv::Size(-1, -1),
cv::TermCriteria(cv::TermCriteria::EPS + cv::TermCriteria::COUNT, 30, 0.1));
// Construct the corners
for (int i = 0; i < boardSize.height; ++i)
for (int j = 0; j < boardSize.width; ++j)
corners3d.push_back(cv::Point3f(j * kSquareSize, i * kSquareSize, 0) - kCenterOffset);
return true;
}
int main()
{
vector<cv::Mat> frames;
for (const auto& p : fs::directory_iterator("c:\\calibration_test\\"))
{
frames.push_back(cv::imread(p.path().string()));
}
int numFrames = (int)frames.size();
vector<vector<cv::Point2f>> corners(numFrames);
vector<vector<cv::Point3f>> corners3d(numFrames);
int framesWithCorners = 0;
for (int i = 0; i < numFrames; ++i)
{
if (extractCorners(frames[i], corners3d[framesWithCorners], corners[framesWithCorners]))
{
++framesWithCorners;
}
}
numFrames = framesWithCorners;
corners.resize(numFrames);
corners3d.resize(numFrames);
// Camera intrinsics come from the Azure Kinect API
cv::Matx33d cameraMatrix(
914.111755f, 0.f, 960.887390f,
0.f, 913.880615f, 551.566528f,
0.f, 0.f, 1.f);
vector<float> distCoeffs = { 0.576340079f, -2.71203661f, 0.000563957903f, -0.000239689150f, 1.54344523f, 0.454746544f, -2.53860712f, 1.47272563f };
cv::Size imageSize = frames[0].size();
vector<cv::Point3d> rotations;
vector<cv::Point3d> translations;
int flags = cv::CALIB_USE_INTRINSIC_GUESS | cv::CALIB_FIX_PRINCIPAL_POINT | cv::CALIB_RATIONAL_MODEL;
double result = cv::calibrateCamera(corners3d, corners, imageSize, cameraMatrix, distCoeffs, rotations, translations,
flags);
// After this call, cameraMatrix has different values for fx and fy, and WILDLY different distortion coefficients.
cout << "fx: " << cameraMatrix(0, 0) << endl;
cout << "fy: " << cameraMatrix(1, 1) << endl;
cout << "cx: " << cameraMatrix(0, 2) << endl;
cout << "cy: " << cameraMatrix(1, 2) << endl;
for (size_t i = 0; i < distCoeffs.size(); ++i)
{
cout << "d" << i << ": " << distCoeffs[i] << endl;
}
return 0;
}
Some sample output is:
fx: 913.143
fy: 917.965
cx: 960.887
cy: 551.567
d0: 0.327596
d1: -73.1837
d2: -0.00125972
d3: 0.002805
d4: -7.93086
d5: 0.295437
d6: -73.481
d7: -3.25043
d8: 0
d9: 0
d10: 0
d11: 0
d12: 0
d13: 0
Any idea what I'm doing wrong?
Bonus question: Why do I get 14 distortion coefficients back instead of 8? If I leave off CALIB_RATIONAL_MODEL then I only get 5 (three radial and two tangential).

You need to take images from the whole field of view of the camera to correctly capture the lens distortion characteristics. The images you provide only show the chessboad in one position, slightly angled.
Ideally you should have images of the chessboard evenly distributed over the x and y axis of the image plane, right up to the edges of the image. Make sure sufficient white boarder around the board is always visible though for detection robustness.
You should also try to capture images where the chessboard is nearer to the camera and farther away, not just a uniform distance. The different angles you provide look good on the other hand.
You can find an extensive guide how to ensure good calibration results in this answer: How to verify the correctness of calibration of a webcam?
Comparing your camera matrix to the one coming from Azure Kinect API it doesn't look so bad. The principle point is pretty spot on and the focal length is in a reasonable range. If you improve the quality of the input with my tips and the SO answer I have provided the results should be even closer. Comparing sets of distortion coefficients by their distance doesn't really work that well, the error function is not convex so you can have lots of local minima that produce relatively good results but they are far from the global minimum that would yield the best results. If that explanation makes sense to you.
Regarding your bonus question: I only see 8 values filled in in the output you return, the rest is 0 so doesn't have any influence. I'm not sure if the output is expected to be different from that function.

Related

Letter inside letter, pattern recognition

I would like to detect this pattern
As you can see it's basically the letter C, inside another, with different orientations. My pattern can have multiple C's inside one another, the one I'm posting with 2 C's is just a sample. I would like to detect how many C's there are, and the orientation of each one. For now I've managed to detect the center of such pattern, basically I've managed to detect the center of the innermost C. Could you please provide me with any ideas about different algorithms I could use?
And here we go! A high level overview of this approach can be described as the sequential execution of the following steps:
Load the input image;
Convert it to grayscale;
Threshold it to generate a binary image;
Use the binary image to find contours;
Fill each area of contours with a different color (so we can extract each letter separately);
Create a mask for each letter found to isolate them in separate images;
Crop the images to the smallest possible size;
Figure out the center of the image;
Figure out the width of the letter's border to identify the exact center of the border;
Scan along the border (in a circular fashion) for discontinuity;
Figure out an approximate angle for the discontinuity, thus identifying the amount of rotation of the letter.
I don't want to get into too much detail since I'm sharing the source code, so feel free to test and change it in any way you like.
Let's start, Winter Is Coming:
#include <iostream>
#include <vector>
#include <cmath>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
cv::RNG rng(12345);
float PI = std::atan(1) * 4;
void isolate_object(const cv::Mat& input, cv::Mat& output)
{
if (input.channels() != 1)
{
std::cout << "isolate_object: !!! input must be grayscale" << std::endl;
return;
}
// Store the set of points in the image before assembling the bounding box
std::vector<cv::Point> points;
cv::Mat_<uchar>::const_iterator it = input.begin<uchar>();
cv::Mat_<uchar>::const_iterator end = input.end<uchar>();
for (; it != end; ++it)
{
if (*it) points.push_back(it.pos());
}
// Compute minimal bounding box
cv::RotatedRect box = cv::minAreaRect(cv::Mat(points));
// Set Region of Interest to the area defined by the box
cv::Rect roi;
roi.x = box.center.x - (box.size.width / 2);
roi.y = box.center.y - (box.size.height / 2);
roi.width = box.size.width;
roi.height = box.size.height;
// Crop the original image to the defined ROI
output = input(roi);
}
For more details on the implementation of isolate_object() please check this thread. cv::RNG is used later on to fill each contour with a different color, and PI, well... you know PI.
int main(int argc, char* argv[])
{
// Load input (colored, 3-channel, BGR)
cv::Mat input = cv::imread("test.jpg");
if (input.empty())
{
std::cout << "!!! Failed imread() #1" << std::endl;
return -1;
}
// Convert colored image to grayscale
cv::Mat gray;
cv::cvtColor(input, gray, CV_BGR2GRAY);
// Execute a threshold operation to get a binary image from the grayscale
cv::Mat binary;
cv::threshold(gray, binary, 128, 255, cv::THRESH_BINARY);
The binary image looks exactly like the input because it only had 2 colors (B&W):
// Find the contours of the C's in the thresholded image
std::vector<std::vector<cv::Point> > contours;
cv::findContours(binary, contours, cv::RETR_LIST, cv::CHAIN_APPROX_SIMPLE);
// Fill the contours found with unique colors to isolate them later
cv::Mat colored_contours = input.clone();
std::vector<cv::Scalar> fill_colors;
for (size_t i = 0; i < contours.size(); i++)
{
std::vector<cv::Point> cnt = contours[i];
double area = cv::contourArea(cv::Mat(cnt));
//std::cout << "* Area: " << area << std::endl;
// Fill each C found with a different color.
// If the area is larger than 100k it's probably the white background, so we ignore it.
if (area > 10000 && area < 100000)
{
cv::Scalar color = cv::Scalar(rng.uniform(0, 255), rng.uniform(0,255), rng.uniform(0,255));
cv::drawContours(colored_contours, contours, i, color,
CV_FILLED, 8, std::vector<cv::Vec4i>(), 0, cv::Point());
fill_colors.push_back(color);
//cv::imwrite("test_contours.jpg", colored_contours);
}
}
What colored_contours looks like:
// Create a mask for each C found to isolate them from each other
for (int i = 0; i < fill_colors.size(); i++)
{
// After inRange() single_color_mask stores a single C letter
cv::Mat single_color_mask = cv::Mat::zeros(input.size(), CV_8UC1);
cv::inRange(colored_contours, fill_colors[i], fill_colors[i], single_color_mask);
//cv::imwrite("test_mask.jpg", single_color_mask);
Since this for loop is executed twice, one for each color that was used to fill the contours, I want you to see all images that were generated by this stage. So the following images are the ones that were stored by single_color_mask (one for each iteration of the loop):
// Crop image to the area of the object
cv::Mat cropped;
isolate_object(single_color_mask, cropped);
//cv::imwrite("test_cropped.jpg", cropped);
cv::Mat orig_cropped = cropped.clone();
These are the ones that were stored by cropped (by the way, the smaller C looks fat because the image is rescaled by this page to have the same size of the larger C, don't worry):
// Figure out the center of the image
cv::Point obj_center(cropped.cols/2, cropped.rows/2);
//cv::circle(cropped, obj_center, 3, cv::Scalar(128, 128, 128));
//cv::imwrite("test_cropped_center.jpg", cropped);
To make it clearer to understand for what obj_center is for, I painted a little gray circle for educational purposes on that location:
// Figure out the exact center location of the border
std::vector<cv::Point> border_points;
for (int y = 0; y < cropped.cols; y++)
{
if (cropped.at<uchar>(obj_center.x, y) != 0)
border_points.push_back(cv::Point(obj_center.x, y));
if (border_points.size() > 0 && cropped.at<uchar>(obj_center.x, y) == 0)
break;
}
if (border_points.size() == 0)
{
std::cout << "!!! Oops! No border detected." << std::endl;
return 0;
}
// Figure out the exact center location of the border
cv::Point border_center = border_points[border_points.size() / 2];
//cv::circle(cropped, border_center, 3, cv::Scalar(128, 128, 128));
//cv::imwrite("test_border_center.jpg", cropped);
The procedure above scans a single vertical line from top/middle of the image to find the borders of the circle to be able to calculate it's width. Again, for education purposes I painted a small gray circle in the middle of the border. This is what cropped looks like:
// Scan the border of the circle for discontinuities
int radius = obj_center.y - border_center.y;
if (radius < 0)
radius *= -1;
std::vector<cv::Point> discontinuity_points;
std::vector<int> discontinuity_angles;
for (int angle = 0; angle <= 360; angle++)
{
int x = obj_center.x + (radius * cos((angle+90) * (PI / 180.f)));
int y = obj_center.y + (radius * sin((angle+90) * (PI / 180.f)));
if (cropped.at<uchar>(x, y) < 128)
{
discontinuity_points.push_back(cv::Point(y, x));
discontinuity_angles.push_back(angle);
//cv::circle(cropped, cv::Point(y, x), 1, cv::Scalar(128, 128, 128));
}
}
//std::cout << "Discontinuity size: " << discontinuity_points.size() << std::endl;
if (discontinuity_points.size() == 0 && discontinuity_angles.size() == 0)
{
std::cout << "!!! Oops! No discontinuity detected. It's a perfect circle, dang!" << std::endl;
return 0;
}
Great, so the piece of code above scans along the middle of the circle's border looking for discontinuity. I'm sharing a sample image to illustrate what I mean. Every gray dot on the image represents a pixel that is tested. When the pixel is black it means we found a discontinuity:
// Figure out the approximate angle of the discontinuity:
// the first angle found will suffice for this demo.
int approx_angle = discontinuity_angles[0];
std::cout << "#" << i << " letter C is rotated approximately at: " << approx_angle << " degrees" << std::endl;
// Figure out the central point of the discontinuity
cv::Point discontinuity_center;
for (int a = 0; a < discontinuity_points.size(); a++)
discontinuity_center += discontinuity_points[a];
discontinuity_center.x /= discontinuity_points.size();
discontinuity_center.y /= discontinuity_points.size();
cv::circle(orig_cropped, discontinuity_center, 2, cv::Scalar(128, 128, 128));
cv::imshow("Original crop", orig_cropped);
cv::waitKey(0);
}
return 0;
}
Very well... This last piece of code is responsible for figuring out the approximate angle of the discontinuity as well as indicate the central point of discontinuity. The following images are stored by orig_cropped. Once again I added a gray dot to show the exact positions detected as the center of the gaps:
When executed, this application prints the following information to the screen:
#0 letter C is rotated approximately at: 49 degrees
#1 letter C is rotated approximately at: 0 degrees
I hope it helps.
For start you could use Hough transformation. This algorithm is not very fast, but it's quite robust. Especially if you have such clear images.
The general approach would be:
1) preprocessing - suppress noise, convert to grayscale / binary
2) run edge detector
3) run Hough transform - IIRC it's `cv::HoughCircles` in OpenCV
4) do some postprocessing - remove surplus circles, decide which ones correspond to shape of letter C, and so on
My approach will give you 2 hough circles per letter C. One on inner boundary, one on outer letter C. If you want only one circle per letter you can use skeletonization algoritm. More info here http://homepages.inf.ed.ac.uk/rbf/HIPR2/skeleton.htm
Given that we have nested C structures and you know the centres of the Cs and would like to evaluate the orientations- one simply needs to observe the distribution of pixels along the radius of the concentric Cs in all directions.
This can be done by performing a simple morphological dilation operation from the centre. As we reach the right radius for the innermost C, we will reach a maximum number of pixels covered for the innermost C. The difference between the disc and the C will give us the location of the gap in the whole and one can perform an ultimate erosion to get the centroid of the gap in the C. The angle between the centre and this point is the orientation of the C. This step is iterated till all Cs are covered.
This can also be done quickly using the Distance function from the centre point of the Cs.

3D Mapping depth to RGB (Kinect OpenNI Depthmap to OpenCV RGB Cam)

i'm trying to map my OpenNI (1.5.4.0) Kinect 4 Windows Depthmap to a OpenCV RGB image.
i have the Depthmap 640x480 with depth in mm an was trying to do the mapping like Burrus:
http://burrus.name/index.php/Research/KinectCalibration
i skipped the distortion part but otherwise i did everything i think:
//with depth camera intrinsics, each pixel (x_d,y_d) of depth camera can be projected
//to metric 3D space. with fx_d, fy_d, cx_d and cy_d the intrinsics of the depth camera.
P3D.at<Vec3f>(y,x)[0] = (x - cx_ir) * depth/fx_ir;
P3D.at<Vec3f>(y,x)[1] = (y - cy_ir) * depth/fy_ir;
P3D.at<Vec3f>(y,x)[2] = depth;
//P3D' = R.P3D + T:
RTMat = (Mat_<float>(4,4) << 0.999388, -0.00796202, -0.0480646, -3.96963,
0.00612322, 0.9993536, 0.0337474, -22.8512,
0.0244427, -0.03635059, 0.999173, -15.6307,
0,0,0,1);
perspectiveTransform(P3D, P3DS, RTMat);
//reproject each 3D point on the color image and get its color:
depth = P3DS.at<Vec3f>(y,x)[2];
x_rgb = (P3DS.at<Vec3f>(y,x)[0] * fx_rgb/ depth + cx_rgb;
y_rgb = (P3DS.at<Vec3f>(y,x)[1] * fy_rgb/ depth + cy_rgb;
But with my estimated calibration values for the RGB Camera and the IR Camera of the Kinect my result fails in every direction and cannot be fixed only with changing the extrinsic T Parameters.
I have a few suspisions:
does OpenNi already map the IR Depthmap to the RGB Camera of the
Kinect?
Should i use depth in meters and or transform the pixels into
mm? (i tried by multiplying with pixel_size * 0.001 but i got the
same results)
Really hope someone can help me.
Thx in advance.
AFAIK OpenNI does it's own registration (factory setting) and you can toggle registration as well. If you've built OpenCV with OpenNI support it's as simple as this:
capture.set(CV_CAP_PROP_OPENNI_REGISTRATION,1);
As explained here and there's a minimal OpenNI/OpenCV example here.
So a minimal working sample would look like so:
#include "opencv2/core/core.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <iostream>
using namespace cv;
using namespace std;
int main(){
VideoCapture capture;
capture.open(CV_CAP_OPENNI);
//registration
if(capture.get( CV_CAP_PROP_OPENNI_REGISTRATION ) == 0) capture.set(CV_CAP_PROP_OPENNI_REGISTRATION,1);
if( !capture.isOpened() ){
cout << "Can not open a capture object." << endl;
return -1;
}
cout << "ready" << endl;
for(;;){
Mat depthMap,depthShow;
if( !capture.grab() ){
cout << "Can not grab images." << endl;
return -1;
}else{
if( capture.retrieve( depthMap, CV_CAP_OPENNI_DEPTH_MAP ) ){
const float scaleFactor = 0.05f;
depthMap.convertTo( depthShow, CV_8UC1, scaleFactor );
imshow("depth",depthShow);
}
}
if( waitKey( 30 ) == 27 ) break;//esc to exit
}
}
If you don't have OpenCV built with OpenNI support, you should be able to use GetAlternativeViewPointCap()

OpenCV::solvePNP() - Assertion failed

I am trying to get the pose of the camera with the help of solvePNP() from OpenCV.
After running my program I get the following errors:
OpenCV Error: Assertion failed (npoints >= 0 && npoints == std::max(ipoints.checkVector(2, CV_32F), ipoints.checkVector(2, CV_64F))) in solvePnP, file /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_graphics_opencv/opencv/work/OpenCV-2.4.2/modules/calib3d/src/solvepnp.cpp, line 55
libc++abi.dylib: terminate called throwing an exception
I tried to search how to solve these errors, but I couldn't resolve it unfortunately!
Here is my code, all comment/help is much appreciated:
enum Pattern { NOT_EXISTING, CHESSBOARD, CIRCLES_GRID, ASYMMETRIC_CIRCLES_GRID };
void calcBoardCornerPositions(Size boardSize, float squareSize, vector<Point3f>& corners,
Pattern patternType)
{
corners.clear();
switch(patternType)
{
case CHESSBOARD:
case CIRCLES_GRID:
for( int i = 0; i < boardSize.height; ++i )
for( int j = 0; j < boardSize.width; ++j )
corners.push_back(Point3f(float( j*squareSize ), float( i*squareSize ), 0));
break;
case ASYMMETRIC_CIRCLES_GRID:
for( int i = 0; i < boardSize.height; i++ )
for( int j = 0; j < boardSize.width; j++ )
corners.push_back(Point3f(float((2*j + i % 2)*squareSize), float(i*squareSize), 0));
break;
}
}
int main(int argc, char* argv[])
{
float squareSize = 50.f;
Pattern calibrationPattern = CHESSBOARD;
//vector<Point2f> boardCorners;
vector<vector<Point2f> > imagePoints(1);
vector<vector<Point3f> > boardPoints(1);
Size boardSize;
boardSize.width = 9;
boardSize.height = 6;
vector<Mat> intrinsics, distortion;
string filename = "out_camera_xml.xml";
FileStorage fs(filename, FileStorage::READ);
fs["camera_matrix"] >> intrinsics;
fs["distortion_coefficients"] >> distortion;
fs.release();
vector<Mat> rvec, tvec;
Mat img = imread(argv[1], CV_LOAD_IMAGE_GRAYSCALE); // at kell adnom egy kepet
bool found = findChessboardCorners(img, boardSize, imagePoints[0], CV_CALIB_CB_ADAPTIVE_THRESH);
calcBoardCornerPositions(boardSize, squareSize, boardPoints[0], calibrationPattern);
boardPoints.resize(imagePoints.size(),boardPoints[0]);
//***Debug start***
cout << imagePoints.size() << endl << boardPoints.size() << endl << intrinsics.size() << endl << distortion.size() << endl;
//***Debug end***
solvePnP(Mat(boardPoints), Mat(imagePoints), intrinsics, distortion, rvec, tvec);
for(int i=0; i<rvec.size(); i++) {
cout << rvec[i] << endl;
}
return 0;
}
EDIT (some debug info):
I debugged it row by row. I stepped into all of the functions. I am getting the Assertion failed in SolvePNP(...). You can see below what I see when I step into the solvePNP function. First it jumps over the first if statement /if(vec.empty())/, and goes into the second if statement /if( !copyData )/, there when it executes the last line /*datalimit = dataend = datastart + rows*step[0]*/ jumps back to the first if statement and returns => than I get the Assertion failed error.
template<typename _Tp> inline Mat::Mat(const vector<_Tp>& vec, bool copyData)
: flags(MAGIC_VAL | DataType<_Tp>::type | CV_MAT_CONT_FLAG),
dims(2), rows((int)vec.size()), cols(1), data(0), refcount(0),
datastart(0), dataend(0), allocator(0), size(&rows)
{
if(vec.empty())
return;
if( !copyData )
{
step[0] = step[1] = sizeof(_Tp);
data = datastart = (uchar*)&vec[0];
datalimit = dataend = datastart + rows*step[0];
}
else
Mat((int)vec.size(), 1, DataType<_Tp>::type, (uchar*)&vec[0]).copyTo(*this);
}
Step into the function in a debugger and see exactly which assertion is failing. ( Probably it requires values in double (CV_64F) rather than float. )
OpenCVs new "inputarray" wrapper issuppsoed to allow you to call functions with any shape of mat, vector of points, etc - and it will sort it out. But a lot of functions assume a particular inut format or have obsolete assertions enforcing a particular format.
The stereo/calibration systems are the worst for requiring a specific layout, and frequently succesive operations require a different layout.
The types don't seem right, at least in the code that worked for me I used different types(as mentioned in the documentation).
objectPoints – Array of object points in the object coordinate space, 3xN/Nx3 1-channel or 1xN/Nx1 3-channel, where N is the number of points. vector can be also passed here.
imagePoints – Array of corresponding image points, 2xN/Nx2 1-channel or 1xN/Nx1 2-channel, where N is the number of points.
vector can be also passed here.
cameraMatrix – Input camera matrix A = \vecthreethree{fx}{0}{cx}{0}{fy}{cy}{0}{0}{1} .
distCoeffs – Input
vector of distortion coefficients (k_1, k_2, p_1, p_2[, k_3[, k_4,
k_5, k_6]]) of 4, 5, or 8 elements. If the vector is NULL/empty, the
zero distortion coefficients are assumed.
rvec – Output rotation vector (see Rodrigues() ) that, together with tvec , brings points from the model coordinate system to the
camera coordinate system.
tvec – Output translation vector.
useExtrinsicGuess – If true (1), the function uses the provided rvec and tvec values as initial
approximations of the rotation and translation vectors, respectively,
and further optimizes them.
Documentation from here.
vector<Mat> rvec, tvec should be Mat rvec, tvec instead.
vector<vector<Point2f> > imagePoints(1) should be vector<Point2f> imagePoints(1) instead.
vector<vector<Point3f> > boardPoints(1) should be
vector<Point3f> boardPoints(1) instead.
Note: I encountered the exact same problem, and this worked for me(It is a little bit confusing since calibrateCamera use vectors). Haven't tried it for imagePoints or boardPoints though.(but as it is documented in the link above, vector,vector should work, I thought I'd better mention it), but for rvec,trec I tried it myself.
I run in exactly the same problem with solvePnP and opencv3. I tried to isolate the problem in a single test case. I seams passing a std::vector to cv::InputArray does not what is expected. The following small test works with opencv 2.4.9 but not with 3.2.
And this is exactly the problem when passing a std::vector of points to solvePnP and causes the assert at line 63 in solvepnp.cpp to fail !
Generating a cv::mat out of the vector list before passing to solvePnP works.
//create list with 3 points
std::vector<cv::Point3f> vectorList;
vectorList.push_back(cv::Point3f(1.0, 1.0, 1.0));
vectorList.push_back(cv::Point3f(1.0, 1.0, 1.0));
vectorList.push_back(cv::Point3f(1.0, 1.0, 1.0));
//to input array
cv::InputArray inputArray(vectorList);
cv::Mat mat = inputArray.getMat();
cv::Mat matDirect = cv::Mat(vectorList);
LOG_INFO("Size vector: %d mat: %d matDirect: %d", vectorList.size(), mat.checkVector(3, CV_32F), matDirect.checkVector(3, CV_32F));
QVERIFY(vectorList.size() == mat.checkVector(3, CV_32F));
Result opencv 2.4.9 macos:
TestObject: OpenCV
Size vector: 3 mat: 3 matDirect: 3
Result opencv 3.2 win64:
TestObject: OpenCV
Size vector: 3 mat: 9740 matDirect: 3
I faced the same issue. In my case, (in python) converted the input array type as float.
It worked fine afterwards.

Testing a fundamental matrix

My questions are:
How do I figure out if my fundamental matrix is correct?
Is the code I posted below a good effort toward that?
My end goal is to do some sort of 3D reconstruction. Right now I'm trying to calculate the fundamental matrix so that I can estimate the difference between the two cameras. I'm doing this within openFrameworks, using the ofxCv addon, but for the most part it's just pure OpenCV. It's difficult to post code which isolates the problem since ofxCv is also in development.
My code basically reads in two 640x480 frames taken by my webcam from slightly different positions (basically just sliding the laptop a little bit horizontally). I already have a calibration matrix for it, obtained from ofxCv's calibration code, which uses findChessboardCorners. The undistortion example code seems to indicate that the calibration matrix is accurate. It calculates the optical flow between the pictures (either calcOpticalFlowPyrLK or calcOpticalFlowFarneback), and feeds those point pairs to findFundamentalMatrix.
To test if the fundamental matrix is valid, I decomposed it to a rotation and translation matrix. I then multiplied the rotation matrix by the points of the second image, to see what the rotation difference between the cameras was. I figured that any difference should be small, but I'm getting big differences.
Here's the fundamental and rotation matrix of my last code, if it helps:
fund: [-8.413948689969405e-07, -0.0001918870646474247, 0.06783422344973795;
0.0001877654679452431, 8.522397812179886e-06, 0.311671691674232;
-0.06780237856576941, -0.3177275967586101, 1]
R: [0.8081771697692786, -0.1096128431920695, -0.5786490187247098;
-0.1062963539438068, -0.9935398408215166, 0.03974506055610323;
-0.5792674230456705, 0.02938723035105822, -0.8146076621848839]
t: [0, 0.3019063882496216, -0.05799044915951077;
-0.3019063882496216, 0, -0.9515721940769112;
0.05799044915951077, 0.9515721940769112, 0]
Here's my portion of the code, which occurs after the second picture is taken:
const ofImage& image1 = images[images.size() - 2];
const ofImage& image2 = images[images.size() - 1];
std::vector<cv::Point2f> points1 = flow->getPointsPrev();
std::vector<cv::Point2f> points2 = flow->getPointsNext();
std::vector<cv::KeyPoint> keyPoints1 = convertFrom(points1);
std::vector<cv::KeyPoint> keyPoints2 = convertFrom(points2);
std::cout << "points1: " << points1.size() << std::endl;
std::cout << "points2: " << points2.size() << std::endl;
fundamentalMatrix = (cv::Mat)cv::findFundamentalMat(points1, points2);
cv::Mat cameraMatrix = (cv::Mat)calibration.getDistortedIntrinsics().getCameraMatrix();
cv::Mat cameraMatrixInv = cameraMatrix.inv();
std::cout << "fund: " << fundamentalMatrix << std::endl;
essentialMatrix = cameraMatrix.t() * fundamentalMatrix * cameraMatrix;
cv::SVD svd(essentialMatrix);
Matx33d W(0,-1,0, //HZ 9.13
1,0,0,
0,0,1);
cv::Mat_<double> R = svd.u * Mat(W).inv() * svd.vt; //HZ 9.19
std::cout << "R: " << (cv::Mat)R << std::endl;
Matx33d Z(0, -1, 0,
1, 0, 0,
0, 0, 0);
cv::Mat_<double> t = svd.vt.t() * Mat(Z) * svd.vt;
std::cout << "t: " << (cv::Mat)t << std::endl;
Vec3d tVec = Vec3d(t(1,2), t(2,0), t(0,1));
Matx34d P1 = Matx34d(R(0,0), R(0,1), R(0,2), tVec(0),
R(1,0), R(1,1), R(1,2), tVec(1),
R(2,0), R(2,1), R(2,2), tVec(2));
ofMatrix4x4 ofR(R(0,0), R(0,1), R(0,2), 0,
R(1,0), R(1,1), R(1,2), 0,
R(2,0), R(2,1), R(2,2), 0,
0, 0, 0, 1);
ofRs.push_back(ofR);
cv::Matx34d P(1,0,0,0,
0,1,0,0,
0,0,1,0);
for (int y = 0; y < image1.height; y += 10) {
for (int x = 0; x < image1.width; x += 10) {
Vec3d vec(x, y, 0);
Point3d point1(vec.val[0], vec.val[1], vec.val[2]);
Vec3d result = (cv::Mat)((cv::Mat)R * (cv::Mat)vec);
Point3d point2 = result;
mesh.addColor(image1.getColor(x, y));
mesh.addVertex(ofVec3f(point1.x, point1.y, point1.z));
mesh.addColor(image2.getColor(x, y));
mesh.addVertex(ofVec3f(point2.x, point2.y, point2.z));
}
}
Any ideas? Does my fundamental matrix look correct, or do I have the wrong idea in testing it?
If you want to find out if your Fundamental Matrix is correct, you should compute error.
Using the epipolar constraint equation, you can check how close the detected features in one image lie on the epipolar lines of the other image. Ideally, these dot products should sum to 0, and thus, the calibration error is computed as the sum of absolute distances (SAD). The mean of the SAD is reported as stereo calibration error. Basically, you are computing SAD of the computed features in image_left (could be chessboard corners) from the corresponding epipolar lines. This error is measured in pixel^2, anything below 1 is acceptable.
OpenCV has code examples, look at the Stereo Calibrate cpp file, it shows you how to compute this error.
https://code.ros.org/trac/opencv/browser/trunk/opencv/samples/c/stereo_calib.cpp?rev=2614
Look at "avgErr" Lines 260-269
Ankur
i think that you did not remove matches which are incorrect before you use then to calculate F.
Also i have an idea on how to validate F ,from x'Fx=0,you can replace several x' and x in the formula.
KyleFan
I wrote a python function to do this:
def Ferror(F,pts1,pts2): # pts are Nx3 array of homogenous coordinates.
# how well F satisfies the equation pt1 * F * pt2 == 0
vals = pts1.dot(F).dot(pts2.T)
err = np.abs(vals)
print("avg Ferror:",np.mean(err))
return np.mean(err)

kissfft scaling

I am looking to compute a fast correlation using FFTs and the kissfft library, and scaling needs to be precise. What scaling is necessary (forward and backwards) and what value do I use to scale my data?
The 3 most common FFT scaling factors are:
1.0 forward FFT, 1.0/N inverse FFT
1.0/N forward FFT, 1.0 inverse FFT
1.0/sqrt(N) in both directions, FFT & IFFT
Given any possible ambiguity in the documentation, and for whatever scaling the user expects to be "correct" for their purposes, best to just feed a pure sine wave of known (1.0 float or 255 integer) amplitude and exactly periodic in the FFT length to the FFT (and/or IFFT) in question, and see if the scaling matches one of the above, is maybe different from one of the above by 2X or sqrt(2), or the desired scaling is something completely different.
e.g. Write a unit test for kissfft in your environment for your data types.
multiply each frequency response by 1/sqrt(N), for an overall scaling of 1/N
In pseudocode:
ifft( fft(x)*conj( fft(y) )/N ) == circular_correlation(x,y)
At least this is true for kisfft with floating point types.
The output of the following c++ example code should be something like
the circular correlation of [1, 3i, 0 0 ....] with itself = (10,0),(1.19796e-10,3),(-4.91499e-08,1.11519e-15),(1.77301e-08,-1.19588e-08) ...
#include <complex>
#include <iostream>
#include "kiss_fft.h"
using namespace std;
int main()
{
const int nfft=256;
kiss_fft_cfg fwd = kiss_fft_alloc(nfft,0,NULL,NULL);
kiss_fft_cfg inv = kiss_fft_alloc(nfft,1,NULL,NULL);
std::complex<float> x[nfft];
std::complex<float> fx[nfft];
memset(x,0,sizeof(x));
x[0] = 1;
x[1] = std::complex<float>(0,3);
kiss_fft(fwd,(kiss_fft_cpx*)x,(kiss_fft_cpx*)fx);
for (int k=0;k<nfft;++k) {
fx[k] = fx[k] * conj(fx[k]);
fx[k] *= 1./nfft;
}
kiss_fft(inv,(kiss_fft_cpx*)fx,(kiss_fft_cpx*)x);
cout << "the circular correlation of [1, 3i, 0 0 ....] with itself = ";
cout
<< x[0] << ","
<< x[1] << ","
<< x[2] << ","
<< x[3] << " ... " << endl;
kiss_fft_free(fwd);
kiss_fft_free(inv);
return 0;
}

Resources