kissfft scaling - signal-processing

I am looking to compute a fast correlation using FFTs and the kissfft library, and scaling needs to be precise. What scaling is necessary (forward and backwards) and what value do I use to scale my data?

The 3 most common FFT scaling factors are:
1.0 forward FFT, 1.0/N inverse FFT
1.0/N forward FFT, 1.0 inverse FFT
1.0/sqrt(N) in both directions, FFT & IFFT
Given any possible ambiguity in the documentation, and for whatever scaling the user expects to be "correct" for their purposes, best to just feed a pure sine wave of known (1.0 float or 255 integer) amplitude and exactly periodic in the FFT length to the FFT (and/or IFFT) in question, and see if the scaling matches one of the above, is maybe different from one of the above by 2X or sqrt(2), or the desired scaling is something completely different.
e.g. Write a unit test for kissfft in your environment for your data types.

multiply each frequency response by 1/sqrt(N), for an overall scaling of 1/N
In pseudocode:
ifft( fft(x)*conj( fft(y) )/N ) == circular_correlation(x,y)
At least this is true for kisfft with floating point types.
The output of the following c++ example code should be something like
the circular correlation of [1, 3i, 0 0 ....] with itself = (10,0),(1.19796e-10,3),(-4.91499e-08,1.11519e-15),(1.77301e-08,-1.19588e-08) ...
#include <complex>
#include <iostream>
#include "kiss_fft.h"
using namespace std;
int main()
{
const int nfft=256;
kiss_fft_cfg fwd = kiss_fft_alloc(nfft,0,NULL,NULL);
kiss_fft_cfg inv = kiss_fft_alloc(nfft,1,NULL,NULL);
std::complex<float> x[nfft];
std::complex<float> fx[nfft];
memset(x,0,sizeof(x));
x[0] = 1;
x[1] = std::complex<float>(0,3);
kiss_fft(fwd,(kiss_fft_cpx*)x,(kiss_fft_cpx*)fx);
for (int k=0;k<nfft;++k) {
fx[k] = fx[k] * conj(fx[k]);
fx[k] *= 1./nfft;
}
kiss_fft(inv,(kiss_fft_cpx*)fx,(kiss_fft_cpx*)x);
cout << "the circular correlation of [1, 3i, 0 0 ....] with itself = ";
cout
<< x[0] << ","
<< x[1] << ","
<< x[2] << ","
<< x[3] << " ... " << endl;
kiss_fft_free(fwd);
kiss_fft_free(inv);
return 0;
}

Related

drake gurobi cannot handle quadratic constraint

I am currently working to implement a simple quadratic constraint on an optimization problem. Gurobi's website says that quadratic constraints can be implemented.
Does drake not have an interface with this constraint using Gurobi?
The code is as follows.
#include <Eigen/Dense>
#include <math.h>
#include <iostream>
#include "drake/solvers/mathematical_program.h"
#include "drake/solvers/solve.h"
#include "drake/solvers/gurobi_solver.h"
#include "drake/solvers/scs_solver.h"
using namespace std;
using namespace Eigen;
using namespace drake;
int main(){
solvers::MathematicalProgram prog;
// Instantiate the decision variables
solvers::VectorXDecisionVariable x = prog.NewContinuousVariables(2, "x");
// Define constraints
for(int i = 0; i < 2; i++){
prog.AddConstraint(x[i]*x[i] <= 2); //Replacing this with a linear constraint
//such as prog.AddLinearConstraint(-5 <= x[i] && x[i] <= 5);
//will work
}
// Define the cost
MatrixXd Q(2,2);
Q << 2, 1,
1, 4;
VectorXd b(2);
b << 0, 3;
double c = 1.0;
prog.AddQuadraticCost(Q, b, c, x);
solvers::GurobiSolver solver;
cout << "Gurobi available? " << solver.is_enabled() << endl;
auto result = solver.Solve(prog);
cout << "Is optimization successful?" << result.is_success() << endl;
cout << "Optimal x: " << result.GetSolution().transpose() << endl;
cout << "solver is: " << result.get_solver_id().name() << endl;
cout << "computation time is: " << result.get_solver_details<solvers::GurobiSolver>().optimizer_time;
return 0;
}
Drake currently (intentionally) doesn't support the quadratic constraint. For better solver performance, I would recommend to reformulate the optimization problem without quadratic constraint, but with second-order cone constraints.
First for a quadratic constraint
xᵀx <= a²
This can be regarded as a Lorentz cone constraint
(y, x) is in the Lorentz cone
y = a
In Drake, you could add this constraint as
prog.AddLorentzConeConstraint((drake::VectorX<drake::symbolic::Expression>(x.rows() + 1) << a, x).finished());
For the more general quadratic constraint 0.5xᵀQx + bᵀx + c<=0, we also provide AddQuadraticAsRotatedLorentzConeConstraint which imposes a rotated Lorentz cone constraint.
prog.AddQuadraticAsRotatedLorentzConeConstraint(Q, b, c, x);
For the quadratic cost
min 0.5xᵀQx + bᵀx + c
you could also reformulate it as a linear cost with a rotated Lorentz cone constraint
min z
z >= 0.5xᵀQx + bᵀx + c
In Drake, the code is
z = prog.NewContinuousVariables<1>()(0);
prog.AddLinearCost(z);
prog.AddRotatedLorentzConeConstraint(symbolic::Expression(z), symbolic::Expression(1), 0.5 * x.dot(Q * x) + b.dot(x) + c);
The reason why we would prefer Lorentz cone constraint, instead of a quadratic-constrained-quadratic-program (QCQP), is that with Lorentz cone constraint, you end up with a second-order conic optimization problem (SOCP), this optimization problem has a lot of nice features, in terms of its dual form. You could read the paper Alizadeh's paper for more details. And Mosek also recommends SOCP over QCQP https://docs.mosek.com/9.2/rmosek/prob-def-quadratic.html#a-recommendation
Possibly the (as-of-yet unimplemented) feature request at https://github.com/RobotLocomotion/drake/issues/6341 is what you're looking for?

opencv calibrateCamera function yielding bad results

I'm trying to get opencv camera calibration working but having trouble getting it to output valid data. I have an uncalibrated camera that I would like to calibrate, but to test my code I am using an Azure Kinect camera (the color camera), since the SDK supplies the correct intrinsics for it and I can verify them. I've collected 30 images of a chessboard from slightly different angles, which I understand should be sufficient, and run the calibration function, but no matter what flags I pass in I get values for fx and fy that are pretty different from the correct fx and fy, and distortion coefficients that are WILDLY different. Am I doing something wrong? Do I need more or better data?
A sample of the images I'm using can be found here: https://www.dropbox.com/sh/9pa94uedoe5mlxz/AABisSvgWwBT-bY65lfzp2N3a?dl=0
Save them in c:\calibration_test to run the code below.
#include <filesystem>
#include <iostream>
#include <opencv2/core.hpp>
#include <opencv2/calib3d/calib3d.hpp>
#include <opencv2/features2d/features2d.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/imgcodecs.hpp>
using namespace std;
namespace fs = experimental::filesystem;
static bool extractCorners(cv::Mat colorImage, vector<cv::Point3f>& corners3d, vector<cv::Point2f>& corners)
{
// Each square is 20x20mm
const float kSquareSize = 0.020f;
const cv::Size boardSize(7, 9);
const cv::Point3f kCenterOffset((float)(boardSize.width - 1) * kSquareSize, (float)(boardSize.height - 1) * kSquareSize, 0.f);
cv::Mat image;
cv::cvtColor(colorImage, image, cv::COLOR_BGRA2GRAY);
int chessBoardFlags = cv::CALIB_CB_ADAPTIVE_THRESH | cv::CALIB_CB_NORMALIZE_IMAGE;
if (!cv::findChessboardCorners(image, boardSize, corners, chessBoardFlags))
{
return false;
}
cv::cornerSubPix(image, corners, cv::Size(11, 11), cv::Size(-1, -1),
cv::TermCriteria(cv::TermCriteria::EPS + cv::TermCriteria::COUNT, 30, 0.1));
// Construct the corners
for (int i = 0; i < boardSize.height; ++i)
for (int j = 0; j < boardSize.width; ++j)
corners3d.push_back(cv::Point3f(j * kSquareSize, i * kSquareSize, 0) - kCenterOffset);
return true;
}
int main()
{
vector<cv::Mat> frames;
for (const auto& p : fs::directory_iterator("c:\\calibration_test\\"))
{
frames.push_back(cv::imread(p.path().string()));
}
int numFrames = (int)frames.size();
vector<vector<cv::Point2f>> corners(numFrames);
vector<vector<cv::Point3f>> corners3d(numFrames);
int framesWithCorners = 0;
for (int i = 0; i < numFrames; ++i)
{
if (extractCorners(frames[i], corners3d[framesWithCorners], corners[framesWithCorners]))
{
++framesWithCorners;
}
}
numFrames = framesWithCorners;
corners.resize(numFrames);
corners3d.resize(numFrames);
// Camera intrinsics come from the Azure Kinect API
cv::Matx33d cameraMatrix(
914.111755f, 0.f, 960.887390f,
0.f, 913.880615f, 551.566528f,
0.f, 0.f, 1.f);
vector<float> distCoeffs = { 0.576340079f, -2.71203661f, 0.000563957903f, -0.000239689150f, 1.54344523f, 0.454746544f, -2.53860712f, 1.47272563f };
cv::Size imageSize = frames[0].size();
vector<cv::Point3d> rotations;
vector<cv::Point3d> translations;
int flags = cv::CALIB_USE_INTRINSIC_GUESS | cv::CALIB_FIX_PRINCIPAL_POINT | cv::CALIB_RATIONAL_MODEL;
double result = cv::calibrateCamera(corners3d, corners, imageSize, cameraMatrix, distCoeffs, rotations, translations,
flags);
// After this call, cameraMatrix has different values for fx and fy, and WILDLY different distortion coefficients.
cout << "fx: " << cameraMatrix(0, 0) << endl;
cout << "fy: " << cameraMatrix(1, 1) << endl;
cout << "cx: " << cameraMatrix(0, 2) << endl;
cout << "cy: " << cameraMatrix(1, 2) << endl;
for (size_t i = 0; i < distCoeffs.size(); ++i)
{
cout << "d" << i << ": " << distCoeffs[i] << endl;
}
return 0;
}
Some sample output is:
fx: 913.143
fy: 917.965
cx: 960.887
cy: 551.567
d0: 0.327596
d1: -73.1837
d2: -0.00125972
d3: 0.002805
d4: -7.93086
d5: 0.295437
d6: -73.481
d7: -3.25043
d8: 0
d9: 0
d10: 0
d11: 0
d12: 0
d13: 0
Any idea what I'm doing wrong?
Bonus question: Why do I get 14 distortion coefficients back instead of 8? If I leave off CALIB_RATIONAL_MODEL then I only get 5 (three radial and two tangential).
You need to take images from the whole field of view of the camera to correctly capture the lens distortion characteristics. The images you provide only show the chessboad in one position, slightly angled.
Ideally you should have images of the chessboard evenly distributed over the x and y axis of the image plane, right up to the edges of the image. Make sure sufficient white boarder around the board is always visible though for detection robustness.
You should also try to capture images where the chessboard is nearer to the camera and farther away, not just a uniform distance. The different angles you provide look good on the other hand.
You can find an extensive guide how to ensure good calibration results in this answer: How to verify the correctness of calibration of a webcam?
Comparing your camera matrix to the one coming from Azure Kinect API it doesn't look so bad. The principle point is pretty spot on and the focal length is in a reasonable range. If you improve the quality of the input with my tips and the SO answer I have provided the results should be even closer. Comparing sets of distortion coefficients by their distance doesn't really work that well, the error function is not convex so you can have lots of local minima that produce relatively good results but they are far from the global minimum that would yield the best results. If that explanation makes sense to you.
Regarding your bonus question: I only see 8 values filled in in the output you return, the rest is 0 so doesn't have any influence. I'm not sure if the output is expected to be different from that function.

Testing my SVM Model with different parameters yields exact same results

My goal is to use an SVM w/ HOG features to classify vehicles in traffic under sedans and SUVs.
I've used various kernels (RBF, LINEAR, POLY) and each give different results, but they give the same results no matter the parameters changed. For example, if I am using a POLY kernel and the degree is greater than or equal to .65 it will classify everything as an SUV, if its less than .65 then it will classify all my testing images as sedans.
With a LINEAR kernel, the only parameter changed is C. No matter what the parameter C is, I always get 8/10 images classified as sedans and the same 2 classified as SUVs.
Now I only have about 70 training images and 10 testing images, I haven't been able to find a good dataset of vehicles from the rear and up like from a bridge that I will be using this for. Could the problem be due to this small dataset, or the parameters, or something else? Also, I see that my support vectors are usually very high, like 58 out of the 70 training images, so that may be a problem with the dataset? Is there a way for me to visualize the training points somehow--in the SVM examples they always have a nice 2D plot of points and draw a line through it, but is there a way to plot those points with images so I can see if my data is linearly separable and make adjustments accordingly? Are my HOG parameters accurate for a 150x200 image of a car?
Also note that when I use testing images that are the same as training images, then the SVM model predicts perfectly, but obviously that's cheating.
The following image shows the result, and an example of a testing image
Here is my code, I didn't include most of it because I'm not sure the code is the problem. First I take the positive images, extract HOG features, then load them into the training Mat, and then do the same for the negative images in the same way that I do for the included testing part.
//Set SVM Parameters (not sure about these values, but just wanna see something)
Ptr<SVM> svm = SVM::create();
svm->setType(SVM::C_SVC);
svm->setKernel(SVM::POLY);
svm->setC(50);
svm->setGamma(100);
svm->setDegree(.65);
//svm->setTermCriteria(TermCriteria(TermCriteria::MAX_ITER, 100, 1e-6));
cout << "Parameters Set..." << endl;
svm->train(HOGFeat_train, ROW_SAMPLE, labels_mat);
Mat SV = svm->getSupportVectors();
Mat USV = svm->getUncompressedSupportVectors();
cout << "Support Vectors: " << SV.rows << endl;
cout << "Uncompressed Support Vectors: " << USV.rows << endl;
cout << "Training Successful" << endl;
waitKey(0);
//TESTING PORTION
cout << "Begin Testing..." << endl;
int num_test_images = 10;
Mat HOGFeat_test(1, derSize, CV_32FC1); //Creates a 1 x descriptorSize Mat to house the HoG features from the test image
for (int file_count = 1; file_count < (num_test_images + 1); file_count++)
{
test << nameTest << file_count << type; //'Test_1.jpg' ... 'Test_2.jpg' ... etc ...
string filenameTest = test.str();
test.str("");
Mat test_image = imread(filenameTest, 0); //Read the file folder
HOGDescriptor hog_test;// (Size(64, 64), Size(32, 32), Size(16, 16), Size(32, 32), 9, 1, -1, 0, .2, 1, 64, false);
vector<float> descriptors_test;
vector<Point> locations_test;
hog_test.compute(test_image, descriptors_test, Size(64, 64), Size(0, 0), locations_test);
for (int i = 0; i < descriptors_test.size(); i++)
HOGFeat_test.at<float>(0, i) = descriptors_test.at(i);
namedWindow("Test Image", CV_WINDOW_NORMAL);
imshow("Test Image", test_image);
//Should return a 1 if its an SUV, or a -1 if its a sedan
float result = svm->predict(HOGFeat_test);
if (result <= 0)
cout << "Sedan" << endl;
else
cout << "SUV" << endl;
cout << "Result: " << result << endl;
waitKey(0);
}
Two things solved this issue:
1) I got a larger dataset of vehicles. I used about 400 SUV images and 400 sedan images for the training portion and then another 50 images for the testing portion.
2) In: Mat HOGFeat_test(1, derSize, CV_32FC1), I had the wrong derSize by about an order of magnitude larger. The actual size was 15120, but I had the Mat have 113400 columns. Thus, I filled only about 10% of the testing mat with useful feature data, so it was much harder for the SVM to tell any difference between SUVs and Sedans.
Now it works great with both the linear and poly kernel (C = 10), and my accuracy is better than I expected at a whopping 96%.

Tuning vlfeat SVM

I have 6 samples of 1 dim data as example and I'm trying to train vlfeat's SVM on it:
data:
[188.00000000;
168.00000000;
191.00000000;
150.00000000;
154.00000000;
124.00000000]
first 3 samples are positive and last 3 samples are negative.
and I get weights(including bias):
w: -0.6220197226 -0.0002974511
the problem is that all samples get predicted as negative, but they are clearly linear separable.
For learning I use solver type VlSvmSolverSgd and lambda 0.01.
I'm using C API if it matters.
Minimum working example:
void vlfeat_svm_test()
{
vl_size const numData = 6 ;
vl_size const dimension = 1 ;
//double x[dimension * numData] = {188.0,168.0,191.0,150.0, 154.0, 124.0};
double x[dimension * numData] = {188.0/255,168.0/255,191.0/255,150.0/255, 154.0/255, 124.0/255};
double y[numData] = {1, 1, 1, -1, -1, -1} ;
double lambda = 0.01;
VlSvm *svm = vl_svm_new(VlSvmSolverSgd, x, dimension, numData, y, lambda);
vl_svm_train(svm);
double const * w= vl_svm_get_model(svm);
double bias= vl_svm_get_bias(svm);
for(int k=0;k<numData;++k)
{
double res= 0.0;
for(int i=0;i<dimension;++i)
{
res+= x[k*dimension+i]*w[i];
}
int pred= ((res+bias)>0)?1:-1;
cout<< pred <<endl;
}
cout << "w: ";
for(int i=0;i<dimension;++i)
cout<< w[i] <<" ";
cout<< bias <<endl;
vl_svm_delete(svm);
}
Update:
Also I tried to scale input data by dividing by 255 it has no effect.
Update 2:
Extremly low lambda= 0.000001 seems solve the problem.
This happens because the SVM solvers in VLFeat do not estimate the model and bias directly, but use the workaround of adding a constant component to the data (as mentioned in http://www.vlfeat.org/api/svm-fundamentals.html) and return the corresponding model weight as the bias.
The bias term is thus a part of the regularizer and models with higher bias are "penalized" in terms of energy. This effect is especially strong in your case, since your data are extremely low dimensional :)
Therefore you need to choose a small value of the regularization parameter LAMBDA to lower the importance of the regularizer.

Testing a fundamental matrix

My questions are:
How do I figure out if my fundamental matrix is correct?
Is the code I posted below a good effort toward that?
My end goal is to do some sort of 3D reconstruction. Right now I'm trying to calculate the fundamental matrix so that I can estimate the difference between the two cameras. I'm doing this within openFrameworks, using the ofxCv addon, but for the most part it's just pure OpenCV. It's difficult to post code which isolates the problem since ofxCv is also in development.
My code basically reads in two 640x480 frames taken by my webcam from slightly different positions (basically just sliding the laptop a little bit horizontally). I already have a calibration matrix for it, obtained from ofxCv's calibration code, which uses findChessboardCorners. The undistortion example code seems to indicate that the calibration matrix is accurate. It calculates the optical flow between the pictures (either calcOpticalFlowPyrLK or calcOpticalFlowFarneback), and feeds those point pairs to findFundamentalMatrix.
To test if the fundamental matrix is valid, I decomposed it to a rotation and translation matrix. I then multiplied the rotation matrix by the points of the second image, to see what the rotation difference between the cameras was. I figured that any difference should be small, but I'm getting big differences.
Here's the fundamental and rotation matrix of my last code, if it helps:
fund: [-8.413948689969405e-07, -0.0001918870646474247, 0.06783422344973795;
0.0001877654679452431, 8.522397812179886e-06, 0.311671691674232;
-0.06780237856576941, -0.3177275967586101, 1]
R: [0.8081771697692786, -0.1096128431920695, -0.5786490187247098;
-0.1062963539438068, -0.9935398408215166, 0.03974506055610323;
-0.5792674230456705, 0.02938723035105822, -0.8146076621848839]
t: [0, 0.3019063882496216, -0.05799044915951077;
-0.3019063882496216, 0, -0.9515721940769112;
0.05799044915951077, 0.9515721940769112, 0]
Here's my portion of the code, which occurs after the second picture is taken:
const ofImage& image1 = images[images.size() - 2];
const ofImage& image2 = images[images.size() - 1];
std::vector<cv::Point2f> points1 = flow->getPointsPrev();
std::vector<cv::Point2f> points2 = flow->getPointsNext();
std::vector<cv::KeyPoint> keyPoints1 = convertFrom(points1);
std::vector<cv::KeyPoint> keyPoints2 = convertFrom(points2);
std::cout << "points1: " << points1.size() << std::endl;
std::cout << "points2: " << points2.size() << std::endl;
fundamentalMatrix = (cv::Mat)cv::findFundamentalMat(points1, points2);
cv::Mat cameraMatrix = (cv::Mat)calibration.getDistortedIntrinsics().getCameraMatrix();
cv::Mat cameraMatrixInv = cameraMatrix.inv();
std::cout << "fund: " << fundamentalMatrix << std::endl;
essentialMatrix = cameraMatrix.t() * fundamentalMatrix * cameraMatrix;
cv::SVD svd(essentialMatrix);
Matx33d W(0,-1,0, //HZ 9.13
1,0,0,
0,0,1);
cv::Mat_<double> R = svd.u * Mat(W).inv() * svd.vt; //HZ 9.19
std::cout << "R: " << (cv::Mat)R << std::endl;
Matx33d Z(0, -1, 0,
1, 0, 0,
0, 0, 0);
cv::Mat_<double> t = svd.vt.t() * Mat(Z) * svd.vt;
std::cout << "t: " << (cv::Mat)t << std::endl;
Vec3d tVec = Vec3d(t(1,2), t(2,0), t(0,1));
Matx34d P1 = Matx34d(R(0,0), R(0,1), R(0,2), tVec(0),
R(1,0), R(1,1), R(1,2), tVec(1),
R(2,0), R(2,1), R(2,2), tVec(2));
ofMatrix4x4 ofR(R(0,0), R(0,1), R(0,2), 0,
R(1,0), R(1,1), R(1,2), 0,
R(2,0), R(2,1), R(2,2), 0,
0, 0, 0, 1);
ofRs.push_back(ofR);
cv::Matx34d P(1,0,0,0,
0,1,0,0,
0,0,1,0);
for (int y = 0; y < image1.height; y += 10) {
for (int x = 0; x < image1.width; x += 10) {
Vec3d vec(x, y, 0);
Point3d point1(vec.val[0], vec.val[1], vec.val[2]);
Vec3d result = (cv::Mat)((cv::Mat)R * (cv::Mat)vec);
Point3d point2 = result;
mesh.addColor(image1.getColor(x, y));
mesh.addVertex(ofVec3f(point1.x, point1.y, point1.z));
mesh.addColor(image2.getColor(x, y));
mesh.addVertex(ofVec3f(point2.x, point2.y, point2.z));
}
}
Any ideas? Does my fundamental matrix look correct, or do I have the wrong idea in testing it?
If you want to find out if your Fundamental Matrix is correct, you should compute error.
Using the epipolar constraint equation, you can check how close the detected features in one image lie on the epipolar lines of the other image. Ideally, these dot products should sum to 0, and thus, the calibration error is computed as the sum of absolute distances (SAD). The mean of the SAD is reported as stereo calibration error. Basically, you are computing SAD of the computed features in image_left (could be chessboard corners) from the corresponding epipolar lines. This error is measured in pixel^2, anything below 1 is acceptable.
OpenCV has code examples, look at the Stereo Calibrate cpp file, it shows you how to compute this error.
https://code.ros.org/trac/opencv/browser/trunk/opencv/samples/c/stereo_calib.cpp?rev=2614
Look at "avgErr" Lines 260-269
Ankur
i think that you did not remove matches which are incorrect before you use then to calculate F.
Also i have an idea on how to validate F ,from x'Fx=0,you can replace several x' and x in the formula.
KyleFan
I wrote a python function to do this:
def Ferror(F,pts1,pts2): # pts are Nx3 array of homogenous coordinates.
# how well F satisfies the equation pt1 * F * pt2 == 0
vals = pts1.dot(F).dot(pts2.T)
err = np.abs(vals)
print("avg Ferror:",np.mean(err))
return np.mean(err)

Resources