Input : LBP Feature extracted from an image with dimension 75520, so the input LBP data contains 1 row and 75520 columns.
Required Output: Apply PCA on input to reduce the dimension,
Currently my code look like,
void PCA_DimensionReduction(Mat &src, Mat &dst){
int PCA_DIMENSON_VAL 40
Mat tmp = src.reshape(1,1); //1 rows X 75520 cols
Mat projection_result;
Mat input_feature_vector;
Mat norm_tmp;
normalize(tmp,input_feature_vector,0,1,NORM_MINMAX,CV_32FC1);
PCA pca(input_feature_vector,Mat(),CV_PCA_DATA_AS_ROW, PCA_DIMENSON_VAL);
pca.project(input_feature_vector,projection_result);
dst = projection_result.reshape(1,1);
}
Basically I am using this features to match similarity between two images, but I am not getting proper result as without applying PCA.
Any help will be appreciated...
Regards
Haris...
you will have to collect feature vectors from a lot of images, make a single pca from that (offline), and later use the mean & eigenvectors for the projection.
// let's say, you have collected 10 feature vectors a 30 elements.
// flatten them to a single row (reshape(1,1)) and push_back into a big Data Mat
Mat D(10,30,CV_32F); // 10 rows(features) a 30 elements
randu(D,0,10); // only for the simulation here
cerr << D.size() << endl;
// [30 x 10]
// now make a pca, that will only retain 6 eigenvectors
// so the later projections are shortened to 6 elements:
PCA p(D,Mat(),CV_PCA_DATA_AS_ROW,6);
cerr << p.eigenvectors.size() << endl;
// [30 x 6]
// now, that the training step is done, we can use it to
// shorten feature vectors:
// either keep the PCA around for projecting:
// a random test vector,
Mat v(1,30,CV_32F);
randu(v,0,30);
// pca projection:
Mat vp = p.project(v);
cerr << vp.size() << endl;
cerr << vp << endl;
// [6 x 1]
// [-4.7032223, 0.67155731, 15.192059, -8.1542597, -4.5874329, -3.7452228]
// or, maybe, save the pca.mean and pca.eigenvectors only, and do your own projection:
Mat vp2 = (v - mean) * eigenvectors.t();
cerr << vp2.size() << endl;
cerr << vp2 << endl;
//[6 x 1]
//[-4.7032223, 0.67155731, 15.192059, -8.1542597, -4.5874329, -3.7452228]
well, oh, here's the downside: calculating a pca from 4.4k train images a 75k feature elements will take like a good day ;)
Related
I'm using OpenCV to detect features and compute descriptors.
For feature detection I'm using FAST:
cv::Ptr<cv::FeatureDetector> _detector = cv::FastFeatureDetector::create(_configuration.threshold,
_configuration.nonmaxSuppression);
For descriptors I'm using BRIEF:
cv::Ptr<cv::DescriptorExtractor> _descriptor_extractor = cv::xfeatures2d::BriefDescriptorExtractor::create();
After that, I'd like to order keypoints based on their response and store just a certain number of them:
typedef std::map<float,cv::KeyPoint,std::greater<float> > ResponseKeypointMap;
// keypoint buffer
std::vector<cv::KeyPoint> keypoints;
cv::Mat descriptors;
// detect keypoints
_detector->detect(rgb_image_, keypoints);
const int keypoints_size = keypoints.size();
if(!keypoints_size){
std::cerr << "warning: [PointDetector] found 0 keypoints!\n";
return;
}
ResponseKeypointMap keypoints_map;
for(int i=0; i < keypoints_size; ++i){
keypoints_map.insert(std::make_pair(keypoints[i].response,keypoints[i]));
}
int iterations = std::min(_configuration.max_keypoints_size,keypoints_size);
std::vector<cv::KeyPoint> filtered_keypoints;
filtered_keypoints.resize(iterations);
int k=0;
for(ResponseKeypointMap::iterator it = keypoints_map.begin();
it != keypoints_map.end();
++it){
filtered_keypoints[k] = it->second;
k++;
if(k>=iterations)
break;
}
std::cerr << "filtered keypoints size: " << filtered_keypoints.size() << std::endl;
_descriptor_extractor->compute(rgb_image_, filtered_keypoints, descriptors);
std::cerr << "Computed " << descriptors.rows << "x" << descriptors.cols << " descriptors" << std::endl;
I don't know why I'm giving 100 keypoints to the DescriptorExtractor, but I'm recieving 55 descriptors.
I'd be very grateful if you could explain me what is happening.
Thanks.
According to OpenCV documentation https://docs.opencv.org/2.4/modules/features2d/doc/common_interfaces_of_descriptor_extractors.html
DescriptorExtractor::compute(const Mat& image, vector& keypoints, Mat& descriptors)
...
keypoints – Input collection of keypoints. Keypoints for which a
descriptor cannot be computed are removed and the remaining ones may
be reordered. Sometimes new keypoints can be added, for example: SIFT
duplicates a keypoint with several dominant orientations (for each
orientation).
...
So, after execution of compute method, your filtered_keypoints vector is altered and you have new pair of keypoints and descriptors, both of size 55.
In, particular, FAST draws a diameter-7 ring around each test-point to determine if it is a keypoint, but BRIEF uses 256 points around the test-point. I don't know if BRIEF uses a square area or a circular area but, either way, it is bigger, and so FAST may find keypoints that are too close to the boundary of the image for BRIEF to be able to calculate the description.
My goal is to use an SVM w/ HOG features to classify vehicles in traffic under sedans and SUVs.
I've used various kernels (RBF, LINEAR, POLY) and each give different results, but they give the same results no matter the parameters changed. For example, if I am using a POLY kernel and the degree is greater than or equal to .65 it will classify everything as an SUV, if its less than .65 then it will classify all my testing images as sedans.
With a LINEAR kernel, the only parameter changed is C. No matter what the parameter C is, I always get 8/10 images classified as sedans and the same 2 classified as SUVs.
Now I only have about 70 training images and 10 testing images, I haven't been able to find a good dataset of vehicles from the rear and up like from a bridge that I will be using this for. Could the problem be due to this small dataset, or the parameters, or something else? Also, I see that my support vectors are usually very high, like 58 out of the 70 training images, so that may be a problem with the dataset? Is there a way for me to visualize the training points somehow--in the SVM examples they always have a nice 2D plot of points and draw a line through it, but is there a way to plot those points with images so I can see if my data is linearly separable and make adjustments accordingly? Are my HOG parameters accurate for a 150x200 image of a car?
Also note that when I use testing images that are the same as training images, then the SVM model predicts perfectly, but obviously that's cheating.
The following image shows the result, and an example of a testing image
Here is my code, I didn't include most of it because I'm not sure the code is the problem. First I take the positive images, extract HOG features, then load them into the training Mat, and then do the same for the negative images in the same way that I do for the included testing part.
//Set SVM Parameters (not sure about these values, but just wanna see something)
Ptr<SVM> svm = SVM::create();
svm->setType(SVM::C_SVC);
svm->setKernel(SVM::POLY);
svm->setC(50);
svm->setGamma(100);
svm->setDegree(.65);
//svm->setTermCriteria(TermCriteria(TermCriteria::MAX_ITER, 100, 1e-6));
cout << "Parameters Set..." << endl;
svm->train(HOGFeat_train, ROW_SAMPLE, labels_mat);
Mat SV = svm->getSupportVectors();
Mat USV = svm->getUncompressedSupportVectors();
cout << "Support Vectors: " << SV.rows << endl;
cout << "Uncompressed Support Vectors: " << USV.rows << endl;
cout << "Training Successful" << endl;
waitKey(0);
//TESTING PORTION
cout << "Begin Testing..." << endl;
int num_test_images = 10;
Mat HOGFeat_test(1, derSize, CV_32FC1); //Creates a 1 x descriptorSize Mat to house the HoG features from the test image
for (int file_count = 1; file_count < (num_test_images + 1); file_count++)
{
test << nameTest << file_count << type; //'Test_1.jpg' ... 'Test_2.jpg' ... etc ...
string filenameTest = test.str();
test.str("");
Mat test_image = imread(filenameTest, 0); //Read the file folder
HOGDescriptor hog_test;// (Size(64, 64), Size(32, 32), Size(16, 16), Size(32, 32), 9, 1, -1, 0, .2, 1, 64, false);
vector<float> descriptors_test;
vector<Point> locations_test;
hog_test.compute(test_image, descriptors_test, Size(64, 64), Size(0, 0), locations_test);
for (int i = 0; i < descriptors_test.size(); i++)
HOGFeat_test.at<float>(0, i) = descriptors_test.at(i);
namedWindow("Test Image", CV_WINDOW_NORMAL);
imshow("Test Image", test_image);
//Should return a 1 if its an SUV, or a -1 if its a sedan
float result = svm->predict(HOGFeat_test);
if (result <= 0)
cout << "Sedan" << endl;
else
cout << "SUV" << endl;
cout << "Result: " << result << endl;
waitKey(0);
}
Two things solved this issue:
1) I got a larger dataset of vehicles. I used about 400 SUV images and 400 sedan images for the training portion and then another 50 images for the testing portion.
2) In: Mat HOGFeat_test(1, derSize, CV_32FC1), I had the wrong derSize by about an order of magnitude larger. The actual size was 15120, but I had the Mat have 113400 columns. Thus, I filled only about 10% of the testing mat with useful feature data, so it was much harder for the SVM to tell any difference between SUVs and Sedans.
Now it works great with both the linear and poly kernel (C = 10), and my accuracy is better than I expected at a whopping 96%.
I'm trying to perform an RGB Color mixing operation in opencv. I have the image contained in an MxNx3 Mat. I would like to multiple this with a 3x3 matrix. In Matlab I do the following:
*Flatten the image from MxNx3 to a MNx3
*multiply the MNx3 matrix by the 3x3 color mixing matrix
*reshape back to a MxNx3
In Opencv I would like to do the following:
void RGBMixing::mixColors(Mat &imData, Mat &rgbMixData)
{
float rgbmix[] = {1.4237, -0.12364, -0.30003, -0.65221, 2.1936, -0.54141, -0.38854, -0.47458, 1.8631};
Mat rgbMixMat(3, 3, CV_32F, rgbmix);
// Scale the coefficents
multiply(rgbMixMat, 1, rgbMixMat, 256);
Mat temp = imData.reshape(0, 1);
temp = temp.t();
multiply(temp, rgbMixMat, rgbMixData);
}
This compiles but generates an exception:
OpenCV Error: Sizes of input arguments do not match (The operation is
neither 'a rray op array' (where arrays have the same size and the
same number of channels) , nor 'array op scalar', nor 'scalar op
array') in arithm_op, file C:/slave/WinI
nstallerMegaPack/src/opencv/modules/core/src/arithm.cpp, line 1253
terminate called after throwing an instance of 'cv::Exception'
what():
C:/slave/WinInstallerMegaPack/src/opencv/modules/core/src/arithm.cpp:
1253: error: (-209) The operation is neither 'array op array' (where
arrays have the same size and the same number of channels), nor
'array op scalar', nor 'sca lar op array' in function arithm_op
This application has requested the Runtime to terminate it in an
unusual way. Please contact the application's support team for more
information.
Update 1:
This is code that appears to work:
void RGBMixing::mixColors(Mat &imData, Mat&rgbMixData)
{
Size tempSize;
uint32_t channels;
float rgbmix[] = {1.4237, -0.12364, -0.30003, -0.65221, 2.1936, -0.54141, -0.38854, -0.47458, 1.8631};
Mat rgbMixMat(3, 3, CV_32F, rgbmix);
Mat flatImage = imData.reshape(1, 3);
tempSize = flatImage.size();
channels = flatImage.channels();
cout << "temp channels: " << channels << " Size: " << tempSize.width << " x " << tempSize.height << endl;
Mat flatFloatImage;
flatImage.convertTo(flatFloatImage, CV_32F);
Mat mixedImage = flatFloatImage.t() * rgbMixMat;
mixedImage = mixedImage.t();
rgbMixData = mixedImage.reshape(3, 1944);
channels = rgbMixData.channels();
tempSize = rgbMixData.size();
cout << "temp channels: " << channels << " Size: " << tempSize.width << " x " << tempSize.height << endl;
}
But the resulting image is distorted. If I skip the multiplication of the two matrices and just assign
mixedImage = flatFloatImage
The resulting image looks fine (just not color mixed). So I must be doing something wrong, but am getting close.
I see a couple of things here:
For scaling the coefficients, OpenCV supports multiplication by scalar so instead of multiply(rgbMixMat, 1, rgbMixMat, 256); you should do directly rgbMixMat = 256 * rgbMixMat;.
If that is all your code, you don't properly initialize or assign values to imData, so the line Mat temp = imData.reshape(0, 1); is probably going to crash.
Assuming that imData is a MxNx3 (3-channel Mat), you want to reshape that into a MNx3 (1-channel). According to the documentation, when you write Mat temp = imData.reshape(0, 1); you are saying that you want the number of channels to remain the same, and the row, should be 1. Instead, it should be:
Mat myData = Mat::ones(100, 100, CV_32FC3); // 100x100x3 matrix
Mat myDataReshaped = myData.reshape(1, myData.rows*myData.cols); // 10000x3 matrix
Why do you take the transpose temp = temp.t(); ?
When you write multiply(temp, rgbMixMat, mixData);, this is the per-element product. You want the matrix product, so you just have to do mixData = myDataReshaped * rgbMixMat; (and then reshape that).
Edit: It crashes if you don't use the transpose, because you do imData.reshape(1, 3); instead of imData.reshape(1, imData.rows);
Try
void RGBMixing::mixColors(Mat &imData, Mat&rgbMixData)
{
Size tempSize;
uint32_t channels;
float rgbmix[] = {1.4237, -0.12364, -0.30003, -0.65221, 2.1936, -0.54141, -0.38854, -0.47458, 1.8631};
Mat rgbMixMat(3, 3, CV_32F, rgbmix);
Mat flatImage = imData.reshape(1, imData.rows*imData.cols);
Mat flatFloatImage;
flatImage.convertTo(flatFloatImage, CV_32F);
Mat mixedImage = flatFloatImage * rgbMixMat;
rgbMixData = mixedImage.reshape(3, imData.rows);
}
My questions are:
How do I figure out if my fundamental matrix is correct?
Is the code I posted below a good effort toward that?
My end goal is to do some sort of 3D reconstruction. Right now I'm trying to calculate the fundamental matrix so that I can estimate the difference between the two cameras. I'm doing this within openFrameworks, using the ofxCv addon, but for the most part it's just pure OpenCV. It's difficult to post code which isolates the problem since ofxCv is also in development.
My code basically reads in two 640x480 frames taken by my webcam from slightly different positions (basically just sliding the laptop a little bit horizontally). I already have a calibration matrix for it, obtained from ofxCv's calibration code, which uses findChessboardCorners. The undistortion example code seems to indicate that the calibration matrix is accurate. It calculates the optical flow between the pictures (either calcOpticalFlowPyrLK or calcOpticalFlowFarneback), and feeds those point pairs to findFundamentalMatrix.
To test if the fundamental matrix is valid, I decomposed it to a rotation and translation matrix. I then multiplied the rotation matrix by the points of the second image, to see what the rotation difference between the cameras was. I figured that any difference should be small, but I'm getting big differences.
Here's the fundamental and rotation matrix of my last code, if it helps:
fund: [-8.413948689969405e-07, -0.0001918870646474247, 0.06783422344973795;
0.0001877654679452431, 8.522397812179886e-06, 0.311671691674232;
-0.06780237856576941, -0.3177275967586101, 1]
R: [0.8081771697692786, -0.1096128431920695, -0.5786490187247098;
-0.1062963539438068, -0.9935398408215166, 0.03974506055610323;
-0.5792674230456705, 0.02938723035105822, -0.8146076621848839]
t: [0, 0.3019063882496216, -0.05799044915951077;
-0.3019063882496216, 0, -0.9515721940769112;
0.05799044915951077, 0.9515721940769112, 0]
Here's my portion of the code, which occurs after the second picture is taken:
const ofImage& image1 = images[images.size() - 2];
const ofImage& image2 = images[images.size() - 1];
std::vector<cv::Point2f> points1 = flow->getPointsPrev();
std::vector<cv::Point2f> points2 = flow->getPointsNext();
std::vector<cv::KeyPoint> keyPoints1 = convertFrom(points1);
std::vector<cv::KeyPoint> keyPoints2 = convertFrom(points2);
std::cout << "points1: " << points1.size() << std::endl;
std::cout << "points2: " << points2.size() << std::endl;
fundamentalMatrix = (cv::Mat)cv::findFundamentalMat(points1, points2);
cv::Mat cameraMatrix = (cv::Mat)calibration.getDistortedIntrinsics().getCameraMatrix();
cv::Mat cameraMatrixInv = cameraMatrix.inv();
std::cout << "fund: " << fundamentalMatrix << std::endl;
essentialMatrix = cameraMatrix.t() * fundamentalMatrix * cameraMatrix;
cv::SVD svd(essentialMatrix);
Matx33d W(0,-1,0, //HZ 9.13
1,0,0,
0,0,1);
cv::Mat_<double> R = svd.u * Mat(W).inv() * svd.vt; //HZ 9.19
std::cout << "R: " << (cv::Mat)R << std::endl;
Matx33d Z(0, -1, 0,
1, 0, 0,
0, 0, 0);
cv::Mat_<double> t = svd.vt.t() * Mat(Z) * svd.vt;
std::cout << "t: " << (cv::Mat)t << std::endl;
Vec3d tVec = Vec3d(t(1,2), t(2,0), t(0,1));
Matx34d P1 = Matx34d(R(0,0), R(0,1), R(0,2), tVec(0),
R(1,0), R(1,1), R(1,2), tVec(1),
R(2,0), R(2,1), R(2,2), tVec(2));
ofMatrix4x4 ofR(R(0,0), R(0,1), R(0,2), 0,
R(1,0), R(1,1), R(1,2), 0,
R(2,0), R(2,1), R(2,2), 0,
0, 0, 0, 1);
ofRs.push_back(ofR);
cv::Matx34d P(1,0,0,0,
0,1,0,0,
0,0,1,0);
for (int y = 0; y < image1.height; y += 10) {
for (int x = 0; x < image1.width; x += 10) {
Vec3d vec(x, y, 0);
Point3d point1(vec.val[0], vec.val[1], vec.val[2]);
Vec3d result = (cv::Mat)((cv::Mat)R * (cv::Mat)vec);
Point3d point2 = result;
mesh.addColor(image1.getColor(x, y));
mesh.addVertex(ofVec3f(point1.x, point1.y, point1.z));
mesh.addColor(image2.getColor(x, y));
mesh.addVertex(ofVec3f(point2.x, point2.y, point2.z));
}
}
Any ideas? Does my fundamental matrix look correct, or do I have the wrong idea in testing it?
If you want to find out if your Fundamental Matrix is correct, you should compute error.
Using the epipolar constraint equation, you can check how close the detected features in one image lie on the epipolar lines of the other image. Ideally, these dot products should sum to 0, and thus, the calibration error is computed as the sum of absolute distances (SAD). The mean of the SAD is reported as stereo calibration error. Basically, you are computing SAD of the computed features in image_left (could be chessboard corners) from the corresponding epipolar lines. This error is measured in pixel^2, anything below 1 is acceptable.
OpenCV has code examples, look at the Stereo Calibrate cpp file, it shows you how to compute this error.
https://code.ros.org/trac/opencv/browser/trunk/opencv/samples/c/stereo_calib.cpp?rev=2614
Look at "avgErr" Lines 260-269
Ankur
i think that you did not remove matches which are incorrect before you use then to calculate F.
Also i have an idea on how to validate F ,from x'Fx=0,you can replace several x' and x in the formula.
KyleFan
I wrote a python function to do this:
def Ferror(F,pts1,pts2): # pts are Nx3 array of homogenous coordinates.
# how well F satisfies the equation pt1 * F * pt2 == 0
vals = pts1.dot(F).dot(pts2.T)
err = np.abs(vals)
print("avg Ferror:",np.mean(err))
return np.mean(err)
I am looking to compute a fast correlation using FFTs and the kissfft library, and scaling needs to be precise. What scaling is necessary (forward and backwards) and what value do I use to scale my data?
The 3 most common FFT scaling factors are:
1.0 forward FFT, 1.0/N inverse FFT
1.0/N forward FFT, 1.0 inverse FFT
1.0/sqrt(N) in both directions, FFT & IFFT
Given any possible ambiguity in the documentation, and for whatever scaling the user expects to be "correct" for their purposes, best to just feed a pure sine wave of known (1.0 float or 255 integer) amplitude and exactly periodic in the FFT length to the FFT (and/or IFFT) in question, and see if the scaling matches one of the above, is maybe different from one of the above by 2X or sqrt(2), or the desired scaling is something completely different.
e.g. Write a unit test for kissfft in your environment for your data types.
multiply each frequency response by 1/sqrt(N), for an overall scaling of 1/N
In pseudocode:
ifft( fft(x)*conj( fft(y) )/N ) == circular_correlation(x,y)
At least this is true for kisfft with floating point types.
The output of the following c++ example code should be something like
the circular correlation of [1, 3i, 0 0 ....] with itself = (10,0),(1.19796e-10,3),(-4.91499e-08,1.11519e-15),(1.77301e-08,-1.19588e-08) ...
#include <complex>
#include <iostream>
#include "kiss_fft.h"
using namespace std;
int main()
{
const int nfft=256;
kiss_fft_cfg fwd = kiss_fft_alloc(nfft,0,NULL,NULL);
kiss_fft_cfg inv = kiss_fft_alloc(nfft,1,NULL,NULL);
std::complex<float> x[nfft];
std::complex<float> fx[nfft];
memset(x,0,sizeof(x));
x[0] = 1;
x[1] = std::complex<float>(0,3);
kiss_fft(fwd,(kiss_fft_cpx*)x,(kiss_fft_cpx*)fx);
for (int k=0;k<nfft;++k) {
fx[k] = fx[k] * conj(fx[k]);
fx[k] *= 1./nfft;
}
kiss_fft(inv,(kiss_fft_cpx*)fx,(kiss_fft_cpx*)x);
cout << "the circular correlation of [1, 3i, 0 0 ....] with itself = ";
cout
<< x[0] << ","
<< x[1] << ","
<< x[2] << ","
<< x[3] << " ... " << endl;
kiss_fft_free(fwd);
kiss_fft_free(inv);
return 0;
}