cv::SVM response one class for every sample - opencv

I am new in Match faces , I am trying to learn how to use SVM with HOG descriptors.
I wrote a simple face recognizer with SVM, but when i activate it , code always returns 1
float *getHOG(const cv::Mat &image, int* count)//Compute HOG
{
cv::HOGDescriptor hog;
std::vector<float> res;
cv::Mat img2;
cv::resize(image, img2, cv::Size(64, 128));
hog.compute(img2, res, cv::Size(8, 8), cv::Size(0, 0));
*count = res.size();
float* result = new float[*count];
for(int i = 0; i < res.size(); i++)
{
result[i] = res[i];
}
return result;
}
const int dataSetLength = 10;
float **getTraininigData(int* setlen, int* veclen)//Load some samples of data
{
char *names[dataSetLength] = {
"../faces/s1/1.pgm",
"../faces/s1/2.pgm",
"../faces/s1/3.pgm",
"../faces/s1/4.pgm",
"../faces/s1/5.pgm",
"../faces/cars/1.jpg",
"../faces/cars/2.jpg",
"../faces/cars/3.jpg",
"../faces/cars/4.jpg",
"../faces/cars/5.jpg",
};
float **res = new float* [dataSetLength];
for(int i = 0; i < dataSetLength; i++)
{
std::cout<<names[i]<<"\n";
cv::Mat img = cv::imread(names[i], 0);
res[i] = getHOG(img, veclen);
}
*setlen = dataSetLength;
return res;
}
void test()//Training and activate SVM
{
int setlen, veclen;
float **trainingData = getTraininigData(&setlen, &veclen);
float *labels = new float[dataSetLength];
for(int i = 0; i < dataSetLength; i++)
{
labels[i] = (i < dataSetLength/2)? 0.0 : 1.0;
}
cv::Mat labelsMat(setlen, 1, CV_32FC1, labels);
cv::Mat trainingDataMat(setlen, veclen, CV_32FC1, trainingData);
cv::SVMParams params;
params.svm_type = cv::SVM::C_SVC;
params.kernel_type = cv::SVM::LINEAR;
params.term_crit = cv::TermCriteria(CV_TERMCRIT_ITER, 100, 1e-6);
std::cout<<labelsMat<<"\n";
cv::SVM SVM;
SVM.train(trainingDataMat, labelsMat, cv::Mat(), cv::Mat(), params);
cv::Mat img = cv::imread("../faces/s1/2.pgm", 0);//sample from train data, but ansewer is 1 for every sample
auto desc = getHOG(img, &veclen);
cv::Mat sampleMat(1, veclen, CV_32FC1, desc);
float response = SVM.predict(sampleMat);
std::cout<<"resp "<< response<<"\n";
}
What wrong with my code ?
PS sorry for my writing mistakes. English in not my native language

You don't have much training data. Note how Dalal and Triggs in their original paper on HOG (http://lear.inrialpes.fr/people/triggs/pubs/Dalal-cvpr05.pdf) used thousands of examples to train the SVM, you have just 5 negative and 5 positive.
You haven't set the C parameter (you need to find a good value via cross validation) - you will need more data.
Possibly HOG descriptors for faces and cars are not separable with a linear kernel, try RBF.
But this is unlikely to be an issue since D&L use a linear SVM in their paper.
Read this: http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf
If you haven't done this yet, get the SVM working for a simpler case (e.g. just use image patches instead of HOG).

Related

Is there a built-in function to split a 3-channel Mat into three 3-channel Mat rather than into three 1-channel Mat?

As far as I know the built-in split will split one 3-channel Mat into three 1-channel Mat. As a result, those three Mat are just gray scale with some different intensities.
My intent is to get three 3-channel Mat as follows.
void splitTo8UC3(const Mat& input, vector<Mat>& output)
{
Mat blue = input.clone();
Mat green = input.clone();
Mat red = input.clone();
const uint N = input.rows * input.step;
for (uint i = 0; i < N; i += 3)
{
// blue.data[i]
green.data[i] = 0;
red.data[i] = 0;
blue.data[i + 1] = 0;
//green.data[i+1]
red.data[i + 1] = 0;
blue.data[i + 2] = 0;
green.data[i + 2] = 0;
//red.data[i+2]
}
output.push_back(blue);
output.push_back(green);
output.push_back(red);
}
It works but instead of reinventing the wheel, I am looking for the built-in if any.
Edit
The proposed solution must be faster than mine.
EDIT: I incorporated Dan's suggested improvements from his comment.
I can't think of a built-in function exactly doing this, and I also couldn't find one. But while doing some research, I came across the mixChannels function, which might improve your solution. At least, it avoids implementing a loop.
Here are my modifications to your code:
void splitTo8UC3(const cv::Mat& input, std::vector<cv::Mat>& output)
{
// Allocate outputs
cv::Mat b(cv::Mat::zeros(input.size(), input.type()));
cv::Mat g(cv::Mat::zeros(input.size(), input.type()));
cv::Mat r(cv::Mat::zeros(input.size(), input.type()));
// Collect outputs
cv::Mat out[] = { b, g, r };
// Set up index pairs
int from_to[] = { 0,0, 1,4, 2,8 };
cv::mixChannels(&input, 1, out, 3, from_to, 3);
output.assign(std::begin(out), std::end(out));
}
Let's have this test image colors.png:
And, let's have this test code:
cv::Mat img = cv::imread("images/colors.png");
std::vector<cv::Mat> bgr;
splitTo8UC3(img, bgr);
cv::imwrite("images/b.png", bgr[0]);
cv::imwrite("images/g.png", bgr[1]);
cv::imwrite("images/r.png", bgr[2]);
Then, we get the following outputs b.png, g.png, and r.png, which hopefully are the them as for your initial solution:
Hope that helps!

partition a set of images into k clusters with opencv

I have an image data set that I would like to partition into k clusters. I am trying to use the opencv implementation of k-means clustering.
Firstly, I store my Mat images into a vector of Mat and then I am trying to use the kmeans function. However, I am getting an assertion error.
Should the images be stored into a different kind of structure? I have read the k-means documentation and I dont seem to understand what I am doing wrong. This is my code:
Thank you in advance,
vector <Mat> images;
string folder = "D:\\football\\positive_clustering\\";
string mask = "*.bmp";
vector<string> files = getFileList(folder + mask);
for (int i = 0; i < files.size(); i++)
{
Mat img = imread(folder + files[i]);
images.push_back(img);
}
cout << "Vector of positive samples created" << endl;
int k = 10;
cv::Mat bestLabels;
cv::kmeans(images, k, bestLabels, TermCriteria(), 3, KMEANS_PP_CENTERS);
//have a look
vector<cv::Mat> clusterViz(bestLabels.rows);
for (int i = 0; i<bestLabels.rows; i++)
{
clusterViz[bestLabels.at<int>(i)].push_back(cv::Mat(images[bestLabels.at<int>(i)]));
}
namedWindow("clusters", WINDOW_NORMAL);
for (int i = 0; i<clusterViz.size(); i++)
{
cv::imshow("clusters", clusterViz[i]);
cv::waitKey();
}

Algorithm for shrinking/limiting palette of an image

as input data I have a 24 bit RGB image and a palette with 2..20 fixed colours. These colours are in no way spread regularly over the full colour range.
Now I have to modify the colours of input image so that only the colours of the given palette are used - using the colour out of the palette that is closest to the original colour (not closest mathematically but for human's visual impression). So what I need is an algorithm that uses an input colour and finds the colour in target palette that visually fits best to this colour. Please note: I'm not looking for a stupid comparison/difference algorithm but for something that really incorporates the impression a colour has on humans!
Since this is something that already should have been done and because I do not want to re-invent the wheel again: is there some example source code out there that does this job? In best case it is really a piece of code and not a link to a desastrous huge library ;-)
(I'd guess OpenCV does not provide such a function?)
Thanks
You should look at the Lab color space. It was designed so that the distance in the colour space equals the perceptual distance. So once you have converted your image you can compute the distances as you would have done earlier, but should get a better result from a perceptual point of view. In OpenCV you can use the cvtColor(source, destination, CV_BGR2Lab) function.
Another Idea would be to use dithering. The idea is to mix missing colours using neighbouring pixels. A popular algorithm for this is Floyd-Steinberg dithering.
Here is an example of mine, where I combined a optimized palette using k-means with the Lab colourspace and floyd steinberg dithering:
#include <opencv2/opencv.hpp>
#include <iostream>
using namespace cv;
using namespace std;
cv::Mat floydSteinberg(cv::Mat img, cv::Mat palette);
cv::Vec3b findClosestPaletteColor(cv::Vec3b color, cv::Mat palette);
int main(int argc, char** argv)
{
// Number of clusters (colors on result image)
int nrColors = 18;
cv::Mat imgBGR = imread(argv[1],1);
cv::Mat img;
cvtColor(imgBGR, img, CV_BGR2Lab);
cv::Mat colVec = img.reshape(1, img.rows*img.cols); // change to a Nx3 column vector
cv::Mat colVecD;
colVec.convertTo(colVecD, CV_32FC3, 1.0); // convert to floating point
cv::Mat labels, centers;
cv::kmeans(colVecD, nrColors, labels,
cv::TermCriteria(CV_TERMCRIT_ITER, 100, 0.1),
3, cv::KMEANS_PP_CENTERS, centers); // compute k mean centers
// replace pixels by there corresponding image centers
cv::Mat imgPosterized = img.clone();
for(int i = 0; i < img.rows; i++ )
for(int j = 0; j < img.cols; j++ )
for(int k = 0; k < 3; k++)
imgPosterized.at<Vec3b>(i,j)[k] = centers.at<float>(labels.at<int>(j+img.cols*i),k);
// convert palette back to uchar
cv::Mat palette;
centers.convertTo(palette,CV_8UC3,1.0);
// call floyd steinberg dithering algorithm
cv::Mat fs = floydSteinberg(img, palette);
cv::Mat imgPosterizedBGR, fsBGR;
cvtColor(imgPosterized, imgPosterizedBGR, CV_Lab2BGR);
cvtColor(fs, fsBGR, CV_Lab2BGR);
imshow("input",imgBGR); // original image
imshow("result",imgPosterizedBGR); // posterized image
imshow("fs",fsBGR); // floyd steinberg dithering
waitKey();
return 0;
}
cv::Mat floydSteinberg(cv::Mat imgOrig, cv::Mat palette)
{
cv::Mat img = imgOrig.clone();
cv::Mat resImg = img.clone();
for(int i = 0; i < img.rows; i++ )
for(int j = 0; j < img.cols; j++ )
{
cv::Vec3b newpixel = findClosestPaletteColor(img.at<Vec3b>(i,j), palette);
resImg.at<Vec3b>(i,j) = newpixel;
for(int k=0;k<3;k++)
{
int quant_error = (int)img.at<Vec3b>(i,j)[k] - newpixel[k];
if(i+1<img.rows)
img.at<Vec3b>(i+1,j)[k] = min(255,max(0,(int)img.at<Vec3b>(i+1,j)[k] + (7 * quant_error) / 16));
if(i-1 > 0 && j+1 < img.cols)
img.at<Vec3b>(i-1,j+1)[k] = min(255,max(0,(int)img.at<Vec3b>(i-1,j+1)[k] + (3 * quant_error) / 16));
if(j+1 < img.cols)
img.at<Vec3b>(i,j+1)[k] = min(255,max(0,(int)img.at<Vec3b>(i,j+1)[k] + (5 * quant_error) / 16));
if(i+1 < img.rows && j+1 < img.cols)
img.at<Vec3b>(i+1,j+1)[k] = min(255,max(0,(int)img.at<Vec3b>(i+1,j+1)[k] + (1 * quant_error) / 16));
}
}
return resImg;
}
float vec3bDist(cv::Vec3b a, cv::Vec3b b)
{
return sqrt( pow((float)a[0]-b[0],2) + pow((float)a[1]-b[1],2) + pow((float)a[2]-b[2],2) );
}
cv::Vec3b findClosestPaletteColor(cv::Vec3b color, cv::Mat palette)
{
int i=0;
int minI = 0;
cv::Vec3b diff = color - palette.at<Vec3b>(0);
float minDistance = vec3bDist(color, palette.at<Vec3b>(0));
for (int i=0;i<palette.rows;i++)
{
float distance = vec3bDist(color, palette.at<Vec3b>(i));
if (distance < minDistance)
{
minDistance = distance;
minI = i;
}
}
return palette.at<Vec3b>(minI);
}
Try this algorithm (it will reduct color number, but it compute palette by itself):
#include <opencv2/opencv.hpp>
#include "opencv2/legacy/legacy.hpp"
#include <vector>
#include <list>
#include <iostream>
using namespace cv;
using namespace std;
void main(void)
{
// Number of clusters (colors on result image)
int NrGMMComponents = 32;
// Source file name
string fname="D:\\ImagesForTest\\tools.jpg";
cv::Mat SampleImg = imread(fname,1);
//cv::GaussianBlur(SampleImg,SampleImg,Size(5,5),3);
int SampleImgHeight = SampleImg.rows;
int SampleImgWidth = SampleImg.cols;
// Pick datapoints
vector<Vec3d> ListSamplePoints;
for (int y=0; y<SampleImgHeight; y++)
{
for (int x=0; x<SampleImgWidth; x++)
{
// Get pixel color at that position
Vec3b bgrPixel = SampleImg.at<Vec3b>(y, x);
uchar b = bgrPixel.val[0];
uchar g = bgrPixel.val[1];
uchar r = bgrPixel.val[2];
if(rand()%25==0) // Pick not every, bu t every 25-th
{
ListSamplePoints.push_back(Vec3d(b,g,r));
}
} // for (x)
} // for (y)
// Form training matrix
Mat labels;
int NrSamples = ListSamplePoints.size();
Mat samples( NrSamples, 3, CV_32FC1 );
for (int s=0; s<NrSamples; s++)
{
Vec3d v = ListSamplePoints.at(s);
samples.at<float>(s,0) = (float) v[0];
samples.at<float>(s,1) = (float) v[1];
samples.at<float>(s,2) = (float) v[2];
}
cout << "Learning to represent the sample distributions with" << NrGMMComponents << "gaussians." << endl;
// Algorithm parameters
CvEMParams params;
params.covs = NULL;
params.means = NULL;
params.weights = NULL;
params.probs = NULL;
params.nclusters = NrGMMComponents;
params.cov_mat_type = CvEM::COV_MAT_GENERIC; // DIAGONAL, GENERIC, SPHERICAL
params.start_step = CvEM::START_AUTO_STEP;
params.term_crit.max_iter = 1500;
params.term_crit.epsilon = 0.001;
params.term_crit.type = CV_TERMCRIT_ITER|CV_TERMCRIT_EPS;
//params.term_crit.type = CV_TERMCRIT_ITER;
// Train
cout << "Started GMM training" << endl;
CvEM em_model;
em_model.train( samples, Mat(), params, &labels );
cout << "Finished GMM training" << endl;
// Result image
Mat img = Mat::zeros( Size( SampleImgWidth, SampleImgHeight ), CV_8UC3 );
// Ask classifier for each pixel
Mat sample( 1, 3, CV_32FC1 );
Mat means;
means=em_model.getMeans();
for(int i = 0; i < img.rows; i++ )
{
for(int j = 0; j < img.cols; j++ )
{
Vec3b v=SampleImg.at<Vec3b>(i,j);
sample.at<float>(0,0) = (float) v[0];
sample.at<float>(0,1) = (float) v[1];
sample.at<float>(0,2) = (float) v[2];
int response = cvRound(em_model.predict( sample ));
img.at<Vec3b>(i,j)[0]=means.at<double>(response,0);
img.at<Vec3b>(i,j)[1]=means.at<double>(response,1);
img.at<Vec3b>(i,j)[2]=means.at<double>(response,2);
}
}
img.convertTo(img,CV_8UC3);
imshow("result",img);
waitKey();
// Save the result
cv::imwrite("result.png", img);
}
PS: For perceptive color distance measurement it's better to use L*a*b color space. There is converter in opencv for this purpose. For clustering you can use k-means with defined cluster centers (your palette entries). After clustering you'll get points with indexes of palette intries.

Input matrix to opencv kmeans clustering

This question is specific to opencv:
The kmeans example given in the opencv documentation has a 2-channel matrix - one channel for each dimension of the feature vector. But, some of the other example seem to say that it should be a one channel matrix with features along the columns with one row for each sample. Which of these is right?
if I have a 5 dimensional feature vector, what should be the input matrix that I use:
This one:
cv::Mat inputSamples(numSamples, 1, CV32FC(numFeatures))
or this one:
cv::Mat inputSamples(numSamples, numFeatures, CV_32F)
The correct answer is cv::Mat inputSamples(numSamples, numFeatures, CV_32F).
The OpenCV Documentation about kmeans says:
samples – Floating-point matrix of input samples, one row per sample
So it is not a Floating-point vector of n-Dimensional floats as in the other option. Which examples suggested such a behaviour?
Here is also a small example by me that shows how kmeans can be used. It clusters the pixels of an image and displays the result:
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"
using namespace cv;
int main( int argc, char** argv )
{
Mat src = imread( argv[1], 1 );
Mat samples(src.rows * src.cols, 3, CV_32F);
for( int y = 0; y < src.rows; y++ )
for( int x = 0; x < src.cols; x++ )
for( int z = 0; z < 3; z++)
samples.at<float>(y + x*src.rows, z) = src.at<Vec3b>(y,x)[z];
int clusterCount = 15;
Mat labels;
int attempts = 5;
Mat centers;
kmeans(samples, clusterCount, labels, TermCriteria(CV_TERMCRIT_ITER|CV_TERMCRIT_EPS, 10000, 0.0001), attempts, KMEANS_PP_CENTERS, centers );
Mat new_image( src.size(), src.type() );
for( int y = 0; y < src.rows; y++ )
for( int x = 0; x < src.cols; x++ )
{
int cluster_idx = labels.at<int>(y + x*src.rows,0);
new_image.at<Vec3b>(y,x)[0] = centers.at<float>(cluster_idx, 0);
new_image.at<Vec3b>(y,x)[1] = centers.at<float>(cluster_idx, 1);
new_image.at<Vec3b>(y,x)[2] = centers.at<float>(cluster_idx, 2);
}
imshow( "clustered image", new_image );
waitKey( 0 );
}
As alternative to reshaping the input matrix manually, you can use OpenCV reshape function to achieve similar result with less code. Here is my working implementation of reducing colors count with K-Means method (in Java):
private final static int MAX_ITER = 10;
private final static int CLUSTERS = 16;
public static Mat colorMapKMeans(Mat img, int K, int maxIterations) {
Mat m = img.reshape(1, img.rows() * img.cols());
m.convertTo(m, CvType.CV_32F);
Mat bestLabels = new Mat(m.rows(), 1, CvType.CV_8U);
Mat centroids = new Mat(K, 1, CvType.CV_32F);
Core.kmeans(m, K, bestLabels,
new TermCriteria(TermCriteria.COUNT | TermCriteria.EPS, maxIterations, 1E-5),
1, Core.KMEANS_RANDOM_CENTERS, centroids);
List<Integer> idx = new ArrayList<>(m.rows());
Converters.Mat_to_vector_int(bestLabels, idx);
Mat imgMapped = new Mat(m.size(), m.type());
for(int i = 0; i < idx.size(); i++) {
Mat row = imgMapped.row(i);
centroids.row(idx.get(i)).copyTo(row);
}
return imgMapped.reshape(3, img.rows());
}
public static void main(String[] args) {
System.loadLibrary(Core.NATIVE_LIBRARY_NAME);
Highgui.imwrite("result.png",
colorMapKMeans(Highgui.imread(args[0], Highgui.CV_LOAD_IMAGE_COLOR),
CLUSTERS, MAX_ITER));
}
OpenCV reads image into 2 dimensional, 3 channel matrix. First call to reshape - img.reshape(1, img.rows() * img.cols()); - essentially unrolls 3 channels into columns. In resulting matrix one row corresponds to one pixel of the input image, and 3 columns corresponds to RGB components.
After K-Means algorithm finished its work, and color mapping has been applied, we call reshape again - imgMapped.reshape(3, img.rows()), but now rolling columns back into channels, and reducing row numbers to the original image row number, thus getting back the original matrix format, but only with reduced colors.

Drawing spectrum of an image in C++ (fftw, OpenCV)

I'm trying to create a program that will draw a 2d greyscale spectrum of a given image. I'm using OpenCV and FFTW libraries. By using tips and codes from the internet and modifying them I've managed to load an image, calculate fft of this image and recreate the image from the fft (it's the same). What I'm unable to do is to draw the fourier spectrum itself. Could you please help me?
Here's the code (less important lines removed):
/* Copy input image */
/* Create output image */
/* Allocate input data for FFTW */
in = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * N);
dft = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * N);
/* Create plans */
plan_f = fftw_plan_dft_2d(w, h, in, dft, FFTW_FORWARD, FFTW_ESTIMATE);
/* Populate input data in row-major order */
for (i = 0, k = 0; i < h; i++)
{
for (j = 0; j < w; j++, k++)
{
in[k][0] = ((uchar*)(img1->imageData + i * img1->widthStep))[j];
in[k][1] = 0.;
}
}
/* forward DFT */
fftw_execute(plan_f);
/* spectrum */
for (i = 0, k = 0; i < h; i++)
{
for (j = 0; j < w; j++, k++)
((uchar*)(img2->imageData + i * img2->widthStep))[j] = sqrt(pow(dft[k][0],2) + pow(dft[k][1],2));
}
cvShowImage("iplimage_dft(): original", img1);
cvShowImage("iplimage_dft(): result", img2);
cvWaitKey(0);
/* Free memory */
}
The problem is in the "Spectrum" section. Instead of a spectrum I get some noise. What am I doing wrong? I would be grateful for your help.
You need to draw magnitude of spectrum. here is the code.
void ForwardFFT(Mat &Src, Mat *FImg)
{
int M = getOptimalDFTSize( Src.rows );
int N = getOptimalDFTSize( Src.cols );
Mat padded;
copyMakeBorder(Src, padded, 0, M - Src.rows, 0, N - Src.cols, BORDER_CONSTANT, Scalar::all(0));
// Создаем комплексное представление изображения
// planes[0] содержит само изображение, planes[1] его мнимую часть (заполнено нулями)
Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
Mat complexImg;
merge(planes, 2, complexImg);
dft(complexImg, complexImg);
// После преобразования результат так-же состоит из действительной и мнимой части
split(complexImg, planes);
// обрежем спектр, если у него нечетное количество строк или столбцов
planes[0] = planes[0](Rect(0, 0, planes[0].cols & -2, planes[0].rows & -2));
planes[1] = planes[1](Rect(0, 0, planes[1].cols & -2, planes[1].rows & -2));
Recomb(planes[0],planes[0]);
Recomb(planes[1],planes[1]);
// Нормализуем спектр
planes[0]/=float(M*N);
planes[1]/=float(M*N);
FImg[0]=planes[0].clone();
FImg[1]=planes[1].clone();
}
void ForwardFFT_Mag_Phase(Mat &src, Mat &Mag,Mat &Phase)
{
Mat planes[2];
ForwardFFT(src,planes);
Mag.zeros(planes[0].rows,planes[0].cols,CV_32F);
Phase.zeros(planes[0].rows,planes[0].cols,CV_32F);
cv::cartToPolar(planes[0],planes[1],Mag,Phase);
}
Mat LogMag;
LogMag.zeros(Mag.rows,Mag.cols,CV_32F);
LogMag=(Mag+1);
cv::log(LogMag,LogMag);
//---------------------------------------------------
imshow("Логарифм амплитуды", LogMag);
imshow("Фаза", Phase);
imshow("Результат фильтрации", img);
Can you try to do the IFFT step and see if you recover the original image ? then , you can check step by step where is your problem. Another solution to find the problem is to do this process with a small matrix predefined by you ,and calculate it FFT in MATLAB, and check step by step, it worked for me!

Resources