compare 2 histograms with Chi-Square - opencv

i want to compare 2 Histograms, which have 2 dimensions.
For this i want to use the Chi-Square-Metric.
My comparator looks like this function:
double Histogram::compareHistogram(Histogram *hist){
double result=0;
double a=0;
double b=0;
for (int y=0 ; y < bins_1 ; y++) {
for (int x=0 ; x < bins_2 ; x++) {
a=getHistogramValue(x,y)-hist->getHistogramValue(x,y);
b=getHistogramValue(x,y)+hist->getHistogramValue(x,y);
if(fabs(b)>0.0){
result+=a*a/b;
}
}
}
return result;
}
I've compared the result with the result of OpenCv's cv::compareHist() function and it is different. I don't know why.
Before i compared the histograms, i norm the histograms with the MINMAX-Norm.
I compared my normed histogram with the normed histogram of openCV and they are equal.
So I think, the problem is in my compareHist function.
But where?
Best regards,

The relevant section of source code from OpenCV is as follows:
if( method == CV_COMP_CHISQR )
{
for( j = 0; j < len; j++ )
{
double a = h1[j] - h2[j];
double b = h1[j];
if( fabs(b) > DBL_EPSILON )
result += a*a/b;
}
}
So you can see that the difference in your code is this line
b=getHistogramValue(x,y)+hist->getHistogramValue(x,y);
which should be
b=getHistogramValue(x,y);

Related

How to match keypoints in SIFT ?

How to match keypoints in SIFT ?
I have calculated 128 size vector for each keypoint in an image.
let, I1 is original image, I2 is 45 degree rotated image.
I got 130 keypoints for I1 and 104 keypoints for I2.
i.e. 128x130 and 128x104.
I calculated euclidean distance between one keypoint of I1 and all keypoints of I2. so I got again euclidean distance matrix of size 128x104.
Now I need to choose nearest keypoint from this euclidean distance matrix. How I can select minimum distance 128 size vector out of 128 x 104 sized matrix?
Since you have already calculated the distance between the keypoints, in order to match them, sort them in increasing order of Euclidean distance, and consider only those keypoints which are a constant*min_distance [i.e: select on some %age of the sorted distances] as 'good matches'.
There is also BruteForceMatcher, KNNMatch and FlannBasedMatcher in OpenCV (URL Below)
http://docs.opencv.org/2.4/doc/tutorials/features2d/feature_flann_matcher/feature_flann_matcher.html#feature-flann-matcher
and
http://docs.opencv.org/2.4/modules/features2d/doc/common_interfaces_of_descriptor_matchers.html#descriptormatcher-knnmatch
Also, have a look at these questions and their responses.
1) Trying to match two images using sift in OpenCv, but too many matches
2) Efficient way for SIFT descriptor matching
Just for completeness, providing some very rough code for your reference.
If you have ;
class SIFTDemo
{
private:
Mat image;
vector<cv::KeyPoint> keypoints;
Mat descriptors;
Mat sift_output;
vector<DMatch> matches;
public:
SIFTDemo();
~SIFTDemo();
SIFTDemo(Mat m);
void extractSiftFeatures();
vector <DMatch> FindMatchesEuclidian(SIFTDemo &m2);
};
Then one can have something like this;
void SIFTDemo::extractSiftFeatures()
{
SIFT siftobject;
siftobject.operator()(image, Mat(), keypoints, descriptors);
}
vector<DMatch> SIFTDemo::FindMatchesEuclidian(SIFTDemo &m2)
{
// Calculate euclidian distance between keypoints to find best matching pairs.
// create two dimensional vector for storing euclidian distance
vector< vector<float> > vec1, unsortedvec1;
for (int i=0; i<this->keypoints.size(); i++)
{
vec1.push_back(vector<float>()); // Add an empty row
unsortedvec1.push_back(vector<float>());
}
// create vector of DMatch for storing matxhes point
vector<DMatch> matches1;
DMatch dm1;
// loop through keypoints1.size
for (int i=0; i<this->keypoints.size(); i++)
{
// get 128 dimensions in a vector
vector<float> k1;
for(int x=0; x<128; x++)
{
k1.push_back((float)this->descriptors.at<float>(i,x));
}
// loop through keypoints2.size
for (int j=0; j<m2.keypoints.size(); j++)
{
double temp=0;
// calculate euclidian distance
for(int x=0; x<128; x++)
{
temp += (pow((k1[x] - (float)m2.descriptors.at<float>(j,x)), 2.0));
}
vec1[i].push_back((float)sqrt(temp)); // store distance for each keypoints in image2
unsortedvec1[i] = vec1[i];
}
sort(vec1[i].begin(),vec1[i].end()); // sort the vector distances to get shortest distance
// find position of the shortest distance
int pos = (int)(find(unsortedvec1[i].begin(), unsortedvec1[i].end(), vec1[i][0]) - unsortedvec1[i].begin());
// assign that matchin feature to DMatch variable dm1
dm1.queryIdx = i;
dm1.trainIdx = pos;
dm1.distance = vec1[i][0];
matches1.push_back(dm1);
this->matches.push_back(dm1);
//cout << pos << endl;
}
// craete two dimensional vector for storing euclidian distance
vector<vector<float>> vec2, unsortedvec2;
for (int i=0; i<m2.keypoints.size(); i++)
{
vec2.push_back(vector<float>()); // Add an empty row
unsortedvec2.push_back(vector<float>());
}
// create vector of DMatch for storing matxhes point
vector<DMatch> matches2;
DMatch dm2;
// loop through keypoints2.size
for (int i=0; i<m2.keypoints.size(); i++)
{
// get 128 dimensions in a vector
vector<float> k1;
for(int x=0; x<128; x++)
{
k1.push_back((float)m2.descriptors.at<float>(i,x));
}
// loop through keypoints1.size
for (int j=0; j<this->keypoints.size(); j++)
{
double temp=0;
// calculate euclidian distance
for(int x=0; x<128; x++)
{
temp += (pow((k1[x] - (float)this->descriptors.at<float>(j,x)), 2.0));
}
vec2[i].push_back((float)sqrt(temp)); // store distance for each keypoints in image1
unsortedvec2[i] = vec2[i];
}
sort(vec2[i].begin(),vec2[i].end()); // sort the vector distances to get shortest distance
// find position of the shortest distance
int pos = (int)(find(unsortedvec2[i].begin(), unsortedvec2[i].end(), vec2[i][0]) - unsortedvec2[i].begin());
// assign that matchin feature to DMatch variable
dm2.queryIdx = i;
dm2.trainIdx = pos;
dm2.distance = vec2[i][0];
matches2.push_back(dm2);
m2.matches.push_back(dm2);
//cout << pos << endl;
}
// Ref : http://docs.opencv.org/2.4/doc/tutorials/features2d/feature_flann_matcher/feature_flann_matcher.html#feature-flann-matcher
//-- Quick calculation of max and min distances between keypoints1
double max_dist = 0;
double min_dist = 500.0;
for( int i = 0; i < matches1.size(); i++ )
{
double dist = matches1[i].distance;
if( dist < min_dist ) min_dist = dist;
if( dist > max_dist ) max_dist = dist;
}
// Draw only "good" matches1 (i.e. whose distance is less than 2*min_dist )
vector<DMatch> good_matches1;
for( int i = 0; i < matches1.size(); i++ )
{
if( matches1[i].distance <= 2*min_dist )
{
good_matches1.push_back( matches1[i]);
}
}
// Quick calculation of max and min distances between keypoints2 but not used
for( int i = 0; i < matches2.size(); i++ )
{
double dist = matches2[i].distance;
if( dist < min_dist ) min_dist = dist;
if( dist > max_dist ) max_dist = dist;
}
// Draw only "good" matches by comparing that (ft1 gives ft2) and (ft2 gives ft1)
vector<DMatch> good_matches;
for(unsigned int i=0; i<good_matches1.size(); i++)
{
// check ft1=ft2 and ft2=ft1
if(good_matches1[i].queryIdx == matches2[good_matches1[i].trainIdx].trainIdx)
good_matches.push_back(good_matches1[i]);
}
return good_matches;
}
FInally, as mentioned in the comment also look at RANSAC to do this. Not diving into that not to make the answer longer but you can find resources online and on SO.

what algorithm does cv::arclength use to compute perimeter?

I am currently doing a project which requires me use some structural analysis like finding the perimeter and area. I have successfully obtained the contour of the object.
when I use contour.size() function it return 1108(in this case)
when I used cv::arclength(contour) function it returns 1200.
shouldn't the Perimeter be the number of points of the contour.(the contour is the external boundary of the object)? Which should I trust?
not necessarily, with cv::arclength you summarize the euclidean distances between the consecutive points in the curve.
Here is a code snippet of cv::arclength:
...
const Point2f* ptf = curve.ptr<Point2f>();
...
for( i = 0; i < count; i++ )
{
Point2f p = ptf[i];
float dx = p.x - prev.x, dy = p.y - prev.y;
buf[j] = dx*dx + dy*dy;
if( ++j == N || i == count-1 )
{
Mat bufmat(1, j, CV_32F, buf);
sqrt(bufmat, bufmat);
for( ; j > 0; j-- )
perimeter += buf[j-1];
}
prev = p;
}
return perimeter;

Fast Gaussian Blur image filter with ARM NEON

I'm trying to make a mobile fast version of Gaussian Blur image filter.
I've read other questions, like: Fast Gaussian blur on unsigned char image- ARM Neon Intrinsics- iOS Dev
For my purpose i need only a fixed size (7x7) fixed sigma (2) Gaussian filter.
So, before optimizing for ARM NEON, I'm implementing 1D Gaussian Kernel in C++, and comparing performance with OpenCV GaussianBlur() method directly in mobile environment (Android with NDK). This way it will result in a much simpler code to optimize.
However the result is that my implementation is 10 times slower then OpenCV4Android version. I've read that OpenCV4 Tegra have optimized GaussianBlur implementation, but I don't think that standard OpenCV4Android have those kind of optimizations, so why is my code so slow?
Here is my implementation (note: reflect101 is used for pixel reflection when applying filter near borders):
Mat myGaussianBlur(Mat src){
Mat dst(src.rows, src.cols, CV_8UC1);
Mat temp(src.rows, src.cols, CV_8UC1);
float sum, x1, y1;
// coefficients of 1D gaussian kernel with sigma = 2
double coeffs[] = {0.06475879783, 0.1209853623, 0.1760326634, 0.1994711402, 0.1760326634, 0.1209853623, 0.06475879783};
//Normalize coeffs
float coeffs_sum = 0.9230247873f;
for (int i = 0; i < 7; i++){
coeffs[i] /= coeffs_sum;
}
// filter vertically
for(int y = 0; y < src.rows; y++){
for(int x = 0; x < src.cols; x++){
sum = 0.0;
for(int i = -3; i <= 3; i++){
y1 = reflect101(src.rows, y - i);
sum += coeffs[i + 3]*src.at<uchar>(y1, x);
}
temp.at<uchar>(y,x) = sum;
}
}
// filter horizontally
for(int y = 0; y < src.rows; y++){
for(int x = 0; x < src.cols; x++){
sum = 0.0;
for(int i = -3; i <= 3; i++){
x1 = reflect101(src.rows, x - i);
sum += coeffs[i + 3]*temp.at<uchar>(y, x1);
}
dst.at<uchar>(y,x) = sum;
}
}
return dst;
}
A big part of the problem, here, is that the algorithm is overly precise, as #PaulR pointed out. It's usually best to keep your coefficient table no more precise than your data. In this case, since you appear to be processing uchar data, you would use roughly an 8-bit coefficient table.
Keeping these weights small will particularly matter in your NEON implementation because the narrower you have the arithmetic, the more lanes you can process at once.
Beyond that, the first major slowdown that stands out is that having the image edge reflection code within the main loop. That's going to make the bulk of the work less efficient because it will generally not need to do anything special in that case.
It might work out better if you use a special version of the loop near the edges, and then when you're safe from that you use a simplified inner loop that doesn't call that reflect101() function.
Second (more relevant to prototype code) is that it's possible to add the wings of the window together before applying the weighting function, because the table contains the same coefficients on both sides.
sum = src.at<uchar>(y1, x) * coeffs[3];
for(int i = -3; i < 0; i++) {
int tmp = src.at<uchar>(y + i, x) + src.at<uchar>(y - i, x);
sum += coeffs[i + 3] * tmp;
}
This saves you six multiplies per pixel, and it's a step towards some other optimisations around controlling overflow conditions.
Then there are a couple of other problems related to the memory system.
The two-pass approach is good in principle, because it saves you from performing a lot of recomputation. Unfortunately it can push the useful data out of L1 cache, which can make everything quite a lot slower. It also means that when you write the result out to memory, you're quantising the intermediate sum, which can reduce precision.
When you convert this code to NEON, one of the things you will want to focus on is trying to keep your working set inside the register file, but without discarding calculations before they've been fully utilised.
When people do use two passes, it's usual for the intermediate data to be transposed -- that is, a column of input becomes a row of output.
This is because the CPU will really not like fetching small amounts of data across multiple lines of the input image. It works out much more efficient (because of the way the cache works) if you collect together a bunch of horizontal pixels, and filter those. If the temporary buffer is transposed, then the second pass also collects together a bunch of horizontal points (which would vertical in the original orientation) and it transposes its output again so it comes out the right way.
If you optimise to keep your working set localised, then you might not need this transposition trick, but it's worth knowing about so that you can set yourself a healthy baseline performance. Unfortunately, localisation like this does force you to go back to the non-optimal memory fetches, but with the wider data types that penalty can be mitigated.
If this is specifically for 8 bit images then you really don't want floating point coefficients, especially not double precision. Also you don't want to use floats for x1, y1. You should just use integers for coordinates and you can use fixed point (i.e. integer) for the coefficients to keep all the filter arithmetic in the integer domain, e.g.
Mat myGaussianBlur(Mat src){
Mat dst(src.rows, src.cols, CV_8UC1);
Mat temp(src.rows, src.cols, CV_16UC1); // <<<
int sum, x1, y1; // <<<
// coefficients of 1D gaussian kernel with sigma = 2
double coeffs[] = {0.06475879783, 0.1209853623, 0.1760326634, 0.1994711402, 0.1760326634, 0.1209853623, 0.06475879783};
int coeffs_i[7] = { 0 }; // <<<
//Normalize coeffs
float coeffs_sum = 0.9230247873f;
for (int i = 0; i < 7; i++){
coeffs_i[i] = (int)(coeffs[i] / coeffs_sum * 256); // <<<
}
// filter vertically
for(int y = 0; y < src.rows; y++){
for(int x = 0; x < src.cols; x++){
sum = 0; // <<<
for(int i = -3; i <= 3; i++){
y1 = reflect101(src.rows, y - i);
sum += coeffs_i[i + 3]*src.at<uchar>(y1, x); // <<<
}
temp.at<uchar>(y,x) = sum;
}
}
// filter horizontally
for(int y = 0; y < src.rows; y++){
for(int x = 0; x < src.cols; x++){
sum = 0; // <<<
for(int i = -3; i <= 3; i++){
x1 = reflect101(src.rows, x - i);
sum += coeffs_i[i + 3]*temp.at<uchar>(y, x1); // <<<
}
dst.at<uchar>(y,x) = sum / (256 * 256); // <<<
}
}
return dst;
}
This is the code after implementing all the suggestions of #Paul R and #sh1, summarized as follows:
1) use only integer arithmetic (with precision to taste)
2) add the values ​​of the pixels at the same distance from the mask center before applying the multiplications, to reduce the number of multiplications.
3) apply only horizontal filters to take advantage of the storage by rows of the matrices
4) separate cycles around the edges from those inside the image not to make unnecessary calls to reflection functions. I totally removed the functions of reflection, including them inside the loops along the edges.
5) In addition, as a personal observation, to improve rounding without calling a (slow) function "round" or "cvRound", I've added to both temporary and final pixel results 0.5f (= 32768 in integers precision) to reduce the error / difference compared to OpenCV.
Now the performance is much better from about 15 to about 6 times slower than OpenCV.
However, the resulting matrix is not perfectly identical to that obtained with the Gaussian Blur of OpenCV. This is not due to arithmetic length (sufficient) as well as removing the error remains. Note that this is a minimum difference, between 0 and 2 (in absolute value) of pixel intensity, between the matrices resulting from the two versions. Coefficient are the same used by OpenCV, obtained with getGaussianKernel with same size and sigma.
Mat myGaussianBlur(Mat src){
Mat dst(src.rows, src.cols, CV_8UC1);
Mat temp(src.rows, src.cols, CV_8UC1);
int sum;
int x1;
double coeffs[] = {0.070159, 0.131075, 0.190713, 0.216106, 0.190713, 0.131075, 0.070159};
int coeffs_i[7] = { 0 };
for (int i = 0; i < 7; i++){
coeffs_i[i] = (int)(coeffs[i] * 65536); //65536
}
// filter horizontally - inside the image
for(int y = 0; y < src.rows; y++){
uchar *ptr = src.ptr<uchar>(y);
for(int x = 3; x < (src.cols - 3); x++){
sum = ptr[x] * coeffs_i[3];
for(int i = -3; i < 0; i++){
int tmp = ptr[x+i] + ptr[x-i];
sum += coeffs_i[i + 3]*tmp;
}
temp.at<uchar>(y,x) = (sum + 32768) / 65536;
}
}
// filter horizontally - edges - needs reflect
for(int y = 0; y < src.rows; y++){
uchar *ptr = src.ptr<uchar>(y);
for(int x = 0; x <= 2; x++){
sum = 0;
for(int i = -3; i <= 3; i++){
x1 = x + i;
if(x1 < 0){
x1 = -x1;
}
sum += coeffs_i[i + 3]*ptr[x1];
}
temp.at<uchar>(y,x) = (sum + 32768) / 65536;
}
}
for(int y = 0; y < src.rows; y++){
uchar *ptr = src.ptr<uchar>(y);
for(int x = (src.cols - 3); x < src.cols; x++){
sum = 0;
for(int i = -3; i <= 3; i++){
x1 = x + i;
if(x1 >= src.cols){
x1 = 2*src.cols - x1 - 2;
}
sum += coeffs_i[i + 3]*ptr[x1];
}
temp.at<uchar>(y,x) = (sum + 32768) / 65536;
}
}
// transpose to apply again horizontal filter - better cache data locality
transpose(temp, temp);
// filter horizontally - inside the image
for(int y = 0; y < src.rows; y++){
uchar *ptr = temp.ptr<uchar>(y);
for(int x = 3; x < (src.cols - 3); x++){
sum = ptr[x] * coeffs_i[3];
for(int i = -3; i < 0; i++){
int tmp = ptr[x+i] + ptr[x-i];
sum += coeffs_i[i + 3]*tmp;
}
dst.at<uchar>(y,x) = (sum + 32768) / 65536;
}
}
// filter horizontally - edges - needs reflect
for(int y = 0; y < src.rows; y++){
uchar *ptr = temp.ptr<uchar>(y);
for(int x = 0; x <= 2; x++){
sum = 0;
for(int i = -3; i <= 3; i++){
x1 = x + i;
if(x1 < 0){
x1 = -x1;
}
sum += coeffs_i[i + 3]*ptr[x1];
}
dst.at<uchar>(y,x) = (sum + 32768) / 65536;
}
}
for(int y = 0; y < src.rows; y++){
uchar *ptr = temp.ptr<uchar>(y);
for(int x = (src.cols - 3); x < src.cols; x++){
sum = 0;
for(int i = -3; i <= 3; i++){
x1 = x + i;
if(x1 >= src.cols){
x1 = 2*src.cols - x1 - 2;
}
sum += coeffs_i[i + 3]*ptr[x1];
}
dst.at<uchar>(y,x) = (sum + 32768) / 65536;
}
}
transpose(dst, dst);
return dst;
}
According to Google document, on Android device, using float/double is twice slower than using int/uchar.
You may find some solutions to speed up your C++ code on this Android documents.
https://developer.android.com/training/articles/perf-tips

Whats wrong in the following cpp Bucubic interpolation code for Image Resizing

I am trying to upsample an Image using Bicubic Interpoloation, I need the accurate values matching the cvResize() function of opencv, but the results of following code is not matching the results from cvResize(), can you take a look and help me to fix the bug.
Image *Image::resize_using_Bicubic(int w, int h) {
float dx,dy;
float x,y;
float tx, ty;
float i,j,m,n;
Image *result=new Image(w, h);
tx = (float)this->m_width/(float)w;
ty = (float)this->m_height/(float)h;
for(i=0; i< w; i++)
{
for(j=0; j< h; j++)
{
x = i*tx;
y = j*ty;
dx = float(i-x)-(int)(i-x);
dy = float(j-y)-(int)(j-y);
float temp=0.0;
for(m=-1;m<=2;m++)
{
for(n=-1;n<=2;n++)
{
int HIndex,WIndex;
HIndex=(y+n);
WIndex=(x+m);
if (HIndex<0) {
HIndex=0;
}
else if(HIndex>this->getHeight())
{
HIndex=this->getHeight()-1;
}
if (WIndex<0) {
WIndex=0;
}
else if(WIndex>this->getWidth())
{
WIndex=this->getWidth()-1;
}
temp+=this->getPixel(HIndex,WIndex)*R(m-dx)*R(dy-n);
}
}
result->setPixel(j, i, temp);
}
}
return result;
}
You haven't said how different the results are. If they're very close, say within 1 or 2 in each RGB channel, this could be explained simply by roundoff differences.
There is more than one algorithm for Bicubic interpolation. Don Mitchell and Arun Netravali did an analysis and came up with a single formula to describe a number of them: http://www.mentallandscape.com/Papers_siggraph88.pdf
Edit: One more thing, the individual filter coefficients should be summed up and used to divide the final value at the end, to normalize the values. And I'm not sure why you have m-dx for one and dy-n for the other, shouldn't they be the same sign?
r=R(m-dx)*R(dy-n);
r_sum+=r;
temp+=this->getPixel(HIndex,WIndex)*r;
. . .
result->setPixel(j, i, temp/r_sum);
Change:
else if(HIndex>this->getHeight())
to:
else if(HIndex >= this->getHeight())
and change:
else if(WIndex>this->getWidth())
to:
else if(WIndex >= this->getWidth())
EDIT
Also change:
for(m=-1;m<=2;m++)
{
for(n=-1;n<=2;n++)
to:
for(m = -1; m <= 1; m++)
{
for(n = -1; n <= 1; n++)

Accessing value at row,col in a Matrix

I'm trying to access a specific row in a matrix but am having a hard time doing so.
I want to get the value at row j, column i but I don't think my algorithm is correct. I'm using OpenCV's Mat for my matrix and accessing it through the data member.
Here is how I am attempting to access values:
plane.data[i + j*plane.rows]
Where i = the column, j = the row. Is this correct? The Matrix is 1 plane from a YUV matrix.
Any help would be appreciated! Thanks.
No, your are wrong
plane.data[i + j*plane.rows] is not a good way to access pixel. Your pointer must depend on type of the matrix and its depth.
You should use at() operator of the matrix.
To make it simple here is a code sample which access each pixel of a matrix and prints it. It works almost for every matrix type and for any number of channels:
void printMat(const Mat& M){
switch ( (M.dataend-M.datastart) / (M.cols*M.rows*M.channels())){
case sizeof(char):
printMatTemplate<unsigned char>(M,true);
break;
case sizeof(float):
printMatTemplate<float>(M,false);
break;
case sizeof(double):
printMatTemplate<double>(M,false);
break;
}
}
template <typename T>
void printMatTemplate(const Mat& M, bool isInt = true){
if (M.empty()){
printf("Empty Matrix\n");
return;
}
if ((M.elemSize()/M.channels()) != sizeof(T)){
printf("Wrong matrix type. Cannot print\n");
return;
}
int cols = M.cols;
int rows = M.rows;
int chan = M.channels();
char printf_fmt[20];
if (isInt)
sprintf_s(printf_fmt,"%%d,");
else
sprintf_s(printf_fmt,"%%0.5g,");
if (chan > 1){
// Print multi channel array
for (int i = 0; i < rows; i++){
for (int j = 0; j < cols; j++){
printf("(");
const T* Pix = &M.at<T>(i,j);
for (int c = 0; c < chan; c++){
printf(printf_fmt,Pix[c]);
}
printf(")");
}
printf("\n");
}
printf("-----------------\n");
}
else {
// Single channel
for (int i = 0; i < rows; i++){
const T* Mi = M.ptr<T>(i);
for (int j = 0; j < cols; j++){
printf(printf_fmt,Mi[j]);
}
printf("\n");
}
printf("\n");
}
}
I do not think there is anything different between accessing RGB Mat and YUV Mat. Its just the colorspace different.
Please refer to http://opencv.willowgarage.com/wiki/faq#Howtoaccessmatrixelements.3F on how to access each pixel.

Resources