Euclidean distance between RGB histogram of two images - image-processing

I have two pictures with histogram of the R,G,B intensities for each image. I am suppose to find the euclidean distance using the values of histogram to find the similarity.
I know euclidean distance formula is:
= sqr((R1-R2)^2 +(G1-G2)^2+(B1-B2)^2)
Since the histogram of R G and B for each image has several values, so are you suppose to take the average of all the intensity values in one histogram and then subtract it with the average of intensity values of the other histogram?
Example 1:
Image1: R1 histogram has values of 2,3,4
Image2: R2 histogram has values of 2,3,1
Then do I do R1=(2+3+4)/3 ,R2=(2+3+1)/3
Then do I do (9-6)^2 for the value (R1-R2)^2 in sqr((R1-R2)^2+(G1-G2)^2+(B1-B2)^2)?
OR
Example 2:
Image1: R1 histogram has values of 2,3,4
Image2: R2 histogram has values of 2,3,1
Then do I do this (2-2)^2 +(3-3)^2 +(4-1)^2 for the (R1-R2)^2 in sqr((R1-R2)^2 +(G1-G2)^2+(B1-B2)^2)?
Please help me out, thanks!

Think of a histogram as a vector (maybe there are 256 bins, so it’s a 256-dimensional vector). Now compute the Euclidean distance between the two vectors:
DR = norm(R1-R2); % same as sqrt(sum((R1-R2).^2))
You can repeat this for each R, G and B component, and combine the three distances again using the Euclidean norm:
D = sqrt(DR.^2 + DG.^2 + DB.^2);
This is the same as concatenating the 3 color histograms for each image and computing their distance:
H1 = [R1,G1,B1]; % assuming histograms are row vectors
H2 = [R2,G2,B2];
D = norm(H1-H2);

I think you are mixing Normalization with Euclidean Distance.
Euclidean Distance = Sqrt( Sum( ( a[i][j] - b[i][j] )^2 ) ) for all i = 0..width, j = 0..height
a[][] and b[][] can be normalized data or non-normalized data. If you are using the raw image pixel values, they are non-normalized. You can normalize the images by dividing by the intensity range of the pixel values (min-max normalization).
So, compute the normalized images anorm[][] and bnorm[][] in the first pass where,
for(i = 0; i < width; i++) {
for(j = 0; j < height; j++) {
anorm[i][j] = a[i][j] / (max_a - min_a);
bnorm[i][j] = b[i][j] / (max_b - min_b);
}
}
Now, apply the Euclidean Distance formula on anorm[][] and bnorm[][].

Related

How Convexity Defect is calculated in OpenCV?

What is the algorithm used in OpenCV function convexityDefects() to calculate the convexity defects of a contour?
Please, describe and illustrate the high-level operation of the algorithm, along with its inputs and outputs.
Based on the documentation, the input are two lists of coordinates:
contour defining the original contour (red on the image below)
convexhull defining the convex hull corresponding to that contour (blue on the image below)
The algorithm works in the following manner:
If the contour or the hull contain 3 or less points, then the contour is always convex, and no more processing is needed. The algorithm assures that both the contour and the hull are accessed in the same orientation.
N.B.: In further explanation I assume they are in the same orientation, and ignore the details regarding representation of the floating point depth as an integer.
Then for each pair of adjacent hull points (H[i], H[i+1]), defining one edge of the convex hull, calculate the distance from the edge for each point on the contour C[n] that lies between H[i] and H[i+1] (excluding C[n] == H[i+1]). If the distance is greater than zero, then a defect is present. When a defect is present, record i, i+1, the maximum distance and the index (n) of the contour point where the maximum located.
Distance is calculated in the following manner:
dx0 = H[i+1].x - H[i].x
dy0 = H[i+1].y - H[i].y
if (dx0 is 0) and (dy0 is 0) then
scale = 0
else
scale = 1 / sqrt(dx0 * dx0 + dy0 * dy0)
dx = C[n].x - H[i].x
dy = C[n].y - H[i].y
distance = abs(-dy0 * dx + dx0 * dy) * scale
It may be easier to visualize in terms of vectors:
C: defect vector from H[i] to C[n]
H: hull edge vector from H[i] to H[i+1]
H_rot: hull edge vector H rotated 90 degrees
U_rot: unit vector in direction of H_rot
H components are [dx0, dy0], so rotating 90 degrees gives [-dy0, dx0].
scale is used to find U_rot from H_rot, but because divisions are more computationally expensive than multiplications, the inverse is used as an optimization. It's also pre-calculated before the loop over C[n] to avoid recomputing each iteration.
|H| = sqrt(dx0 * dx0 + dy0 * dy0)
U_rot = H_rot / |H| = H_rot * scale
Then, a dot product between C and U_rot gives the perpendicular distance from the defect point to the hull edge, and abs() is used to get a positive magnitude in any orientation.
distance = abs(U_rot.C) = abs(-dy0 * dx + dx0 * dy) * scale
In the scenario depicted on the above image, in first iteration, the edge is defined by H[0] and H[1]. The contour points tho examine for this edge are C[0], C[1], and C[2] (since C[3] == H[1]).
There are defects at C[1] and C[2]. The defect at C[1] is the deepest, so the algorithm will record (0, 1, 1, 50).
The next edge is defined by H[1] and H[2], and corresponding contour point C[3]. No defect is present, so nothing is recorded.
The next edge is defined by H[2] and H[3], and corresponding contour point C[4]. No defect is present, so nothing is recorded.
Since C[5] == H[3], the last contour point can be ignored -- there can't be a defect there.

How do you calculate the average gradient direction and average gradient strength/magnitude

In OpenCV how do you calculate the average gradient strength in a Mat and the average gradient direction?
I have sourced the below methods by googling but I want to confirm I am actually doing this correctly before moving onto the next step.
Is this correct?
Mat img = imread('foo.png', CV_8UC); // read image as grayscale single channel
// Calculate the mean intensity and the std deviation
// Any errors here or am I doing this correctly?
Scalar sMean, sStdDev;
meanStdDev(src, sMean, sStdDev);
double mean = sMean[0];
double stddev = sStdDev[0];
// Calculate the average gradient magnitude/strength across the image
// Any errors here or am I doing this correctly?
Mat dX, dY, magnitude;
Sobel(src, dX, CV_32F, 1, 0, 1);
Sobel(src, dY, CV_32F, 0, 1, 1);
magnitude(dX, dY, magnitude);
Scalar sMMean, sMStdDev;
meanStdDev(magnitude, sMMean, sMStdDev);
double magnitudeMean = sMMean[0];
double magnitudeStdDev = sMStdDev[0];
// Calculate the average gradient direction across the image
// Any errors here or am I doing this correctly?
Scalar avgHorizDir = mean(dX);
Scalar avgVertDir = mean(dY);
double avgDir = atan2(-avgVertDir[0], avgHorizDir[0]);
float blurriness = cv::videostab::calcBlurriness(src); // low values = sharper. High values = blurry
Technically those are the correct ways of obtaining the two averages.
The way you compute mean direction uses weighted directional statistics, meaning that pixels without a strong gradient have less influence on the average.
However, for most images this average direction is not very meaningful, as there exist edges in all directions and cancel out.
If your image is of a single edge, then this will work great.
If your image has lines in it, containing edges in opposite directions, this will not work. In this case, you want to average the double angle (average orientations). The obvious way of doing this is to compute the direction per pixel as an angle, double them, then use directional statistics to average (ie convert back to vectors and average those). Doubling the angle causes opposite directions to be mapped to the same value, thus averaging doesn’t cancel these out.
Another simple way to average orientations is to take the average of the tensor field obtained by the outer product of the gradient field with itself, and determine the direction of the eigenvector corresponding to the largest eigenvalue. The tensor field is obtained as follows:
Mat Sxx = dX * dX;
Mat Syy = dY * dY;
Mat Sxy = dX * dY;
This should then be averaged:
Scalar mSxx = mean(sXX);
Scalar mSyy = mean(sYY);
Scalar mSxy = mean(sXY);
These values form a 2x2 real-valued symmetric matrix:
| mSxx mSxy |
| mSxy mSyy |
It is relatively straight-forward to determine its eigendecomposition, and can be done analytically. I don’t have the equations on hand right now, so I’ll leave it as an exercise to the reader. :)

How to calculate quantized angle?

I am looking at the code for Hough transformation in image segmentation. The following code is from Computer Vision by Linda Shapiro. Can somebody tell me what is quantize_angle and how can I compute it?
The Hough transform looks for straight lines (or other features) in an image and represents these features as points in a different 2D coordinate system, where one axis represents the angle θ of a detected line, and the other represents the distance δ from this line to the centre of the image.
Source: Wikipedia
To produce a Hough transform of finite dimensions, both θ and δ have to be quantized. For example, if θ lies in the range (0 ≤ θ < 2π), then you could map it to the range 0–255 by a function such as the following:
int quantize_angle(float theta) {
int q = floor(theta * 128.0 / 3.141592654 + 0.5);
return q % 256;
}
This will result in a Hough transform that is 256 pixels wide.

Image similarity (histogram matching/euclidean distance)

I have been searching this for days but i can't seem to know where to start. I am trying to compare a base image with 10 other images by their color and i am bounded to use either euclidean or histogram matching without using opencv functions. I have only tried euclidean distance. What i want to do is get the distance of the each pixel in image1 and image2. I displayed the distances and i am getting very high values. What could be wrong here in my code? Please help. :)
for(p=0;p<height;p++) // row
{
for(p2=0;p2<inputHeight;p2++) // row
{
for(u2=0;u2<inputWidth;u2++) // col
{
r2 = inputData[p2*inputStep+u2*inputChannels+2];
g2 = inputData[p2*inputStep+u2*inputChannels+1];
b2 = inputData[p2*inputStep+u2*inputChannels+0];
}
}
for(p=0;p<height;p++) // row
{
for(u=0;u<width;u++) // col
{
r = data[p*step+u*channels+2];
g = data[p*step+u*channels+1];
b = data[p*step+u*channels+0];
}
}
euclidean=(euclidean+sqrt(pow(b2-b,2) + pow(g2-g, 2) + pow(r2-r,2)));
}
Your program tends to get very high value as you summed all pixels' Euclidean distance together:
euclidean=(euclidean+sqrt(pow(b2-b,2) + pow(g2-g, 2) + pow(r2-r,2)));
I suggest you to do as follows:
Compute color histogram (vector features) of the images.
Compute correlation coefficient between these histograms as the differences of the images.

SIFT clustering converting sift features (128 dimensional vector) into a vocabulary

How to cluster the extracted SIFT descriptors. The aim of doing clustering is to use it for classification purpose.
Approach:
First of all compute the SIFT descriptor for each image/object and then push_back that descriptor into a single image (lets called that image Mat featuresUnclustered).
After that your task is to cluster all the descriptor into some number of groups/clusters (which is decided by you). That will be the size of your vocabulary/dictionary.
int dictionarySize=200;
And then finally comes the step of clustering them
//define Term Criteria
TermCriteria tc(CV_TERMCRIT_ITER,100,0.001);
//retries number
int retries=1;
//necessary flags
int flags=KMEANS_PP_CENTERS;
//Create the BoW (or BoF) trainer
BOWKMeansTrainer bowTrainer(dictionarySize,tc,retries,flags);
//cluster the feature vectors
Mat dictionary=bowTrainer.cluster(featuresUnclustered);
To cluster , convert N*128 dimension(N is the number of descriptor from each image) into a array of M*128 dimension (M number of descriptor from all images). and perform cluster on this data.
eg:
def dict2numpy(dict):
nkeys = len(dict)
array = zeros((nkeys * PRE_ALLOCATION_BUFFER, 128))
pivot = 0
for key in dict.keys():
value = dict[key]
nelements = value.shape[0]
while pivot + nelements > array.shape[0]:
padding = zeros_like(array)
array = vstack((array, padding))
array[pivot:pivot + nelements] = value
pivot += nelements
array = resize(array, (pivot, 128))
return array
all_features_array = dict2numpy(all_features)
nfeatures = all_features_array.shape[0]
nclusters = 100
codebook, distortion = vq.kmeans(all_features_array,
nclusters)
usually kmeans is applied to get k centers, you can change each image into a vector of the K (each dimension represent how many patch in that cluster).

Resources