Image similarity (histogram matching/euclidean distance) - opencv

I have been searching this for days but i can't seem to know where to start. I am trying to compare a base image with 10 other images by their color and i am bounded to use either euclidean or histogram matching without using opencv functions. I have only tried euclidean distance. What i want to do is get the distance of the each pixel in image1 and image2. I displayed the distances and i am getting very high values. What could be wrong here in my code? Please help. :)
for(p=0;p<height;p++) // row
{
for(p2=0;p2<inputHeight;p2++) // row
{
for(u2=0;u2<inputWidth;u2++) // col
{
r2 = inputData[p2*inputStep+u2*inputChannels+2];
g2 = inputData[p2*inputStep+u2*inputChannels+1];
b2 = inputData[p2*inputStep+u2*inputChannels+0];
}
}
for(p=0;p<height;p++) // row
{
for(u=0;u<width;u++) // col
{
r = data[p*step+u*channels+2];
g = data[p*step+u*channels+1];
b = data[p*step+u*channels+0];
}
}
euclidean=(euclidean+sqrt(pow(b2-b,2) + pow(g2-g, 2) + pow(r2-r,2)));
}

Your program tends to get very high value as you summed all pixels' Euclidean distance together:
euclidean=(euclidean+sqrt(pow(b2-b,2) + pow(g2-g, 2) + pow(r2-r,2)));
I suggest you to do as follows:
Compute color histogram (vector features) of the images.
Compute correlation coefficient between these histograms as the differences of the images.

Related

Euclidean distance between RGB histogram of two images

I have two pictures with histogram of the R,G,B intensities for each image. I am suppose to find the euclidean distance using the values of histogram to find the similarity.
I know euclidean distance formula is:
= sqr((R1-R2)^2 +(G1-G2)^2+(B1-B2)^2)
Since the histogram of R G and B for each image has several values, so are you suppose to take the average of all the intensity values in one histogram and then subtract it with the average of intensity values of the other histogram?
Example 1:
Image1: R1 histogram has values of 2,3,4
Image2: R2 histogram has values of 2,3,1
Then do I do R1=(2+3+4)/3 ,R2=(2+3+1)/3
Then do I do (9-6)^2 for the value (R1-R2)^2 in sqr((R1-R2)^2+(G1-G2)^2+(B1-B2)^2)?
OR
Example 2:
Image1: R1 histogram has values of 2,3,4
Image2: R2 histogram has values of 2,3,1
Then do I do this (2-2)^2 +(3-3)^2 +(4-1)^2 for the (R1-R2)^2 in sqr((R1-R2)^2 +(G1-G2)^2+(B1-B2)^2)?
Please help me out, thanks!
Think of a histogram as a vector (maybe there are 256 bins, so it’s a 256-dimensional vector). Now compute the Euclidean distance between the two vectors:
DR = norm(R1-R2); % same as sqrt(sum((R1-R2).^2))
You can repeat this for each R, G and B component, and combine the three distances again using the Euclidean norm:
D = sqrt(DR.^2 + DG.^2 + DB.^2);
This is the same as concatenating the 3 color histograms for each image and computing their distance:
H1 = [R1,G1,B1]; % assuming histograms are row vectors
H2 = [R2,G2,B2];
D = norm(H1-H2);
I think you are mixing Normalization with Euclidean Distance.
Euclidean Distance = Sqrt( Sum( ( a[i][j] - b[i][j] )^2 ) ) for all i = 0..width, j = 0..height
a[][] and b[][] can be normalized data or non-normalized data. If you are using the raw image pixel values, they are non-normalized. You can normalize the images by dividing by the intensity range of the pixel values (min-max normalization).
So, compute the normalized images anorm[][] and bnorm[][] in the first pass where,
for(i = 0; i < width; i++) {
for(j = 0; j < height; j++) {
anorm[i][j] = a[i][j] / (max_a - min_a);
bnorm[i][j] = b[i][j] / (max_b - min_b);
}
}
Now, apply the Euclidean Distance formula on anorm[][] and bnorm[][].

Line detection in noisy image (and no detection when it is not present)

I have tried to extract the dark line inside very noisy images without success. Some tips?
My current steps for the first example:
1) Clahe: with clip_limit = 10 and grid_size = (8,8)
2) Box Filter: with size = (5,5)
3) Inverted Image: 255 - image
4) Threshold: when inverted_image < 64
UPDATE
I have performed some preprocessing steps to improve the quality of tested images. I adjusted my ROI mask to crop top and down (because they are low intensities) and added a illumination correction to see better the line. Follow below the current images:
Even though the images are noisy, you are only looking for straight lines towards the north of image. So, why don't use some kind of matched filter with morphological operations?
EDIT: I have modified it.
1) Use median filter along the x and y axis, and normalize the images.
2) Matched filter with all possible orientations of lines.
% im=imread('JwXON.png');
% im=imread('Fiy72.png');
% im=imread('Ya9AN.png');
im=imread('OcgaIt8.png');
imOrig=im;
matchesx = fl(im, 1);
matchesy = fl(im, 0);
matches = matchesx + matchesy;
[x, y] = find(matches);
figure(1);
imagesc(imOrig), axis image
hold on, plot(y, x, 'r.', 'MarkerSize',5)
colormap gray
%----------
function matches = fl(im, direc)
if size(im,3)~=1
im = double(rgb2gray(im));
else
im=double(im);
end
[n, m] = size(im);
mask = bwmorph(imfill(im>0,'holes'),'thin',10);
indNaN=find(im==0); im=255-im; im(indNaN)=0;
N = n - numel(find(im(:,ceil(m/2))==0));
N = ceil(N*0.8); % possible line length
% Normalize the image with median filter
if direc
background= medfilt2(im,[1,30],'symmetric');
thetas = 31:149;
else
background= medfilt2(im,[30,1],'symmetric');
thetas = [1:30 150:179];
end
normIm = im - background;
normIm(normIm<0)=0;
% initialize matched filter result
matches=im*0;
% search for different angles of lines
for theta=thetas
normIm2 = imclose(normIm>0,strel('line',5,theta));
normIm3 = imopen(normIm2>0,strel('line',N,theta));
matches = matches + normIm3;
end
% eliminate false alarms
matches = imclose(matches,strel('disk',2));
matches = matches>3 & mask;
matches = bwareaopen(matches,100);

SIFT clustering converting sift features (128 dimensional vector) into a vocabulary

How to cluster the extracted SIFT descriptors. The aim of doing clustering is to use it for classification purpose.
Approach:
First of all compute the SIFT descriptor for each image/object and then push_back that descriptor into a single image (lets called that image Mat featuresUnclustered).
After that your task is to cluster all the descriptor into some number of groups/clusters (which is decided by you). That will be the size of your vocabulary/dictionary.
int dictionarySize=200;
And then finally comes the step of clustering them
//define Term Criteria
TermCriteria tc(CV_TERMCRIT_ITER,100,0.001);
//retries number
int retries=1;
//necessary flags
int flags=KMEANS_PP_CENTERS;
//Create the BoW (or BoF) trainer
BOWKMeansTrainer bowTrainer(dictionarySize,tc,retries,flags);
//cluster the feature vectors
Mat dictionary=bowTrainer.cluster(featuresUnclustered);
To cluster , convert N*128 dimension(N is the number of descriptor from each image) into a array of M*128 dimension (M number of descriptor from all images). and perform cluster on this data.
eg:
def dict2numpy(dict):
nkeys = len(dict)
array = zeros((nkeys * PRE_ALLOCATION_BUFFER, 128))
pivot = 0
for key in dict.keys():
value = dict[key]
nelements = value.shape[0]
while pivot + nelements > array.shape[0]:
padding = zeros_like(array)
array = vstack((array, padding))
array[pivot:pivot + nelements] = value
pivot += nelements
array = resize(array, (pivot, 128))
return array
all_features_array = dict2numpy(all_features)
nfeatures = all_features_array.shape[0]
nclusters = 100
codebook, distortion = vq.kmeans(all_features_array,
nclusters)
usually kmeans is applied to get k centers, you can change each image into a vector of the K (each dimension represent how many patch in that cluster).

Find distance between two lines (OpenCV)

I have the below image after some conversions.
How can I find a distance between these two lines?
A simple way to do this would be
- Scan across a row until you find a pixel above a threshold.
- Keep scanning until you find a pixel below the threshold.
- Count the pixels until the next pixel above the threshold.
- Take the average across a number of rows sampled from the image (or all rows)
- You'll need to know the image resolution (e.g. dpos per inch) to convert the count to an actual distance
An efficient method to scan across rows can be found in the OpenCV documentation
A more complicated approach would use Houghlines to extract lines. It will give you two points on each line (hopefully you only have two). From that it is possible to work out a distance formula, assuming the lines are parallel.
A skeleton code (not efficient, just readable so that you know how to do it) would be,
cv::Mat source = cv::imread("source.jpg", CV_LOAD_IMAGE_GRAYSCALE);
std::vector<int> output;
int threshold = 35, temp_var; // Change in accordance with data
int DPI = 30; // Digital Pixels per Inch
for (int i=0; i<source.cols; ++i)
{
for (int j=0; j<source.rows; ++j)
{
if (source.at<unsigned char>(i,j) > threshold)
{
temp_var = j;
for (; j<source.rows; ++j)
if (source.at<unsigned char>(i,j) > threshold)
output.push_back( (j-temp_var)/DPI ); // Results are stored in Inch
}
}
}
Afterwards, you could take an average of all the elements in output, etc.
HTH
Assumptions:
You have only two continuous lines without any break in between.
No other pixels (noise) apart from the lines
My proposed solution: Almost same as given above
Mark leftmost line as line 1. Mark rightmost line as line 2.
Scan the image (Mat in OpenCV) from the leftmost column and make a list of points matching the pixel value of line 1
Scan the image (Mat in OpenCV) from the rightmost column and make a list of points matching the pixel value of line 2
Calculate the distance between points from that list using the code below.
public double euclideanDistance(Point a, Point b){
double distance = 0.0;
try{
if(a != null && b != null){
double xDiff = a.x - b.x;
double yDiff = a.y - b.y;
distance = Math.sqrt(Math.pow(xDiff,2) + Math.pow(yDiff, 2));
}
}catch(Exception e){
System.err.println("Something went wrong in euclideanDistance function in "+Utility.class+" "+e.getMessage());
}
return distance;
}

Laplacian of gaussian filter use

This is a formula for LoG filtering:
(source: ed.ac.uk)
Also in applications with LoG filtering I see that function is called with only one parameter:
sigma(σ).
I want to try LoG filtering using that formula (previous attempt was by gaussian filter and then laplacian filter with some filter-window size )
But looking at that formula I can't understand how the size of filter is connected with this formula, does it mean that the filter size is fixed?
Can you explain how to use it?
As you've probably figured out by now from the other answers and links, LoG filter detects edges and lines in the image. What is still missing is an explanation of what σ is.
σ is the scale of the filter. Is a one-pixel-wide line a line or noise? Is a line 6 pixels wide a line or an object with two distinct parallel edges? Is a gradient that changes from black to white across 6 or 8 pixels an edge or just a gradient? It's something you have to decide, and the value of σ reflects your decision — the larger σ is the wider are the lines, the smoother the edges, and more noise is ignored.
Do not get confused between the scale of the filter (σ) and the size of the discrete approximation (usually called stencil). In Paul's link σ=1.4 and the stencil size is 9. While it is usually reasonable to use stencil size of 4σ to 6σ, these two quantities are quite independent. A larger stencil provides better approximation of the filter, but in most cases you don't need a very good approximation.
This was something that confused me too, and it wasn't until I had to do the same as you for a uni project that I understood what you were supposed to do with the formula!
You can use this formula to generate a discrete LoG filter. If you write a bit of code to implement that formula, you can then to generate a filter for use in image convolution. To generate, say a 5x5 template, simply call the code with x and y ranging from -2 to +2.
This will generate the values to use in a LoG template. If you graph the values this produces you should see the "mexican hat" shape typical of this filter, like so:
(source: ed.ac.uk)
You can fine tune the template by changing how wide it is (the size) and the sigma value (how broad the peak is). The wider and broader the template the less affected by noise the result will be because it will operate over a wider area.
Once you have the filter, you can apply it to the image by convolving the template with the image. If you've not done this before, check out these few tutorials.
java applet tutorials more mathsy.
Essentially, at each pixel location, you "place" your convolution template, centred at that pixel. You then multiply the surrounding pixel values by the corresponding "pixel" in the template and add up the result. This is then the new pixel value at that location (typically you also have to normalise (scale) the output to bring it back into the correct value range).
The code below gives a rough idea of how you might implement this. Please forgive any mistakes / typos etc. as it hasn't been tested.
I hope this helps.
private float LoG(float x, float y, float sigma)
{
// implement formula here
return (1 / (Math.PI * sigma*sigma*sigma*sigma)) * //etc etc - also, can't remember the code for "to the power of" off hand
}
private void GenerateTemplate(int templateSize, float sigma)
{
// Make sure it's an odd number for convenience
if(templateSize % 2 == 1)
{
// Create the data array
float[][] template = new float[templateSize][templatesize];
// Work out the "min and max" values. Log is centered around 0, 0
// so, for a size 5 template (say) we want to get the values from
// -2 to +2, ie: -2, -1, 0, +1, +2 and feed those into the formula.
int min = Math.Ceil(-templateSize / 2) - 1;
int max = Math.Floor(templateSize / 2) + 1;
// We also need a count to index into the data array...
int xCount = 0;
int yCount = 0;
for(int x = min; x <= max; ++x)
{
for(int y = min; y <= max; ++y)
{
// Get the LoG value for this (x,y) pair
template[xCount][yCount] = LoG(x, y, sigma);
++yCount;
}
++xCount;
}
}
}
Just for visualization purposes, here is a simple Matlab 3D colored plot of the Laplacian of Gaussian (Mexican Hat) wavelet. You can change the sigma(σ) parameter and see its effect on the shape of the graph:
sigmaSq = 0.5 % Square of σ parameter
[x y] = meshgrid(linspace(-3,3), linspace(-3,3));
z = (-1/(pi*(sigmaSq^2))) .* (1-((x.^2+y.^2)/(2*sigmaSq))) .*exp(-(x.^2+y.^2)/(2*sigmaSq));
surf(x,y,z)
You could also compare the effects of the sigma parameter on the Mexican Hat doing the following:
t = -5:0.01:5;
sigma = 0.5;
mexhat05 = exp(-t.*t/(2*sigma*sigma)) * 2 .*(t.*t/(sigma*sigma) - 1) / (pi^(1/4)*sqrt(3*sigma));
sigma = 1;
mexhat1 = exp(-t.*t/(2*sigma*sigma)) * 2 .*(t.*t/(sigma*sigma) - 1) / (pi^(1/4)*sqrt(3*sigma));
sigma = 2;
mexhat2 = exp(-t.*t/(2*sigma*sigma)) * 2 .*(t.*t/(sigma*sigma) - 1) / (pi^(1/4)*sqrt(3*sigma));
plot(t, mexhat05, 'r', ...
t, mexhat1, 'b', ...
t, mexhat2, 'g');
Or simply use the Wavelet toolbox provided by Matlab as follows:
lb = -5; ub = 5; n = 1000;
[psi,x] = mexihat(lb,ub,n);
plot(x,psi), title('Mexican hat wavelet')
I found this useful when implementing this for edge detection in computer vision. Although not the exact answer, hope this helps.
It appears to be a continuous circular filter whose radius is sqrt(2) * sigma. If you want to implement this for image processing you'll need to approximate it.
There's an example for sigma = 1.4 here: http://homepages.inf.ed.ac.uk/rbf/HIPR2/log.htm

Resources