I am trying to compare images based on their Euclidean Distance. I have come across this pseudo code:
sqrt((r1-r2)^2 + (g1-g2)^2 + (b1-b2)^2)
What I am trying to figure out is- in the pseudo code above, does (r1-r2) mean: subtract red values in image-1 from the red values in image-2?
Yeah, this is the most basic form of Euclidean Color Distance. You compare pixel color to other pixel color by comparing the distance between the different components in the pixels.
Pixels are 3 colors (usually) in RGB and you compare the pixels. So you #FFAA00 and #F8A010 has 0xFF for R1 and 0xF8 for R2.
There's a bunch of other distance values like CIELabD2k. But, that's the core behind color distance.
Related
I am analyzing a very big number of images and extracting the dominant color codes.
I want to group them into ranges of generic color names, like Green, Dark Green, Light Green, Blue, Dark Blue, Light Blue and so on.
I am looking for a language agnostic way in order to implement something by myself, if there are examples I can look into in order to achieve this I would be more than grateful.
In machine learning field, what you want to do is called classification, in which the goal is to assign the label of one of the classes (color) to each of the observations (images).
To do this, classes must be pre-defined. Suppose these are the colors we want to assign to images:
To determine the dominant color of an image, the distance between each of its pixels and all the colors in the table must be calculated. Note that this distance is calculated in RGB color space. To calculate the distance between the ij-th pixel of the image and the k-th color of the table, the following equation can be used:
d_ijk = sqrt((r_ij-r_k)^2+(g_ij-g_k)^2+(b_ij-b_k)^2)
In the next step, for each pixel, the closest color in the table is selected. This is the concept used to compress an image using indexed colors (except that here the palette is the same for all images and is not calculated for each to minimize the difference between the original and the indexed image). Now, as #jairoar pointed out, we can get the histogram of the image (not to be confused with RGB histogram or intensity histogram), and determine the color that has the most repetition.
To show the result of these steps, I used random crops of this work of art! of mine:
This is how images look, before and after indexing (left: original, right: indexed):
And these are most repeated colors (left: indexed, right: dominant color):
But since you said the number of images is large, you should know that these calculations are relatively time consuming. But the good news is that there are ways to increase the performance. For example, instead of using the Euclidean distance (formula above), you can use the City Block or Chebyshev distance. You can also calculate the distance only for a fraction of the pixels instead of calculating it for all the pixels in an image. For this purpose, you can first scale the image to a much smaller size (for example, 32 by 32) and perform calculations for the pixels of this reduced image. If you decided to resize images, don not bother to use bilinear or bicubic interpolations, it doesn't worth the extra computation. Instead, go for the nearest neighbor, which actually performs a rectangular lattice sampling on the original image.
Although the mentioned changes will greatly increase the speed of calculations, but nothing good comes for free. This is a trade-off of performance versus accuracy. For example, in the previous two pictures, we see that the image, which was initially recognized as orange (code 20), has been recognized as pink (code 26) after resizing it.
To determine the parameters of the algorithm (distance measurement, reduced image size and scaling algorithm), you must first perform the classification operation on a number of images with the highest possible accuracy and keep the results as the ground truth. Then, with multiple experiments, obtain a combination of parameters that do not make the classification error more than a maximum tolerable value.
#saastn's fantastic answer assumes you have a set of pre-defined colors that you want to sort your images to. The implementation is easier if you just want to classify the images to one color out of some set of X equidistant colors, a la histogram.
To summarize, round the color of each pixel in the image to the nearest color out of some set of equidistant color bins. This reduces the precision of your colors down to whatever amount of colors that you desire. Then count all of the colors in the image and select the most frequent color as your classification for that image.
Here is my implementation of this in Python:
import cv2
import numpy as np
#Set this to the number of colors that you want to classify the images to
number_of_colors = 8
#Verify that the number of colors chosen is between the minimum possible and maximum possible for an RGB image.
assert 8 <= number_of_colors <= 16777216
#Get the cube root of the number of colors to determine how many bins to split each channel into.
number_of_values_per_channel = number_of_colors ** ( 1 / 3 )
#We will divide each pixel by its maximum value divided by the number of bins we want to divide the values into (minus one for the zero bin).
divisor = 255 / (number_of_values_per_channel - 1)
#load the image and convert it to float32 for greater precision. cv2 loads the image in BGR (as opposed to RGB) format.
image = cv2.imread("image.png", cv2.IMREAD_COLOR).astype(np.float32)
#Divide each pixel by the divisor defined above, round to the nearest bin, then convert float32 back to uint8.
image = np.round(image / divisor).astype(np.uint8)
#Flatten the columns and rows into just one column per channel so that it will be easier to compare the columns across the channels.
image = image.reshape(-1, image.shape[2])
#Find and count matching rows (pixels), where each row consists of three values spread across three channels (Blue column, Red column, Green column).
uniques = np.unique(image, axis=0, return_counts=True)
#The first of the two arrays returned by np.unique is an array compromising all of the unique colors.
colors = uniques[0]
#The second of the two arrays returend by np.unique is an array compromising the counts of all of the unique colors.
color_counts = uniques[1]
#Get the index of the color with the greatest frequency
most_common_color_index = np.argmax(color_counts)
#Get the color that was the most common
most_common_color = colors[most_common_color_index]
#Multiply the channel values by the divisor to return the values to a range between 0 and 255
most_common_color = most_common_color * divisor
#If you want to name each color, you could also provide a list sorted from lowest to highest BGR values comprising of
#the name of each possible color, and then use most_common_color_index to retrieve the name.
print(most_common_color)
I have two boolean masks that I got from the object detection for two video frames i and i+1. Now I want to "avarage" them to remove noise. Masks are closed convex curves. So basically I want to find the middle line between them. How can I do this?
Here is an example:
Let's say that we have two maks red and blue for two successive frames, after filtering we need to get something like the green line that is between two contours.
You can accomplish this using the distance transform.
The core idea is to compute the signed distance to the edge of each mask, and find the zero level set for the average. There is no need to require convex masks for this algorithm. I do assume that the inputs are solid masks (i.e. a filled contour).
The distance transform computes the (Euclidean) distance of each object pixel to the nearest background pixel. The signed distance to the edge is formed by the combination of two distance transforms: the distance transform of the object and the distance transform of the background (i.e. of the inverted mask). The latter, subtracted from the former, gives an image where pixels outside the mask have negative distances to the edge of the mask, whereas pixels inside have positive distances. The edge of the mask is given by the zero crossings.
If you compute the signed distance to the edges of the two mask images, and average them together, you will obtain zero crossings at a location exactly half-way the edges of the two masks. Simply thresholding this result gives you the averaged mask.
Note that, since we're thresholding at 0, there is no difference between the sum of the two signed distances, or their average. The sum is cheaper to compute.
Here is an example, using your color coding (red and blue are the edges of the two inputs, green is the edge of the output):
The code below is MATLAB with DIPimage, which I wrote just to show the result. Just consider it pseudo-code for you to implement with OpenCV. :)
% inputs: mask1, mask2: binary images
d1 = dt(mask1) - dt(~mask1); % dt is the distance transform
d2 = dt(mask2) - dt(~mask2); % ~ is the logical negation
mask = (d1+d2) > 0; % output
I have a problem about plotting 3D matrix. Assume that I have one image with its size 384x384. In loop function, I will create about 10 images with same size and store them into a 3D matrix and plot the 3D matrix in loop. The thickness size is 0.69 between each size (distance between two slices). So I want to display its thickness by z coordinate. But it does not work well. The problem is that slice distance visualization is not correct. And it appears blue color. I want to adjust the visualization and remove the color. Could you help me to fix it by matlab code. Thank you so much
for slice = 1 : 10
Img = getImage(); % get one 2D image.
if slice == 1
image3D = Img;
else
image3D = cat(3, image3D, Img);
end
%Plot image
figure(1)
[x,y,z] = meshgrid(1:384,1:384,1:slice);
scatter3(x(:),y(:),z(:).*0.69,90,image3D(:),'filled')
end
The blue color can be fixed by changing the colormap. Right now you are setting the color of each plot point to the value in image3D with the default colormap of jet which shows lower values as blue. try adding colormap gray; after you plot or whichever colormap you desire.
I'm not sure what you mean by "The problem is that slice distance visualization is not correct". If each slice is of a thickness 0.69 than the image values are an integral of all the values within each voxel of thickness 0.69. So what you are displaying is a point at the centroid of each voxel that represents the integral of the values within that voxel. Your z scale seems correct as each voxel centroid will be spaced 0.69 apart, although it won't start at zero.
I think a more accurate z-scale would be to use (0:slice-1)+0.5*0.69 as your z vector. This would put the edge of the lowest slice at zero and center each point directly on the centroid of the voxel.
I still don't think this will give you the visualization you are looking for. 3D data is most easily viewed by looking at slices of it. you can check out matlab's slice which let's you make nice displays like this one:
slice view http://people.rit.edu/pnveme/pigf/ThreeDGraphics/thrd_threev_slice_1.gif
I am not able to under stand the formula ,
What is W (window) and intensity in the formula mean,
I found this formula in opencv doc
http://docs.opencv.org/trunk/doc/py_tutorials/py_feature2d/py_features_harris/py_features_harris.html
For a grayscale image, intensity levels (0-255) tells you how bright is the pixel..hope that you already know about it.
So, now the explanation of your formula is below:
Aim: We want to find those points which have maximum variation in terms of intensity level in all direction i.e. the points which are very unique in a given image.
I(x,y): This is the intensity value of the current pixel which you are processing at the moment.
I(x+u,y+v): This is the intensity of another pixel which lies at a distance of (u,v) from the current pixel (mentioned above) which is located at (x,y) with intensity I(x,y).
I(x+u,y+v) - I(x,y): This equation gives you the difference between the intensity levels of two pixels.
W(u,v): You don't compare the current pixel with any other pixel located at any random position. You prefer to compare the current pixel with its neighbors so you chose some value for "u" and "v" as you do in case of applying Gaussian mask/mean filter etc. So, basically w(u,v) represents the window in which you would like to compare the intensity of current pixel with its neighbors.
This link explains all your doubts.
For visualizing the algorithm, consider the window function as a BoxFilter, Ix as a Sobel derivative along x-axis and Iy as a Sobel derivative along y-axis.
http://docs.opencv.org/doc/tutorials/imgproc/imgtrans/sobel_derivatives/sobel_derivatives.html will be useful to understand the final equations in the above pdf.
When we look at a photo of a group of trees, we are able to identify that the photo is predominantly green and brown, or for a picture of the sea we are able to identify that it is mostly blue.
Does anyone know of an algorithm that can be used to detect the prominent color or colours in a photo?
I can envisage a 3D clustering algorithm in RGB space or something similar. I was wondering if someone knows of an existing technique.
Convert the image from RGB to a color space with brightness and saturation separated (HSL/HSV)
http://en.wikipedia.org/wiki/HSL_and_HSV
Then find the dominating values for the hue component of each pixel. Make a histogram for the hue values of each pixel and analyze in which angle region the peaks fall in. A large peak in the quadrant between 180 and 270 degrees means there is a large portion of blue in the image, for example.
There can be several difficulties in determining one dominant color. Pathological example: an image whose left half is blue and right half is red. Also, the hue will not deal very well with grayscales obviously. So a chessboard image with 50% white and 50% black will suffer from two problems: the hue is arbitrary for a black/white image, and there are two colors that are exactly 50% of the image.
It sounds like you want to start by computing an image histogram or color histogram of the image. The predominant color(s) will be related to the peak(s) in the histogram.
You might want to change the image from RGB to indexed, then you could use a regular histogram and detect the pics (Matlab does this with rgb2ind(), as you probably already know), and then the problem would be reduced to your regular "finding peaks in an array".
Then
n = hist(Y,nbins) bins the elements in vector Y into 10 equally spaced containers and returns the number of elements in each container as a row vector.
Those values in n will give you how many elements in each bin. Then it's just a matter of fiddling with the number of bins to make them wide enough, and with how many elements in each would make you count said bin as a predominant color, then taking the bins that contain those many elements, calculating the index that corresponds with their middle, and converting it to RGB again.
Whatever you're using for your processing probably has similar functions to those
Average all pixels in the image.
Remove all pixels that are farther away from the average color than standard deviation.
GOTO 1 with remaining pixels until arbitrarily few are left (1 or maybe 1%).
You might also want to pre-process the image, for example apply high-pass filter (removing only very low frequencies) to even out lighting in the photo — http://en.wikipedia.org/wiki/Checker_shadow_illusion