I am unsure about this but I want to compute features around interest points computed by surf using RGB color Histogram. I guess the final feature will be 256 dimensional long. However, I am unsure if this is correct.
The dimension of the RGB color histogram is determined by how many bins you use for each channel. The dimension will be 24 (8+8+8) if you use 8 bins for each of them.
Related
I know how to do an image filter like the sobel-filter on 1 byte per pixel. However if you want to do it on RGB image with 24 bit per pixel, I don't know how to do it.
What is the best way to do that without having to convert the RGB image to a 8 bit one (greyscale)?
For the Sobel operator, like all other linear filter, one can apply the filter to each of the channels independently. That is, the output red channel is the filtered input red channel, etc.
Note that this is true only for linear filter. For any nonlinear filter, such a process leads to false colors.
For each color channel, calculate gradient magnitudes (derivatives) by applying the Sobel operator and dividing by 8.
Unless you’re doing this for colorblinds, gradients (and edges from them) in every color channel do count. You could merge those gradient magnitude matrices into one matrix, keeping the highest magnitudes.
I want to cluster images based on colour similarity. For that I need a good similarity metric between two 3D histograms. A 3D histogram of an image is just a 3 dimensional space where each axis represents one of the base colours. The range of each axis is 0-255 since this are the possible values of the base colours for each pixel.
The histogram is represented as a 256X256X256 matrix and each entry in the matrix represents the count of pixels with that specific colour in the image. For example:
If the value of the matrix element M[0][0][0] = 1150 it means that there are 1150 black pixels in the image (RGB(0,0,0) represents the colour black)
I am looking for the most sensible similarity metric for this kind of problem. The metric will be used in the clustering algorithm (DBSCAN probably) to evaluate image similarity.
Use the L*a*b* (CIELAB) color space, where euclidean distance is indeed similarity, as it is designed to model human eye perceptions non-linearities.
I know that we take a 16x16 window of "in-between" pixels around the key point. we split that window into sixteen 4x4 windows. From each 4x4 window, we generate a histogram of 8 bins. Each bin corresponding to 0-44 degrees, 45-89 degrees, etc. Gradient orientations from the 4x4 are put into these bins. This is done for all 4x4 blocks. Finally, we normalize the 128 values you get.
Where they get their value
but I misunderstand where the 128 number get their value from? did it refer to the corresponding magnitude of the orientation value or what?
I would be grateful if anyone describes any numerical example Regards!
In SIFT (Scale-Invariant Feature Transform), the 128 dimensional feature vector is made up of 4x4 samples per window in 8 directions per sample -- 4x4x8 = 128.
For an illustrated guide see A Short introduction to descriptors, and in particular this image, showing 8-direction measurements (cardinal and inter-cardinal) embedded in each of the 4x4 grid squares (center image) and then a histogram of directions (right image):
From your question I believe you are also unclear on what the information inside the descriptor is -- it is called Histograms of Oriented Gradients (HOG). For further reading, Wikipedia has an overview of HOG gradient computation:
Each pixel within the cell casts a weighted vote for an orientation-based histogram channel based on the values found in the gradient computation.
Everything is built on those per-pixel "votes".
I wanna calculate the perceived brightness of an image and classify the image into dark, neutral and bright. And I find one problem here!
And I quote Lakshmi Narayanan's comment below. I'm confused with this method. What does "the average of the hist values from 0th channel" mean here? the 0th channel refer to gray image or value channel in hsv image? Moreover, what's the theory of that method?
Well, for such a case, I think the hsv would be better. Or try this method #2vision2. Compute the laplacian of the gray scale of the image. obtain the max value using minMacLoc. call it maxval. Estimate your sharpness/brightness index as - (maxval * average V channel values) / (average of the hist values from 0th channel), as said above. This would give you certain values. low bright images are usually below 30. 30 - 50 can b taken as ok images. and above 50 as bright images.
If you have an RGB color image you can get the brightness by converting it to another color space that separates color from intensity information like HSV or LAB.
Gray images already show local "brightness" so no conversion is necessary.
If an image is perceived as bright depends on many things. Mainly your display device, reference images, contrast, human...
Using a few intensity statistics values should give you an ok classification for one particular display device.
I would like to know the difference between contrast stretching and histogram equalization.
I have tried both using OpenCV and observed the results, but I still have not understood the main differences between the two techniques. Insights would be of much needed help.
Lets Define Contrast first,
Contrast is a measure of the “range” of an image; i.e. how spread its intensities are. It has many formal definitions one famous is Michelson’s:
He says contrast = ( Imax - Imin )/( Imax + I min )
Contrast is strongly tied to an image’s overall visual quality.
Ideally, we’d like images to use the entire range of values available
to them.
Contrast Stretching and Histogram Equalisation have the same goal: making the images to use entire range of values available to them.
But they use different techniques.
Contrast Stretching works like mapping
it maps minimum intensity in the image to the minimum value in the range( 84 ==> 0 in the example above )
With the same way, it maps maximum intensity in the image to the maximum value in the range( 153 ==> 255 in the example above )
This is why Contrast Stretching is un-reliable, if there exist only two pixels have 0 and 255 intensity, it is totally useless.
However a better approach is Histogram Equalisation which uses probability distribution. You can learn the steps here
I came across the following points after some reading.
Contrast stretching is all about increasing the difference between the maximum intensity value in an image and the minimum one. All the rest of the intensity values are spread out between this range.
Histogram equalization is about modifying the intensity values of all the pixels in the image such that the histogram is "flattened" (in reality, the histogram can't be exactly flattened, there would be some peaks and some valleys, but that's a practical problem).
In contrast stretching, there exists a one-to-one relationship of the intensity values between the source image and the target image i.e., the original image can be restored from the contrast-stretched image.
However, once histogram equalization is performed, there is no way of getting back the original image.
In Histogram equalization, you want to flatten the histogram into a uniform distribution.
In contrast stretching, you manipulate the entire range of intensity values. Like what you do in Normalization.
Contrast stretching is a linear normalization that stretches an arbitrary interval of the intensities of an image and fits the interval to an another arbitrary interval (usually the target interval is the possible minimum and maximum of the image, like 0 and 255).
Histogram equalization is a nonlinear normalization that stretches the area of histogram with high abundance intensities and compresses the area with low abundance intensities.
I think that contrast stretching broadens the histogram of the image intensity levels, so the intensity around the range of input may be mapped to the full intensity range.
Histogram equalization, on the other hand, maps all of the pixels to the full range according to the cumulative distribution function or probability.
Contrast is the difference between maximum and minimum pixel intensity.
Both methods are used to enhance contrast, more precisely, adjusting image intensities to enhance contrast.
During histogram equalization the overall shape of the histogram
changes, whereas in contrast stretching the overall shape of
histogram remains same.