How can I normalize two histograms in OpenCV? - ios

I'm working on a CBIR image processing project and I need to normalize two histograms such that their values are on the same scale. I'm not quite sure if normalizing is the right term for it though.
Here's what I'm trying to accomplish.
I am slicing each image (single channel grayscale) into an 8x8 grid. Therefore, for some images the blocks of the grid may be larger and for others smaller. But for each block, I'm extracting a 256 bin histogram as a feature for storage and comparison against other features later, ergo all images have equal sized feature descriptors, regardless of size differences.
As it stands, this doesn't scale properly, as images with larger blocks have higher counts in each bin, but could simply be a 2x scale of the same image.
It's because of this I want to store histograms such that each bin value is the percentage of their occurrence over a 2D region in the grid. Doing so, I would able to compare histograms with both according to the same scale.
Can I accomplish this using OpenCV? I'm not sure if it's a common practice and I'm just not aware of the nomenclature.

Related

What is the difference between Binning and sub-sampling in Image Signal Processing?

As I know, there are some functions in the CMOS Image Sensor ISP (Image Signal Processor).
Specifically, I'd like to know the difference between binning and sub-sampling. I think these purpose is same to reduce image size.
However, I'm not sure why these functions exist?
What is their purpose?
Binning and sub-sampling reduce the image size as you have suspected, but what they focus on are different things. Let's tackle each issue separately
Binning
Binning in image processing deals primarily with quantization. The closest thing I can think of is related to what is known as data binning. Basically, consider breaking up your image into distinct (non-overlapping) M x N tiles, where M and N are the rows and columns of a tile and M and N should be much smaller than the rows and columns of the image.
If you consider any grid of M x N pixels, all of these pixels get replaced with a representative colour. The way this representative colour is calculated is done in many ways... the average is a popular method. The reason why binning is performed is primarily as a data pre-processing technique which is used to reduce the effects of minor observation errors. This effectively reduces the amount of information that is representative of the image, and so it certainly reduces the image size by reducing the amount of unique colours that represent the image.
In addition, binning the data may also reduce the impact of noise that impacts the CMOS sensor on the final processed image, but at the cost of a lower dynamic range of colours.
Sub-sampling
Sub-sampling in the case of image processing mostly deals with image resizing. It's also called image scaling. The goal is to take an image and reduce its dimensions so that you get a smaller image as a result. Binning deals with keeping the image the same size (i.e. the same dimensions as the original) while reducing the amount of colours which ultimately reduces the amount of space the image takes up. Subsampling reduces the image size by removing information all together. Usually when you subsample, you also interpolate or smooth the image so that you reduce aliasing.
Sub-sampling has another application in video processing - especially in MPEG where video is encoded in YCbCr. Y is the luminance while Cb and Cr are the chrominance pairs. We tend to notice changes in luminance rather than chrominance, and so the chrominance is subsampled to reduce the amount of space taken up by the video. Specifically, the human visual system has poor acuity when it comes to colour information than we do with luminance / intensity. Usually, the chrominance values are filtered then subsampled by 1/2 or even 1/4 of that of the intensity. Even with a rather high subsampling rate, we don't notice any differences in terms of perceived image quality.
This is obviously a rather rough introduction on the differences between them both, but I hope this gives you enough of what you're after for your purposes.
Good luck!

HOG features on different scales

Suppose we calculate HOG features of image patches of different sizes, ranging from 64 * 64 to 128 * 128. Now, if we want to do k-means on these, should we normalize the patches which belong to different scale? I know HOG features are normalized, but does scale matter?
Normally, the HOG representations are normalized. However, you must be careful to the block size. In fact, you must have the same number of blocks, whatever the size of the image. Otherwise, you obtain descriptors of different lengths and the k-means cannot be performed. This means that when having larger images, you will have larger blocks. The resulting histograms will contain information from more gradients, so they aren’t invariant at this stage. However, by applying the histogram normalization, the scale invariance of the final descriptor is obtained.
Yet, if you are not sure if the histogram normalization is well performed or not, you can extract the descriptor for an image and its resized version and compare them.
Good luck!

minimum texture image dimension for effective classification

Iam a beginner in image mining. I would like to know the minimum dimension required for effective classification of textured images. As what i feel if a image is too small feature extraction step will not extract enough features. And if the image size goes beyond a certain dimension the processing time will increase exponentially with image size.
This is a complex question that requires a bit of thinking.
Short answer: It depends.
Long answer: It depends on the type of texture you want to classify and the type of feature your classification is based on. If the feature extracted is, say, color only, you can use "texture" as small as 1x1 pixel (in that case, using the word "texture" is a bit of an abuse). If you want to classify, say for example characters, you can usually extract a lot of local information from edges (Hough transform, Gabor filters, etc). The image plane just have to be big enough to hold the characters (say 16x16 pixels for Latin alphabet).
If you want to be able to classify any kind of images in any kind of number, you can also base your classification on global information, like entropy, correlogram, energy, inertia, cluster shade, cluster prominence, color and correlation. Those features are used for content based image retrieval.
From the top of my head, I would try using texture as small as 32x32 pixels if the kind of texture you are using is a priori unknown. If on the contrary the kind of texture is a priori known, I would choose one or more feature that I know would classify the images according to my needs (1x1 pixel for color-only, 16x16 pixels for characters, etc). Again, it really depends on what you are trying to achieve. There isn't a unique answer to your question.

2D subimage detection in Open CV

What's the most sensible algorithm, or combination of algorithms, to be using from OpenCV for the following problem:
I have a set of small 2D images. I want to detect the locations of these subimages in a larger image.
The subimages are usually around 32x32 pixels, and the larger image is around 400x400.
The subimages are not always square, and such contains alpha channel.
Optionally - the larger image may be grainy, compressed, rotated in 3D, or otherwise slightly distorted
I have tried cvMatchTemplate, with very poor results (difficult to match correctly, and large numbers of false positives, with all match methods). Some of the problems come from the fact OpenCV can't seem to deal with alpha channel template matching.
I have tried a manual search, which seems to work better, and can include the alpha channel, but is very slow.
Thanks for any help.
cvMatchTemplate uses a MSE (SQDIFF/SQDIFF_NORMED) kind of metric for the matching. This kind of metric will penalize different alpha values severly (due to the square in the equation). Have you tried normalized cross-correlation? It is known to model linear variations in pixel intensities better.
If NCC does not do the job, you will need to transform the images to a space where the intensity differences do not have much effect. e.g. Compute a edge-strength image (canny, sobel etc) and run cvMatchTemplate on these images.
Considering the large difference in scales of the images (~10x). A image pyramid will have to be employed to figure out the correct scale for the matching. Recommend you start with a scale (2^1/x: x being the correct scale) and propagate the estimate up the pyramid.
What you need is something like SIFT or SURF.

Image Comparison

What is the efficient way to compare two images in visual c..?
Also in which format images has to be stored.(bmp, gif , jpeg.....)?
Please provide some suggestions
If the images you are trying to compare have distinctive characteristics that you are trying to differentiate then PCA is an excellent way to go. The question of what format of the file you need is irrelevant really; you need to load it into the program as an array of numbers and do analysis.
Your question opens a can of worms in terms of complexity.
If you want to compare two images to check if they are the same, then you need to perform an md5 on the file (removing possible metainfos which could distort your result).
If you want to compare if they look the same, then it's a completely different story altogether. "Look the same" is intended in a very loose meaning (e.g. they are exactly the same image but stored with two different file formats). For this, you need advanced algorithms, which will give you a probability for two images to be the same. Not being an expert in the field, I would perform the following "invented out of my head" algorithm:
take an arbitrary set of pixel points from the image.
for each pixel "grow" a polygon out of the surrounding pixels which are near in color (according to HSV colorspace)
do the same for the other image
for each polygon of one image, check the geometrical similitude with all the other polygons in the other image, and pick the highest value. Divide this value by the area of the polygon (to normalize).
create a vector out of the highest values obtained
the higher is the norm of this vector, the higher is the chance that the two images are the same.
This algorithm should be insensitive to color drift and image rotation. Maybe also scaling (you normalize against the area). But I restate: not an expert, there's probably much better, and it could make kittens cry.
I did something similar to detect movement from a MJPEG stream and record images only when movement occurs.
For each decoded image, I compared to the previous using the following method.
Resize the image to effectively thumbnail size (I resized fairly hi-res images down by a factor of ten
Compare the brightness of each pixel to the previous image and flag if it is much lighter or darker (threshold value 1)
Once you've done that for each pixel, you can use the count of different pixels to determine whether the image is the same or different (threshold value 2)
Then it was just a matter of tuning the two threshold values.
I did the comparisons using System.Drawing.Bitmap, but as my source images were jpg, there were some artifacting.
It's a nice simple way to compare images for differences if you're going to roll it yourself.
If you want to determine if 2 images are the same perceptually, I believe the best way to do it is using an Image Hashing algorithm. You'd compute the hash of both images and you'd be able to use the hashes to get a confidence rating of how much they match.
One that I've had some success with is pHash, though I don't know how easy it would be to use with Visual C. Searching for "Geometric Hashing" or "Image Hashing" might be helpful.
Testing for strict identity is simple: Just compare every pixel in source image A to the corresponding pixel value in image B. If all pixels are identical, the images are identical.
But I guess don't want this kind of strict identity. You probably want images to be "identical" even if certain transformations have been applied to image B. Examples for these transformations might be:
changing image brightness globally (for every pixel)
changing image brightness locally (for every pixel in a certain area)
changing image saturation golbally or locally
gamma correction
applying some kind of filter to the image (e.g. blurring, sharpening)
changing the size of the image
rotation
e.g. printing an image and scanning it again would probably include all of the above.
In a nutshell, you have to decide which transformations you want to treat as "identical" and then find image measures that are invariant to those transformations. (Alternatively, you could try to revert the translations, but that's not possible if the transformation removes information from the image, like e.g. blurring or clipping the image)

Resources