OpenCV Histogram Comparison Methods - opencv

Looking at the Histogram Documentation, there are 4(5) different comparison methods:
CV_COMP_CORREL Correlation
CV_COMP_CHISQR Chi-Square
CV_COMP_INTERSECT Intersection
CV_COMP_BHATTACHARYYA Bhattacharyya distance
CV_COMP_HELLINGER Synonym for CV_COMP_BHATTACHARYYA
They all give different outputs that are read differently as shown in the Compare Histogram Documentation. But I can't find anything that states how effectively each method performs compared against each other. Surely there are Pros and Cons for each method, otherwise why have multiple methods?
Even the OpenCV 2 Computer Vision Application Programming Cookbook has very little to say on the differnces:
The call to cv::compareHist is straightforward. You just input the two
histograms and the function returns the measured distance. The
specific measurement method you want to use is specified using a flag.
In the ImageComparator class, the intersection method is used (with
flag CV_COMP_INTERSECT). This method simply compares, for each bin,
the two values in each histogram, and keeps the minimum one. The
similarity measure is then simply the sum of these minimum values.
Consequently, two images having histograms with no colors in common
would get an intersection value of 0, while two identical histograms
would get a value equal to the total number of pixels.
The other methods available are the Chi-Square (flag CV_COMP_CHISQR)
which sums the normalized square difference between the bins, the
correlation method (flag CV_COMP_CORREL) which is based on the
normalized cross-correlation operator used in signal processing to
measure the similarity between two signals, and the Bhattacharyya
measure (flag CV_COMP_BHATTACHARYYA) used in statistics to estimate
the similarity between two probabilistic distributions.
There must be differences between the methods, so my question is what are they? and under what circumstances do they work best?

CV_COMP_INTERSECT is fast to compute since you just need the minimum value for each bin. But it will not tell you much about the distribution of the differences. Other methods try to achieve better and more continuous score as a match, under different assumptions about the pixel distribution.
You can find the formulae used in different methods, at
http://docs.opencv.org/doc/tutorials/imgproc/histograms/histogram_comparison/histogram_comparison.html
Some references to more details on the matching algorithms can be found at:
http://siri.lmao.sk/fiit/DSO/Prednasky/7%20a%20Histogram%20based%20methods/7%20a%20Histogram%20based%20methods.pdf

Related

Determining the number of clusters for kdd99 dataset using k-means

What is the general convention for number of k, while performing k-means on KDD99 dataset? Three different papers I read have three completely different k (25,20 and 5). I would like to know the general opinion on this, like what should be the range of k e.t.c?
Thanks
The K-means clustering algorithm is used to find groups which have not been explicitly labeled in the data.
I general there is no method for determining the exact value for K, but an estimated approach can be used to determine it.
To find K, take the mean distance between data points and their cluster centroid.
The elbow method and kernel method works more precisely, but the number of clusters can depend upon your problem. (Recommended)
And one of the quick approaches is:-Take the square root of the number of data points divided by two and set that as number of cluster.

General theory about OpenCV Descriptor Matching: What does every single step mean?

I'm quiet new to OpenCV and image processing, so my questions to the feature matching approach a a bit general. I read something about the theory, but i have problems to arrange the very specific theory in this steps;
As i understand it i would group the sequence in the following steps:
Feature detection: Special points from image are found in a very
Feature description: Information about the near neighborhood is collected and a per featurepoint one vector is created
->(1) is this always in the form of an histogram?
Matching: A distance between the descriptors is calculated
->(2) can I determine what kind of distance is used? I read about χ^2 and EMD, even if they are not implemented, are these the right keywords in this place
Corresponding matches are determined
->(3) I guess the Hungarian method would be one method?
Transformation estimation: In an optimization problem the best position is estimated
It would be nice if someone could clarify the italic marked question
(1): is this always in the form of an histogram?
No, for example there are binary descriptors for ORB features. In Theory, descriptors can be anything. Often they are normalized and often they are either binary or floating points. But: Histograms have some properties which can make them good descriptors.
(2) can I determine what kind of distance is used?
For floating point descriptors, sum of squared distances might be the most used metric to measure the distance. For binary descriptor afaik, hamming distance is used?
(3) I guess the Hungarian method would be one method?
Could be used, I guess, but this might or might not lead to some problems. Typically nearest neighbor approaches are used. Often just brute-force (which is O(n^2) instead of O(n^3) of hungarian). The "problem", that multiple descriptors of one set might have the same nearest neighbor in the second set, is in fact another feature, because if that happens, you might be able to filter out some "uncertain" matches (often the ratio of the best n matches is used to filter out even more). You must assume that many descriptors in a set will have no fitting correspondence in the second set and you must assume, that the matching itself won't produce perfect matches. Typically some additional steps like homography computation are used to make the matching more robust and filter out outliers.

How to match features when the number of features of both objects in unequal?

I have 2 objects. I get n features from object 1 & m features from object 2.
n!=m
I have to measure the probability that object 1 is similar to object 2.
How can I do this?
There is a nice tutorial in the OpenCV website that does this. Check it out.
The idea is to get the distances between all those descriptors with a FlannBasedMatcher, get the closest ones, and run RANSAC to find some set of consistent features between the two objects. You don't get a probability, but the number of consistent features, from which you may score how good your detection is, but that is up to you.
You can group the features in the image where features are more.
Set a vector to use the same. There may be multiple matches from among-st that you can choose the highest one.
Are you talking about point feature descriptors, like SIFT, SURF, or FREAK?
In that case there are several strategies. In all cases you need a distance measure. For SIFT or SURF you can use the Euclidean distance between the descriptors, or the L1 norm, or the dot product (correlation). For binary features, like FREAK or BRISK, you typically use the Hamming distance.
Then, one approach, is to simply pick a threshold on the distance. This is likely to give you many-to-many matches. Another way is to use bipartite graph matching to find the minimum-cost or maximum-weight assignment between the two sets. A very practical approach is described by David Lowe, which uses a ratio test to discard ambiguous matches.
Many of these strategies are implemented in the matchFeatures function in the Computer Vision System Toolbox for MATLAB.

facial expression classification using k-means

My method for classifying facial expressions using k-means is:
Use opencv to detect the face in the image
Use ASM and stasm to get the facial feature point
Calculate the distance between facial features (as show in the picture). There'll be 5 distances.
Calculate the centroid for each distance for each facial expression (exp: in the distance D1 there are 7 centroids for each expression 'happy, angry...').
Use 5 k-means each k-means for a distance and each k-means will have as a result the expression shown by the distance closest to the Centroid calculated in the first step.
Final expression will be the expression that appears in the most k-means results
However, using that method my results are wrong?
Is my method correct or is it wrong somewhere?
K-means is not a classification algorithm. Once runned, it simply finds centroids of K elements, so it splits data into K parts, but in most cases it won't have anything to do with desired classes. This algorithm (as all the clustering methods) should be used when you want to explore data and find some distinguishable objects. Distinguishable in any sense. If your task is to build a system, which recognizes some given classes, then it is a classification problem, not clustering. One of the most simple methods, which are easy to both implement and understand is KNN (K-nearest neighbours), which roughly does what you are trying to accomplish - checks which classes' objects are the closest ones to some predefined ones.
To better see the difference let us consider your case - you are trying to detect emotional state based on the face features. Running k-means on such data can split your face photos into many groups:
If you use photos of different people, it can cluster photos of particular people together (as their distances differ from others)
it can split data into for example man and woman, as there are gender specific differences in such features
it can even split your data based on the distance from the camera, as the perspective changes your features, creating "clusters".
etc.
As you can see, there are dozens possible "reasonable" (and even more completely not interpretable) splits, and K-means (and any) other clustering algorithm will simply find one of them (in most cases - the not interpretable one). Classification methods are used to overcome this issue, to "explain" the algorithm what are you expecting.

Unified distance measure to use in different implementations of OpenCV feature matching?

I'm trying to do some key feature matching in OpenCV, and for now I've been using cv::DescriptorMatcher::match and, as expected, I'm getting quite a few false matches.
Before I start to write my own filter and pruning procedures for the extracted matches, I wanted to try out the cv::DescriptorMatcher::radiusMatch function, which should only return the matches closer to each other than the given float maxDistance.
I would like to write a wrapper for the available OpenCV matching algorithms so that I could use them through an interface which allows for additional functionalities as well as additional extern (mine) matching implementations.
Since in my code, there is only one concrete class acting as a wrapper to OpenCV feature matching (similarly as cv::DescriptorMatcher, it takes the name of the specific matching algorithm and constructs it internally through a factory method), I would also like to write a universal method to implement matching utilizing cv::DescriptorMatcher::radiusMatch that would work for all the different matcher and feature choices (I have a similar wrapper that allows me to change between different OpenCV feature detectors and also implement some of my own).
Unfortunately, after looking through the OpenCV documentation and the cv::DescriptorMatcher interface, I just can't find any information about the distance measure used to calculate the actual distance between the matches. I found a pretty good matching example here using Surf features and descriptors, but I did not manage to understand the actual meaning of a specific value of the argument.
Since I would like to compare the results I'd get when using different feature/descriptor combinations, I would like to know what kind of distance measure is used (and if it can easily be changed), so that I can use something that makes sense with all the combinations I try out.
Any ideas/suggestions?
Update
I've just printed out the feature distances I get when using cv::DescriptorMatcher::match with various feature/descriptor combinations, and what I got was:
MSER/SIFT order of magnitude: 100
SURF/SURF order of magnitude: 0.1
SURF/SIFT order of magnitude: 50
MSER/SURF order of magnitude: 0.2
From this I can conclude that whichever distance measure is applied to the features, it is definitely not normalized. Since I am using OpenCV's and my own interfaces to work with different feature extraction, descriptor calculation and matching methods, I would like to have some argument for ::radiusMatch that I could use with all (most) of the different combinations. (I've tried matching using BruteForce and FlannBased matchers, and while the matches are slightly different, the discances between the matches are on the same order of magnitude for each of the combinations).
Some context:
I'm testing this on two pictures acquired from a camera mounted on top of a (slow) moving vehicle. The images should be around 5 frames (1 meter of vehicle motion) apart, so most of the features should be visible, and not much different (especially those that are far away from the camera in both images).
The magnitude of the distance is indeed dependent on the type of feature used. That is because some specialized feature descriptors also come with a specialized feature matcher that makes optimal use of the descriptor. If you want to obtain weights for the match distances of different feature types, your best bet is probably to make a training set of a dozen or more 1:1 matches, unleash each feature detector/matcher on it, and normalize the distances so that each detector has an average distance of 1 over all matches. You can then use the obtained weights on other datasets.
You should have a look at the following function in features2d.hpp in opencv library.
template<class Distance> void BruteForceMatcher<Distance>::commonRadiusMatchImpl()
Usually we use L2 distance to measure distance between matches. It depends on the descriptor you use. For example, Hamming distance is useful for the Brief descriptor since it counts the bit differences between two strings.

Resources