I have a set of templates images against which I need to compare a test image and find the best match. Given that we have a SIFT descriptor, I select the best feature match and all feature matches that lie within 3*distance of the best match are considered good matches. Then I add up the distance of all the good matches. I don't know if this is a good approach because I think I should also take into the account the number of good matches, not just the average of the distances between the good matches. I am new to template matching, so I would appreciate your inputs.
In these test images, is the template that you are looking for always in the same perspective (undistorted)? If so, I would recommend a more accurate technique than using feature point matching. OpenCV offers a function called matchTemplate() and there is even a gpu implementation. Your measure can be based on the pixel averaged result of that function.
If they are distorted, then using SIFT or SURF might suffice. You should send your point matches through findHomography(), which will use RANSAC to remove outliers. The number of matches that survive this test could be used as a measure to decide if the object is found.
Related
I'm quiet new to OpenCV and image processing, so my questions to the feature matching approach a a bit general. I read something about the theory, but i have problems to arrange the very specific theory in this steps;
As i understand it i would group the sequence in the following steps:
Feature detection: Special points from image are found in a very
Feature description: Information about the near neighborhood is collected and a per featurepoint one vector is created
->(1) is this always in the form of an histogram?
Matching: A distance between the descriptors is calculated
->(2) can I determine what kind of distance is used? I read about χ^2 and EMD, even if they are not implemented, are these the right keywords in this place
Corresponding matches are determined
->(3) I guess the Hungarian method would be one method?
Transformation estimation: In an optimization problem the best position is estimated
It would be nice if someone could clarify the italic marked question
(1): is this always in the form of an histogram?
No, for example there are binary descriptors for ORB features. In Theory, descriptors can be anything. Often they are normalized and often they are either binary or floating points. But: Histograms have some properties which can make them good descriptors.
(2) can I determine what kind of distance is used?
For floating point descriptors, sum of squared distances might be the most used metric to measure the distance. For binary descriptor afaik, hamming distance is used?
(3) I guess the Hungarian method would be one method?
Could be used, I guess, but this might or might not lead to some problems. Typically nearest neighbor approaches are used. Often just brute-force (which is O(n^2) instead of O(n^3) of hungarian). The "problem", that multiple descriptors of one set might have the same nearest neighbor in the second set, is in fact another feature, because if that happens, you might be able to filter out some "uncertain" matches (often the ratio of the best n matches is used to filter out even more). You must assume that many descriptors in a set will have no fitting correspondence in the second set and you must assume, that the matching itself won't produce perfect matches. Typically some additional steps like homography computation are used to make the matching more robust and filter out outliers.
I have a large image (5400x3600) that has multiple CCTVs that I need to detect.
The detection takes lot of time (4-7 minutes) with rotation. But it still fails to resolve certain CCTVs.
What is the best method to match a template like this?
I am using skImage - openCV is not an option for me, but I am open to suggestions on that too.
For example: in the images below, the template is correct matched with the second image - but the first image is not matched - I guess due to the noise created by the text "BLDG..."
Template:
Source image:
Match result:
The fastest method is probably a cascade of boosted classifiers trained with several variations of your logo and possibly a few rotations and some negative examples too (non-logos). You have to roughly scale your overall image so the test and training examples are approximately matched by scale. Unlike SIFT or SURF that spend a lot of time in searching for interest points and creating descriptors for both learning and searching, binary classifiers shift most of the burden to a training stage while your testing or search will be much faster.
In short, the cascade would run in such a way that a very first test would discard a large portion of the image. If the first test passes the others will follow and refine. They will be super fast consisting of just a few intensity comparison in average around each point. Only a few locations will pass the whole cascade and can be verified with additional tests such as your rotation-correlation routine.
Thus, the classifiers are effective not only because they quickly detect your object but because they can also quickly discard non-object areas. To read more about boosted classifiers see a following openCV section.
This problem in general is addressed by Logo Detection. See this for similar discussion.
There are many robust methods for template matching. See this or google for a very detailed discussion.
But from your example i can guess that following approach would work.
Create a feature for your search image. It essentially has a rectangle enclosing "CCTV" word. So the width, height, angle, and individual character features for matching the textual information could be a suitable choice. (Or you may also use the image having "CCTV". In that case the method will not be scale invariant.)
Now when searching first detect rectangles. Then use the angle to prune your search space and also use image transformation to align the rectangles in parallel to axis. (This should take care of the need for the rotation). Then according to the feature choosen in step 1, match the text content. If you use individual character features, then probably your template matching step is essentially a classification step. Otherwise if you use image for matching, you may use cv::matchTemplate.
Hope it helps.
Symbol spotting is more complicated than logo spotting because interest points work hardly on document images such as architectural plans. Many conferences deals with pattern recognition, each year there are many new algorithms for symbol spotting so giving you the best method is not possible. You could check IAPR conferences : ICPR, ICDAR, DAS, GREC (Workshop on Graphics Recognition), etc. This researchers focus on this topic : M Rusiñol, J Lladós, S Tabbone, J-Y Ramel, M Liwicki, etc. They work on several techniques for improving symbol spotting such as : vectorial signatures, graph based signature and so on (check google scholar for more papers).
An easy way to start a new approach is to work with simples shapes such as lines, rectangles, triangles instead of matching everything at one time.
Your example can be recognized by shape matching (contour matching), much faster than 4 minutes.
For good match , you require nice preprocess and denoise.
examples can be found http://www.halcon.com/applications/application.pl?name=shapematch
I have 2 objects. I get n features from object 1 & m features from object 2.
n!=m
I have to measure the probability that object 1 is similar to object 2.
How can I do this?
There is a nice tutorial in the OpenCV website that does this. Check it out.
The idea is to get the distances between all those descriptors with a FlannBasedMatcher, get the closest ones, and run RANSAC to find some set of consistent features between the two objects. You don't get a probability, but the number of consistent features, from which you may score how good your detection is, but that is up to you.
You can group the features in the image where features are more.
Set a vector to use the same. There may be multiple matches from among-st that you can choose the highest one.
Are you talking about point feature descriptors, like SIFT, SURF, or FREAK?
In that case there are several strategies. In all cases you need a distance measure. For SIFT or SURF you can use the Euclidean distance between the descriptors, or the L1 norm, or the dot product (correlation). For binary features, like FREAK or BRISK, you typically use the Hamming distance.
Then, one approach, is to simply pick a threshold on the distance. This is likely to give you many-to-many matches. Another way is to use bipartite graph matching to find the minimum-cost or maximum-weight assignment between the two sets. A very practical approach is described by David Lowe, which uses a ratio test to discard ambiguous matches.
Many of these strategies are implemented in the matchFeatures function in the Computer Vision System Toolbox for MATLAB.
I'm trying to do some key feature matching in OpenCV, and for now I've been using cv::DescriptorMatcher::match and, as expected, I'm getting quite a few false matches.
Before I start to write my own filter and pruning procedures for the extracted matches, I wanted to try out the cv::DescriptorMatcher::radiusMatch function, which should only return the matches closer to each other than the given float maxDistance.
I would like to write a wrapper for the available OpenCV matching algorithms so that I could use them through an interface which allows for additional functionalities as well as additional extern (mine) matching implementations.
Since in my code, there is only one concrete class acting as a wrapper to OpenCV feature matching (similarly as cv::DescriptorMatcher, it takes the name of the specific matching algorithm and constructs it internally through a factory method), I would also like to write a universal method to implement matching utilizing cv::DescriptorMatcher::radiusMatch that would work for all the different matcher and feature choices (I have a similar wrapper that allows me to change between different OpenCV feature detectors and also implement some of my own).
Unfortunately, after looking through the OpenCV documentation and the cv::DescriptorMatcher interface, I just can't find any information about the distance measure used to calculate the actual distance between the matches. I found a pretty good matching example here using Surf features and descriptors, but I did not manage to understand the actual meaning of a specific value of the argument.
Since I would like to compare the results I'd get when using different feature/descriptor combinations, I would like to know what kind of distance measure is used (and if it can easily be changed), so that I can use something that makes sense with all the combinations I try out.
Any ideas/suggestions?
Update
I've just printed out the feature distances I get when using cv::DescriptorMatcher::match with various feature/descriptor combinations, and what I got was:
MSER/SIFT order of magnitude: 100
SURF/SURF order of magnitude: 0.1
SURF/SIFT order of magnitude: 50
MSER/SURF order of magnitude: 0.2
From this I can conclude that whichever distance measure is applied to the features, it is definitely not normalized. Since I am using OpenCV's and my own interfaces to work with different feature extraction, descriptor calculation and matching methods, I would like to have some argument for ::radiusMatch that I could use with all (most) of the different combinations. (I've tried matching using BruteForce and FlannBased matchers, and while the matches are slightly different, the discances between the matches are on the same order of magnitude for each of the combinations).
Some context:
I'm testing this on two pictures acquired from a camera mounted on top of a (slow) moving vehicle. The images should be around 5 frames (1 meter of vehicle motion) apart, so most of the features should be visible, and not much different (especially those that are far away from the camera in both images).
The magnitude of the distance is indeed dependent on the type of feature used. That is because some specialized feature descriptors also come with a specialized feature matcher that makes optimal use of the descriptor. If you want to obtain weights for the match distances of different feature types, your best bet is probably to make a training set of a dozen or more 1:1 matches, unleash each feature detector/matcher on it, and normalize the distances so that each detector has an average distance of 1 over all matches. You can then use the obtained weights on other datasets.
You should have a look at the following function in features2d.hpp in opencv library.
template<class Distance> void BruteForceMatcher<Distance>::commonRadiusMatchImpl()
Usually we use L2 distance to measure distance between matches. It depends on the descriptor you use. For example, Hamming distance is useful for the Brief descriptor since it counts the bit differences between two strings.
I am using openCV Surf tracker to find exact points in two images.
as you know, Surf returns many Feature points in both images. what i want to do is using these feature parameters to find out which matches are exactly correct (true positive matches). In my application i need only true positive matches.
These parameters existed : Hessian, Laplacian, Distance, Size, Dir.
I dont know how to use these parameters?
is exact matches have less distance or more hessian? laplacian can help ? size or dir can help ?
How can i find Exact matches(true positives)??
You can find very decent matches between descriptors in the query and image by adopting the following strategy -
Use a 2 NN search for query descriptors among the image descriptors, and the following condition-
if distance(1st match) < 0.6*distance(2nd match) the 1st match is a "good match".
to filter out false positives.
It obvious you can't be 100% sure which points truly match. You can increase (in the cost of performance) positives by tuning SURF parameters (see some links here). Depending on your real task you can use robust algorithms to eliminate outliers, i.e. RANSAC if you perform kind of model fitting. Also, as Erfan said, you can use spatial information (check out "Elastic Bunch Graph Matching" and Spatial BoW).
The answer which I'm about to post is just my guess because I have not tested it to see whether it exactly works as predicted or not.
By comparing the relative polar distance between 3 random candidate feature points returned by opencv and comparing it with the counterpart points in the template (with a certain error), you can not only compute the probability of true positiveness, but also the angle and the scale of your matched pattern.
Cheers!