How to remove similar images based on hog features? - opencv

I have 5000 images and each image can generate a vector with about 1000 dimensions(hog feature), but some of the images are very similar so I want to remove the similar ones. Is there a way to achieve this?
===============================================================
As #thedarkside ofthemoon suggested, let me explain a little bit more about what I am trying to do. I am using SVM + HOG features to do image classification. I have prepared some training data but some of the training images are very similar so that I want to remove the similar ones to reduce computation cost. I don't know if the removal of similar images has a side effect on the final classification rate so a good criteria of 'similarity' must be found. That's what i am trying to do.

In another way(not using hog features) you can compute color histogram for each image and compare against others.
Like,
Get the first image and compute the histogram,
Now for each other images calculate histogram and compare with the first one.
If you find close match on the histogram you can discard it. And by using CV_COMP_CORREL you will get match in the range of 0-1.

Well it depends what you mean by similar, currently my favorite image similarity descriptor is the gist descriptor.
http://people.csail.mit.edu/torralba/code/spatialenvelope/
but it is not in opencv. however it is coded in C here, so can be added to a c++ project (extern "C"), if your using the c++ opencv, not sure about python sorry.
http://people.rennes.inria.fr/Herve.Jegou/software.html
I have found this to be pretty good, and quite efficient.
(Sorry this is not a direct opencv solution, but i feel it is a reasonable answer as gist C code can be added to c++ project, and works nicely.)
EDIT:
if you just want to remove ones with similar hog descriptor you can use the:
http://docs.opencv.org/modules/ml/doc/k_nearest_neighbors.html
or
http://docs.opencv.org/trunk/modules/flann/doc/flann_fast_approximate_nearest_neighbor_search.html

Related

How to find the matched SIFT features that are spatially consistent?

I have extracted DenseSIFT from the query and database image and quantized by kmeans using VLFeat. The challenge is to find those SIFT features that quantized to the same visual words and be spatially consistent (have a similar position to object centers). I have tried few techniques:
using FLANN() on the SIFT (normal SIFT) coordinates on both query and database image and find the nearest neighbor and then comparing the visual words (NOTE: this gave few points that did not work).
Using Coherent-Point-Drift (CPD) on SIFT coordinates to find the matched points (I am not sure about this whether it is a right solution or not).
I am struggling with it for many days, and I hope experts can guide me with this. What are the possible solutions or algorithms that I can use for solving this?
Neither of those two methods you mentioned achieve what you want do. The answer depends on the object in your pictures. If it has mostly flat faces, then you can rely on estimating the homography, see this tutorial.
If that's not case then can use the epipolar constraint to remove outliers / get geometrically consistent matches, see this tutorial. There are some other ways to achieve this if the speed is of importance in your application.

object detection LEDs in simple scene

I am new to opencv, I am guessing that this problem could be somewhat simple: I am trying to detect an object which is almost 25 by 15 pixels in an image which is 470 by 590 pixels.
I am attaching a zoomed image of this object, I have several options to go with:
1 - Two close Circles Detection using hough transformation,
2 - Histogram matching
3 - SURF feature detection
Any advise on which direction should I take? Please consider speed and real-time application. Thanks
I think it should go without explicitly saying so, but there are probably hundreds of things that could be tried, and with only one example image it is quite difficult to advise. For instance are the LED always green? we don't know.
That aside, imho, two good places to start would be with the ol' faithful template matching, or blob detection.
Then if that is not robust enough, you will need to look at some alternative representations of the template/blob, like the classic HoG (good for shape, maybe a bit heavy this app.), or even your own bespoke one that encodes your own domain specific knowledge of this problem.
Then if that is not robust enough, build a dataset of representative +ve and -ve examples, as big as you can, and then train a machine like svm , or a boosted classifier.
Template Matching:
http://docs.opencv.org/doc/tutorials/imgproc/histograms/template_matching/template_matching.html
Blob detection:
https://code.google.com/p/cvblob/
Machine Learning:
http://docs.opencv.org/modules/ml/doc/ml.html
TIPS:
Add as much domain knowledge as possible, i.e. if they are always green, use color in the representation, like hog on g channel for instance. If they are always circular, try to encode that, like use a log-polar grid in the template,rather than a regular grid... and so on.
Machine Learning is not magic, a linear classifier will essentially weight different points in the feature space, so you still require a good representation, so if the Template matching was a total fail, the it is unlikely that simple linear ml with help, but if the Template matching was okay, then ml may well boost the performance to a good level.
step 1: Remove the black background.
step 2: A snake algorithm can be used to find the boundaries of your object

What is the best method to template match a image with noise?

I have a large image (5400x3600) that has multiple CCTVs that I need to detect.
The detection takes lot of time (4-7 minutes) with rotation. But it still fails to resolve certain CCTVs.
What is the best method to match a template like this?
I am using skImage - openCV is not an option for me, but I am open to suggestions on that too.
For example: in the images below, the template is correct matched with the second image - but the first image is not matched - I guess due to the noise created by the text "BLDG..."
Template:
Source image:
Match result:
The fastest method is probably a cascade of boosted classifiers trained with several variations of your logo and possibly a few rotations and some negative examples too (non-logos). You have to roughly scale your overall image so the test and training examples are approximately matched by scale. Unlike SIFT or SURF that spend a lot of time in searching for interest points and creating descriptors for both learning and searching, binary classifiers shift most of the burden to a training stage while your testing or search will be much faster.
In short, the cascade would run in such a way that a very first test would discard a large portion of the image. If the first test passes the others will follow and refine. They will be super fast consisting of just a few intensity comparison in average around each point. Only a few locations will pass the whole cascade and can be verified with additional tests such as your rotation-correlation routine.
Thus, the classifiers are effective not only because they quickly detect your object but because they can also quickly discard non-object areas. To read more about boosted classifiers see a following openCV section.
This problem in general is addressed by Logo Detection. See this for similar discussion.
There are many robust methods for template matching. See this or google for a very detailed discussion.
But from your example i can guess that following approach would work.
Create a feature for your search image. It essentially has a rectangle enclosing "CCTV" word. So the width, height, angle, and individual character features for matching the textual information could be a suitable choice. (Or you may also use the image having "CCTV". In that case the method will not be scale invariant.)
Now when searching first detect rectangles. Then use the angle to prune your search space and also use image transformation to align the rectangles in parallel to axis. (This should take care of the need for the rotation). Then according to the feature choosen in step 1, match the text content. If you use individual character features, then probably your template matching step is essentially a classification step. Otherwise if you use image for matching, you may use cv::matchTemplate.
Hope it helps.
Symbol spotting is more complicated than logo spotting because interest points work hardly on document images such as architectural plans. Many conferences deals with pattern recognition, each year there are many new algorithms for symbol spotting so giving you the best method is not possible. You could check IAPR conferences : ICPR, ICDAR, DAS, GREC (Workshop on Graphics Recognition), etc. This researchers focus on this topic : M Rusiñol, J Lladós, S Tabbone, J-Y Ramel, M Liwicki, etc. They work on several techniques for improving symbol spotting such as : vectorial signatures, graph based signature and so on (check google scholar for more papers).
An easy way to start a new approach is to work with simples shapes such as lines, rectangles, triangles instead of matching everything at one time.
Your example can be recognized by shape matching (contour matching), much faster than 4 minutes.
For good match , you require nice preprocess and denoise.
examples can be found http://www.halcon.com/applications/application.pl?name=shapematch

sift features for "similar" objects

I find out that SIFT features is only good for find the same object in the scene, but it seems not suitable for "similar" objects.
maybe I doing something wrong?
maybe I must use some other descriptors?
images and SIFT\ASIFT algorithms work:
link
same problem- no matches
link
I find out that SIFT features is only good for find the same object in the scene, but it seems not suitable for "similar" objects.
It is exactly what they are doing (and not only them, task is called "wide baseline matching") - 1)for each feature find the most similar - called "tentative" or "putative" correspondence
2)use RANSAC or other similar method to find geometric transformation between sets of correspondences.
So, if you need to find "similar", you have to use other method, like Viola-Jones http://en.wikipedia.org/wiki/Viola%E2%80%93Jones_object_detection_framework
Or (but it will give you a lot of false positives) you can compare big image to small and do not use step 2.
The basic SIFT algorithm using VLfeat gives me this as a result. Which given the small and not so unique target image, is a pretty good result I would say.

Image Descriptors with SIFT/VLFEAT

I want to perform a classification task in which I map a given image of an object to one of a list of predefined constellations that object can be in (i.e. find the most probable match).
In order to get descriptors of the image (on which i will run machine learning algorithms) i was suggested using SIFT with the VLFeat implementation.
First of all my main question - I would like to ignore the key-point finding part of sift, and only use it for its descriptors. In the tutorial I saw that there is an option to do exactly that by calling
[f,d] = vl_sift(I,'frames',fc) ;
where fc specifies the key-points. My problem is that I want to explicitly specify the
bounding box in which i want to calculate the descriptors around the key-point - but it seems i can only specify a scale parameter which right now is a bit cryptic to me and doesn't allow me to specify explicitly the bounding box. Is there a way to achieve this?
The second question is does setting the scale manually and getting the descriptors this way make sense? ( i.e. result in a good descriptor? ). Any other suggestions regarding better ways of getting descriptors ? ( using SIFT with other implementations, or other non-SIFT descriptors ). I should mention that my object is always the only object in the image, is centered, has constant illumination, and changes by some kinds of rotations of its internal parts - And this is why I thought SIFT would work out as i understood it focuses on the orientation gradients which would change accordingly with the rotations of the object.
Thanks
Agreed about the fact that the descriptor scale looks a bit cryptic.
See the third image in the VLFeat SIFT tutorial where they overlay the extracted descriptors on the image with the following commands
h3 = vl_plotsiftdescriptor(d(:,sel),f(:,sel)) ;
set(h3,'color','g') ;
You can thus play with scale and see if the region where the histogram is extracted jives with what you expected.
SIFT sounds like it might be overkill for your application if you have that much control over the imaging environment but it should work.
Hey.
It might help looking through the background chapter of this Thesis:
http://www.cs.bris.ac.uk/Publications/pub_master.jsp?id=2001260
it would take time for me to explain about the scale so try reading it and see the relevant citation. Btw in that work the descriptors are used at base resolution, i.e. scale ~ 1.
Hope this helps.
Maybe I did not understood the problem, but, if the query image must be matched against a database of train images, and both train and test images are constant in illumination, scale, ... maybe SIFT is not necessary in here. You could have a look on correlation. Are you using matlab?
http://docs.opencv.org/doc/tutorials/imgproc/histograms/template_matching/template_matching.html#template-matching "Here" you can see an example using correlation with opencv.

Resources