I would like to know, if there is any code or any good documentation available for implementing HOG features? I tried to read the documentation here but it's quite difficult to understand and it needs SVM..
What I need is just to implement a HOG detector for objects.... Like what it does SIFT or SURF
Btw, I'm not interesting in this work.
Thank you..
you can take a look at
http://szproxy.blogspot.com/2010/12/testtest.html
he also published "tutorial" for HOG on source forge here:
http://sourceforge.net/projects/hogtrainingtuto/?_test=beta
I know this since I'm having the same problem as you. The tutorial though isn't what i would call a tutorial, its a bunch of source codes, no documentation, but I assume that it works and can at least get you somewhere.
At the end and simplifying a bit, all that you need to detect specific objects in image is:
Localize "points of interest" to extract the patches:
In order to get points of interest, you can use some algorithms like Harris corner detector, randomly or something simply like sliding windows.
From these points get patches:
You will have to take the decission of the patch size.
From these patches compute the feature descriptor. (like HOG).
Instead of HOG you can use another feature descriptor like SIFT, SURF...
HOG's implementation is not too hard. You have to calculate the gradients of the extracted patch doing applying Sobel X and Y kernels, after that you have to divide the patch in NxM cells, 8x8 for instance, and compute an histogram of gradients, angle and magnitude. In the following link you can see it more detailed explanation:
HOG Person Detector Tutorial
Check your feature vector in the previously trained classifier
Once you got this vector, check if it is the desired object or not with a previously trained classifier like SMV. Instead SVM you could use NeuralNetworks for instance.
SVM implementation is more dificult, but there are some libraries like opencv that you can use.
There is a function extractHOGFeatures in the Computer Vision System Toolbox for MATLAB.
Related
I have 5000 images and each image can generate a vector with about 1000 dimensions(hog feature), but some of the images are very similar so I want to remove the similar ones. Is there a way to achieve this?
===============================================================
As #thedarkside ofthemoon suggested, let me explain a little bit more about what I am trying to do. I am using SVM + HOG features to do image classification. I have prepared some training data but some of the training images are very similar so that I want to remove the similar ones to reduce computation cost. I don't know if the removal of similar images has a side effect on the final classification rate so a good criteria of 'similarity' must be found. That's what i am trying to do.
In another way(not using hog features) you can compute color histogram for each image and compare against others.
Like,
Get the first image and compute the histogram,
Now for each other images calculate histogram and compare with the first one.
If you find close match on the histogram you can discard it. And by using CV_COMP_CORREL you will get match in the range of 0-1.
Well it depends what you mean by similar, currently my favorite image similarity descriptor is the gist descriptor.
http://people.csail.mit.edu/torralba/code/spatialenvelope/
but it is not in opencv. however it is coded in C here, so can be added to a c++ project (extern "C"), if your using the c++ opencv, not sure about python sorry.
http://people.rennes.inria.fr/Herve.Jegou/software.html
I have found this to be pretty good, and quite efficient.
(Sorry this is not a direct opencv solution, but i feel it is a reasonable answer as gist C code can be added to c++ project, and works nicely.)
EDIT:
if you just want to remove ones with similar hog descriptor you can use the:
http://docs.opencv.org/modules/ml/doc/k_nearest_neighbors.html
or
http://docs.opencv.org/trunk/modules/flann/doc/flann_fast_approximate_nearest_neighbor_search.html
I want to training data and use HOG algorithm to detect pedestrian.
Now I can use defaultHog.setSVMDetector(HOGDescriptor::getDefaultPeopleDetector()); in opencv to detection, but the result is not very good to my testing video. So I want to do training use my database.
I have prepared 1000+ positive sample, and 1000+ negative samples. They are cropped to size 50 * 100, and I have do the list file.
And I have read some tutorials on the internet, they all so complex, sometimes abstruse. Most of them are analyze the source code and the algorithm of HOG. But with only less examples and simple anylize.
Some instruction show that libsvm\windows\svm-train.exe can be used to training, Can anyone gives an examples according to 1000+ 50*100 positive samples?
For example, like haartraing, we can do it from opencv, like haartraining.exe –a –b with some parameters, and get a *.xml as a result which will be used to people detection?
Or is there any other method to training, and detection?
I prefer to know how to use it and the detail procedures. As the detail algorithm, it is not important to me. I just want to implement it.
If anyone know about it, please give me some tips.
I provided some sample code and instructions to start training your own HOG descriptor using openCV:
See https://github.com/DaHoC/trainHOG/wiki/trainHOG-Tutorial.
The algorithm is indeed too complex to provide in short, the basic idea however is to:
Extract HOG features from negative and positive sample images of identical size and type.
Use the extracted feature vectors along with their respective classes to train a SVM classifier, in this step you can use the svm-train.exe with a generated file of the correct format containing the feature vectors and their classes (or directly include and address the libsvm library class in your sources).
The resulting SVM model and support vectors are calculated into a single descriptor vector that can be used with the openCV detector.
Best regards
When detecting objects using SURF, how can a plot a graph for false positives and hits using the Good matches and several keypoints?
(A) How do I get the statistics of good matches i.e an ROC plot or the true positives vs false positives of detection from so many of the line descriptors?Can somebody put a code for plotting true positves vs false positive statistics.
(B)**Secondly,there are many resources vdo1 , vdo2and implemetations, papers ( Object tracking using improved Camshift with SURF method ;
A Study on Moving Object Tracking Algorithm Using SURF Algorithm
and Depth Information
) which say that SURF and SIFT can be used for tracking in combination with camshift or meanshift.
But, what I fail to understand is that we need prediction algorithm like Kalman filters or tracking algorithm like Camshift,mean shift or template differencing(not sure) for tracking.So,how come some video implementations and tutorial say that Lukas Kanade Optical flow,SIFT,SURF is tracking objects whereas the papers mention to club either camshift or meanshift.Am I missing out on some conceptual matter?
Shall be obliged for pointers and a detailed explanation on how SURF or SIFT or feature based methods can be used for tracking alone?
Lucas-Kandae with pyramid (pyrLK) is a method that looks for a small shift in a single feature location. It can do this to many features at once. Camshift and meanshift track a statistic for a group of features. You can also just try to use a matcher, to find where the features went on the next frame. GoodFeturesToTrack, SIFT and SURF are algorithms that find points that should be easy to find and tell apart one from another. SURF and SIFT include also descriptors, that characterise those features in a way which can ignore size change, orientation change or both.
Kalman filter is used to refine Your results. It is able to shrink the area where the answer should lay, because algorithms above are not perfect.
As for the code, I haven't done too much tracking except Shi-Thomasi + pyrLK, so I dont't think I can help.
i am working on a simple playing card detecting programme.
For now i have a working Sift Algorithmus from here.
And i have created some bounding boxes around the cards.
Then i used Sift on the card to be searched for and saved the descriptors.
But what to do next? Do i have to make a mask of the object and run with it through the bounding box while running Sift in every step?
Couldn't find any tutorial on how to do that exactly.
Hope someone can help me!
Greets Max
edit: I want to recognize every card, so i can say like: it's a heart 7 or so.
SIFT is just the beginning.
SIFT is a routine to obtain interest points on object. You have to use Bag of Words approach. Cluster the SIFT features you collected and represent each feature in terms of your cluster means. Represent each card as histogram of these cluster means (aka. bag of words).
Once you have the representation of the cards ready (what #nimcap says), you then you need to do the recognition itself. You can try nearest neighbors, SVM, etc.
Also, for a better description (more technical) of what to do you might want to look at Lowe's original 2004 SIFT paper.
Is SIFT the best approach for something like this ?
As opposed to Haar classifiers or just simple template matching.
eg http://digital.liby.waikato.ac.nz/conferences/ivcnz07/papers/ivcnz07-paper51.pdf
I have a basic understanding in image processing and now studying in-depth the "Digital Image Processing" book by Gonzales.
When image given and object of interest approximated form is known (e.g. circle, triangle),
what is the best algorithm / method to find this object on image?
The object can be slightly deformed, so brute force approach will not help.
You may try using Histograms of Oriented Gradients (also called Edge Orientation Histograms). We have used them for detecting road signs. http://en.wikipedia.org/wiki/Histogram_of_oriented_gradients and the papers by Bill Triggs should get you started.
I recommend you use the Hough transform, which allows you to find any given pattern described by a equation. What's more the Hough transform works also great for deformed objects.
The algorithm and implementation itself is quite simple.
More details can be found here: http://en.wikipedia.org/wiki/Hough_transform , even a source code for this algorithm is included on a referenced page (http://www.rob.cs.tu-bs.de/content/04-teaching/06-interactive/HNF.html).
I hope that helps you.
I would look at your problem in two steps:
first finding your object's outer boundary:
I'm supposing you have contrasted enough image, that you can easily threshold to get a binary image of your object. You need to extract the object boundary chain-code.
then analyzing the boundary's shape to deduce the form (circle, polygon,...):
You can calculate the curvature in each point of the boundary chain and thus determine how many sharp angles (i.e. high curvature value) there are in your shape. Several sharp angles means you have a polygon, none means you have a circle (constant curvature).
You can find a description on how to get your object's boundary from the binary image and ways of analysing it in Gonzalez's Digital Image Processing, chapter 11.
I also found this insightful presentation on binary image analyis (PPT) and a matlab script that implements some of the techniques that Gonzalez talks about in DIP.
I strongly recommend you to use OpenCV, it's a great computer vision library that greatly help with anything related to computer vision. Their website isn't really attractive, nor helpful, but the API is really powerful.
A book that helped me a lot since there isn't a load of documentation on the web is Learning OpenCV. The documentation that comes with the API is good, but not great for learning how to use it.
Related to your problem, you could use a Canny Edge detector to find the border of your item and then analyse it, or you could proceed with and Hough transform to search for lines and or circles.
you can specially try 'face recognition'. Because, you know that is a specific topic. On the other hand 'face detection' etc. EmguCV can be useful for you.. It is .Net wrapper to the Intel OpenCV image processing library.
It looks like professor Jean Rouat from the University of Sherbooke, has found a way to find objects in images by processing neutral spiking neural network. His technology name RN-SPIKES, seems to be available for licencing.