How does HOG feature descriptor training work? - opencv

There doesn't seem to be any implementations of HOG training in openCV and little sources about how HOG training works. From what I gathered, HOG training can be done in real time. But what are the requirements of training? How does the training process actually work?

As with most computer vision algorithms, Google Scholar is your friend :) I would suggest reading a few papers on how it works. Here is one of the most referenced papers on HoG for you to start with.
Another tip when researching in computer vision is to note the authors of the papers you find interesting, and try to find their websites. They will tend to have an implementation of their algorithms as well as rules of thumb on how to use them. Also, look up the references that are sited in the paper about your algorithm. This can be very helpful in aquiring the background knowledge to truly understand how the algorithm works and why.

Your terminology is a bit mixed up. HOG is a feature descriptor. You can train a classifier using HOG, which can in turn be used for object detection. OpenCV includes a people detector that uses HOG features and an SVM classifier. It also includes CascadeClassifier, which can use HOG, and which is typically used for face detection.
There is a program in OpenCV called opencv_traincascade, which lets you train a cascade object detector, an which gives you the option to use HOG. There is a function in the Computer Vision System Toolbox for MATLAB called trainCascadeObjectDetector, which does the same thing.

Related

Using Caffe to classify "hand-crafted" image features

Does it make any sense to perform feature extraction on images using, e.g., OpenCV, then use Caffe for classification of those features?
I am asking this as opposed to the traditional way of passing the images directly to Caffe, and letting Caffe do the extraction and classification procedures.
Yes, it does make sense, but it may not be the first thing you want to try:
If you have already extracted hand-crafted features that are suitable for your domain, there is a good chance you'll get satisfactory results by using an easier-to-use machine learning tool (e.g. libsvm).
Caffe can be used in many different ways with your features. If they are low-level features (e.g. Histogram of Gradients), then several convolutional layers may be able to extract the appropriate mid-level features for your problem. You may also use caffe as an alternative non-linear classifier (instead of SVM). You have the freedom to try (too) many things, but my advice is to first try a machine learning method with a smaller meta-parameter space, especially if you're new to neural nets and caffe.
Caffe is a tool for training and evaluating deep neural networks. It is quite a versatile tool allowing for both deep convolutional nets as well as other architectures.
Of course it can be used to process pre-computed image features.

How to approach a machine learning programming competition

Many machine learning competitions are held in Kaggle where a training set and a set of features and a test set is given whose output label is to be decided based by utilizing a training set.
It is pretty clear that here supervised learning algorithms like decision tree, SVM etc. are applicable. My question is, how should I start to approach such problems, I mean whether to start with decision tree or SVM or some other algorithm or is there is any other approach i.e. how will I decide?
So, I had never heard of Kaggle until reading your post--thank you so much, it looks awesome. Upon exploring their site, I found a portion that will guide you well. On the competitions page (click all competitions), you see Digit Recognizer and Facial Keypoints Detection, both of which are competitions, but are there for educational purposes, tutorials are provided (tutorial isn't available for the facial keypoints detection yet, as the competition is in its infancy. In addition to the general forums, competitions have forums also, which I imagine is very helpful.
If you're interesting in the mathematical foundations of machine learning, and are relatively new to it, may I suggest Bayesian Reasoning and Machine Learning. It's no cakewalk, but it's much friendlier than its counterparts, without a loss of rigor.
EDIT:
I found the tutorials page on Kaggle, which seems to be a summary of all of their tutorials. Additionally, scikit-learn, a python library, offers a ton of descriptions/explanations of machine learning algorithms.
This cheatsheet http://peekaboo-vision.blogspot.pt/2013/01/machine-learning-cheat-sheet-for-scikit.html is a good starting point. In my experience using several algorithms at the same time can often give better results, eg logistic regression and svm where the results of each one have a predefined weight. And test, test, test ;)
There is No Free Lunch in data mining. You won't know which methods work best until you try lots of them.
That being said, there is also a trade-off between understandability and accuracy in data mining. Decision Trees and KNN tend to be understandable, but less accurate than SVM or Random Forests. Kaggle looks for high accuracy over understandability.
It also depends on the number of attributes. Some learners can handle many attributes, like SVM, whereas others are slow with many attributes, like neural nets.
You can shrink the number of attributes by using PCA, which has helped in several Kaggle competitions.

OpenCV: Training a soft cascade classifier

I've built an algorithm for pedestrian detection using openCV tools. To perform classification I use a boosted classifier trained with the CvBoost class.
The problem of this implementation is that I need to feed my classifier the whole set of features I used for training. This makes the algorithm extremely slow, so much that each image takes around 20 seconds to be fully analysed.
I need a different detection structure, and openCV has this Soft Cascade class that seems like exactly what I need. Its basic principle is that there is no need to examine all the features of a testing sample, since a detector can reject most negative samples using a small number of features. The problem is that I have no idea how to train one given a fully labeled set of negative and positive examples.
I find no information about this online, so I am looking for any tips you can give me on how to use this soft cascade to make classification.
Best regards

methods of face detection?

i want to know the best method of face detection because i'm working on predict face emotion application
so Before analyzing the facial expression of a face fixed or moving, it should detect or follow to extract relevant information. several
detection methods existe but what is the best in my case ?
A fast and easy way to get started with face detection is through using OpenCV's Haar detection methods (a slightly modified version of the viola-jones face detection algorithm IIRC). They have pre-trained haar cascade classifiers for entire faces and individual face components, e.g. eyes, nose, etc. You can also train your own if you feel so inclined. Haar features also have the advantage of being very fast, so it's quite usable with video (which it sounds like you'll be using). Also, by having the individual face-components classified, it may simplify your emotion detection/prediction algorithm.
You can find the OpenCV documentation detailing Haar feature-based object recognition at http://docs.opencv.org/modules/objdetect/doc/cascade_classification.html#viola01
and an example of performing face detection at http://code.opencv.org/projects/opencv/repository/revisions/master/entry/samples/cpp/dbt_face_detection.cpp
As for the emotion detection, that's an open research question, so anything you try will likely be fairly involved. If you're into that sort of thing, some good papers to look over might be http://www.utdallas.edu/dept/eecs/research/researchlabs/msp-lab/publications/Busso_2004.pdf and http://humansensing.cs.cmu.edu/papers/Automated.pdf

Is Haar Cascade the only available technique for image recognition in OpenCV

I know that there are many detection techniques in OpenCV, such as SURF, STAR, ORB etc...but those techniques are for feature detection of new video feed, not for dealing with specific instances of objects that require prior learning. OpenCV's documentation isn't quite as easy to flip through and I've yet been able to find anything besides Haar, which I know deals best with face recognition.
So are there any other techniques besides Haar? The Haar technique dates back to research 10 years ago, so ideally I hope that there have been some more advances since then that have been implemented in OpenCV.
If you are looking for OpenCV machine learning type algorithms, check out this link.
For a state of the art on-the-fly object detection algorithm, have a look at OpenTLD. It uses bounding boxes and random forests to learn about an object over time. Check out the demo video here.
Also check out the matching_to_many_images.cpp sample from OpenCV. It uses feature descriptors to match objects much like Google Goggles works. A related example to this is the bagofwords_classification.cpp sample. It may be what you are looking for in this case. It uses feature detectors (SURF, SIFT, etc...) to detect objects and then classify them by comparing the relative positions of the features to a learned database of features. Have a look also at this tutorial from MIT.
The latentsvmdetect.cpp may also be a good starting point for you.
Hope that helps!

Resources