I am working on a project in my school to detect how many students are in the classroom. Like in this picture.
I have been trying to use Haar Cascade in opencv for face detection to detect people, but the result is very bad. Like this:
I took thousands of pictures in classroom, and cropped the picture with people manually. There are about 4000 positive samples and 12000 negative samples. I was wondering what did I do wrong?
When I crop the image, should I only crop only head like this?
Or like this with body?
I think I had enough training samples, and I follow the exact procedure with this post:
http://note.sonots.com/SciSoftware/haartraining.html#v6f077ba
which should be working.
Or should I use a different algorithm like HOG or SVM. Any suggestion would be great for me, I have been stuck in this for months and don't have any clue. Thanks a lot!
Haar is better for human face. Hog with SVM is classic for human detection and there've been lots of source and blogs about them, it's not hard to train a classifier. For your scene, I think 'head and shoulder' is better than 'head alone'. But your multi-view samples increase the difficulty. A facing cam would be better. Add more hard neg samples if you always have much more false positive alarms.
This paper may help:
http://irip.buaa.edu.cn/~zxzhang/papers/icip2009-1.pdf
Normally, with Haar cascade, the result is very different when we change the parameters when we train the classifier. In my case, that object is very simple, but it cannot detect too:
opencv haar cascade for Varikont detection
When I changed the parameters, it can detect very nice.
You can have a look here: https://inspirit.github.io/jsfeat/sample_haar_face.html
Or more special and more professional, you can research about Bag of Visual Words (SIFT/SURF + SVM or SGD).
Personally, I think you don't need to use the method complex for person detection.
Regards,
Related
I am working on a project that detect people and identify whether he is wearing a pair of protection goggles. Now I am using the tradition HOG features to detect human body based on the Dalal's algorithms. My application gave me a confusion matrix like this after I test my data (%80 data used for training and 20% for test) :
confusion matrix.
The result seems to be good but when I use my detector to detect human, It gave me a result like this:
result of human detection.
The detector even perform worse on other pics
May I ask where is the problem, is it from classifier or my detector?
Sorry that I don not have the privilege to post a image here..
You are on the right track here. You can improve the results further. Before I get into that, You may get accuracy of around 90 and if you really push hard like 93-94 %(again depending on number of images you have for training and how similar they are with actual use case)
Okay, back to the answer. You have to use Hard negative mining to reduce false positives(i.e detecting a person when there is none). You take all the false positives and add them to the negative class and retrain the classifier. This will help you improve the results.
Hope this helps.
I have a set of images of a particular object. I want to find if some of these has anomalies with a machine learning algorithm. For example if I have many photos of glasses I want to find if one of these is broken or has something anomalous. Something like this:
GOOD!!
BAD!!
(Obviously I will use the same kind of glasses...)
The problem is that I don't know every negative situation, so, for training, I have only positive images.
In other words I want an algorithm that recognize if an image has something different from the dataset. Do you have any suggestion?
In particular is there a way to use convolutional neural network?
What you are looking for is usually called anomaly, outlier, or novelty detection. You have lots of examples of what your data should look like, and you want to know when something doesn't look like your data.
A good approach for this problem, since you are using images, you can get a feature vectorized version using a pre-trained CNN on image net. Then you can use an anomaly detector on that feature set. The isolation forest should be an easier one to get working.
This is a typical Classification problem. I do not understand why you need CNN for this ......
My suggestion would be to build/train a classification model
comprising of only GOOD images of glass. Here you would possibly
have all kinds of glasses that are intact with a regular shape.
If the model encounters anything other than GOOD images, it will
classify those as BAD images. This so called BAD images may
include cracked/broken glasses having an irregular shape.
Another option that might work is to use an autoencoder.
Autoencoders are unsupervised neural networks with bottleneck architecture that try to reconstruct its own input.
We could train a deep convolutional autoencoder with examples of good glasses so that it gets specialized in reconstructing those type of images. You don't need to train autoencoder with bad glasses.
Therefore I would expect the trained autoencoder to produce low error rate for good glasses and high error rate for bad glasses. Error rate could be measured with MSE based on the difference between the reconstructed and original values (pixels).
From the trained autoencoder you can plot the MSEs for good vs bad glasses to help you define the right threshold. Or you can also try statistic thresholds such as: avg + 2*std, median + 2*MAD, etc.
Autoencoder details:
http://ufldl.stanford.edu/tutorial/unsupervised/Autoencoders/
Deep autoencoder for images:
https://cds.cern.ch/record/2209085/files/Outlier%20detection%20using%20autoencoders.%20Olga%20Lyudchick%20(NMS).pdf
I need to program my bot so that it is able to find an object that is it asked to pickup and bring it to the commanded position. I have tried simple img processing techniques like filtering, contour finding. that doesn't seems to work well. I want to use ORB feature extractor. here are a sample images. object of interest is the ball. In short how do I train my bot to pickup balls or other objects any sample program will be helpful. how to use ORB. provide an example if possible. thanx in advance
http://i.stack.imgur.com/spobV.jpg
http://i.stack.imgur.com/JNH1T.jpg
You can try the learning based algorithms like Haar-classifier to detect any object. Thanks to OpenCV all the training process is very streamlined. All you have to do is to is to train your classifier with some true-image(image of the object) and false images(any possible image not having the object.).
Below are some links for your refrence.
Haar trainer for Ball-Pen detection: http://opencvuser.blogspot.com/2011/08/creating-haar-cascade-classifier-aka.html
Haar Trainer for Banana Detection :) :http://coding-robin.de/2013/07/22/train-your-own-opencv-haar-classifier.html
i am working on a simple playing card detecting programme.
For now i have a working Sift Algorithmus from here.
And i have created some bounding boxes around the cards.
Then i used Sift on the card to be searched for and saved the descriptors.
But what to do next? Do i have to make a mask of the object and run with it through the bounding box while running Sift in every step?
Couldn't find any tutorial on how to do that exactly.
Hope someone can help me!
Greets Max
edit: I want to recognize every card, so i can say like: it's a heart 7 or so.
SIFT is just the beginning.
SIFT is a routine to obtain interest points on object. You have to use Bag of Words approach. Cluster the SIFT features you collected and represent each feature in terms of your cluster means. Represent each card as histogram of these cluster means (aka. bag of words).
Once you have the representation of the cards ready (what #nimcap says), you then you need to do the recognition itself. You can try nearest neighbors, SVM, etc.
Also, for a better description (more technical) of what to do you might want to look at Lowe's original 2004 SIFT paper.
Is SIFT the best approach for something like this ?
As opposed to Haar classifiers or just simple template matching.
eg http://digital.liby.waikato.ac.nz/conferences/ivcnz07/papers/ivcnz07-paper51.pdf
How do you find the negative and positive training data sets of Haar features for the AdaBoost algorithm? So say you have a certain type of blob that you want to locate in an image and there are several of them in your entire array - how do you go about training it? I'd appreciate a nontechnical explanation as much as possible. I'm new to this. Thanks.
First, AdaBoost does not necessarily have anything to do with Haar features. AdaBoost is a learning algorithm that combines weak learners to form a strong learner. Haar features are just a type of data on which an AdaBoost algorithm can learn.
Second, the best way to get them is to prearrange your data. So, if you want to do facial recognition a la Viola and Jones, you'll want to mark the faces in your images in a mask/overlay image. When you're training, you select samples from the image, as well as whether the sample you select is positive or negative. That positivity/negativity comes from your previous marking of the face (or whatever) in the image.
You'll have to make the actual implementation yourself, but you can use existing projects to either guide you, or you can modify their projects.