I have to make an application which recognize road signs. I saw that in OpenCV folder there are some XML files for facial recognition but I do not know what that numbers in the XML represents or how they obtained those values. I need to understand this so as I can do my own XML files for road sign recognition.
I do not know much about OpenCV, anyhow I have completed my Final Year Project on Face Recognition using neural networks. Basically I used an algorithm to extract the Facial Portion from a given image. Thereafter I fed that new image (containing only the face) to a neural network that I developed using Matlab. After rigorous improvements, it was a success and by using the Simulation Feature of Matlab it was possible to precisely identify the individual.
Therefore I strongly recommend that you follow the same technique in carrying out this task.
I managed to find some interesting articles related to this topic, here, here , here and here.
What you need is two steps:
detection step
recognition step
for the detection, I suggest you to use cascade classifier that is included with opencv. It's robust and more quick than that of haar trainer. By this step you train the traffic signs to be detected. I found this tutorial that may help you how to prepare your training stuff
by this step you detect your signs . it may detect you some additional false objects in the image, for these undesired objects you can eliminate them by some processing like ratio, or color , or even by adding some negative images.
for the recognition I suggest you to use exactly the opencv's tutorial dedicated for face recognition
here you don't need a lot of modification..
Related
The problem statement:
Given two images such as the two images of Brad Pitt below, figure out if the image contains the same person or no. The difficulty is that we have only one reference image for each person and what to figure out if any other incoming image contains the same person or no.
Some research:
There are a few different methods of solving this task, these are
Using color histograms
Keypoint oriented methods
Using deep convolutional neural networks or other ML techniques
The histogram methods involve calculating histograms based on color and defining some sort of metric between them and then deciding upon a threshold. One that I have tried is the Earth Mover's Distance. However this method is lacking in accuracy.
The best approach therefore should be some sort of mix between 2nd and 3rd methods, and some preprocessing.
For preprocessing obvious steps to perform are:
Run a face detection such as Viola-Jones and separate the regions containing faces
Convert the said faces to grayscale
Run eye,mouth,nose detection algorithms perhaps using haar_cascades of opencv
Align the face images according found landmarks
All of this is done using opencv.
Extracting features such as SIFT and MSER generate accuracy of between 73-76%. After some additional research I've come across this paper using fisherfaces. And the fact that opencv has now the ability to create fisherface detectors and train them is great and works fantastically, achieving the accuracy promised by the paper on the Yale datasets.
The complication of the problem is that in my case I don't have a database with several images of the same person, to train the detector on. All I have is a single image corresponding to a single person, and given another image I want to understand whether this is the same person or no.
So what I am interested in knowing is`
Has anyone tried anything of the sort? What are some papers/methods/libraries that I should look into?
Do you have any suggestions on how to tackle problem?
Since you have only one image, you can give this method using DLib a try. I have used 3-4 images per person and it is giving good results.
Detect face (sample_face)
Get face descriptor (128 D vector) using dlib compute_face_descriptor (Check link)
Get the new picture in which you want to recognise the face
detect face and compute the descriptor(lets call test_face).
Compute euclidean distance between test_face descriptor and all sample_faces descriptor
assign the test_face with class(person name) with least euclidean distance.
Give this a whirl, you can play with face aligning if you start getting good results.
This is one of the hot topic for computer visin area. To handle as you have written there are many kind of solutions are available.
But i suggest to look OpenFace which has very high accuracy. There is a implementation of that project at Github.
Thanks
You need to understand that machine learning doesn't work that way, there are intensive training carried out before your model can give some good results.
with the single image of a person you just cannot predict that its the same person, cause you need to train your model over different images of the person under different light intensities, angles and many other varying scenarios.
Still i would like to try this link :
http://hanzratech.in/2015/02/03/face-recognition-using-opencv.html
you may find some match for the image atleast.
So what I am interested in knowing is` Has anyone tried anything of
the sort?
Yes. This is 2017 and facial recognition has been researched for decades.
What are some papers/methods/libraries that I should look into?
Anything google throws at you searching "single image/sample face recognition"
Do you have any suggestions on how to tackle problem?
See above
Extracting features such as SIFT and MSER generate accuracy of between 73-76%.
I doubt humans, who's facial recognition is unmatched perform much better with only 1 image as reference. I mean I couldn't tell for sure if that's Brad Pitt or if one is just a look-alike and I have seen him on houndreds of pictures and hours of movies...
I'm trying to implement a face recognition algorithm using Python. I want to be able to receive a directory of images, and compute pair-wise distances between them, when short distances should hopefully correspond to the images belonging to the same person. The ultimate goal is to cluster images and perform some basic face identification tasks (unsupervised learning).
Because of the unsupervised setting, my approach to the problem is to calculate a "face signature" (a vector in R^d for some int d) and then figure out a metric in which two faces belonging to the same person will indeed have a short distance between them.
I have a face detection algorithm which detects the face, crops the image and performs some basic pre-processing, so the images i'm feeding to the algorithm are gray and equalized (see below).
For the "face signature" part, I've tried two approaches which I read about in several publications:
Taking the histogram of the LBP (Local Binary Pattern) of the entire (processed) image
Calculating SIFT descriptors at 7 facial landmark points (right of mouth, left of mouth, etc.), which I identify per image using an external application. The signature is the concatenation of the square root of the descriptors (this results in a much higher dimension, but for now performance is not a problem).
For the comparison of two signatures, I'm using OpenCV's compareHist function (see here), trying out several different distance metrics (Chi Square, Euclidean, etc).
I know that face recognition is a hard task, let alone without any training, so I'm not expecting great results. But all I'm getting so far seems completely random. For example, when calculating distances from the image on the far right against the rest of the image, I'm getting she is most similar to 4 Bill Clintons (...!).
I have read in this great presentation that it's popular to carry out a "metric learning" procedure on a test set, which should significantly improve results. However it does say in the presentation and elsewhere that "regular" distance measures should also get OK results, so before I try this out I want to understand why what I'm doing gets me nothing.
In conclusion, my questions, which I'd love to get any sort of help on:
One improvement I though of would be to perform LBP only on the actual face, and not the corners and everything that might insert noise to the signature. How can I mask out the parts which are not the face before calculating LBP? I'm using OpenCV for this part too.
I'm fairly new to computer vision; How would I go about "debugging" my algorithm to figure out where things go wrong? Is this possible?
In the unsupervised setting, is there any other approach (which is not local descriptors + computing distances) that could work, for the task of clustering faces?
Is there anything else in the OpenCV module that maybe I haven't thought of that might be helpful? It seems like all the algorithms there require training and are not useful in my case - the algorithm needs to work on images which are completely new.
Thanks in advance.
What you are looking for is unsupervised feature extraction - take a bunch of unlabeled images and find the most important features describing these images.
The state-of-the-art methods for unsupervised feature extraction are all based on (convolutional) neural networks. Have look at autoencoders (http://ufldl.stanford.edu/wiki/index.php/Autoencoders_and_Sparsity) or Restricted Bolzmann Machines (RBMs).
You could also take an existing face detector such as DeepFace (https://www.cs.toronto.edu/~ranzato/publications/taigman_cvpr14.pdf), take only feature layers and use distance between these to group similar faces together.
I'm afraid that OpenCV is not well suited for this task, you might want to check Caffe, Theano, TensorFlow or Keras.
Can anyone advise me way to build effective face classifier that may be able to classify many different faces (~1000)?
And i have only 1-5 examples of each face
I know about opencv face classifier, but it works bad for my task (many classes, a few samples).
It works alright for one face classification with small number of samples. But i think that 1k separate classifier is not good idea
I read a few articles about face recognition but methods from these articles reqiues a lot of samples of each class for work
PS Sorry for my writing mistakes. English in not my native language.
Actually, for giving you a proper answer, I'd be happy to know some details of your task and your data. Face Recognition is a non-trivial problem and there is no general solution for all sorts of image acquisition.
First of all, you should define how many sources of variation (posing, emotions, illumination, occlusions or time-lapse) you have in your sample and testing sets. Then you should choose an appropriate algorithm and, very importantly, preprocessing steps according to the types.
If you don't have any significant variations, then it is a good idea to consider for a small training set one of the Discrete Orthogonal Moments as a feature extraction method. They have a very strong ability to extract features without redundancy. Some of them (Hahn, Racah moments) can also work in two modes - local and global feature extraction. The topic is relatively new, and there are still few articles about it. Although, they are thought to become a very powerful tool in Image Recognition. They can be computed in near real-time by using recurrence relationships. For more information, have a look here and here.
If the pose of the individuals significantly varies, you may try to perform firstly pose correction by Active Appearance Model.
If there are lots of occlusions (glasses, hats) then using one of the local feature extractors may help.
If there is a significant time lapse between train and probe images, the local features of the faces could change over the age, then it's a good option to try one of the algorithms which use graphs for face representation so as to keep the face topology.
I believe that non of the above are implemented in OpenCV, but for some of them you can find MATLAB implementation.
I'm not native speaker as well, so sorry for the grammar
Coming to your problem , it is very unique in its way. As you said there are only few images per class , the model which we train should either have an awesome architecture which can create better features within an image itself , or there should be an different approach which can achieve this task .
I have four things which I can share as of now :
Do data pre-processing and then create a bigger dataset and train on a neural network ideally. Here, we can do pre-processing like:
- image rotation
- image shearing
- image scaling
- image blurring
- image stretching
- image translation
and create atleast 200 images per class. Please checkout opencv documentation which provides many more methods on how you can increase the size of your dataset. Once you do this, then we can apply transfer learning , which is a better approach than training a neural network from scratch.
Transfer learning is a method where we train a network on our own custom classes , and this network is already pre-trained on 1000's of classes. Since our data here is very less, I would prefer transfer learning only. I have written a blog on how you can approach this using tranfer learning after you have the required amount of data. It is linked here. Face recognition also is a classification task itself, where each human is a separate class. So, follow the instructions given in the blog , may be it would help you create your own powerful classifer.
Another suggestion would be , after creating a dataset , encode them properly. This encoding would help you preserve the features in an image and can help you train better networks. VLAD ,Fisher , Bag of Words are few encoding techniques. You can search few repositories online which have implemented these already on ORL database. Once you encode , train the network on the encodings , you will obviously see a better performance.
Even do check out , Siamese network here which is meant for this purpose I feel . Here they compare two images with similar characteristics on different networks and there by achieve better classification accuracies . Git repository is here.
Another standard approach would be using SVM , Random forests since the data is less. If you still prefer neural networks the above methods would serve you the purpose. If you intend to go with encodings , then I would suggest random forests , as it is highly preferrable in learning and flexible too.
Hopefully , this answer would help you proceed in the right direction of achieving things.
You might want to take a look at OpenFace, a Python and Torch implementantion of face recognition with deep neural networks: https://cmusatyalab.github.io/openface/
I know that there are many detection techniques in OpenCV, such as SURF, STAR, ORB etc...but those techniques are for feature detection of new video feed, not for dealing with specific instances of objects that require prior learning. OpenCV's documentation isn't quite as easy to flip through and I've yet been able to find anything besides Haar, which I know deals best with face recognition.
So are there any other techniques besides Haar? The Haar technique dates back to research 10 years ago, so ideally I hope that there have been some more advances since then that have been implemented in OpenCV.
If you are looking for OpenCV machine learning type algorithms, check out this link.
For a state of the art on-the-fly object detection algorithm, have a look at OpenTLD. It uses bounding boxes and random forests to learn about an object over time. Check out the demo video here.
Also check out the matching_to_many_images.cpp sample from OpenCV. It uses feature descriptors to match objects much like Google Goggles works. A related example to this is the bagofwords_classification.cpp sample. It may be what you are looking for in this case. It uses feature detectors (SURF, SIFT, etc...) to detect objects and then classify them by comparing the relative positions of the features to a learned database of features. Have a look also at this tutorial from MIT.
The latentsvmdetect.cpp may also be a good starting point for you.
Hope that helps!
I was trying to build a basic Face Recognition system (PCA-Eigenfaces) using OpenCV 2.2 (from Willow Garage). I understand from many of the previous posts on Face Recognition that there is no standard open source library that can provide all the face recognition for you.
Instead, I would like to know if someone has used the functions(and integrated them):
icvCalcCovarMatrixEx_8u32fR
icvCalcEigenObjects_8u32fR
icvEigenProjection_8u32fR
et.al in the eigenobjects.cpp to form a face recognition system, because the functions seem to provide much of the required functionality along with cvSvd?
I am having a tough time trying to understand to do so since I am new to OpenCV.
Update: OpenCV 2.4.2 now comes with the very new cv::FaceRecognizer. Please see the very detailed documentation at:
http://docs.opencv.org/trunk/tutorial_face_main.html
I worked on a project with CV to recognize facial features. Most people don't understand the difference between Biometrics and Facial Recognition. There is a huge difference based on the fact that Biometrics is mainly based on histogram density matching while Facial Recognition implements this and vector support based on feature recognition from density. Check out the following link. This is the library you want to use if you are pursuing CV and Facial Recognition: www.betaface.com . Oleksander is awesome and based out of Germany, but he answers questions which is nice.
With OpenCV it's easy to get started with face detection. It comes with some predefined sets for feature detection, including face detection.
You might already know this one: OpenCV Wiki, FaceDetection
The important functions in this example are cvLoad and cvHaarDetectObjects. The first one loads the classifier and the second one applies it to an image.
The standard classifiers work pretty well. Of course you can train your own classifiers, if the standard ones don't fit your purpose.
As you said there are a lot of algorithms for face detection. Some of them might provide better results, but OpenCV is definitively a good start.