Looking for scientific article references for the network architecture presented in Deep MNIST for Experts tutorial (https://www.tensorflow.org/versions/r0.9/tutorials/mnist/pros/index.html)
I have a similar image processing data and I'm looking for a good vanilla architecture, any recommendations?
Currently the best solution for this problem are wavelet transform based solutions
You probably don't want to look at Deep MNIST for Experts as an example of a good architecture for MNIST or as a scientific baseline. It's more an example of basic Tensorflow building blocks and a nice introduction to convolutional models.
I.e, you should be able to get equal or better results with a model with 5% of the free parameters and less layers.
Related
Does it make any sense to perform feature extraction on images using, e.g., OpenCV, then use Caffe for classification of those features?
I am asking this as opposed to the traditional way of passing the images directly to Caffe, and letting Caffe do the extraction and classification procedures.
Yes, it does make sense, but it may not be the first thing you want to try:
If you have already extracted hand-crafted features that are suitable for your domain, there is a good chance you'll get satisfactory results by using an easier-to-use machine learning tool (e.g. libsvm).
Caffe can be used in many different ways with your features. If they are low-level features (e.g. Histogram of Gradients), then several convolutional layers may be able to extract the appropriate mid-level features for your problem. You may also use caffe as an alternative non-linear classifier (instead of SVM). You have the freedom to try (too) many things, but my advice is to first try a machine learning method with a smaller meta-parameter space, especially if you're new to neural nets and caffe.
Caffe is a tool for training and evaluating deep neural networks. It is quite a versatile tool allowing for both deep convolutional nets as well as other architectures.
Of course it can be used to process pre-computed image features.
Can anyone advise me way to build effective face classifier that may be able to classify many different faces (~1000)?
And i have only 1-5 examples of each face
I know about opencv face classifier, but it works bad for my task (many classes, a few samples).
It works alright for one face classification with small number of samples. But i think that 1k separate classifier is not good idea
I read a few articles about face recognition but methods from these articles reqiues a lot of samples of each class for work
PS Sorry for my writing mistakes. English in not my native language.
Actually, for giving you a proper answer, I'd be happy to know some details of your task and your data. Face Recognition is a non-trivial problem and there is no general solution for all sorts of image acquisition.
First of all, you should define how many sources of variation (posing, emotions, illumination, occlusions or time-lapse) you have in your sample and testing sets. Then you should choose an appropriate algorithm and, very importantly, preprocessing steps according to the types.
If you don't have any significant variations, then it is a good idea to consider for a small training set one of the Discrete Orthogonal Moments as a feature extraction method. They have a very strong ability to extract features without redundancy. Some of them (Hahn, Racah moments) can also work in two modes - local and global feature extraction. The topic is relatively new, and there are still few articles about it. Although, they are thought to become a very powerful tool in Image Recognition. They can be computed in near real-time by using recurrence relationships. For more information, have a look here and here.
If the pose of the individuals significantly varies, you may try to perform firstly pose correction by Active Appearance Model.
If there are lots of occlusions (glasses, hats) then using one of the local feature extractors may help.
If there is a significant time lapse between train and probe images, the local features of the faces could change over the age, then it's a good option to try one of the algorithms which use graphs for face representation so as to keep the face topology.
I believe that non of the above are implemented in OpenCV, but for some of them you can find MATLAB implementation.
I'm not native speaker as well, so sorry for the grammar
Coming to your problem , it is very unique in its way. As you said there are only few images per class , the model which we train should either have an awesome architecture which can create better features within an image itself , or there should be an different approach which can achieve this task .
I have four things which I can share as of now :
Do data pre-processing and then create a bigger dataset and train on a neural network ideally. Here, we can do pre-processing like:
- image rotation
- image shearing
- image scaling
- image blurring
- image stretching
- image translation
and create atleast 200 images per class. Please checkout opencv documentation which provides many more methods on how you can increase the size of your dataset. Once you do this, then we can apply transfer learning , which is a better approach than training a neural network from scratch.
Transfer learning is a method where we train a network on our own custom classes , and this network is already pre-trained on 1000's of classes. Since our data here is very less, I would prefer transfer learning only. I have written a blog on how you can approach this using tranfer learning after you have the required amount of data. It is linked here. Face recognition also is a classification task itself, where each human is a separate class. So, follow the instructions given in the blog , may be it would help you create your own powerful classifer.
Another suggestion would be , after creating a dataset , encode them properly. This encoding would help you preserve the features in an image and can help you train better networks. VLAD ,Fisher , Bag of Words are few encoding techniques. You can search few repositories online which have implemented these already on ORL database. Once you encode , train the network on the encodings , you will obviously see a better performance.
Even do check out , Siamese network here which is meant for this purpose I feel . Here they compare two images with similar characteristics on different networks and there by achieve better classification accuracies . Git repository is here.
Another standard approach would be using SVM , Random forests since the data is less. If you still prefer neural networks the above methods would serve you the purpose. If you intend to go with encodings , then I would suggest random forests , as it is highly preferrable in learning and flexible too.
Hopefully , this answer would help you proceed in the right direction of achieving things.
You might want to take a look at OpenFace, a Python and Torch implementantion of face recognition with deep neural networks: https://cmusatyalab.github.io/openface/
Many machine learning competitions are held in Kaggle where a training set and a set of features and a test set is given whose output label is to be decided based by utilizing a training set.
It is pretty clear that here supervised learning algorithms like decision tree, SVM etc. are applicable. My question is, how should I start to approach such problems, I mean whether to start with decision tree or SVM or some other algorithm or is there is any other approach i.e. how will I decide?
So, I had never heard of Kaggle until reading your post--thank you so much, it looks awesome. Upon exploring their site, I found a portion that will guide you well. On the competitions page (click all competitions), you see Digit Recognizer and Facial Keypoints Detection, both of which are competitions, but are there for educational purposes, tutorials are provided (tutorial isn't available for the facial keypoints detection yet, as the competition is in its infancy. In addition to the general forums, competitions have forums also, which I imagine is very helpful.
If you're interesting in the mathematical foundations of machine learning, and are relatively new to it, may I suggest Bayesian Reasoning and Machine Learning. It's no cakewalk, but it's much friendlier than its counterparts, without a loss of rigor.
EDIT:
I found the tutorials page on Kaggle, which seems to be a summary of all of their tutorials. Additionally, scikit-learn, a python library, offers a ton of descriptions/explanations of machine learning algorithms.
This cheatsheet http://peekaboo-vision.blogspot.pt/2013/01/machine-learning-cheat-sheet-for-scikit.html is a good starting point. In my experience using several algorithms at the same time can often give better results, eg logistic regression and svm where the results of each one have a predefined weight. And test, test, test ;)
There is No Free Lunch in data mining. You won't know which methods work best until you try lots of them.
That being said, there is also a trade-off between understandability and accuracy in data mining. Decision Trees and KNN tend to be understandable, but less accurate than SVM or Random Forests. Kaggle looks for high accuracy over understandability.
It also depends on the number of attributes. Some learners can handle many attributes, like SVM, whereas others are slow with many attributes, like neural nets.
You can shrink the number of attributes by using PCA, which has helped in several Kaggle competitions.
I'm working on a classification problem where I have data about only One Class, so I wanna classify between that "Target"class against all other possibilities which is the "Outlier" Class in incremental learning. So, I have found some libraries, but none of them support updating classifier.
Do you know any library that supports one-class classifier with updating pre-existed classifier especially in java or matlab?
I can't think of any full pre-existing solution to your question. However, I can suggest two approaches:
Neural networks have been used for various types of anomaly detection (e.g. see here, with the problem framed as "novelty detection"). Depending on the nature of your problem, this might be a suitable solution, as NNs can be incrementally trained and are supported by several widely used libraries. The right one to use would be highly dependent on your problem framing and the network architecture chosen.
Although most SVM libraries do not support incremental training, there are some with such support (e.g. see in Can an SVM learn incrementally?). However, as far as I can see, none of the two libraries suggested in the cited reference supports unary classification. But you could try basing a tailored solution on one of them (their source code seems to be freely available).
PS if you found one of these (or any other) solution to work, please post it as an answer as well :)
I am doing research on machine learning. Now I want to test my algorithms with some famous datasets. Since I am a newbie in this area, I can't find other suitable datasets apart from MNIST. I thing MNIST is quite suitable for our research. Does anyone know some similar datasets with MNIST?
P.S I know another handwritten digit dataset that is often used, called USPS dataset. But I need a dataset with more training examples (typically more than 10000 and comparable to the number of training examples in MNIST), so USPS is out of my selection.
The machine learning archive (http://archive.ics.uci.edu/ml/) contains quite a variety of datasets including those, like MINIST, suitable for classification e.g. (http://archive.ics.uci.edu/ml/datasets/Skin+Segmentation).
I can't say which of them would be suitable without knowing what you're trying to demonstrate with your algorithm but anything inside the UCI archive is well known.
You can try Fashion MNIST or Kuzushiji MNIST that have very similar properties to MNIST, but a bit harder to predict. From Fashion MNIST's page:
Seriously, we are talking about replacing MNIST. Here are some good reasons:
MNIST is too easy. Convolutional nets can achieve 99.7% on MNIST. Classic machine learning algorithms can also achieve 97% easily. Check out our side-by-side benchmark for Fashion-MNIST vs. MNIST, and read "Most pairs of MNIST digits can be distinguished pretty well by just one pixel."
MNIST is overused. In this April 2017 Twitter thread, Google Brain research scientist and deep learning expert Ian Goodfellow calls for people to move away from MNIST.
MNIST can not represent modern CV tasks, as noted in this April 2017 Twitter thread, deep learning expert/Keras author François Chollet.