im trying to figure out a question, the thing is that im working with a big dataset of pictures, the key idea is that almost all the pictures have just 1 person in it, every class should represent a different person but for some reason, lets say 1 of 1000 pictures in every class has a face that does not belong to that class(is not the same person that is on the other pics in that class) actually the person miss labeled is not from any class. here is my question: what happens on the learning process?, the convnet learns that that face is not useful for the task? or it generate some kind of error? i ask this because i need to know if i need to remove these "noisy" pictures for better performance, or if it is the case, the error would be neglectable. Thank you all in advance
Misleading targets will definitely add noise to your data. It will make training much more unstable if you have significant amount of incorrectly labeled data. Although, in your case, if you have 1/1000 ratio of incorrectly labeled data, unless you are using weighted classes, it won't much affect training.
By the way, if you are trying to create model that classifies a person by
face image, you might want to create other features, like eyes position, skin color, etc.
If i am doing a multi-classification problem, is there a way to essentially make a class an "unsure" class? For example if my model doesn't have a very strong prediction, it should default to this class. Like when you take a test, some tests penalize you for wrong answers, some don't. I want to do a custom loss function that doesn't penalize my model for guessing the neutral class, but does penalize if the model makes a prediction that is wrong. Is there a way to do what i am trying to do?
For classifiers using a one-hot encoded softmax output layer, the outputs can be interpreted as a probability that the input falls into each of the categories. e.g. if your model has outputs (cat, dog, frog), then an output of (0.6, 0.2, 0.2) means the input has (according to the classifier) a 60% chance of being a cat and a 20% chance for each of being a dog or frog.
In this case, when the model is uncertain it can (and will) have an output where no one class is particularly likely, e.g. (0.33, 0.33, 0.33). There's no need to add a separate 'Other' category.
Separate to this, it might be difficult to train an "unsure" category, unless you have specific input examples that you want to train the model to classify as "unsure".
I encountered the very same problem.
I tried using a neutral class, but the neural net will either put nothing in it, or everything in it depending on the reduced loss.
After some searching, it looks like we are trying to achieve "neural network uncertainty estimation". One of the ways to achieve that is to run your image 100 times in your neural net with random dropouts and see how many times it hits the same class.
This blog post explains it well : https://www.inovex.de/blog/uncertainty-quantification-deep-learning/
This video also : https://medium.com/deeplearningmadeeasy/how-to-add-uncertainty-to-your-neural-network-afb5f855e66a
I will let you know and publish here if I have some results with that.
I have n classes and a one unknown.
Unknown is not included in the training set as the way is not explored as yet by me.
I trained the mobilenet(or inception-v3) for n classes. The confusion matrix is very good.
Now if an unknown-class image comes in for prediction, the model predicts it as any of the n classes. Which is clearly misclassified.
The confidence also comes near by 0.998, which makes it difficult to filter out. Otherwise known class object of n trained classes is classified at same confidence.
I tried keeping a class which does not include any feature set of n classes i.e. sort of negative sampled class as unknown class. As output, the confusion matrix goes very bad. Bad enough not to go further with that. This I am still going through.
How to determine unknown class using neural network? That it doesn't fall in any known class classification.
I have dealt with a similar problem in the past. You have to keep the following points in mind:
a. When you introduce the "unknown" class as your "n+1"th class, you should keep in mind that this class should represent the same variance as you would expect during your live/prod run. To elaborate, if you assume images from m different categories to come which are not a part of your training label, then all such images should be represented in this "unknown" label. This will help to bring the confidence score down for these "out-of-scope" images.
b. Additionally, once you bring the confidence score down with the help of above method, you can set a threshold on the confidence score.
The above two steps combined can help you filter out the "out-of-scope" images. I hope this helps for you. In case of further clarification, let me know.
I would like to train this beautiful model to recognize only one type of images. To be clear at the end having the model capable of telling if the new image is part of that class or no. Thank you very much for your help.
You should keep in mind is that when you want to recognize a "dog" for example you need to know what is NOT a "dog" as well. So your classification problem is a two class problem and not one class. Your two classes will be "My Type" and "Not My Type".
About retraining your model, yes it is possible. I guess you use a model pretrained on Imagenet Dataset. There is two cases : If the classification problem is close (for example if your "type" is a class from Imagenet) you can just replace your last layer (replace Fully connected 1x1000 by FC 1x2) and retrain on this layer. If the problem is not the same you may want to retrain more layers.
It also depends on the number of Samples you have for your retrain.
I hope it helps or clarifies your question.
Is it possible to retrain googles inception model with one class?
Yes. Just remove the last layer, add a new layer with one (or two) nodes and train it on your new problem. This way you keep general features learned on the (probably bigger) image net dataset.
In a particular application I was in need of machine learning (I know the things I studied in my undergraduate course). I used Support Vector Machines and got the problem solved. Its working fine.
Now I need to improve the system. Problems here are
I get additional training examples every week. Right now the system starts training freshly with updated examples (old examples + new examples). I want to make it incremental learning. Using previous knowledge (instead of previous examples) with new examples to get new model (knowledge)
Right my training examples has 3 classes. So, every training example is fitted into one of these 3 classes. I want functionality of "Unknown" class. Anything that doesn't fit these 3 classes must be marked as "unknown". But I can't treat "Unknown" as a new class and provide examples for this too.
Assuming, the "unknown" class is implemented. When class is "unknown" the user of the application inputs the what he thinks the class might be. Now, I need to incorporate the user input into the learning. I've no idea about how to do this too. Would it make any difference if the user inputs a new class (i.e.. a class that is not already in the training set)?
Do I need to choose a new algorithm or Support Vector Machines can do this?
PS: I'm using libsvm implementation for SVM.
I just wrote my Answer using the same organization as your Question (1., 2., 3).
Can SVMs do this--i.e., incremental learning? Multi-Layer Perceptrons of course can--because the subsequent training instances don't affect the basic network architecture, they'll just cause adjustment in the values of the weight matrices. But SVMs? It seems to me that (in theory) one additional training instance could change the selection of the support vectors. But again, i don't know.
I think you can solve this problem quite easily by configuring LIBSVM in one-against-many--i.e., as a one-class classifier. SVMs are one-class classifiers; application of an SVM for multi-class means that it has been coded to perform multiple, step-wise one-against-many classifications, but again the algorithm is trained (and tested) one class at a time. If you do this, then what's left after step-wise execution against the test set, is "unknown"--in other words, whatever data is not classified after performing multiple, sequential one-class classifications, is by definition in that 'unknown' class.
Why not make the user's guess a feature (i.e., just another dependent variable)? The only other option is to make it the class label itself, and you don't want that. So you would, for instance, add a column to your data matrix "user class guess", and just populate it with some value most likely to have no effect for those data points not in the 'unknown' category and therefore for which the user will not offer a guess--this value could be '0' or '1', but really it depends on how you have your data scaled and normalized).
Your first item will likely be the most difficult, since there are essentially no good incremental SVM implementations in existence.
A few months ago, I also researched online or incremental SVM algorithms. Unfortunately, the current state of implementations is quite sparse. All I found was a Matlab example, OnlineSVR (a thesis project only implementing regression support), and SVMHeavy (only binary class support).
I haven't used any of them personally. They all appear to be at the "research toy" stage. I couldn't even get SVMHeavy to compile.
For now, you can probably get away with doing periodic batch training to incorporate updates. I also use LibSVM, and it's quite fast, so it sould be a good substitute until a proper incremental version is implemented.
I also don't think SVM's can model the concept of an "unknown" sample by default. They typically work as a series of boolean classifiers, so a sample ends up as positively being classified as something, even if that sample is drastically different from anything seen previously. A possible workaround would be to model the ranges of your features, and randomly generate samples that exist outside of these ranges, and then add these to your training set.
For example, if you have an attribute called "color", which has a minimum value of 4 and a maximum value of 123, then you could add these to your training set
[({'color':3},'unknown'),({'color':125},'unknown')]
to give your SVM an idea of what an "unknown" color means.
There are algorithms to train an SVM incrementally, but I don't think libSVM implements this. I think you should consider whether you really need this feature. I see no problem with your current approach, unless the training process is really too slow. If it is, could you retrain in batches (i.e. after every 100 new examples)?
You can get libSVM to produce probabilities of class membership. I think this can be done for multiclass classification, but I'm not entirely sure about that. You will need to decide some threshold at which the classification is not certain enough and then output 'Unknown'. I suppose something like setting a threshold on the difference between the most likely and second most likely class would achieve this.
I think libSVM scales to any number of new classes. The accuracy of your model may well suffer by adding new classes, however.
Even though this question is probably out of date, I feel obliged to give some additional thoughts.
Since your first question has been answered by others (there is no production-ready SVM which implements incremental learning, even though it is possible), I will skip it. ;)
Adding 'Unknown' as a class is not a good idea. Depending on it's use, the reasons are different.
If you are using the 'Unknown' class as a tag for "this instance has not been classified, but belongs to one of the known classes", then your SVM is in deep trouble. The reason is, that libsvm builds several binary classifiers and combines them. So if you have three classes - let's say A, B and C - the SVM builds the first binary classifier by splitting the training examples into "classified as A" and "any other class". The latter will obviously contain all examples from the 'Unknown' class. When trying to build a hyperplane, examples in 'Unknown' (which really belong to the class 'A') will probably cause the SVM to build a hyperplane with a very small margin and will poorly recognizes future instances of A, i.e. it's generalization performance will diminish. That's due to the fact, that the SVM will try to build a hyperplane which separates most instances of A (those officially labeled as 'A') onto one side of the hyperplane and some instances (those officially labeled as 'Unknown') on the other side .
Another problem occurs if you are using the 'Unknown' class to store all examples, whose class is not yet known to the SVM. For example, the SVM knows the classes A, B and C, but you recently got example data for two new classes D and E. Since these examples are not classified and the new classes not known to the SVM, you may want to temporarily store them in 'Unknown'. In that case the 'Unknown' class may cause trouble, since it possibly contains examples with enormous variation in the values of it's features. That will make it very hard to create good separating hyperplanes and therefore the resulting classifier will poorly recognize new instances of D or E as 'Unknown'. Probably the classification of new instances belonging to A, B or C will be hindered as well.
To sum up: Introducing an 'Unknown' class which contains examples of known classes or examples of several new classes will result in a poor classifier. I think it's best to ignore all unclassified instances when training the classifier.
I would recommend, that you solve this issue outside the classification algorithm. I was asked for this feature myself and implemented a single webpage, which shows an image of the object in question and a button for each known class. If the object in question belongs to a class which is not known yet, the user can fill out another form to add a new class. If he goes back to the classification page, another button for that class will magically appear. After the instances have been classified, they can be used for training the classifier. (I used a database to store the known classes and reference which example belongs to which class. I implemented an export function to make the data SVM-ready.)