I built a CNN to classify 10 different classes. It performs well on most of the classes, giving approx. 80-85% accu. per class. Current each class has 10000+ images
But in the future, there is a possibility that I might get more data for each class.
For instance, I get more data for some class, how should I re-train the model?
Should I retrain the entire thing i.e with old and new data?
Should I retrain the model with only new data? Here, I fear that as the model will get new data for a single class, the model can possibly forget what it has already learned or might affect the accuracies of other classes.
If anyone has worked on this problem before, please help......
Related
I split my dataset into training and testing. At the end after finding the best hyper parameters for the training dataset, should I fit the model again using all the data? The point is to reach the highest possible score for new data.
Yes, that would help to generalize your model, as more data generally means better generalization.
I don't think so. If you do that, you will no longer have a valid test set. What happens when you come back to improve the model later? If you do this, then you will need a new test set each model improvement, which means more labeling. You won't be able to compare experiments across model versions, because the test set won't be identical.
If you consider this model finished forever, then ok.
I would like to train this beautiful model to recognize only one type of images. To be clear at the end having the model capable of telling if the new image is part of that class or no. Thank you very much for your help.
You should keep in mind is that when you want to recognize a "dog" for example you need to know what is NOT a "dog" as well. So your classification problem is a two class problem and not one class. Your two classes will be "My Type" and "Not My Type".
About retraining your model, yes it is possible. I guess you use a model pretrained on Imagenet Dataset. There is two cases : If the classification problem is close (for example if your "type" is a class from Imagenet) you can just replace your last layer (replace Fully connected 1x1000 by FC 1x2) and retrain on this layer. If the problem is not the same you may want to retrain more layers.
It also depends on the number of Samples you have for your retrain.
I hope it helps or clarifies your question.
Is it possible to retrain googles inception model with one class?
Yes. Just remove the last layer, add a new layer with one (or two) nodes and train it on your new problem. This way you keep general features learned on the (probably bigger) image net dataset.
I am new to Doc2vec use. In case I could get some advice before I start on it, it will save a LOT of time.
My data is an stream of text data (such as tweets) continuously coming in time. For clustering these tweets, I was thinking of using doc2vec to reduce the text content into a fixed size vector and use that to compare between documents.
So in this case, the text data is getting accumulated over time, can this be still used with Doc2Vec, I may have to learn the model again and again (may be!) or could I use some large corpus such as Wikipedia or a large newscorpus to train the Doc2Vec model.
Any suggestions will help!
Thanks in Advance.
The gensim Doc2Vec class does not support adjusting the model with new documents, but it can 'infer' and report a vector for new documents, based on the model learned from an earlier bulk training.
So, you can use that new inferred vector to compare the new document to older ones, or feed it to a trained classifier, etc.
If new documents continue to arrive, and especially if the balance of topics/meaning in your documents drifts over time, you would likely at some point want to discard a model based on older data, and create a new model based on your larger (or more recent) data.
(Note that vectors from the old model and new model won't be directly comparable. Training sessions involve a lot of randomness, and the meanings of dimensions/directions in any one model are somewhat arbitrary. It's the relative positions of vectors, from within the same model, that has some interpretive power.)
I'm currently using partial_fit with SGDClassifier to fit a model to predict the hashtags on images.
The problem I'm having is that SGDClassifier requires specifying classes upfront. This is ok to fit a model offline but I'd like to add new classes online when observing new hashtags. Currently, I need to retrain a new model from scratch to accommodate the new classes.
Is there a way to have SGDClassifier accept new classes without having to retrain a new model? Or would I be better off training a separate binary SGDClassifier for each hashtag?
Thanks
Hashtags are usually just tags, thus one object can have many of them. In such setting there is no multiclass scenario - and you should have just a single SGD binary classifier per tag. You can obviously fit more complex models taking into account reasoning between tags, but SGD is not duing so, thus using it in a provided setting does not make any more sense than just having N distinct classifiers.
I am using the classical SIFT - BOW - SVM for image classification. My classifiers are created using the 1vsAll paradigm.
Let's say that I currently have 100 classes.
Later, I would like to add new classes OR I would like to improve the recognition for some specific classes using additional training sets.
What would be the best approach to do it ?
Of course, the best way would be to re-execute every steps of the training stage.
But would it make any sense to only compute the additional (or modified) classes using the same vocabulary as the previous model, in order avoid to recompute a new vocabulary and train again ALL the classes ?
Thanks
In short - no. If you add new class it should be added to each of the "old" classifiers so "one vs. all" still makes sense. If you assume that new classes can appear with time consider using one class classifiers instead, such as one-class SVM. This way once you get new samples for a particular class you only retrain a particular model, or add a completely new one to the system.
Furthermore, for large number of classes, 1 vs all SVM works quite badly, and one-class approach is usually much better.