How to train a caffemodel on our own dataset? - machine-learning

I tried using pre-trained bvlc_reference_caffenet.caffemodel for object recognition from images. I got good results for images containing only a single object. For images with multiple objects, I removed the argmax() term from prediction which gives the class label with the maximum probability.
Still, the accuracy is very less for the labels which I am getting. So, I am thinking of training the same caffemodel on my own dataset (containing images with multiple objects). How should I proceed? Is there any way to retrain a pre-trained caffemodel with the different dataset?

What you are after is called "finetuning": taking a deep net trained for task A, reusing its weights and re-train it to accomplish task B.
You can start with this tutorial, but you will find much more information simply by googling "finetune caffe model".
You may also be interested in this post regarding training caffe with mutiple categories per input image.

Related

Finding the suitable CNN architecture for the calssification

I want to use convolutional Neural Network (CNN) to classify between two classes of images. I built several CNN architectures, but I always get the same result; the network always classify all cases as a second class sample. Therefore, I always get 50% accuracy in leave-one-out. The data is balanced in terms of the number of samples of each class (16 from 1st, and 16 from 2nd). Could you please clarify what does this mean.
With such small number of training samples, Your CNN model is very likely to overfit the data giving good training accuracy and worst test accuracy.
Else your model can be skewed predicting the same class at all times.
Below are some of the solutions you can try:
1) As you have commented, if you cannot get any more images, then try creating new images by modifying the ones already available. For ex: Let's say you have 16 images of a cat (cat is the class). You can crop the cat and paste it in different backgrounds, try varying the brightness, intensity etc, Try rotation, translation operations etc.
This will help you create a good training set.
2) Try creating a smaller model (with one or two layers) and check if it improves your accuracy.
3) You can do transfer learning by using a good pre-trained model as it can learn pretty well when compared to creating a model from base.

Image classification, narrow domain with custom labels

Let's suppose I would like to classify motorbikes by model.
there are couple of hundreds models of motorbikes I'm interested in.
I do have tens, sometimes hundreds of pictures of each motorbike model.
Can you please point me to the practical example that demonstrates how to train model on your data and then use it to classify images? It needs to be a deep learning model, not simple logistic regression.
I'm not sure about it, but it seems like I can't use pre-trained neural net because it has been trained on wide range of objects like cat, human, cars etc. They may be not too good at distinguishing the motorbike nuances I'm interested in.
I found couple of such examples (tensorflow has one), but sadly, all of them were using pre-trained model. None of it had example how to train it on your own dataset.
In cases like yours you either use transfer learning or fine tuning. If you have more then thousand images of motorbikes I would use fine tuning and if you have less transfer learning.
Fine tuning is using a pre trained model and using a different classifier part. Then the new classifier part maybe the last 1-2 layers of the trained model are trained to your dataset.
Transfer learning means using a pre trained model and letting it output features for an input image. Now you use a new classifier based on those features. Maybe a SVM or a logistic regression.
An example for this can be seen here: https://github.com/cpra/dlvc2016/blob/master/lectures/lecture10.pdf. slide 33.
This paper Quick, Draw! Doodle Recognition from a kaggle challenge may be similar enough to what you are doing. The code is on github. You may need some data augmentation if you only have a few hundred images for each category.
What you want is pretty EZ. Follow the darknet YOLO implementation
Instruction: https://pjreddie.com/darknet/yolo/
Code https://github.com/pjreddie/darknet
Training YOLO on COCO
You can train YOLO from scratch if you want to play with different training regimes, hyper-parameters, or datasets. Here's how to get it working on the COCO dataset.
Get The COCO Data
To train YOLO you will need all of the COCO data and labels. The script scripts/get_coco_dataset.sh will do this for you. Figure out where you want to put the COCO data and download it, for example:
cp scripts/get_coco_dataset.sh data
cd data
bash get_coco_dataset.sh
Add your data inside and make sure it is same as testing samples.
Now you should have all the data and the labels generated for Darknet.
Then call training script with the pre-trained weight.
Keep in mind that only training on your motorcycle may not result in good estimation. There would be biased result coming out, I red it somewhere b4.
The rest is all inside the link. Good luck

Image similarity detection with TensorFlow

Recently I started to play with tensorflow, while trying to learn the popular algorithms i am in a situation where i need to find similarity between images.
Image A is supplied to the system by me, and userx supplies an image B and the system should retrieve image A to the userx if image B is similar(color and class).
Now i have got few questions:
Do we consider this scenario to be supervised learning? I am asking
because i don't see it as a classification problem(confused!!)
What algorithms i should use to train etc..
Re-training should be done quite often, how should i tackle this
problem so i don't train everytime from scratch( fine-tuning??)
Do we consider this scenario to be supervised learning?
It is supervised learning when you have labels to optimize your model. So for most neural networks, it is supervised.
However, you might also look at the complete task. I guess you don't have any ground truth for image pairs and the "desired" similarity value your model should output?
One way to solve this problem which sounds inherently unsupervised is to take a CNN (convolutional neural network) trained (in a supervised way) on the 1000 classes of image net. To get the similarity of two images, you could then simply take the euclidean distance of the output probability distribution. This will not lead to excellent results, but is probably a good starter.
What algorithms i should use to train etc..
First, you should define what "similar" means for you. Are two images similar when they contain the same object (classes)? Are they similar if the general color of the image is the same?
For example, how similar are the following 3 pairs of images?
Have a look at FaceNet and search for "Content based image retrieval" (CBIR):
Wikipedia
Google Scholar
This can be a supervised learning. You can classify the images into categories, if two images are in the same categories (or close in a category), you can think of them as similar.
You can use the deep conventional neural networks for imagenet such as inception model. The inception model outputs a probability map for 1000 classes (which is a vector whose values sum to 1). You can calculate the distance of vectors of two images to get their similarity.
On the same page of the inception model, you will also find the instructions to retrain a model: https://github.com/tensorflow/models/tree/master/inception#how-to-fine-tune-a-pre-trained-model-on-a-new-task

type of recognition of convolution neural network

I was trying to create a convolution neural network for the recognition of animals, vehicles, buildings, trees, plants from a large data-set having the combination of these objects.
At the time of training I got a doubt about the way in which the network should be trained. My doubt is that whether I could train the network with the data-set of whole animals as a single attribute or train each animals separately?
Means, one group for lions, one for tigers, one for elephants etc and at the time of testing I can code it to output the result as animal if any one of its subcategory is satisfied.
I got this doubt since I have read that there should be a correct pattern in the data-set for the efficient detection and there should be a pattern only if we are training with the subcategory of objects than the vast data-set.
I have attached a figure showing the sample dataset(only logically correct). I want to know whether there should be separate data-set or single data-set.
Training on a separate data-set or a single data-set will depend on a variety of factors. If you want to classify the images in your test dataset using the Convolution Neural Network into just animals and not further subdivide them, then training on a single-data should be done. However, if you plan to further sub classify the images into tigers and lions, then the training needs to be done on separate datasets of tigers and lions.
The type of the dataset that you use for training will highly depend on your requirements of classification on the test dataset.
Moreover, you have to make sure that you normalize the images before you use it for training.

Algorithm for Multi-Class Classification of News Article

I want to classify the news article into the category it belongs to. I have 4 categories of news eg." Technology,Sports,Politics and Health." And i have collected around 50 documents for each category as a Training Set
**Is the Training data enough for classification ??? And Which Algorithm should i use for classification?? SVM, Random Forest,Knn, ??
I am using Scikit-learn http://scikit-learn.org/ [python] library for my task
Thanks
There are many ways to attack this problem form CRFs to Random Forests.
With your limited training data, I would suggest going with a model with high bias such as the linear SVM. Start with training one vs all models for each class and predicting the class with the highest probably. This will give you a baseline for how hard your problem is with the given training data.
I prefer you to use Naive-Bayes classification. There is a tool called Ling-pipe where this is already implemented. What you want to do is just refer
http://alias-i.com/lingpipe/demos/tutorial/classify/read-me.html
There you have a small sample program Classifynews.java. Run that program by training the data and apply testing .A training data sample is given as "20 newsgroups"
http://qwone.com/~jason/20Newsgroups/
Training can be applied by training the data and if needed you can build an intermediate model and then apply the test data into that model. Naive-Bayes is good for the cases where training data is small.
But its accuracy increases as the size of training data increases. So try to include more news groups. Good luck. Try this and let me know

Resources