How to train an object detection model with objects in xml files? - machine-learning

I have a few images and the xml files associated with the object of that image.
I want to train a model with those images with xml file for object detection.
How to do it using keras or tensorflow.
Some jupyter notebook/github links will be highly appreciated.

It is a type of vague question. You may be talking about object detection algorithm like Yolo Algorithm. Yolo algorithm along Darknet uses XML for annotation here. If so go through the paper.

Related

How to extract weights from trained tensorflow object detection api model

I am using the Tensorflow Object Detection API to train a couple of models (with SSD and Faster RCNN) in a custom dataset. Everything works well, but I want to know how to extract the convolutional and classification model weights, in order to load those weights in an external (for instance keras) convolutional and full connected corresponding model. I've read about the meta architectures (SSDMetaArch and FasterRCNNMetaArch) and restoring checkpoint, but I am not sure yet how to do it for my purpose.
The above because I want to use something like CAM or GradCAM to visually check what the model learns for every class in my dataset.
Thank you

Pre-Trained model to extract the feature of the images tensorflow?

Could someone please provide details of model available to extract the feature of images model for tensorflow or Keras? I have been looking for pre-trained models that will extract the features of the image. And then I will create a vector of the images then apply the nearest neighbor to find out similar images.
Any ordinary pre-trained classification model like vgg or resNet will extract different features of the image on each layer. While the earlier layers will respond to more basic and simple features like edges, the deeper layers will respond to more specific features. If you want to have specific features extracted from images, you have to label some data and train your model with that dataset.
For that, you can use the first couple of layers from a pre-trained model as an encoder.
But I would guess a CNN only solution will get you better results. Here is a nice read about the subject: https://arxiv.org/ftp/arxiv/papers/1709/1709.08761.pdf
Keras actually includes some applications with pre-trained weights, including vgg16: https://github.com/fchollet/keras/blob/master/keras/applications/vgg16.py
There you can find the link to the weights for this vgg16 model (pre-trained on imageNet):
https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5

Image classification, narrow domain with custom labels

Let's suppose I would like to classify motorbikes by model.
there are couple of hundreds models of motorbikes I'm interested in.
I do have tens, sometimes hundreds of pictures of each motorbike model.
Can you please point me to the practical example that demonstrates how to train model on your data and then use it to classify images? It needs to be a deep learning model, not simple logistic regression.
I'm not sure about it, but it seems like I can't use pre-trained neural net because it has been trained on wide range of objects like cat, human, cars etc. They may be not too good at distinguishing the motorbike nuances I'm interested in.
I found couple of such examples (tensorflow has one), but sadly, all of them were using pre-trained model. None of it had example how to train it on your own dataset.
In cases like yours you either use transfer learning or fine tuning. If you have more then thousand images of motorbikes I would use fine tuning and if you have less transfer learning.
Fine tuning is using a pre trained model and using a different classifier part. Then the new classifier part maybe the last 1-2 layers of the trained model are trained to your dataset.
Transfer learning means using a pre trained model and letting it output features for an input image. Now you use a new classifier based on those features. Maybe a SVM or a logistic regression.
An example for this can be seen here: https://github.com/cpra/dlvc2016/blob/master/lectures/lecture10.pdf. slide 33.
This paper Quick, Draw! Doodle Recognition from a kaggle challenge may be similar enough to what you are doing. The code is on github. You may need some data augmentation if you only have a few hundred images for each category.
What you want is pretty EZ. Follow the darknet YOLO implementation
Instruction: https://pjreddie.com/darknet/yolo/
Code https://github.com/pjreddie/darknet
Training YOLO on COCO
You can train YOLO from scratch if you want to play with different training regimes, hyper-parameters, or datasets. Here's how to get it working on the COCO dataset.
Get The COCO Data
To train YOLO you will need all of the COCO data and labels. The script scripts/get_coco_dataset.sh will do this for you. Figure out where you want to put the COCO data and download it, for example:
cp scripts/get_coco_dataset.sh data
cd data
bash get_coco_dataset.sh
Add your data inside and make sure it is same as testing samples.
Now you should have all the data and the labels generated for Darknet.
Then call training script with the pre-trained weight.
Keep in mind that only training on your motorcycle may not result in good estimation. There would be biased result coming out, I red it somewhere b4.
The rest is all inside the link. Good luck

How to convert (samesize, categoriezed) images into dataset for TensorFlow

I am learning to create a learning model using TensorFlow.
I have successfully run the MNIST tutorial, now would like to test the model with my own images. They are same-size image (224x224) and classified into folders.
Now I would like to use those images as input for my model as in the MNIST example. I tried to open the MNIST data-set but it's unreadable. I guess it has been converted into some binary types. Through the example, I think the MNIST dataset somehow has a structure like this:
mnist
test
images
labels
train
images
labels
How can I make a dataset look like the MNIST data from my own images files?
Thank you very much!
MNIST is not stored in image format. From the mnist web-site (http://yann.lecun.com/exdb/mnist/) you could see that it has specific format which is already close to the tensor or numpy array, which could be used in tensorflow with minimal adjustments. It is a kind of a matrix with numbers.
What you need to work with usual images (.jpg for instance) is to use any python lib for image processing to convert into the np.array. For example PIL will work, like here:
PIL and numpy
Another option is to use a built-in functions from tensorflow to convert your images straight to tensors supported by tensofrlow, check this out:
https://www.tensorflow.org/versions/r0.9/api_docs/python/image.html

ORB feature descriptor

I need to program my bot so that it is able to find an object that is it asked to pickup and bring it to the commanded position. I have tried simple img processing techniques like filtering, contour finding. that doesn't seems to work well. I want to use ORB feature extractor. here are a sample images. object of interest is the ball. In short how do I train my bot to pickup balls or other objects any sample program will be helpful. how to use ORB. provide an example if possible. thanx in advance
http://i.stack.imgur.com/spobV.jpg
http://i.stack.imgur.com/JNH1T.jpg
You can try the learning based algorithms like Haar-classifier to detect any object. Thanks to OpenCV all the training process is very streamlined. All you have to do is to is to train your classifier with some true-image(image of the object) and false images(any possible image not having the object.).
Below are some links for your refrence.
Haar trainer for Ball-Pen detection: http://opencvuser.blogspot.com/2011/08/creating-haar-cascade-classifier-aka.html
Haar Trainer for Banana Detection :) :http://coding-robin.de/2013/07/22/train-your-own-opencv-haar-classifier.html

Resources