Using tensorflow Object-detection on only 1 classes - opencv

I am using tensorflow (object-detection) on my own dataset (drone recognition), also only 1 class named 'drone', after about 30000 steps trained, my result model can detect drone with very high accuracy, but I got a problem, I used ssd_inception_v2_coco model and its fine_tune_checkpoint on model zoo, right now sometimes in my real time detection, it detected human face as drone (very big different between 2 objects like that), I think because of the old checkpoint.
How can I prevent the detection of some object that have big different with my drone object, like human, dog, cat... Or can someone describe for me what problem here?
Sorry for my bad english

Even if you train an SSD for one class, it automatically creates another class called background. The background is trained using the regions of the training images that are not labeled as the desired classes (in your case, drone).
An easy way out is to add training samples that include images that have both drones and the things that you don't want to recognize as drones, in the same scene. Doing this and then increasing the number of epochs should improve the precision.
If you are doing an application where there are frequent occurences of some objects with drones, another possiblity is to actually train the network for those things too. This will increase your training workload, but improve the accuracy.
Some implementations of SSD have an option for hard negative mining of data, so that mistakes made during validation are specifically used with training. If you are familiar with the code, you might want to check if this is available.

Related

What is the appropriate way to train a face detector / classifier?

I want to build a face detector/classifier to generate a network that detects whether a face is present in an image/video.
I understand the basic concept, but what I have problems with is the choice of the number of classes.
Initially, I thought that two classes (with face / without face) would be sufficient. However, I was unsure which data I should use for the class 'without face'. So I threw together datasets of equipment and plants and animals, whereupon the classes were very unbalanced, which is apparently not good.
Then I thought it would be better to use as many classes as possible.
But again, I am unsure what would be the best/common approach to the problem?
You can experiment with any number of samples and different images for the negative class. If the datasets with equipment/plant/places you have are imbalanced, you can try to subsample, e.g. pick 100 images from each.
Just don't make the negative class too huge, w.r.t the number of images with human samples you have. The rest is up to experimentation.

Finding the suitable CNN architecture for the calssification

I want to use convolutional Neural Network (CNN) to classify between two classes of images. I built several CNN architectures, but I always get the same result; the network always classify all cases as a second class sample. Therefore, I always get 50% accuracy in leave-one-out. The data is balanced in terms of the number of samples of each class (16 from 1st, and 16 from 2nd). Could you please clarify what does this mean.
With such small number of training samples, Your CNN model is very likely to overfit the data giving good training accuracy and worst test accuracy.
Else your model can be skewed predicting the same class at all times.
Below are some of the solutions you can try:
1) As you have commented, if you cannot get any more images, then try creating new images by modifying the ones already available. For ex: Let's say you have 16 images of a cat (cat is the class). You can crop the cat and paste it in different backgrounds, try varying the brightness, intensity etc, Try rotation, translation operations etc.
This will help you create a good training set.
2) Try creating a smaller model (with one or two layers) and check if it improves your accuracy.
3) You can do transfer learning by using a good pre-trained model as it can learn pretty well when compared to creating a model from base.

Neural Network gets stuck

I am experimenting with classification using neural networks (I am using tensorflow).
And unfortunately the training of my neural network gets stuck at 42% accuracy.
I have 4 classes, into which I try to classify the data.
And unfortunately, my data set is not well balanced, meaning that:
43% of the data belongs to class 1 (and yes, my network gets stuck predicting only this)
37% to class 2
13% to class 3
7% to class 4
The optimizer I am using is AdamOptimizer and the cost function is tf.nn.softmax_cross_entropy_with_logits.
I was wondering if the reason for my training getting stuck at 42% is really the fact that my data set is not well balanced, or because the nature of the data is really random, and there are really no patterns to be found.
Currently my NN consists of:
input layer
2 convolution layers
7 fully connected layers
output layer
I tried changing this structure of the network, but the result is always the same.
I also tried Support Vector Classification, and the result is pretty much the same, with small variations.
Did somebody else encounter similar problems?
Could anybody please provide me some hints how to get out of this issue?
Thanks,
Gerald
I will assume that you have already double, triple and quadruple checked that the data going in is matching what you expect.
The question is quite open-ended, and even a topic for research. But there are some things that can help.
In terms of better training, there's two normal ways in which people train neural networks with an unbalanced dataset.
Oversample the examples with lower frequency, such that the proportion of examples for each class that the network sees is equal. e.g. in every batch, enforce that 1/4 of the examples are from class 1, 1/4 from class 2, etc.
Weight the error for misclassifying each class by it's proportion. e.g. incorrectly classifying an example of class 1 is worth 100/43, while incorrectly classifying an example of class 4 is worth 100/7
That being said, if your learning rate is good, neural networks will often eventually (after many hours of just sitting there) jump out of only predicting for one class, but they still rarely end well with a badly skewed dataset.
If you want to know whether or not there are patterns in your data which can be determined, there is a simple way to do that.
Create a new dataset by randomly select elements from all of your classes such that you have an even number of all of them (i.e. if there's 700 examples of class 4, then construct a dataset by randomly selecting 700 examples from every class)
Then you can use all of your techniques on this new dataset.
Although, this paper suggests that even with random labels, it should be able to find some pattern that it understands.
Firstly you should check if your model is overfitting or underfitting, both of which could cause low accuracy. Check the accuracy of both training set and dev set, if accuracy on training set is much higher than dev/test set, the model may be overfiiting, and if accuracy on training set is as low as it on dev/test set, then it could be underfitting.
As for overfiiting, more data or simpler learning structures may work while make your structure more complex and longer training time may solve underfitting problem

Can inception model be used for object counting in an image?

I have already gone through the image classification part in Inception model, but I require to count the objects in the image.
Considering the flowers data-set, one image can have multiple instances of a flower, so how can I get that count?
What you describe is known to research community as Instance-Level Segmentation.
In last year itself there have been a significant spike in papers addressing this problem.
Here are some of the papers:
https://arxiv.org/pdf/1412.7144v4.pdf
https://arxiv.org/pdf/1511.08498v3.pdf
https://arxiv.org/pdf/1607.03222v2.pdf
https://arxiv.org/pdf/1607.04889v2.pdf
https://arxiv.org/pdf/1511.08250v3.pdf
https://arxiv.org/pdf/1611.07709v1.pdf
https://arxiv.org/pdf/1603.07485v2.pdf
https://arxiv.org/pdf/1611.08303v1.pdf
https://arxiv.org/pdf/1611.08991v2.pdf
https://arxiv.org/pdf/1611.06661v2.pdf
https://arxiv.org/pdf/1612.03129v1.pdf
https://arxiv.org/pdf/1605.09410v4.pdf
As you see in these papers simple object classification network won't solve the problem.
If you search github you will find a few repositories with basic frameworks, you can build on top of them.
https://github.com/daijifeng001/MNC (caffe)
https://github.com/bernard24/RIS/blob/master/RIS_infer.ipynb (torch)
https://github.com/jr0th/segmentation (keras, tensorflow)
indraforyou answered the question in how to solve the problem you are having. I want to add something for the inception model specifically. In https://arxiv.org/pdf/1312.6229.pdf they propose a regressor network trained on the output of a model trained on the imagenet dataset like the inception model. This regressor model then is used to propose object boundaries for you to use for counting. The advantage of this approach is that you do not have to annotate any training examples and you can just use the ImageNet dataset for training.
If you do not want to train anything I would propose a heuristic in finding object boundaries. Literature in image segmentation https://en.wikipedia.org/wiki/Image_segmentation should help you find a suitable heuristic. I do think using a heuristic will decrease your accuracy though.
Last but not least this is an open problem in computer vision research. You should not expect to get 100% accuracy or even 95% accuracy on counting. Many very smart people have tried this and reported mixed results. Still some very cool things can be accomplished.
Any classification model like inception model will identify the object like flower in your case. However, when multiple items are there classifications won't work (get confused in simple language).
Thus:
You've to segment main image into child images with one object per image and use classification on each segment. This is termed as image segmentation in image processing.

Null Classes in Machine Learning

How do we handle null classes at test time in a machine learning system. If I train my model on lets say 10 classes, and then I observe a class that does not belong to any of the 10 classes, is there a way to detect this occurrence? It is needed for activity recognition in a sliding window approach where each time step yields one of the 10 classes, however, actually, there are time-steps where nothing happens and so the algorithm should not classify.
This would be whats called outlier or novelty detection. Some basic info here. You would want to use a outlier detection algorithm first (where all 10 classes are your inliers) to filer out new classes that you haven't seen before. Then if it passes the outlier detector, you feed it into your classifier. There will be some false positives/negatives on the outlier stage - which will have an impact on what fraction of the data you classify correctly.
however, actually, there are time-steps where nothing happens and so the algorithm should not classify.
Perhaps then what you really should consider is a 11'th class of "no activity". If its real data that occurs regularly, you should treat it as such.

Resources