Extract Bottleneck Features in Batch Mode - machine-learning

I am using the Inception Model to extract the features for Transfer Learning in Tensorflow. Major issue is that it only extract feature for one image at a time. How can I use it for Batch Mode to make it fast?

Related

Classifying image documents using Machine learning

I want to classify image documents(like Passport, Driving Licence etc) using Machine Learning.
Does anybody has any link or documents where I can get idea to do this task.
What I am thinking is of first converting the document to text format and then fro Text file extract the information.But this I can do with one file at a time.
I want to know how can I perform this in millions of document.
You don't need to convert documents to text, you can do this with images directly.
To do image classification you can build basic CNNs with Keras library.
https://towardsdatascience.com/building-a-convolutional-neural-network-cnn-in-keras-329fbbadc5f5
This basic CNN will be enough for you to train an image classifier. But you want to get state of the art accuracy, I recommend get a pretrained resnet50 and train it to build an image classifier. Other than accuracy, there is another major advantage of using pre trained network, you'll need less data to train a robust image classifier.
https://engmrk.com/kerasapplication-pre-trained-model/?utm_campaign=News&utm_medium=Community&utm_source=DataCamp.com
The only thing that you'll need to change is number of output classes from 1000 to the number of classes you want.

MobileNet vs SqueezeNet vs ResNet50 vs Inception v3 vs VGG16

I have recently been looking into incorporating the machine learning release for iOS developers with my app. Since this is my first time ever using anything ML related I was very lost when I started reading the different model descriptions that Apple has made available. They have the same purpose/description, the only difference being the actual file size. What is the difference between these models and how would you know which one is best fit ?
The models Apple makes available are just for simple demo purposes. Most of the time, these models are not sufficient for use in your own app.
The models on Apple's download page are trained for a very specific purpose: image classification on the ImageNet dataset. This means they can take an image and tell you what the "main" object is in the image, but only if it's one of the 1,000 categories from the ImageNet dataset.
Usually, this is not what you want to do in your own apps. If your app wants to do image classification, typically you want to train a model on your own categories (like food or cars or whatever). In that case you can take something like Inception-v3 (the original, not the Core ML version) and re-train it on your own data. That gives you a new model, which you then need to convert to Core ML again.
If your app wants to do something other than image classification, you can use these pretrained models as "feature extractors" in a larger neural network structure. But again this involves training your own model (usually from scratch) and then converting the result to Core ML.
So only in a very specific use case -- image classification using the 1,000 ImageNet categories -- are these Apple-provided models useful to your app.
If you do want to use any of these models, the difference between them is speed vs. accuracy. The smaller models are fastest but also least accurate. (In my opinion, VGG16 shouldn't be used on mobile. It's just too big and it's no more accurate than Inception or even MobileNet.)
SqueezeNets are fully convolutional and use Fire modules which have a squeeze layer of 1x1 convolutions which vastly decreases parameters as it can restrict the number of input channels each layer. This makes SqueezeNets extremely low latency, in addition to the fact they don't have dense layers.
MobileNets utilise depth-wise separable convolutions, very similar to inception towers in inception. These also reduce the number of a parameters and hence latency. MobileNets also have useful model-shrinking parameters than you can call before training to make it exact size you want. The Keras implementation can use ImageNet pre-trained weights too.
The other models are very deep, large models. The reduced number of parameters / style of convolution is not used for low latency but just for the ability to train very deep models, essentially. ResNet introduced residual connections between layers which were originally believed to be key in training very deep models. These aren't seen in the previously mentioned low latency models.

How intensive is training a machine learning algorithm?

I'd like to make an app using iOS's new CoreML framework that does image recognition. To do so I'd probably have to train my own model, and I'm wondering exactly how much data and compute power it would require. Is it something I could feasibly accomplish on an dual core i5 Macbook Pro using Google Images for source data or would it be much more involved?
It depends on what sort of images you want to train your model to recognize.
What is often done is fine-tuning an existing model. You take a pretrained version of Inception-v3 (let's say) and then replace the final layer with your own. You train this last layer on your own images.
You still need a fair number of training images (a few 100 per category, but more is better) but you can do this on your MacBook Pro in anywhere between 30 minutes to a few hours.
TensorFlow comes with a script that makes it really easy to do this. Keras has a great blog post on how to do this. I used the TensorFlow script to re-train Inception-v3 to tell apart my two cats, from 50 or so images of each cat.
If you want to train from scratch you probably want to do this in the cloud using AWS, Google's Cloud ML Engine, or something easy like FloydHub.

Does Keras add any delay/overhead compared to pure TensorFlow?

I'm working on a project that requires object detection and recognition on images fed by a live camera.
For it to work well it should perform the evaluation for a single frame as quick as possible.
Going straight for TensorFlow instead of using Keras with TF backend will improve the performance of the whole model during the evaluation?

Using Weka on Images

I am new to Weka, and from the examples on how to use it, I have only seen text problems. Can I use images in Weka with the machine learning classifiers?
You can directly do pixel classification using the Trainable Weka Segmentation plugin (former Advanced Weka Segmentation plugin) from Fiji/ImageJ.
The plugin is designed for segmentation via interactive learning. This means the user is expected to select a set of features (edge detectors, texture filters, etc.), choose the number of classes (by default there are 2) and interactively draw (with the ROI tools) samples of all classes. After training the classifier based on those samples, the whole image pixels will be classified and the segmentation result will be displayed overlaying the original image. The idea is to repeat this process (drawing + training) until obtaining a satisfying segmentation.
The plugin provides as well a set of tools to save/load the samples in ARFF format and save/load the classifier in .model format, so it's completely compatible with the latest version of WEKA.
If what you want to do is image classification, you might be able to reuse some of the plugin's methods as well.
You can use open source Image processing application such as ImageJ and Fiji to extract features from your image and use it in Weka
Fiji has a plugin called Advanced Weka Segmentation which should be very useful in applying Weka classifiers to Image
Weka machine learning classifiers works with numerical and categorical features. Before using weka with images, you need to extract features from your images.
According to your needs, simple features like average, maximum, mean may be enough. Or you may need to use some other algorithms for your images.
Below wikipedia feature extraction algorithms.
Low-level
Edge detection
Corner detection
Blob detection
Ridge detection
Scale-invariant feature transform
I suggest reading a optical character recognition survey to understand how they are used. OCR is pretty simple example for you to use. Standard data sets and algorithms exists for OCR. Therefore it is very instructive to learn about it.

Resources