I'd like to make an app using iOS's new CoreML framework that does image recognition. To do so I'd probably have to train my own model, and I'm wondering exactly how much data and compute power it would require. Is it something I could feasibly accomplish on an dual core i5 Macbook Pro using Google Images for source data or would it be much more involved?
It depends on what sort of images you want to train your model to recognize.
What is often done is fine-tuning an existing model. You take a pretrained version of Inception-v3 (let's say) and then replace the final layer with your own. You train this last layer on your own images.
You still need a fair number of training images (a few 100 per category, but more is better) but you can do this on your MacBook Pro in anywhere between 30 minutes to a few hours.
TensorFlow comes with a script that makes it really easy to do this. Keras has a great blog post on how to do this. I used the TensorFlow script to re-train Inception-v3 to tell apart my two cats, from 50 or so images of each cat.
If you want to train from scratch you probably want to do this in the cloud using AWS, Google's Cloud ML Engine, or something easy like FloydHub.
Related
i am trying to create a personal project taking 4-5 guns from Pubg mobile along with their different skins. I want to create a image classifier , classifying all these guns separately. Can you please help me that how should I start and proceed. For example how to create the dataset , how to take images? What data augmentation to apply Like scaling, shifting, rotating etc. Which model to use Alex net? Vgg model?. Key points to keep in mind. Python libraries everything.
I have a project which I should classify the data coming from several sensors(time series based data) like gyroscope to several classes. I have used several classifiers including SVM, decision tree, neural networks, KNN,... in a batch scenario. My ultimate goal is to find a real-time classifier which is accurate, light and also has the ability to improve itself to implement it on my device which has limited sources(CPU, RAM,..). I was thinking a semi-supervised classifier since I can save a few labeled data on my device and use the future data points to improve my classifier. Does anyone have any recommendation or experience in this regard?
Online learning is very challenging. I recommend you steer away from now and use batch learning. You can always update the model as you update the mobile app or just make the app look for a new updated model on your server every x days.
Now, how to run a machine learning algorithm efficiently on a phone with limited resources. First, you have to identify which platform you are using. I assume you want to get a platform agnostic answer. Most ML algorithms (except lazy learning ones) can run efficiently on smartphone, have a look at this benchmarking experiment.
You have several options here:
iOS: Here's a list of all machine learning libraries available publicly.
Android: Weka for Android, this lib has a huge number of ML algorithms.
Platform agnostic deep learning: Tensorflow, you can export your models to TensorFlow lite (tutorial) and deploy them on any mobile OS and Caffe2 to train deep learning models and export them to any smartphone OS.
I have recently been looking into incorporating the machine learning release for iOS developers with my app. Since this is my first time ever using anything ML related I was very lost when I started reading the different model descriptions that Apple has made available. They have the same purpose/description, the only difference being the actual file size. What is the difference between these models and how would you know which one is best fit ?
The models Apple makes available are just for simple demo purposes. Most of the time, these models are not sufficient for use in your own app.
The models on Apple's download page are trained for a very specific purpose: image classification on the ImageNet dataset. This means they can take an image and tell you what the "main" object is in the image, but only if it's one of the 1,000 categories from the ImageNet dataset.
Usually, this is not what you want to do in your own apps. If your app wants to do image classification, typically you want to train a model on your own categories (like food or cars or whatever). In that case you can take something like Inception-v3 (the original, not the Core ML version) and re-train it on your own data. That gives you a new model, which you then need to convert to Core ML again.
If your app wants to do something other than image classification, you can use these pretrained models as "feature extractors" in a larger neural network structure. But again this involves training your own model (usually from scratch) and then converting the result to Core ML.
So only in a very specific use case -- image classification using the 1,000 ImageNet categories -- are these Apple-provided models useful to your app.
If you do want to use any of these models, the difference between them is speed vs. accuracy. The smaller models are fastest but also least accurate. (In my opinion, VGG16 shouldn't be used on mobile. It's just too big and it's no more accurate than Inception or even MobileNet.)
SqueezeNets are fully convolutional and use Fire modules which have a squeeze layer of 1x1 convolutions which vastly decreases parameters as it can restrict the number of input channels each layer. This makes SqueezeNets extremely low latency, in addition to the fact they don't have dense layers.
MobileNets utilise depth-wise separable convolutions, very similar to inception towers in inception. These also reduce the number of a parameters and hence latency. MobileNets also have useful model-shrinking parameters than you can call before training to make it exact size you want. The Keras implementation can use ImageNet pre-trained weights too.
The other models are very deep, large models. The reduced number of parameters / style of convolution is not used for low latency but just for the ability to train very deep models, essentially. ResNet introduced residual connections between layers which were originally believed to be key in training very deep models. These aren't seen in the previously mentioned low latency models.
I'm new to machine learning and trying to figure out where to start and how to apply it to my app.
My app is pulling a bunch of health metrics and based on all of them is suggesting a dose of medication (some abstract medication, doesn't matter) to take. Taking a medication is affecting health metrics and I can see if my suggestion was right of if it needs adjustments to be more precise the next time. Medications are being taken constantly so I have a lot of results and data to work with.
Does that seem like a good case for machine learning and using some of neural networks to train and make better predictions? If so - could you recommend an example for Tensorflow or Keras?
So far I only found image recognition examples and not sure how to apply similar algorithms to my problem.
I'm also a beginner into machine learning, but based on my knowledge, one way would be to use supervised learning with Keras, which uses Tensorflow as a backend. Keras is a lot easier to program than Tensorflow, but eventually Tensorflow might as well do the trick (depending on your familiarity with machine learning libraries).
You mentioned that your algorithm suggests medication based on data (from the patient).
One way to predict medication is to store all your preexisting data in a CSV file, and use the CSV module to read it. This tutorial covers the basics of reading CSV files (https://pythonprogramming.net/reading-csv-files-python-3/).
Next, you can store the data in a multi-dimensional array, and run a neural network through it. Just make sure that you have sufficiently enough data (the more the better) in comparison with the size of your neural network.
Another way, as you mentioned, would be using Convolutional Neural Networks, which theoretically could and should work, but I have very little experience programming them, so I'm afraid I can't give you any advice for that (you can program CNNs in both Keras and Tensorflow).
I do wish you good luck in your project!
I'm trying to utilize a pre-trained model like Inception v3 (trained on the 2012 ImageNet data set) and expand it in several missing categories.
I have TensorFlow built from source with CUDA on Ubuntu 14.04, and the examples like transfer learning on flowers are working great. However, the flowers example strips away the final layer and removes all 1,000 existing categories, which means it can now identify 5 species of flowers, but can no longer identify pandas, for example. https://www.tensorflow.org/versions/r0.8/how_tos/image_retraining/index.html
How can I add the 5 flower categories to the existing 1,000 categories from ImageNet (and add training for those 5 new flower categories) so that I have 1,005 categories that a test image can be classified as? In other words, be able to identify both those pandas and sunflowers?
I understand one option would be to download the entire ImageNet training set and the flowers example set and to train from scratch, but given my current computing power, it would take a very long time, and wouldn't allow me to add, say, 100 more categories down the line.
One idea I had was to set the parameter fine_tune to false when retraining with the 5 flower categories so that the final layer is not stripped: https://github.com/tensorflow/models/blob/master/inception/README.md#how-to-retrain-a-trained-model-on-the-flowers-data , but I'm not sure how to proceed, and not sure if that would even result in a valid model with 1,005 categories. Thanks for your thoughts.
After much learning and working in deep learning professionally for a few years now, here is a more complete answer:
The best way to add categories to an existing models (e.g. Inception trained on the Imagenet LSVRC 1000-class dataset) would be to perform transfer learning on a pre-trained model.
If you are just trying to adapt the model to your own data set (e.g. 100 different kinds of automobiles), simply perform retraining/fine tuning by following the myriad online tutorials for transfer learning, including the official one for Tensorflow.
While the resulting model can potentially have good performance, please keep in mind that the tutorial classifier code is highly un-optimized (perhaps intentionally) and you can increase performance by several times by deploying to production or just improving their code.
However, if you're trying to build a general purpose classifier that includes the default LSVRC data set (1000 categories of everyday images) and expand that to include your own additional categories, you'll need to have access to the existing 1000 LSVRC images and append your own data set to that set. You can download the Imagenet dataset online, but access is getting spotier as time rolls on. In many cases, the images are also highly outdated (check out the images for computers or phones for a trip down memory lane).
Once you have that LSVRC dataset, perform transfer learning as above but including the 1000 default categories along with your own images. For your own images, a minimum of 100 appropriate images per category is generally recommended (the more the better), and you can get better results if you enable distortions (but this will dramatically increase retraining time, especially if you don't have a GPU enabled as the bottleneck files cannot be reused for each distortion; personally I think this is pretty lame and there's no reason why distortions couldn't also be cached as a bottleneck file, but that's a different discussion and can be added to your code manually).
Using these methods and incorporating error analysis, we've trained general purpose classifiers on 4000+ categories to state-of-the-art accuracy and deployed them on tens of millions of images. We've since moved on to proprietary model design to overcome existing model limitations, but transfer learning is a highly legitimate way to get good results and has even made its way to natural language processing via BERT and other designs.
Hopefully, this helps.
Unfortunately, you cannot add categories to an existing graph; you'll basically have to save a checkpoint and train that graph from that checkpoint onward.