Two-level classification using single model - image-processing

At first, i want to binary classify "Fire" event in 5000 images. Secondly, if fire is available in images, then classify further either its urban area(buildings) or rural (As forest). I am using Transfer learning with different models including VGG16 and fine-tune few of its last layers.
I have already tried by training and testing both classification steps separately, but it creates lot of penalty, if i identify that image has fire in rural area but image does not include fire.
I want transfer learning model to binary classify and produce results of both steps as:
img1 fire rural/urban
1 No-fire No-rural/no-urban
2 Fire urban
3 Fire rural
so can i retrain vgg16 in a way to it provides both level of classification i-e step one (fire/no-fire) and step two (rural/urban)

I have found the answer, the same model will be trained to binary classify availability of fire and its weights will be saved.
On next step, same weights can be used for second step, to identify either fire is in rural areas or urban areas. Thanks

Related

how much available credit for roboflow training

enter image description here
i'm beginner in this field and i try to use roboflow to labeling and i found that available credit for training is 2. what that means?
i'm try to training my dataset but i'm worried that I can only do 2 trainings
new workspaces come with 3 training credits to start. If you have 2, you’ve either used a credit to train a model with Roboflow Train, or hadn’t trained one before, so the first time you exported, a model was auto-trained for you in the background to give you a Label Assist checkpoint and access to the Deploy Tab.
https://docs.roboflow.com/annotate/model-assisted-labeling
https://blog.roboflow.com/deploy-tab
Credits apply for Roboflow Train, the AutoML training from Roboflow.
training documentation: https://docs.roboflow.com/train
If you’re training custom or open-source model architectures, such as Detectron2, YOLOv7, YOLOv8, etc., you don’t use credits. Custom or open-source model training doesn’t require any credits.
custom models: https://github.com/roboflow/notebooks
Weights uploads from custom models (YOLOv5 and YOLOv8 object detection, YOLOv7 instance segmentation currently supported): https://docs.roboflow.com/upload-weights
In short, you can train 2 more times with Roboflow Train, and as many times as you want for custom models so long as your source and generated image limits are not exceeded for the workspace.
Here’s how to ask for workspace upgrades such as paid plans, or sponsorships for upgrades: https://docs.roboflow.com/roboflow-workspaces#workspace-upgrades

Supervised learning by creating own labels

Scenario - I have data that does not have labels but I can create a function to label the data based on behavior and deploy the model so I don't have to keep labeling the data. Is this considered machine learning?
Objective: classify accounts with Volume spikes based on high medium low labels to deploy on big data (trillions of lines of data)
Data: the data I have includes the following attributes:
Account, Time, Date, Volume amount.
Method:
Create a new feature column called "spike" and create a pandas function to ID a spike greater than 5. Is this feature engineering?
Next I create my label column and classify it as low medium or high spike.
Next I Train a machine learning classifier and deploy it to label future accounts with similar patterns in big data.
Thoughts on this process? Is this approach correct for Machine learning?
1st question:
If your algorithm takes the decision, that is, put a label in a sample, based on the set of samples that you have, I'd say it's a machine learning algorithm. But if you design a code that takes into account your experience regarding the data, I think it's not an ML method. In brief, ML look at the data to get patterns and insights from them. I don't know why you're doing that, but is it need to be an ML algorithm? Sometimes you can solve the problem in a very simple way, without using ML.
2nd question: I'm afraid not. Select your data attributes (ex: Account, Time, Date, Volume amount), checking their correlations, try to figure out if you have a dominant one, etc. This process is pre ML. The feature engineering will select what are the best features to present to our algorithm in order to perform the classification (in your case)
3rd question: I think it's fair enough to start playing with some ML algorithms, such as KNN, SVM, NNs, Decision Tree, etc.

MobileNet vs SqueezeNet vs ResNet50 vs Inception v3 vs VGG16

I have recently been looking into incorporating the machine learning release for iOS developers with my app. Since this is my first time ever using anything ML related I was very lost when I started reading the different model descriptions that Apple has made available. They have the same purpose/description, the only difference being the actual file size. What is the difference between these models and how would you know which one is best fit ?
The models Apple makes available are just for simple demo purposes. Most of the time, these models are not sufficient for use in your own app.
The models on Apple's download page are trained for a very specific purpose: image classification on the ImageNet dataset. This means they can take an image and tell you what the "main" object is in the image, but only if it's one of the 1,000 categories from the ImageNet dataset.
Usually, this is not what you want to do in your own apps. If your app wants to do image classification, typically you want to train a model on your own categories (like food or cars or whatever). In that case you can take something like Inception-v3 (the original, not the Core ML version) and re-train it on your own data. That gives you a new model, which you then need to convert to Core ML again.
If your app wants to do something other than image classification, you can use these pretrained models as "feature extractors" in a larger neural network structure. But again this involves training your own model (usually from scratch) and then converting the result to Core ML.
So only in a very specific use case -- image classification using the 1,000 ImageNet categories -- are these Apple-provided models useful to your app.
If you do want to use any of these models, the difference between them is speed vs. accuracy. The smaller models are fastest but also least accurate. (In my opinion, VGG16 shouldn't be used on mobile. It's just too big and it's no more accurate than Inception or even MobileNet.)
SqueezeNets are fully convolutional and use Fire modules which have a squeeze layer of 1x1 convolutions which vastly decreases parameters as it can restrict the number of input channels each layer. This makes SqueezeNets extremely low latency, in addition to the fact they don't have dense layers.
MobileNets utilise depth-wise separable convolutions, very similar to inception towers in inception. These also reduce the number of a parameters and hence latency. MobileNets also have useful model-shrinking parameters than you can call before training to make it exact size you want. The Keras implementation can use ImageNet pre-trained weights too.
The other models are very deep, large models. The reduced number of parameters / style of convolution is not used for low latency but just for the ability to train very deep models, essentially. ResNet introduced residual connections between layers which were originally believed to be key in training very deep models. These aren't seen in the previously mentioned low latency models.

How do you add new categories and training to a pretrained Inception v3 model in TensorFlow?

I'm trying to utilize a pre-trained model like Inception v3 (trained on the 2012 ImageNet data set) and expand it in several missing categories.
I have TensorFlow built from source with CUDA on Ubuntu 14.04, and the examples like transfer learning on flowers are working great. However, the flowers example strips away the final layer and removes all 1,000 existing categories, which means it can now identify 5 species of flowers, but can no longer identify pandas, for example. https://www.tensorflow.org/versions/r0.8/how_tos/image_retraining/index.html
How can I add the 5 flower categories to the existing 1,000 categories from ImageNet (and add training for those 5 new flower categories) so that I have 1,005 categories that a test image can be classified as? In other words, be able to identify both those pandas and sunflowers?
I understand one option would be to download the entire ImageNet training set and the flowers example set and to train from scratch, but given my current computing power, it would take a very long time, and wouldn't allow me to add, say, 100 more categories down the line.
One idea I had was to set the parameter fine_tune to false when retraining with the 5 flower categories so that the final layer is not stripped: https://github.com/tensorflow/models/blob/master/inception/README.md#how-to-retrain-a-trained-model-on-the-flowers-data , but I'm not sure how to proceed, and not sure if that would even result in a valid model with 1,005 categories. Thanks for your thoughts.
After much learning and working in deep learning professionally for a few years now, here is a more complete answer:
The best way to add categories to an existing models (e.g. Inception trained on the Imagenet LSVRC 1000-class dataset) would be to perform transfer learning on a pre-trained model.
If you are just trying to adapt the model to your own data set (e.g. 100 different kinds of automobiles), simply perform retraining/fine tuning by following the myriad online tutorials for transfer learning, including the official one for Tensorflow.
While the resulting model can potentially have good performance, please keep in mind that the tutorial classifier code is highly un-optimized (perhaps intentionally) and you can increase performance by several times by deploying to production or just improving their code.
However, if you're trying to build a general purpose classifier that includes the default LSVRC data set (1000 categories of everyday images) and expand that to include your own additional categories, you'll need to have access to the existing 1000 LSVRC images and append your own data set to that set. You can download the Imagenet dataset online, but access is getting spotier as time rolls on. In many cases, the images are also highly outdated (check out the images for computers or phones for a trip down memory lane).
Once you have that LSVRC dataset, perform transfer learning as above but including the 1000 default categories along with your own images. For your own images, a minimum of 100 appropriate images per category is generally recommended (the more the better), and you can get better results if you enable distortions (but this will dramatically increase retraining time, especially if you don't have a GPU enabled as the bottleneck files cannot be reused for each distortion; personally I think this is pretty lame and there's no reason why distortions couldn't also be cached as a bottleneck file, but that's a different discussion and can be added to your code manually).
Using these methods and incorporating error analysis, we've trained general purpose classifiers on 4000+ categories to state-of-the-art accuracy and deployed them on tens of millions of images. We've since moved on to proprietary model design to overcome existing model limitations, but transfer learning is a highly legitimate way to get good results and has even made its way to natural language processing via BERT and other designs.
Hopefully, this helps.
Unfortunately, you cannot add categories to an existing graph; you'll basically have to save a checkpoint and train that graph from that checkpoint onward.

how to train a classifier using video datasets

If I have a video dataset of a specific action , how could I use them to train a classifier that could be used later to classify this action.
The question is very generic. In general, there is no foul proof way of training a classifier that will work for everything. It highly depends on the data you are working with.
Here is the 'generic' pipeline:
extract features from the video
label your features (positive for the action you are looking for; negative otherwise)
split your data into 2 (or 3) sets. One for training, one for testing and the other optionally for validation
train a classifier on the labeled examples (e.g. SVM, Neural Network, Nearest Neighbor ...)
validate the results on the validation data, if that is appropriate for the algorithm
test on data you haven't used for training.
You can start with some machine learning tools here http://www.cs.waikato.ac.nz/ml/weka/
Make sure you never touch the test data for any other purposes than testing
Good luck
Almost 10 years later, here's an updated answer.
Set up a camera and collect raw video data
Save it somewhere in form of single frames. Do this yourself locally or using a cloud bucket or use a service like Sieve API. Helpful repo linked here.
Export from Sieve or cloud bucket to get data labeled. Do this yourself or using some service like Scale Rapid.
Split your dataset into train, test, and validation.
Train a classifier on the labeled samples. Use transfer learning over some existing model and fine-tune just the last few layers.
Run your model over the test set after each training epoch and save the one with the best test set performance.
Evaluate your model at the end using the validation set.
There are many repos that can help you get started: https://github.com/weiaicunzai/awesome-image-classification
The two things that can help you ensure best results include 1. high quality labeled data and 2. a diverse, curated dataset. That's what Sieve can help with!

Resources