How to Multi-Head learning - machine-learning

I have about 5 models that work pretty well trained individually but I want to fuse them together in order to have one big model.
I'm looking into it because one big model is more easy to update (in production) than many small model
this is an image of what I want to achieve.
my question are, is it ok to do it like this ?
having one dataset per head model, how am I supposed to train the whole model ?

my question are, is it ok to do it like this
Sure you can do that. This approach is called multi-task learning. Depending on your datasets and what you are trying to do, it will maybe even increase the performance. Microsoft used a multi-task model to achieve some good results for the NLP Glue benchmark, but they also noted that you can increase the performance further by finetuning the joint model for each individual task.
having one dataset per head model, how am I supposed to train the whole model?
All you need is pytorch ModuleList:
#please note this is just pseudocode and I'm not well versed with computer vision
#therefore you need to check if resnet50 import is correct and look
#for the imports of the task specific stuff
from torch import nn
from torchvision.models import resnet50
class MultiTaskModel(nn.Module):
def __init__(self):
#shared part
self.resnet50 = resnet50()
#task specific stuff
self.tasks = nn.ModuleList()
self.tasks.add_module('depth', Depth())
self.tasks.add_module('denseflow', Denseflow())
#...
def forward(self, tasktag, ...):
#shared part
resnet_output = self.resnet50(...)
#task specific parts
if tasktag == 'depth':
return self.tasks.depth(resnet_output)
elif tasktag == 'denseflow':
return self.tasks.denseflow(resnet_output)
#...

Just for an idea you may check the Detectron2 project and in particular how the models are joined in.
Chances are that some ideas they used you may use also.
Fusing the models together means defining the inputs and the outputs for the main model (containing submodels).

Related

Converting a pytorch model to nn.Module for exporting to onnx for lens studio

I am trying to convert pix2pix to a pb or onnx that can run in Lens Studio. Lens studio has strict requirements for the models. I am trying to export this pytorch model to onnx using this guide provided by lens studio. The issue is the pytorch model found here uses its own base class, when in the example it uses Module.nn, and therefore doesnt have methods/variables that the torch.onnx.export function needs to run. So far Ive run into its missing a variable called training and a method called train
Would it be worth it to try to modify the base model, or should I try to build it from scratch using nn.Module? Is there a way to make the pix2pix model inherit from both the abstract base class and nn.module? Am I not understanding the situation? The reason why I want to do it using the lens studio tutorial is because I have gotten it to export onnx different ways but Lens Studio wont accept those for various reasons.
Also this is my first time asking a SO question (after 6 years of coding), let me know if I make any mistakes and I can correct them. Thank you.
This is the important code from the tutorial creating a pytorch model for Lens Studio:
import torch
import torch.nn as nn
class Model(nn.Module):
def __init__(self):
super().__init__()
self.layer = nn.Conv2d(in_channels=3, out_channels=1,
kernel_size=3, stride=2, padding=1)
def forward(self, x):
out = self.layer(x)
out = nn.functional.interpolate(out, scale_factor=2,
mode='bilinear', align_corners=True)
out = torch.nn.functional.softmax(out, dim=1)
return out
I'm not going to include all the code from the pytorch model bc its large, but the beginning of the baseModel.py is
import os
import torch
from collections import OrderedDict
from abc import ABC, abstractmethod
from . import networks
class BaseModel(ABC):
"""This class is an abstract base class (ABC) for models.
To create a subclass, you need to implement the following five functions:
-- <__init__>: initialize the class; first call BaseModel.__init__(self, opt).
-- <set_input>: unpack data from dataset and apply preprocessing.
-- <forward>: produce intermediate results.
-- <optimize_parameters>: calculate losses, gradients, and update network weights.
-- <modify_commandline_options>: (optionally) add model-specific options and set default options.
"""
def __init__(self, opt):
"""Initialize the BaseModel class.
Parameters:
opt (Option class)-- stores all the experiment flags; needs to be a subclass of BaseOptions
When creating your custom class, you need to implement your own initialization.
In this function, you should first call <BaseModel.__init__(self, opt)>
Then, you need to define four lists:
-- self.loss_names (str list): specify the training losses that you want to plot and save.
-- self.model_names (str list): define networks used in our training.
-- self.visual_names (str list): specify the images that you want to display and save.
-- self.optimizers (optimizer list): define and initialize optimizers. You can define one optimizer for each network. If two networks are updated at the same time, you can use itertools.chain to group them. See cycle_gan_model.py for an example.
"""
self.opt = opt
self.gpu_ids = opt.gpu_ids
self.isTrain = opt.isTrain
self.device = torch.device('cuda:{}'.format(self.gpu_ids[0])) if self.gpu_ids else torch.device('cpu') # get device name: CPU or GPU
self.save_dir = os.path.join(opt.checkpoints_dir, opt.name) # save all the checkpoints to save_dir
if opt.preprocess != 'scale_width': # with [scale_width], input images might have different sizes, which hurts the performance of cudnn.benchmark.
torch.backends.cudnn.benchmark = True
self.loss_names = []
self.model_names = []
self.visual_names = []
self.optimizers = []
self.image_paths = []
self.metric = 0 # used for learning rate policy 'plateau'
and for pix2pix_model.py
import torch
from .base_model import BaseModel
from . import networks
class Pix2PixModel(BaseModel):
""" This class implements the pix2pix model, for learning a mapping from input images to output images given paired data.
The model training requires '--dataset_mode aligned' dataset.
By default, it uses a '--netG unet256' U-Net generator,
a '--netD basic' discriminator (PatchGAN),
and a '--gan_mode' vanilla GAN loss (the cross-entropy objective used in the orignal GAN paper).
pix2pix paper: https://arxiv.org/pdf/1611.07004.pdf
"""
#staticmethod
def modify_commandline_options(parser, is_train=True):
"""Add new dataset-specific options, and rewrite default values for existing options.
Parameters:
parser -- original option parser
is_train (bool) -- whether training phase or test phase. You can use this flag to add training-specific or test-specific options.
Returns:
the modified parser.
For pix2pix, we do not use image buffer
The training objective is: GAN Loss + lambda_L1 * ||G(A)-B||_1
By default, we use vanilla GAN loss, UNet with batchnorm, and aligned datasets.
"""
# changing the default values to match the pix2pix paper (https://phillipi.github.io/pix2pix/)
parser.set_defaults(norm='batch', netG='unet_256', dataset_mode='aligned')
if is_train:
parser.set_defaults(pool_size=0, gan_mode='vanilla')
parser.add_argument('--lambda_L1', type=float, default=100.0, help='weight for L1 loss')
return parser
def __init__(self, opt):
"""Initialize the pix2pix class.
Parameters:
opt (Option class)-- stores all the experiment flags; needs to be a subclass of BaseOptions
"""
(Also sidenote if you see this and it looks like no easy way out let me know, I know what its like seeing someone getting started in something who goes in to deep too early on)
You can definitely have your model inherit from both the base class and torch.nn.Module (python allows multiple inheritance). However you should take care about the conflicts if both inherited class have functions with identical names (I can see at least one : their base provide the eval function and so to nn.module).
However since you do not need the CycleGan, and a lot of the code is compatibility with their training environment, you'd probably better just re-implement the pix2pix. Just steal the code, have it inherit from nn.Module, copy-paste useful/mandatory functions from the base class, and have everything translated into clean pytorch code. You already have the forward function (which is the only requirement for a pytorch module).
All the subnetworks they use (like the resnet blocks) seem to inherit from nn.Module already so there is nothing to change here (double check that though)

Image Classification with single class dataset using Transfer Learning [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I only have around 1000 images of computers. I need to train a model that can identify if the image is computer or not-computer. I do not have a dataset for not-computer, as it could be anything.
I guess the best method for this would be to apply transfer learning. I am trying to train data on a pre-trained VGG19 Model. But still, I am unaware on how to train a model with just computers images without any non-computer images.
I am new to ML Overall, so sorry if question is not to the point.
No way, I'm sorry. You'll need a lot (at least other 1000 images) of non-computer images. You can take them from everywhere, the more they "vary" the better is for your model to extract what features characterize a computer.
Imagine to be a baby that is trained to always say "yes" in front of something, next time you'll se something you'll say "yes" no matter what is in front of you...
The same is for machine learning models, you need positive examples and negative examples, or your model will have 100% accuracy by predicting always "yes".
If you want to see it a mathematically/geometrically, you can see each sample (in your case an image) as a point in the feature space: imagine to draw an axis for each attribute you have (x,y,z an so on), an image will be a point in that space.
For simplicity let's consider a 2-dimension space, which means that each image could be described with 2 attributes (not the case for images, usually the features are a lot, but for simplicity imagine feature_1 = number of colors, feature_2 = number of angles), in this example we can simply draw a point in a cartesian graph, one for each image:
The objective of a classifier is to draw a line which better separate the red dots from the blue dots, which means separate positive examples, from negative examples.
If you give the model only positive samples (which is what you were going to do), you'll have infinite models with 100% accuracy! Because you can put a line wherever you want, the only requirement is to not "cut" your dataset.
Given that I suppose you are a beginner, I'll just tell you what to do, not how because it would take years ;)
1) Collect data - as I told you, even negative examples, at least other 1000 samples
2) Split the data into train/test - a good split could be 2/3 of the samples in the training set and 1/3 in the test set. [REMEMBER] Keep consistency of the final class distribution, i.e. if you had 50%-50% of classes "Computer"-"Non computer", you should keep that percentage for both train set and test set
3) Train a model - have a look at this link for a guided examples, it uses the MNIST dataset, which is a famous image classification one, you should use your data
4) Test the model on the test set and look at performance
While it is not impossible to take data belonging to one only one class of data and then use methods to classify whether other data belong to the same class or not, you usually do not end up with too good accuracy that way.
One way to do this, is to use something called "autoencoders". The point here is that you use the same image as input and as the target, and you make sure that the (usually neural network) is forced to compress the image in some way so that it only stores what is important to recreate images of computers. Ideally, this should lead to a model which is good at recreating images of computers, and bad at everything else, meaning you can test how high the loss is on the output, and if it higher than some threshold you've decided on, you deem it to be something else. Again, you're probably not going to get anything close to 90% accuracy doing this, but it is an approach to your problem.
A better approach is to go hunting for models which have been pre-trained on some dataset which had computers as part of the dataset, take the same dataset and set all computers to one class (+ your own images, make sure they adhere to the dataset format) and a selection of the other images to the other class. Make sure to not make the classes too unbalanced, otherwise your model will suffer from it. Extend the pre-trained model with a couple of layer, fully connected should probably do fine, and make the pre-trained part of the model not trainable, so you don't mess up the good weights there when you're practically telling it to ignore everything which is not a computer.
This is probably your best bet, but is going to require a bit more effort on your side in terms of finding all of these parts which you need to make it happen, and to understand how to integrate that code into yours.
You can either use transfer learning using a pretrained model on the imagenet dataset. As mentioned in another answer, there are a bunch of classes inside imagenet close to computers and electronic devices (such as monitors, CD players, laptops, speakers, etc.). So you can fine-tune the model on your dataset and train it to predict computers (train on around 750 images and test on the remaining 250).
You can manually collect images for objects other than computers, preferably a lot of electronic devices (because they are close to computers) and a bunch of other household things (there is a home objects dataset by Caltech). You should collect about 1000 such images to have a class balance. You can train your own custom model once you have this dataset.
No problem!
step one: install a deep-learning toolkit of your choice. they all come with nice tutorials these days.
step two: grab a pre-trained imagenet model. In that model, there are already a few computer classes built into it! ( "desktop_computer", "laptop", 'notebook", and another class for hand-held computers "hand-held_computer")
step three: use model to predict. for this, you'll need to have your images the correct size.
more steps: further fine-tune the model...a bit more advanced but will give you some gains.
Something to think about is what is your goal? accuracy? false positives/negatives, etc? It's always good having a goal of what you need to accomplish from the start.
EDIT: probably the easiest way to get started(if you don't have libraries, gpu, etc) is to go to google colab ( https://colab.research.google.com/notebooks/welcome.ipynb ) and make a notebook in your browser and run the following code.
#some code take and modded from https://www.learnopencv.com/keras-tutorial- using-pre-trained-imagenet-models/
import keras
import numpy as np
from keras.applications import vgg16
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.applications.imagenet_utils import decode_predictions
import matplotlib.pyplot as plt
from PIL import Image
import requests
from io import BytesIO
%matplotlib inline
vgg_model = vgg16.VGG16(weights='imagenet')
def predict_image(image_url, model):
response = requests.get(image_url)
original = Image.open(BytesIO(response.content))
newsize = (224, 224)
original = original.resize(newsize)
# convert the PIL image to a numpy array
# IN PIL - image is in (width, height, channel)
# In Numpy - image is in (height, width, channel)
numpy_image = img_to_array(original)
# Convert the image / images into batch format
# expand_dims will add an extra dimension to the data at a particular axis
# We want the input matrix to the network to be of the form (batchsize, height, width, channels)
# Thus we add the extra dimension to the axis 0.
image_batch = np.expand_dims(numpy_image, axis=0)
plt.imshow(np.uint8(image_batch[0]))
plt.show()
# prepare the image for the VGG model
processed_image = vgg16.preprocess_input(image_batch.copy())
# get the predicted probabilities for each class
predictions = model.predict(processed_image)
# convert the probabilities to class labels
# We will get top 5 predictions which is the default
label = decode_predictions(predictions)
print label[0][0:2] #just display top 2
urls = ['https://4.imimg.com/data4/CO/YS/MY-29352968/samsung-desktop-computer-500x500.jpg', 'https://cdn.britannica.com/77/170477-050-1C747EE3/Laptop-computer.jpg']
for u in urls:
predict_image(u, vgg_model)
This should be a good starting point. Oh, and if the top predicted label is not in the computer, laptop, etc set, then it's NOT a computer!

How to "Iterate" on Computer Vision machine learning model?

I've created a model using google clouds vision api. I spent countless hours labeling data, and trained a model. At the end of almost 20 hours of "training" the model, it's still hit and miss.
How can I iterate on this model? I don't want to lose the "learning" it's done so far.. It works about 3/5 times.
My best guess is that I should loop over the objects again, find where it's wrong, and label accordingly. But I'm not sure of the best method for that. Should I be labeling all images where it "misses" as TEST data images? Are there best practices or resources I can read on this topic?
I'm by no means an expert, but here's what I'd suggest in order of most to least important:
1) Add more data if possible. More data is always a good thing, and helps develop robustness with your network's predictions.
2) Add dropout layers to prevent over-fitting
3) Have a tinker with kernel and bias initialisers
4) [The most relevant answer to your question] Save the training weights of your model and reload them into a new model prior to training.
5) Change up the type of model architecture you're using. Then, have a tinker with epoch numbers, validation splits, loss evaluation formulas, etc.
Hope this helps!
EDIT: More information about number 4
So you can save and load your model weights during or after the model has trained. See here for some more in-depth information about saving.
Broadly, let's cover the basics. I'm assuming you're going through keras but the same applies for tf:
Saving the model after training
Simply call:
model_json = model.to_json()
with open("{Your_Model}.json", "w") as json_file:
json_file.write(model_json)
# serialize weights to HDF5
model.save_weights("{Your_Model}.h5")
print("Saved model to disk")
Loading the model
You can load the model structure from json like so:
from keras.models import model_from_json
json_file = open('{Your_Model.json}', 'r')
loaded_model_json = json_file.read()
json_file.close()
model = model_from_json(loaded_model_json)
And load the weights if you want to:
model.load_weights('{Your_Weights}.h5', by_name=True)
Then compile the model and you're ready to retrain/predict. by_name for me was essential to re-load the weights back into the same model architecture; leaving this out may cause an error.
Checkpointing the model during training
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath={checkpoint_path},
save_weights_only=True,
verbose=1)
# Train the model with the new callback
model.fit(train_images,
train_labels,
epochs=10,
validation_data=(test_images,test_labels),
callbacks=[cp_callback]) # Pass callback to training

Access Clients Loss while having keras tff NN models

I'm trying to obtain the losses of all clients in tensorflow model without luck. The answer to post how to print local outputs in tensorflow federated?
suggests to create our NN model from scratch. However, I already have my keras NN model. So is there a way to still access the local client losses without having to build NN from scratch?
I tried to use tff.federated_collect(), but not sure how is that possible.
This is partly my attempt:
trainer_Itr_Process = tff.learning.build_federated_averaging_process(model_fn_Federated,server_optimizer_fn=(lambda : tf.keras.optimizers.SGD(learning_rate=learn_rate)),client_weight_fn=None)
FLstate = trainer_Itr_Process.initialize()
#tff.learning.Model
def federated_output_computation():
return{
'num_examples': tff.federated_sum(metrics.num_examples),
'loss': tff.federated_mean(metrics.loss, metrics.num_examples),
'accuracy': tff.federated_mean(metrics.accuracy, metrics.num_examples),
'per_client/num_examples': tff.federated_collect(metrics.num_examples),
'per_client/loss': tff.federated_collect(metrics.loss),
'per_client/accuracy': tff.federated_collect(metrics.accuracy),
}
This is the error I received:
#tff.learning.Model
TypeError: object() takes no parameters
tff.learning.Model is not a decorator for functions, it is the class interface used by the tff.learning module.
Probably the best way to change the implementation of tff.learning.Model.federated_output_computation (what is recommended in how to print local outputs in tensorflow federated?) is to create your own subclass of tff.learning.Model, that implements a different federated_output_computation property. This would be close to re-implementing tff.learning.from_keras_model(), except providing the custom metric aggregation; so looking at the implementation (here) can be useful, but ingesting Keras models is non-trivial at the moment.

can we save a partially trained Machine Learning model, reload it again and train from the point it was saved?

I want to know is there any way in which we can partially save a Scikit-Learn Machine Learning model and reload it again to train it from the point it was saved before?
For models such as Scikitlearn applied to sentiment analysis, I would suspect you need to save two important things: 1) your model, 2) your vectorizer.
Remember that after training your model, your words are represented by a vector of length N, and that is defined according to your total number of words.
Below is a piece from my test-model and test-vectorizer saved in order to be used latter.
SAVING THE MODEL
import pickle
pickle.dump(vectorizer, open("model5vectorizer.pickle", "wb"))
pickle.dump(classifier_fitted, open("model5.pickle", "wb"))
LOADING THE MODEL IN A NEW SCRIPT (.py)
import pickle
model = pickle.load(open("model5.pickle", "rb"))
vectorizer = pickle.load(open("model5vectorizer.pickle", "rb"))
TEST YOUR MODEL
sentence_test = ["Results by Andutta et al (2013), were completely wrong and unrealistic."]
USING THE VECTORIZER (model5vectorizer.pickle) !!
sentence_test_data = vectorizer.transform(sentence_test)
print("### sentence_test ###")
print(sentence_test)
print("### sentence_test_data ###")
print(sentence_test_data)
# OBS-1: VECTOR HERE WILL HAVE SAME LENGTH AS BEFORE :)
# OBS-2: If you load the default vectorizer or a different one, then you may see the following problems
# sklearn.exceptions.NotFittedError: TfidfVectorizer - Vocabulary wasn't fitted.
# # ValueError: X has 8 features per sample; expecting 11
result1 = model.predict(sentence_test_data) # using saved vectorizer from calibrated model
print("### RESULT ###")
print(result1)
Hope that helps.
Regards,
Andutta
When a data set is fitted to a Scikit-learn machine learning model, it is trained and supposedly ready to be used for prediction purposes. By training a model with let's say, 100 samples and using it and then going back to it and fitting another 50 samples to it, you will not make it better but you will rebuild it.
If your purpose is to build a model and make it more powerful as it interacts with more samples, you would be thinking of a real-time condition, such as a mobile robot for mapping an environment with a Kalman Filter.

Resources