I'm importing a pre-trained VGG model in Keras, with
from keras.applications.vgg16 import VGG16
I've noticed that the type of a standard model is keras.models.Sequential, while a pre-trained model is keras.engine.training.Model. I usually add and remove layers with add and pop for sequential models respectively, however, I cannot seem to use pop with pre-trained models.
Is there an alternative to pop for these type of models?
Depends on what you're wanting to remove. If you want to remove the last softmax layer and use the model for transfer learning, you can pass the include_top=False kwarg into the model like so:
from keras.applications.vgg16 import VGG16
IN_SHAPE = (256, 256, 3) # image dimensions and RGB channels
pretrained_model = VGG16(
include_top=False,
input_shape=IN_SHAPE,
weights='imagenet'
)
I wrote a blog post on this use case recently that has some code examples and goes into a bit more detail: http://innolitics.com/10x/pretrained-models-with-keras/
If you're wanting to modify the model architecture more than that, you can access the pop() method via pretrained_model.layers.pop(), as is explained in the link #indraforyou posted.
Side note: When you're modifying layers in a pretrained model, it can be especially helpful to have a visualization of the structure and input/output shapes. pydot and graphviz is particularly useful for this:
import pydot
pydot.find_graphviz = lambda: True
from keras.utils import plot_model
plot_model(model, show_shapes=True, to_file='../model_pdf/{}.pdf'.format(model_name))
Related
I am trying to load a pretrained model resnet_18.pth file into pytorch. Online documentation suggested importing like so:
weights = torch.load("resnet_18.pth")
When I print the output of weights, it gives something like the following:
('module.layer4.1.bn2.running_mean', tensor([ 9.1797e+01, -2.4204e+02, 5.6480e+01, -2.0762e+02, 4.5270e+01,
-3.2356e+02, 1.8662e+02, -1.4498e+02, -2.3701e+02, 3.2354e+01,
...
All of the tutorials mentioned loading weights using a base model:
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.eval()
I want to use a default resnet-18 model to apply the weights on, but I the resent18 from tensorflow vision does not have the load_state_dict function. Help is appreciated.
from torchvision.models import resnet18
resnet18.load_state_dict(torch.load("resnet_18.pth"))
# 'function' object has no attribute 'load_state_dict'
resnet18 is itself a function that returns a ResNet18 model. What you can do to load your own pretrained weights is to use
model = resnet18()
model.load_state_dict(torch.load("resnet_18.pth"))
Note that load_state_dict(...) loads the weights in-place and does not return model itself.
I am trying to convert pix2pix to a pb or onnx that can run in Lens Studio. Lens studio has strict requirements for the models. I am trying to export this pytorch model to onnx using this guide provided by lens studio. The issue is the pytorch model found here uses its own base class, when in the example it uses Module.nn, and therefore doesnt have methods/variables that the torch.onnx.export function needs to run. So far Ive run into its missing a variable called training and a method called train
Would it be worth it to try to modify the base model, or should I try to build it from scratch using nn.Module? Is there a way to make the pix2pix model inherit from both the abstract base class and nn.module? Am I not understanding the situation? The reason why I want to do it using the lens studio tutorial is because I have gotten it to export onnx different ways but Lens Studio wont accept those for various reasons.
Also this is my first time asking a SO question (after 6 years of coding), let me know if I make any mistakes and I can correct them. Thank you.
This is the important code from the tutorial creating a pytorch model for Lens Studio:
import torch
import torch.nn as nn
class Model(nn.Module):
def __init__(self):
super().__init__()
self.layer = nn.Conv2d(in_channels=3, out_channels=1,
kernel_size=3, stride=2, padding=1)
def forward(self, x):
out = self.layer(x)
out = nn.functional.interpolate(out, scale_factor=2,
mode='bilinear', align_corners=True)
out = torch.nn.functional.softmax(out, dim=1)
return out
I'm not going to include all the code from the pytorch model bc its large, but the beginning of the baseModel.py is
import os
import torch
from collections import OrderedDict
from abc import ABC, abstractmethod
from . import networks
class BaseModel(ABC):
"""This class is an abstract base class (ABC) for models.
To create a subclass, you need to implement the following five functions:
-- <__init__>: initialize the class; first call BaseModel.__init__(self, opt).
-- <set_input>: unpack data from dataset and apply preprocessing.
-- <forward>: produce intermediate results.
-- <optimize_parameters>: calculate losses, gradients, and update network weights.
-- <modify_commandline_options>: (optionally) add model-specific options and set default options.
"""
def __init__(self, opt):
"""Initialize the BaseModel class.
Parameters:
opt (Option class)-- stores all the experiment flags; needs to be a subclass of BaseOptions
When creating your custom class, you need to implement your own initialization.
In this function, you should first call <BaseModel.__init__(self, opt)>
Then, you need to define four lists:
-- self.loss_names (str list): specify the training losses that you want to plot and save.
-- self.model_names (str list): define networks used in our training.
-- self.visual_names (str list): specify the images that you want to display and save.
-- self.optimizers (optimizer list): define and initialize optimizers. You can define one optimizer for each network. If two networks are updated at the same time, you can use itertools.chain to group them. See cycle_gan_model.py for an example.
"""
self.opt = opt
self.gpu_ids = opt.gpu_ids
self.isTrain = opt.isTrain
self.device = torch.device('cuda:{}'.format(self.gpu_ids[0])) if self.gpu_ids else torch.device('cpu') # get device name: CPU or GPU
self.save_dir = os.path.join(opt.checkpoints_dir, opt.name) # save all the checkpoints to save_dir
if opt.preprocess != 'scale_width': # with [scale_width], input images might have different sizes, which hurts the performance of cudnn.benchmark.
torch.backends.cudnn.benchmark = True
self.loss_names = []
self.model_names = []
self.visual_names = []
self.optimizers = []
self.image_paths = []
self.metric = 0 # used for learning rate policy 'plateau'
and for pix2pix_model.py
import torch
from .base_model import BaseModel
from . import networks
class Pix2PixModel(BaseModel):
""" This class implements the pix2pix model, for learning a mapping from input images to output images given paired data.
The model training requires '--dataset_mode aligned' dataset.
By default, it uses a '--netG unet256' U-Net generator,
a '--netD basic' discriminator (PatchGAN),
and a '--gan_mode' vanilla GAN loss (the cross-entropy objective used in the orignal GAN paper).
pix2pix paper: https://arxiv.org/pdf/1611.07004.pdf
"""
#staticmethod
def modify_commandline_options(parser, is_train=True):
"""Add new dataset-specific options, and rewrite default values for existing options.
Parameters:
parser -- original option parser
is_train (bool) -- whether training phase or test phase. You can use this flag to add training-specific or test-specific options.
Returns:
the modified parser.
For pix2pix, we do not use image buffer
The training objective is: GAN Loss + lambda_L1 * ||G(A)-B||_1
By default, we use vanilla GAN loss, UNet with batchnorm, and aligned datasets.
"""
# changing the default values to match the pix2pix paper (https://phillipi.github.io/pix2pix/)
parser.set_defaults(norm='batch', netG='unet_256', dataset_mode='aligned')
if is_train:
parser.set_defaults(pool_size=0, gan_mode='vanilla')
parser.add_argument('--lambda_L1', type=float, default=100.0, help='weight for L1 loss')
return parser
def __init__(self, opt):
"""Initialize the pix2pix class.
Parameters:
opt (Option class)-- stores all the experiment flags; needs to be a subclass of BaseOptions
"""
(Also sidenote if you see this and it looks like no easy way out let me know, I know what its like seeing someone getting started in something who goes in to deep too early on)
You can definitely have your model inherit from both the base class and torch.nn.Module (python allows multiple inheritance). However you should take care about the conflicts if both inherited class have functions with identical names (I can see at least one : their base provide the eval function and so to nn.module).
However since you do not need the CycleGan, and a lot of the code is compatibility with their training environment, you'd probably better just re-implement the pix2pix. Just steal the code, have it inherit from nn.Module, copy-paste useful/mandatory functions from the base class, and have everything translated into clean pytorch code. You already have the forward function (which is the only requirement for a pytorch module).
All the subnetworks they use (like the resnet blocks) seem to inherit from nn.Module already so there is nothing to change here (double check that though)
I have pre-trained word embeddings from two different languages using MUSE. Now suppose I have a encoder-decoder architecture. And I created a embedding layer from one of this embedding. But where do I pass it in the model?
The model is trying to translate from one language to another. I have created a embedding_layer. Where do I pass it in in the below code?
"""
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
# Run training
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
model.fit([encoder_input_data, decoder_input_data], decoder_target_data,
batch_size=batch_size,
epochs=epochs,
validation_split=0.2)
"""
Look at the docs of keras: https://keras.io/getting-started/faq/
If you have the whole model saved, you can load the model using the command
keras.models.load_model(filepath)
This is the code example from the Kera docs:
from keras.models import load_model
model.save('my_model.h5') # creates a HDF5 file 'my_model.h5'
del model # deletes the existing model
# returns a compiled model
# identical to the previous one
model = load_model('my_model.h5')
If you have just only the weights, You can use this command:
model.load_weights('my_model_weights.h5')
I am looking for a way to re initialize layer's weights in an existing keras pre trained model.
I am using python with keras and need to use transfer learning,
I use the following code to load the pre trained keras models
from keras.applications import vgg16, inception_v3, resnet50, mobilenet
vgg_model = vgg16.VGG16(weights='imagenet')
I read that when using a dataset that is very different than the original dataset it might be beneficial to create new layers over the lower level features that we have in the trained net.
I found how to allow fine tuning of parameters and now I am looking for a way to reset a selected layer for it to re train. I know I can create a new model and use layer n-1 as input and add layer n to it, but I am looking for a way to reset the parameters in an existing layer in an existing model.
For whatever reason you may want to re-initalize the weights of a single layer k, here is a general way to do it:
from keras.applications import vgg16
from keras import backend as K
vgg_model = vgg16.VGG16(weights='imagenet')
sess = K.get_session()
initial_weights = vgg_model.get_weights()
from keras.initializers import glorot_uniform # Or your initializer of choice
k = 30 # say for layer 30
new_weights = [glorot_uniform()(initial_weights[i].shape).eval(session=sess) if i==k else initial_weights[i] for i in range(len(initial_weights))]
vgg_model.set_weights(new_weights)
You can easily verify that initial_weights[k]==new_weights[k] returns an array of False, while initial_weights[i]==new_weights[i] for any other i returns an array of True.
I'm looking to do some image classification on PDF documents that I convert to images. I'm using tensorflow inception v3 pre trained model and trying to retrain the last layer with my own categories following the tensorflow tuto. I have ~1000 training images per category and only 4 categories. With 200k iterations I can reach up to 90% of successful classifications, which is not bad but still need some work:
The issue here is this pre-trained model takes only 300*300p images for input. Obviously it messes up a lot with the characters involved in the features I try to recognize in the documents.
Would it be possible to alter the model input layer so I can give him images with better resolution ?
Would I get better results with a home made and way simpler model ?
If so, where should I start to build a model for such image classification ?
If you want to use a different image resolution than the pre-trained model uses , you should use only the convolution blocks and have a set of fully connected blocks with respect to the new size. Using a higher level library like Keras will make it a lot easier. Below is an example on how to do that in Keras.
import keras
from keras.layers import Flatten,Dense,GlobalAveragePooling2D
from keras.models import Model
from keras.applications.inception_v3 import InceptionV3
base_model = InceptionV3(include_top=False,input_shape=(600,600,3),weights='imagenet')
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024,activation='relu')(x)
#Add as many dense layers / Fully Connected layers required
pred = Dense(10,activation='softmax')(x)
model = Model(base_model.input,pred)
for l in model.layers[:-3]:
l.trainable=False
The input_top = False will give you only the convolution blocks. You can use the input_shape=(600,600,3) to set the required shape you want. And you can add a couple of dense blocks/Fully connected blocks/layers to the model.The last layer should contain the required number of categories .10 represent the number of classes.By this approach you use all the weights associated with the convolution layers of the pre trained model and train only the last dense layers.