I am trying to load a pretrained model resnet_18.pth file into pytorch. Online documentation suggested importing like so:
weights = torch.load("resnet_18.pth")
When I print the output of weights, it gives something like the following:
('module.layer4.1.bn2.running_mean', tensor([ 9.1797e+01, -2.4204e+02, 5.6480e+01, -2.0762e+02, 4.5270e+01,
-3.2356e+02, 1.8662e+02, -1.4498e+02, -2.3701e+02, 3.2354e+01,
...
All of the tutorials mentioned loading weights using a base model:
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.eval()
I want to use a default resnet-18 model to apply the weights on, but I the resent18 from tensorflow vision does not have the load_state_dict function. Help is appreciated.
from torchvision.models import resnet18
resnet18.load_state_dict(torch.load("resnet_18.pth"))
# 'function' object has no attribute 'load_state_dict'
resnet18 is itself a function that returns a ResNet18 model. What you can do to load your own pretrained weights is to use
model = resnet18()
model.load_state_dict(torch.load("resnet_18.pth"))
Note that load_state_dict(...) loads the weights in-place and does not return model itself.
Related
I need to extract features from a pretrained (fine-tuned) BERT model.
I fine-tuned a pretrained BERT model in Pytorch using huggingface transformer. All the training/validation is done on a GPU in cloud.
At the end of the training, I save the model and tokenizer like below:
best_model.save_pretrained('./saved_model/')
tokenizer.save_pretrained('./saved_model/')
This creates below files in the saved_model directory:
config.json
added_token.json
special_tokens_map.json
tokenizer_config.json
vocab.txt
pytorch_model.bin
I save the saved_model directory in my computer and load the model and tokenizer like below
model = torch.load('./saved_model/pytorch_model.bin',map_location=torch.device('cpu'))
tokenizer = BertTokenizer.from_pretrained('./saved_model/')
Now to extract features, I do below
input_ids = torch.tensor([tokenizer.encode("Here is some text to encode", add_special_tokens=True)])
last_hidden_states = model(input_ids)[0][0]
But for the last line, it throws me error TypeError: 'collections.OrderedDict' object is not callable
It seems like I am not loading the model properly. Instead of loading the entire model in itself, I think my model=torch.load(....) line is loading a ordered dictionary.
What am I missing here? Am I even saving the model in the right way? Please suggest.
torch.load() returns a collections.OrderedDict object. Checkout the recommended way of saving and loading a model's state dict.
Save:
torch.save(model.state_dict(), PATH)
Load:
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.eval()
So, in your case, it should be:
model = BertModel(config)
model.load_state_dict('./saved_model/pytorch_model.bin',
map_location=torch.device('cpu'))
model.eval() # to disable dropouts
I have pre-trained word embeddings from two different languages using MUSE. Now suppose I have a encoder-decoder architecture. And I created a embedding layer from one of this embedding. But where do I pass it in the model?
The model is trying to translate from one language to another. I have created a embedding_layer. Where do I pass it in in the below code?
"""
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
# Run training
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
model.fit([encoder_input_data, decoder_input_data], decoder_target_data,
batch_size=batch_size,
epochs=epochs,
validation_split=0.2)
"""
Look at the docs of keras: https://keras.io/getting-started/faq/
If you have the whole model saved, you can load the model using the command
keras.models.load_model(filepath)
This is the code example from the Kera docs:
from keras.models import load_model
model.save('my_model.h5') # creates a HDF5 file 'my_model.h5'
del model # deletes the existing model
# returns a compiled model
# identical to the previous one
model = load_model('my_model.h5')
If you have just only the weights, You can use this command:
model.load_weights('my_model_weights.h5')
I am looking for a way to re initialize layer's weights in an existing keras pre trained model.
I am using python with keras and need to use transfer learning,
I use the following code to load the pre trained keras models
from keras.applications import vgg16, inception_v3, resnet50, mobilenet
vgg_model = vgg16.VGG16(weights='imagenet')
I read that when using a dataset that is very different than the original dataset it might be beneficial to create new layers over the lower level features that we have in the trained net.
I found how to allow fine tuning of parameters and now I am looking for a way to reset a selected layer for it to re train. I know I can create a new model and use layer n-1 as input and add layer n to it, but I am looking for a way to reset the parameters in an existing layer in an existing model.
For whatever reason you may want to re-initalize the weights of a single layer k, here is a general way to do it:
from keras.applications import vgg16
from keras import backend as K
vgg_model = vgg16.VGG16(weights='imagenet')
sess = K.get_session()
initial_weights = vgg_model.get_weights()
from keras.initializers import glorot_uniform # Or your initializer of choice
k = 30 # say for layer 30
new_weights = [glorot_uniform()(initial_weights[i].shape).eval(session=sess) if i==k else initial_weights[i] for i in range(len(initial_weights))]
vgg_model.set_weights(new_weights)
You can easily verify that initial_weights[k]==new_weights[k] returns an array of False, while initial_weights[i]==new_weights[i] for any other i returns an array of True.
I have trained a constitutional net using transfer learning from ResNet50 in keras as given below.
base_model = applications.ResNet50(weights='imagenet', include_top=False, input_shape=(333, 333, 3))
## set model architechture
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
x = Dense(256, activation='relu')(x)
predictions = Dense(y_train.shape[1], activation='softmax')(x)
model = Model(input=base_model.input, output=predictions)
model.compile(loss='categorical_crossentropy', optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
metrics=['accuracy'])
model.summary()
After training the model as given below I want to save the model.
history = model.fit_generator(
train_datagen.flow(x_train, y_train, batch_size=batch_size),
steps_per_epoch=600,
epochs=epochs,
callbacks=callbacks_list
)
I can't use save_model() function from models of keras as model is of type Model here. I used save() function to save the model. But later when i loaded the model and validated the model it behaved like a untrained model. I think the weights were not saved. What was wrong.? How to save this model properly.?
As per Keras official docs,
If you only need to save the architecture of a model you can use
model_json = model.to_json()
with open("model_arch.json", "w") as json_file:
json_file.write(model_json)
To save weights
model.save_weights("my_model_weights.h5")
You can later load the json file and use
from keras.models import model_from_json
model = model_from_json(json_string)
And similarly, for weights you can use
model.load_weights('my_model_weights.h5')
I am using the same approach and this works perfectly well.
I don't know what happens with my models, but I've never been able to use save_model() and load_model(), there is always an error associated. But these functions exist.
What I usually do is to save and load weights (it's enough for using the model, but may cause a little problem for further training, as the "optimizer" state was not saved, but it was never a big problem, soon a new optimizer finds its way)
model.save_weights(fileName)
model.load_weights(fileName)
Another option us using numpy for saving - this one never failed:
np.save(fileName,model.get_weights())
model.set_weights(np.load(fileName))
For this to work, just create your model again (keep the code you use to create it) and set its weights.
I'm importing a pre-trained VGG model in Keras, with
from keras.applications.vgg16 import VGG16
I've noticed that the type of a standard model is keras.models.Sequential, while a pre-trained model is keras.engine.training.Model. I usually add and remove layers with add and pop for sequential models respectively, however, I cannot seem to use pop with pre-trained models.
Is there an alternative to pop for these type of models?
Depends on what you're wanting to remove. If you want to remove the last softmax layer and use the model for transfer learning, you can pass the include_top=False kwarg into the model like so:
from keras.applications.vgg16 import VGG16
IN_SHAPE = (256, 256, 3) # image dimensions and RGB channels
pretrained_model = VGG16(
include_top=False,
input_shape=IN_SHAPE,
weights='imagenet'
)
I wrote a blog post on this use case recently that has some code examples and goes into a bit more detail: http://innolitics.com/10x/pretrained-models-with-keras/
If you're wanting to modify the model architecture more than that, you can access the pop() method via pretrained_model.layers.pop(), as is explained in the link #indraforyou posted.
Side note: When you're modifying layers in a pretrained model, it can be especially helpful to have a visualization of the structure and input/output shapes. pydot and graphviz is particularly useful for this:
import pydot
pydot.find_graphviz = lambda: True
from keras.utils import plot_model
plot_model(model, show_shapes=True, to_file='../model_pdf/{}.pdf'.format(model_name))