I need to extract features from a pretrained (fine-tuned) BERT model.
I fine-tuned a pretrained BERT model in Pytorch using huggingface transformer. All the training/validation is done on a GPU in cloud.
At the end of the training, I save the model and tokenizer like below:
best_model.save_pretrained('./saved_model/')
tokenizer.save_pretrained('./saved_model/')
This creates below files in the saved_model directory:
config.json
added_token.json
special_tokens_map.json
tokenizer_config.json
vocab.txt
pytorch_model.bin
I save the saved_model directory in my computer and load the model and tokenizer like below
model = torch.load('./saved_model/pytorch_model.bin',map_location=torch.device('cpu'))
tokenizer = BertTokenizer.from_pretrained('./saved_model/')
Now to extract features, I do below
input_ids = torch.tensor([tokenizer.encode("Here is some text to encode", add_special_tokens=True)])
last_hidden_states = model(input_ids)[0][0]
But for the last line, it throws me error TypeError: 'collections.OrderedDict' object is not callable
It seems like I am not loading the model properly. Instead of loading the entire model in itself, I think my model=torch.load(....) line is loading a ordered dictionary.
What am I missing here? Am I even saving the model in the right way? Please suggest.
torch.load() returns a collections.OrderedDict object. Checkout the recommended way of saving and loading a model's state dict.
Save:
torch.save(model.state_dict(), PATH)
Load:
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.eval()
So, in your case, it should be:
model = BertModel(config)
model.load_state_dict('./saved_model/pytorch_model.bin',
map_location=torch.device('cpu'))
model.eval() # to disable dropouts
Related
I am trying to load a pretrained model resnet_18.pth file into pytorch. Online documentation suggested importing like so:
weights = torch.load("resnet_18.pth")
When I print the output of weights, it gives something like the following:
('module.layer4.1.bn2.running_mean', tensor([ 9.1797e+01, -2.4204e+02, 5.6480e+01, -2.0762e+02, 4.5270e+01,
-3.2356e+02, 1.8662e+02, -1.4498e+02, -2.3701e+02, 3.2354e+01,
...
All of the tutorials mentioned loading weights using a base model:
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.eval()
I want to use a default resnet-18 model to apply the weights on, but I the resent18 from tensorflow vision does not have the load_state_dict function. Help is appreciated.
from torchvision.models import resnet18
resnet18.load_state_dict(torch.load("resnet_18.pth"))
# 'function' object has no attribute 'load_state_dict'
resnet18 is itself a function that returns a ResNet18 model. What you can do to load your own pretrained weights is to use
model = resnet18()
model.load_state_dict(torch.load("resnet_18.pth"))
Note that load_state_dict(...) loads the weights in-place and does not return model itself.
The motivation behind this question is I had saved a Keras model using Matterport's MaskRCNN and in the tf.keras.callbacks.ModelCheckpoint() had very explicitly set the save_weights_only argument to False, so that the entire model would be saved (not just the weights).
Turns out there's a bug in the ModelCheckpoint() callback where it sometimes does not save the full model.
This is obviously a problem when you go to load the model after closing your TF session, as the Graph, architecture, and optimizer state are gone, making it hard (if not impossible) to reload that saved model.
Therefore, I am asking whether it is possible to somehow extract the TF session retroactively, from just the .h5 weights file, after the session has closed (resulting from, for example, your Notebook kernel crashing).
Not much code to go on, but there it is:
Given a .h5 file that was saved after each epoch of training a model in Keras, is it possible to extract the Graph session from that .h5 file, and if so, how?
I have several models saved in .h5 format but never called tf.get_session() during the saving of the model weights in h5 format.
with tf.session() as sess:
how to load this model using Tensorflow
TF 2.0 makes this a cinch, but how to solve this on Tensorflow version 1.14?
The end goal of this is to take a model saved with Keras as a .h5 file and do inference with it on Tensorflow Serving, which needs, to my knowledge, a protobuf file in .pb format.
https://medium.com/#pipidog/how-to-convert-your-keras-models-to-tensorflow-e471400b886a
I've tried keras_to_tensorflow:
https://github.com/amir-abdi/keras_to_tensorflow
The code to convert ModelCheckPoint saved in .h5 format to .pb format is shown below:
import tensorflow as tf
# The export path contains the name and the version of the model
tf.keras.backend.set_learning_phase(0) # Ignore dropout at inference
model = tf.keras.models.load_model('./model.h5')
export_path = './PlanetModel/1'
# Fetch the Keras session and save the model
# The signature definition is defined by the input and output tensors
# And stored with the default serving key
with tf.keras.backend.get_session() as sess:
tf.saved_model.simple_save(
sess,
export_path,
inputs={'input_image': model.input},
outputs={t.name:t for t in model.outputs})
For more information, please refer this article.
For other ways to do it, please refer this Stack Overflow Answer.
I fine-tuned a pretrained BERT model in Pytorch using huggingface transformer. All the training/validation is done on a GPU in cloud.
At the end of the training, I save the model and tokenizer like below:
best_model.save_pretrained('./saved_model/')
tokenizer.save_pretrained('./saved_model/')
This creates below files in the saved_model directory:
config.json
added_token.json
special_tokens_map.json
tokenizer_config.json
vocab.txt
pytorch_model.bin
Now, I download the saved_model directory in my computer and want to load the model and tokenizer. I can load the model like below
model = torch.load('./saved_model/pytorch_model.bin',map_location=torch.device('cpu'))
But how do I load the tokenizer? I am new to pytorch and not sure because there are multiple files. Probably I am not saving the model in the right way?
If you look at the syntax, it is the directory of the pre-trained model that you are supposed to pass. Hence, the correct way to load tokenizer must be:
tokenizer = BertTokenizer.from_pretrained(<Path to the directory containing pretrained model/tokenizer>)
In your case:
tokenizer = BertTokenizer.from_pretrained('./saved_model/')
./saved_model here is the directory where you'll be saving your pretrained model and tokenizer.
I have same my inception model in Pycharm using TensorFlow library. Every time I run the project, it starts training the Data set. I want to skip the training every time I run model because once the model has been save ,there is no need to train the data again and again. How I get to know my model has been save successfully? How can I apply the save model in same file?
You can save/restore/load your model using TensorFlow:
Save:
builder = tf.saved_model.builder.SavedModelBuilder(export_dir) with tf.Session(graph=tf.Graph()) as sess: ... builder.add_meta_graph_and_variables(sess,
[tag_constants.TRAINING],
signature_def_map=foo_signatures,
assets_collection=foo_assets,
strip_default_attrs=True)
...
builder.save()
Load:
with tf.Session(graph=tf.Graph()) as sess:
tf.saved_model.loader.load(sess, [tag_constants.TRAINING], export_dir)
...
For further reference: TensorFlow Guide on Saving a Model
Actually, once you have saved your model, some files will be saved to your directory with the extension .YAML, .h5 or .meta(for graph), you can check the accuracy of model by restoring from saved file, just for sanity check.
There is nice tutorial on this:
https://www.tensorflow.org/guide/saved_model
http://cv-tricks.com/tensorflow-tutorial/save-restore-tensorflow-models-quick-complete-tutorial/
If you are use keras-api to build model, then this link will be useful for saving and restoring https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model
To save a model in Keras, what are the differences between the output files of:
model.save()
model.save_weights()
ModelCheckpoint() in the callback
The saved file from model.save() is larger than the model from model.save_weights(), but significantly larger than a JSON or Yaml model architecture file. Why is this?
Restating this: Why is size(model.save()) + size(something) = size(model.save_weights()) + size(model.to_json()), what is that "something"?
Would it be more efficient to just model.save_weights() and model.to_json(), and load from these than to just do model.save() and load_model()?
What are the differences?
save() saves the weights and the model structure to a single HDF5 file. I believe it also includes things like the optimizer state. Then you can use that HDF5 file with load() to reconstruct the whole model, including weights.
save_weights() only saves the weights to HDF5 and nothing else. You need extra code to reconstruct the model from a JSON file.
model.save_weights(): Will only save the weights so if you need, you are able to apply them on a different architecture
mode.save(): Will save the architecture of the model + the the weights + the training configuration + the state of the optimizer
Just to add what ModelCheckPoint's output is, if it's relevant for anyone else: used as a callback during model training, it can either save the whole model or just the weights depending on what state the save_weights_only argument is set to. TRUE and weights only are saved, akin to calling model.save_weights(). FALSE (default) and the whole model is saved, as in calling model.save().
Adding to the answers above, as of tf.keras version '2.7.0', the model can be saved in 2 formats using model.save() i.e., the TensorFlow SavedModel format, and the older Keras H5 format. The recommended format is SavedModel and it is the default when model.save() is called. To save to .h5(HDF5) format, use model.save('my_model', save_format='h5') More