Saving lstm language model in Torch - lua

I am using the lstm language model in https://github.com/wojzaremba/lstm/blob/master/main.lua
I want to save the model at the end of training for later use. I added the following line at the end of training
torch.save(params.model_file, model)
Which seems to successfully save the model. However, when I try to load that model and test it, I get a very large perplexity. Just for testing, I ran a small training instance, which resulted in a test set perplexity of 134, then saved the model. I then loaded the saved model and applied exactly the same testing method (function run_test) on the same test set, but I got a huge perplexity of 71675.134 (even using random weights gives much lower perplexity than that!). I tried saving and loading only the weights, converting them to float() before saving, or saving them as cudaTensors, and all gave me the same result.
Here is the code for loading and testing after saving the whole model; I only modified the main method from the original main.lua:
local function main()
g_init_gpu(arg)
print('loading model from file ' .. params.model_file)
model=torch.load(params.model_file)
state_test = {data=transfer_data(ptb.testdataset(params.batch_size))}
reset_state(state_test)
run_test()
end

Related

Keras Loaded model always train instead of predict

I am doing a project of deep learning. After training I save the model as h5. In another file, I load the saved model and use the model to predict. However when I run the code in Pycharm, the model starts training again. I restart my laptop and run again but the same thing still appears. Is pycharm running on wrong file?
model.save('model_10000.h5')
Then in another file
model = load_model('model_10000.h5')
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
# predict for test set
pred = model.predict(testX)
this is what i got
You don't need to compile your saved model, maybe that has something to do with it.
model.save('model_10000.h5')
model = load_model('model_10000.h5')
pred = model.predict(testX)
Check this for more detail: https://www.tensorflow.org/tutorials/keras/save_and_load
You need not re-compile the model. your saved model is always compiled and when you load it back, it always return the compiled model (refer Keras FAQ).
So, just remove the model.compile step and you will be good to go.

How to extract features from a pytorch pretrained fine-tuned model

I need to extract features from a pretrained (fine-tuned) BERT model.
I fine-tuned a pretrained BERT model in Pytorch using huggingface transformer. All the training/validation is done on a GPU in cloud.
At the end of the training, I save the model and tokenizer like below:
best_model.save_pretrained('./saved_model/')
tokenizer.save_pretrained('./saved_model/')
This creates below files in the saved_model directory:
config.json
added_token.json
special_tokens_map.json
tokenizer_config.json
vocab.txt
pytorch_model.bin
I save the saved_model directory in my computer and load the model and tokenizer like below
model = torch.load('./saved_model/pytorch_model.bin',map_location=torch.device('cpu'))
tokenizer = BertTokenizer.from_pretrained('./saved_model/')
Now to extract features, I do below
input_ids = torch.tensor([tokenizer.encode("Here is some text to encode", add_special_tokens=True)])
last_hidden_states = model(input_ids)[0][0]
But for the last line, it throws me error TypeError: 'collections.OrderedDict' object is not callable
It seems like I am not loading the model properly. Instead of loading the entire model in itself, I think my model=torch.load(....) line is loading a ordered dictionary.
What am I missing here? Am I even saving the model in the right way? Please suggest.
torch.load() returns a collections.OrderedDict object. Checkout the recommended way of saving and loading a model's state dict.
Save:
torch.save(model.state_dict(), PATH)
Load:
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.eval()
So, in your case, it should be:
model = BertModel(config)
model.load_state_dict('./saved_model/pytorch_model.bin',
map_location=torch.device('cpu'))
model.eval() # to disable dropouts

I applied an inception model and my model has been savde but how do I avoid training the dataset again and agian?

I have same my inception model in Pycharm using TensorFlow library. Every time I run the project, it starts training the Data set. I want to skip the training every time I run model because once the model has been save ,there is no need to train the data again and again. How I get to know my model has been save successfully? How can I apply the save model in same file?
You can save/restore/load your model using TensorFlow:
Save:
builder = tf.saved_model.builder.SavedModelBuilder(export_dir) with tf.Session(graph=tf.Graph()) as sess: ... builder.add_meta_graph_and_variables(sess,
[tag_constants.TRAINING],
signature_def_map=foo_signatures,
assets_collection=foo_assets,
strip_default_attrs=True)
...
builder.save()
Load:
with tf.Session(graph=tf.Graph()) as sess:
tf.saved_model.loader.load(sess, [tag_constants.TRAINING], export_dir)
...
For further reference: TensorFlow Guide on Saving a Model
Actually, once you have saved your model, some files will be saved to your directory with the extension .YAML, .h5 or .meta(for graph), you can check the accuracy of model by restoring from saved file, just for sanity check.
There is nice tutorial on this:
https://www.tensorflow.org/guide/saved_model
http://cv-tricks.com/tensorflow-tutorial/save-restore-tensorflow-models-quick-complete-tutorial/
If you are use keras-api to build model, then this link will be useful for saving and restoring https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model

Difference between Keras model.save() and model.save_weights()?

To save a model in Keras, what are the differences between the output files of:
model.save()
model.save_weights()
ModelCheckpoint() in the callback
The saved file from model.save() is larger than the model from model.save_weights(), but significantly larger than a JSON or Yaml model architecture file. Why is this?
Restating this: Why is size(model.save()) + size(something) = size(model.save_weights()) + size(model.to_json()), what is that "something"?
Would it be more efficient to just model.save_weights() and model.to_json(), and load from these than to just do model.save() and load_model()?
What are the differences?
save() saves the weights and the model structure to a single HDF5 file. I believe it also includes things like the optimizer state. Then you can use that HDF5 file with load() to reconstruct the whole model, including weights.
save_weights() only saves the weights to HDF5 and nothing else. You need extra code to reconstruct the model from a JSON file.
model.save_weights(): Will only save the weights so if you need, you are able to apply them on a different architecture
mode.save(): Will save the architecture of the model + the the weights + the training configuration + the state of the optimizer
Just to add what ModelCheckPoint's output is, if it's relevant for anyone else: used as a callback during model training, it can either save the whole model or just the weights depending on what state the save_weights_only argument is set to. TRUE and weights only are saved, akin to calling model.save_weights(). FALSE (default) and the whole model is saved, as in calling model.save().
Adding to the answers above, as of tf.keras version '2.7.0', the model can be saved in 2 formats using model.save() i.e., the TensorFlow SavedModel format, and the older Keras H5 format. The recommended format is SavedModel and it is the default when model.save() is called. To save to .h5(HDF5) format, use model.save('my_model', save_format='h5') More

Predicting text data labels in test data set with Weka?

I am using the Weka gui to train a SVM classifier (using libSVM) on a dataset. The data in the .arff file is
#relation Expandtext
#attribute message string
#attribute Class {positive, negative, objective}
#data
I turn it into a bag of words with String-to-Word Vector, run SVM and get a decent classification rate. Now I have my test data I want to predict their labels which I do not know. Again it's header information is the same but for every class it is labeled with a question mark (?) ie
'Musical awareness: Great Big Beautiful Tomorrow has an ending\u002c Now is the time does not', ?
Again I pre-processed it, string-to-word-vector, class is in the same position as the training data.
I go to the "classify" menu, load up my trained SVM model, select "supplied test data", load in the test data and right click on the model saying "Re-evaluate model on current test set" but it gives me the error that test and train are not compatible. I am not sure why.
Am I going about this the wrong way to label the test data? What am I doing wrong?
For almost any machine learning algorithm, the training data and the test data need to have the same format. That means, both must have the same features, i.e. attributes in weka, in the same format, including the class.
The problem is probably that you pre-process the training set and the test set independently, and the StrintToWordVectorFilter will create different features for each set. Hence, the model, trained on the training set, is incompatible to the test set.
What you rather want to do is initialize the filter on the training set and then apply it on both training and test set.
The question Weka: ReplaceMissingValues for a test file deals with this issue, but I'll repeat the relevant part here:
Instances train = ... // from somewhere
Instances test = ... // from somewhere
Filter filter = new StringToWordVector(); // could be any filter
filter.setInputFormat(train); // initializing the filter once with training set
Instances newTrain = Filter.useFilter(train, filter); // configures the Filter based on train instances and returns filtered instances
Instances newTest = Filter.useFilter(test, filter); // create new test set
Now, you can train the SVM and apply the resulting model on the test data.
If training and testing have to be in separate runs or programs, it should be possible to serialize the initialized filter together with the model. When you load (deserialize) the model, you can also load the filter and apply it on the test data. They should be compatibel now.

Resources