Can not save best weights using keras while training process - machine-learning

I'm new in Keras. I want save model with best weights like as:
model1.compile(loss="mean_squared_error", optimizer="RMSprop")
model1.summary()
mcp_save = ModelCheckpoint('best_model.h5', save_best_only=True, monitor='val_accuracy', mode='auto', verbose=2)
callbacks_list = [mcp_save]
epochs = 5000
batch_size = 50
# fit the model
history = model1.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
callbacks=callbacks_list,
validation_data=(x_test, y_test),
verbose=2)
I couldn't come across warning or error message on Pycharm 2019 Community edition. But I am not able to see 'best_model.h5' on project file folder or somwhere else on my computer after trainig process finished?? Would you give me advices please?? What are my faults??

Your code looks fine to me. I use this callback often. All I can suggest is that you use a full path to designate where to save the model rather than a relative path.

Related

How do I test keras CNN model for images

First time learning keras and I wanted to make a model that can classify pictures between chicken and nature.
train = ImageDataGenerator(rescale=1/255)
validation = ImageDataGenerator(rescale=1/255)
train_data = train.flow_from_directory('./train', target_size=(200, 200), batch_size=3, class_mode='binary')
validation_data = validation.flow_from_directory('./testchickens', target_size=(200, 200), batch_size=3, class_mode='binary')
model = tf.keras.models.Sequential([tf.keras.layers.Conv2D(16,(3,3), activation='relu', input_shape=(200,200,3)),
tf.keras.layers.MaxPool2D(2,2),
tf.keras.layers.Conv2D(32,(3,3), activation='relu'),
tf.keras.layers.MaxPool2D(2,2),
tf.keras.layers.Conv2D(64,(3,3), activation='relu'),
tf.keras.layers.MaxPool2D(2,2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512,activation='relu'),
tf.keras.layers.Dense(1,activation='sigmoid')
])
model.compile(loss='binary_crossentropy', optimizer=RMSprop(learning_rate=0.001), metrics=['accuracy'])
model_fit = model.fit(train_data, steps_per_epoch=3, epochs=10, validation_data = validation_data)
highest score I got was 88.89% in my model. I realized as I was coding the model that I needed 3 separate data sets or files for my images (training, validation, test).
To test my model I downloaded a few pictures off google that weren't in either of my training or validation files but I am not sure how I can test my model on the sample pictures I got. I figured I can convert the image to an array but the output feels off.
sample = glob.glob('./sample/**/*.jpg', recursive=True)
x = sample[0]
img1 = image.load_img(x, target_size=(200, 200, 3))
X = image.img_to_array(img1)
X = np.expand_dims(X, axis=0)
image = np.vstack([X])
model.predict(image)
For some reason my model classifies every single picture as a chicken when clearly I had pictures of trees and forests in my sample or test file. Before the last block of code I ran into a lot of issues with my input not fitting the dimensions/shapes/layers etc so I think if I understood how the structure works on my model maybe I can fix it? If someone can please explain what my issue is or point me to some resource that explains the layering/arrays of keras models I would greatly appreciate.

My cnn accuracy goes down after adding one more feature

So I made a CNN that classifies two types of birds, and it worked fine. After that, I tried adding one more type, but I got weird results. I already posted this on ai stack exchange, but they said its better to ask it in here, so I am providing a link to that post.
https://ai.stackexchange.com/q/11444/23452
Here is the model code:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten, Conv2D, MaxPooling2D
from tensorflow.keras.callbacks import TensorBoard
import pickle
import time as time
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction = 0.333)
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))
pickle_in = open("C:/Users/Recep/Desktop/programlar/python/X.pickle","rb")
X = pickle.load(pickle_in)
pickle_in = open("C:/Users/Recep/Desktop/programlar/python/Y.pickle","rb")
Y = pickle.load(pickle_in)
X = X/255.0
node_size = 64
model_name = "agi_vs_golden-{}".format(time.time())
tensorboard = TensorBoard(log_dir='C:/Users/Recep/Desktop/programlar/python/logs/{}'.format(model_name))
file_writer = tf.summary.FileWriter('C:/Users/Recep/Desktop/programlar/python/logs/{}'.format(model_name, sess.graph))
model = Sequential()
model.add(Conv2D(node_size,(3,3),input_shape = X.shape[1:]))
#idk what that shape does except that and validation i have no problem
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(node_size,(3,3)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(node_size))
model.add(Activation("relu"))
model.add(Dense(1))
model.add(Activation("sigmoid"))
model.compile(loss="binary_crossentropy",optimizer="adam",metrics=["accuracy"])
model.fit(X,Y,batch_size=25,epochs=8,validation_split=0.1,callbacks=[tensorboard])
# idk what the validation is and how its used but dont think it caused the problem
model.save("agi_vs_gouldian.model")
By the way, as I said in the comments of my original post, I think maybe there is a lack of training the network, or I don't have the enough data. So I tried increasing the number of epochs. It kinda get the problem, but the part that I'm curious about is what happened when I had the lower epochs?
Can anyone help me?
I am giving the tensor board graphs below.
BTW, is my data array rgb?
And how can I get rid of this local max of %70?
And since I'm a beginner to this, I don't know what validation really works, but I saw that the validation graphs stays the same in the first training that I had issues with.
You try to classify three birds with sigmoid. Sigmoid is good for binary classification. Try a softmax activation layer and see how it goes. I suggest replacing
model.add(Dense(1))
model.add(Activation("sigmoid"))
with
model.add(Dense(3, activation='softmax'))
Where 3 is the number of birds' type you want to classify.
Have a look here, a very good tutorial of using softmax as the activation layer for a multi-class classification
https://machinelearningmastery.com/multi-class-classification-tutorial-keras-deep-learning-library/

Proper way to save Transfer Learning model in Keras

I have trained a constitutional net using transfer learning from ResNet50 in keras as given below.
base_model = applications.ResNet50(weights='imagenet', include_top=False, input_shape=(333, 333, 3))
## set model architechture
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
x = Dense(256, activation='relu')(x)
predictions = Dense(y_train.shape[1], activation='softmax')(x)
model = Model(input=base_model.input, output=predictions)
model.compile(loss='categorical_crossentropy', optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
metrics=['accuracy'])
model.summary()
After training the model as given below I want to save the model.
history = model.fit_generator(
train_datagen.flow(x_train, y_train, batch_size=batch_size),
steps_per_epoch=600,
epochs=epochs,
callbacks=callbacks_list
)
I can't use save_model() function from models of keras as model is of type Model here. I used save() function to save the model. But later when i loaded the model and validated the model it behaved like a untrained model. I think the weights were not saved. What was wrong.? How to save this model properly.?
As per Keras official docs,
If you only need to save the architecture of a model you can use
model_json = model.to_json()
with open("model_arch.json", "w") as json_file:
json_file.write(model_json)
To save weights
model.save_weights("my_model_weights.h5")
You can later load the json file and use
from keras.models import model_from_json
model = model_from_json(json_string)
And similarly, for weights you can use
model.load_weights('my_model_weights.h5')
I am using the same approach and this works perfectly well.
I don't know what happens with my models, but I've never been able to use save_model() and load_model(), there is always an error associated. But these functions exist.
What I usually do is to save and load weights (it's enough for using the model, but may cause a little problem for further training, as the "optimizer" state was not saved, but it was never a big problem, soon a new optimizer finds its way)
model.save_weights(fileName)
model.load_weights(fileName)
Another option us using numpy for saving - this one never failed:
np.save(fileName,model.get_weights())
model.set_weights(np.load(fileName))
For this to work, just create your model again (keep the code you use to create it) and set its weights.

Extracting weights values from a tensorflow model checkpoint

I am training a model in tensorflow and I am doing checkpoints for my model. I the Checkpoints directory, I have four files namely,
checkpoint
model.cpkt-0.data-00000-of-00001
model.cpkt-0.index
model.cpkt-0.meta
Now I want to extract the weights values for each layer in my graph, how can I do that?
I tried this:
import tensorflow as tf
sess = tf.InteractiveSession()
saver = tf.train.import_meta_graph('model.cpkt-0.meta')
w = saver.restore(sess, 'model.cpkt-0.data-00000-of-00001')
But I am getting the following error:
Unable to open table file ./model.cpkt-0.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
You are restoring in a wrong way
saver.restore(sess, 'model.cpkt-0')
# get the graph
g = tf.get_default_graph()
w1 = g.get_tensor_by_name('some_variable_name as per your definition in the model')

Tensorflow: save the model with smallest validation error

I ran a training job with tensorflow and got the following curve for loss on validation set. The net starts to overfit after 6000-th iteration. So I'd like to get the model before overfitting.
My training code is something like below:
train_step = ......
summary = tf.scalar_summary(l1_loss.op.name, l1_loss)
summary_writer = tf.train.SummaryWriter("checkpoint", sess.graph)
saver = tf.train.Saver()
for i in xrange(20000):
batch = get_next_batch(batch_size)
sess.run(train_step, feed_dict = {x: batch.x, y:batch.y})
if (i+1) % 100 == 0:
saver.save(sess, "checkpoint/net", global_step = i+1)
summary_str = sess.run(summary, feed_dict=validation_feed_dict)
summary_writer.add_summary(summary_str, i+1)
summary_writer.flush()
After training finishes, there is only five checkpoints saved (19600, 19700, 19800, 19900, 20000). Is there any way to let tensorflow save checkpoint according to the validation error?
P.S. I know that tf.train.Saver has a max_to_keep argument, which in principal could save all the checkpoints. But that's not I wanted (unless it's the only option). I want the saver keep the checkpoint with the smallest validation loss so far. Is that possible?
You need to calculate the classification accuracy on the validation-set and keep track of the best one seen so far, and only write the checkpoint once an improvement has been found to the validation accuracy.
If the data-set and/or model is large, then you may have to split the validation-set into batches to fit the computation in memory.
This tutorial shows exactly how to do what you want:
https://github.com/Hvass-Labs/TensorFlow-Tutorials/blob/master/04_Save_Restore.ipynb
It is also available as a short video:
https://www.youtube.com/watch?v=Lx8JUJROkh0
This can be done with checkpoints. In tensorflow 1:
# you should import other functions/libs as needed to build the model
from keras.callbacks.callbacks import ModelCheckpoint
# add checkpoint to save model with lowest val loss
filepath = 'tf1_mnist_cnn.hdf5'
save_checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, \
save_best_only=True, save_weights_only=False, \
mode='auto', period=1)
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test),
callbacks=[save_checkpoint])
Tensorflow 2:
# import other libs as needed for building model
from tensorflow.keras.callbacks import ModelCheckpoint
# add a checkpoint to save the lowest validation loss
filepath = 'tf2_mnist_model.hdf5'
checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, \
save_best_only=True, save_weights_only=False, \
mode='auto', save_frequency=1)
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test),
callbacks=[checkpoint])
Complete demo files are here: https://github.com/nateGeorge/slurm_gpu_ubuntu/tree/master/demo_files.
In your session.run you'll need to explicitely ask for the loss. Then create a list with your last eval-losses and only if the current eval loss is smaller than i.e. the last two saved losses create the checkpoint.

Resources