There are two python files, The first one is for saving the tensorflow
model. The second one is for restoring the saved model.
Question:
When I run the two files one after another, it's ok.
When I run the first one, restart the edit and run the second one,it
tells me that the w1 is not defined?
What I want to do is:
Save a tensorflow model
Restore the saved model
What's wrong with it? Thanks for your kindly help?
model_save.py
import tensorflow as tf
w1 = tf.Variable(tf.random_normal(shape=[2]), name='w1')
w2 = tf.Variable(tf.random_normal(shape=[5]), name='w2')
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
saver.save(sess, 'SR\\my-model')
model_restore.py
import tensorflow as tf
with tf.Session() as sess:
saver = tf.train.import_meta_graph('SR\\my-model.meta')
saver.restore(sess,'SR\\my-model')
print (sess.run(w1))
Briefly, you should use
print (sess.run(tf.get_default_graph().get_tensor_by_name('w1:0')))
instead of print (sess.run(w1)) in your model_restore.py file.
model_save.py
import tensorflow as tf
w1_node = tf.Variable(tf.random_normal(shape=[2]), name='w1')
w2_node = tf.Variable(tf.random_normal(shape=[5]), name='w2')
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(w1_node.eval()) # [ 0.43350926 1.02784836]
#print(w1.eval()) # NameError: name 'w1' is not defined
saver.save(sess, 'my-model')
w1_node is only defined in model_save.py, and model_restore.py file can't recognize it.
When we call a Tensor variable by its name, we should use get_tensor_by_name, as this post Tensorflow: How to get a tensor by name? suggested.
model_restore.py
import tensorflow as tf
with tf.Session() as sess:
saver = tf.train.import_meta_graph('my-model.meta')
saver.restore(sess,'my-model')
print (sess.run(tf.get_default_graph().get_tensor_by_name('w1:0')))
# [ 0.43350926 1.02784836]
print(tf.global_variables()) # print tensor variables
# [<tf.Variable 'w1:0' shape=(2,) dtype=float32_ref>,
# <tf.Variable 'w2:0' shape=(5,) dtype=float32_ref>]
for op in tf.get_default_graph().get_operations():
print str(op.name) # print all the operation nodes' name
Related
I'm running this code:
model = CIFAR10Classifier()
trainer = pl.Trainer(max_epochs=50, gpus=1, default_root_dir="..", enable_checkpointing=False)
# trainer.fit(model, train_dataloader, valid_dataloader)
model = CIFAR10Classifier.load_from_checkpoint("../lightning_logs/cifar10_classifier/checkpoints/epoch=49-step=35150.ckpt")
model.eval()
# preds = trainer.predict(model, dataloaders=test_dataloader, return_predictions=True)
p = trainer.test(model, dataloaders=test_dataloader)
print(p)
When I'm running trainer.test, it's creating additional version_x folders inside the lightning_logs folder, which I don't want. Can I reuse them in any way? If not, then is there any way to disable it from creating?
Also, When I'm experimenting with the training loop, I don't want to save any checkpoint. Is there any workaround too for that?
If enable_checkpointing=False does not help, try also setting logger=False:
trainer = Trainer(enable_checkpointing=False, logger=False)
You can disable checkpoint using the Trainer option enable_checkpointing:
trainer = Trainer(enable_checkpointing=False)
Or use a checkpoint to resume training with load_from_checkpoint:
model = MyLightningModule.load_from_checkpoint("/path/to/checkpoint.ckpt")
# disable randomness, dropout, etc...
model.eval()
# predict with the model
y_hat = model(x)
I have a frozen inference graph stored in a .pb file, which was obtained from a trained Tensorflow model by the freeze_graph function.
Suppose, for simplicity, that I would like to change some of the sigmoid activations in the model to tanh activations (and let's not discuss whether this is a good idea).
How can this be done with access only to the frozen graph in the .pb file, and without the possibility to retrain the model?
I am aware of the Graph Editor library in tf.contrib, which should be able to do this kind of job, but I wasn't able to figure out a simple way to do this in the documentation.
The solution is to use import_graph_def:
import tensorflow as tf
sess = tf.Session()
def load_graph(frozen_graph_filename):
with tf.gfile.GFile(frozen_graph_filename, "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
with tf.Graph().as_default() as graph:
tf.import_graph_def(graph_def, name='')
return graph
graph_model = load_graph("frozen_inference_graph.pb")
graph_model_def = graph_model.as_graph_def()
graph_new = tf.Graph()
graph_new.as_default()
my_new_tensor = # whatever
tf.import_graph_def(graph_model_def, name='', input_map={"tensor_to_replace": my_new_tensor})
#do somthing with your new graph
Here I wrote a post about it
Can you try this:
graph = load_graph(filename)
graph_def = graph.as_graph_def()
# if ReLu op is at node 161
graph_def.node[161].op="tanh"
tf.train.write_graph(graph_def, path2savfrozn, "altered_frozen.pb", False)
Please let know the if it works.
The *.pb file contains a SavedModel protocol buffer. You should be able to load it using a SavedModel loader. You can also inpsect it with the SavedModel CLI. The full documentation on SavedModels is here.
Something along these lines should work:
graph_def = tf.GraphDef()
with open('frozen_inference.pb', 'rb') as f:
graph_def.ParseFromString(f.read())
with tf.Graph().as_default() as graph:
importer.import_graph_def(graph_def, name='')
new_model = tf.GraphDef()
with tf.Session(graph=graph) as sess:
for n in sess.graph_def.node:
if n.op == 'Sigmoid':
nn = new_model.node.add()
nn.op = 'Tanh'
nn.name = n.name
for i in n.input:
nn.input.extend([i])
else:
nn = new_model.node.add()
nn.CopyFrom(n)
Whenever I try to use tf.reset_default_graph(), I get this error: IndexError: list index out of range or ``. At which part of my code should I use this? When should I be using this?
Edit:
I updated the code, but the error still occurs.
def evaluate():
with tf.name_scope("loss"):
global x # x is a tf.placeholder()
xentropy = tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=neural_network(x))
loss = tf.reduce_mean(xentropy, name="loss")
with tf.name_scope("train"):
optimizer = tf.train.AdamOptimizer()
training_op = optimizer.minimize(loss)
with tf.name_scope("exec"):
with tf.Session() as sess:
for i in range(1, 2):
sess.run(tf.global_variables_initializer())
sess.run(training_op, feed_dict={x: np.array(train_data).reshape([-1, 1]), y: label})
print "Training " + str(i)
saver = tf.train.Saver()
saver.save(sess, "saved_models/testing")
print "Model Saved."
def predict():
with tf.name_scope("predict"):
tf.reset_default_graph()
with tf.Session() as sess:
saver = tf.train.import_meta_graph("saved_models/testing.meta")
saver.restore(sess, "saved_models/testing")
output_ = tf.get_default_graph().get_tensor_by_name('output_layer:0')
print sess.run(output_, feed_dict={x: np.array([12003]).reshape([-1, 1])})
def main():
print "Starting Program..."
evaluate()
writer = tf.summary.FileWriter("mygraph/logs", tf.get_default_graph())
predict()
If I remove the tf.reset_default_graph() from the updated code, I get this error: ValueError: cannot add op with name hidden_layer1/kernel/Adam as that name is already used
From my current understanding, tf.reset_default_graph() removes all graphs, hence I avoided the error I mention above(ValueError: cannot add op with name hidden_layer1/kernel/Adam as that name is already used)
This is probably how you use it:
import tensorflow as tf
a = tf.constant(1)
with tf.Session() as sess:
tf.reset_default_graph()
You get an error because you use it in a session. From the tf.reset_default_graph() documentation:
Calling this function while a tf.Session or tf.InteractiveSession is
active will result in undefined behavior. Using any previously created
tf.Operation or tf.Tensor objects after calling this function will
result in undefined behavior
tf.reset_default_graph() can be helpful (at least for me) during the testing phase while I experiment in jupyter notebook. However, I have never used it in production and do not see how it would be helpful there.
Here is an example that could be in a notebook:
import tensorflow as tf
# create some graph
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print sess.run(...)
Now I do not need this stuff anymore, but if I create another graph and visualize it in tensorboard I will see old nodes and the new nodes. To solve this, I could restart the kernel and run only the next cell. However, I can just do:
tf.reset_default_graph()
# create a new graph
with tf.Session() as sess:
print sess.run(...)
Edit after OP added his code:
with tf.name_scope("predict"):
tf.reset_default_graph()
Here is what approximately happens. Your code fails because tf.name_scope already added something to a graph. While being inside of this "adding something to the graph", you tell TF to remove the graph completely, but it can't because it is busy adding something.
For some reason, I need to build a new graph FOR LOTS OF TIMES, and I have just tested, which works eventually! Many thanks for Salvador Dali's answer:-)
import tensorflow as tf
from my_models import Classifier
for i in range(10):
tf.reset_default_graph()
# build the graph
global_step = tf.get_variable('global_step', [], initializer=tf.constant_initializer(0), trainable=False)
classifier = Classifier(global_step)
with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
print("do sth here.")
With TensorFlow 2.0 coming out, now it's better to use tf.compat.v1.reset_default_graph() in order to avoid getting warning. Link to the documentation: https://www.tensorflow.org/api_docs/python/tf/compat/v1/reset_default_graph
simply put,
use to clear previous placeholder which you using sess.run() created
I am using the TensorFlow for Poets code lab to guide me as I retrain the Inceptionv3 CNN to classify a list of images. I have successfully trained the model, and it works when i employ the given code to classify individual images. But when i try and use it on a large batch of images, then i get the GraphDef cannot be larger than 2GB. Please advise.
import pandas as pd
import os, sys
import tensorflow as tf
test_images = pd.read_csv('test_images.csv')
testid = test_images['Id']
listx= list(range(4320))
predlist=[]
output = pd.DataFrame({'Id': listx})
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
for x in listx:
path = 'test/'+str(x+1)+'.jpg'
# change this as you see fit
image_path = path
# Read in the image_data
image_data = tf.gfile.FastGFile(image_path, 'rb').read()
# Loads label file, strips off carriage return
label_lines = [line.rstrip() for line
in tf.gfile.GFile("retrained_labels.txt")]
# Unpersists graph from file
with tf.gfile.FastGFile("retrained_graph.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
# Feed the image_data as input to the graph and get first prediction
with tf.Graph().as_default():
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
predictions = sess.run(softmax_tensor, \
{'DecodeJpeg/contents:0': image_data})
# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
# print('the top result is' + label_lines[node_id])
flag = 0
for node_id in top_k:
while flag == 0:
human_string = label_lines[node_id]
score = predictions[0][node_id]
predlist.append(int(human_string[:3]))
print('%s' % (human_string))
flag = 1 # we only want the top prediction
output['Prediction']=predlist
output.to_csv('outputtest.csv')
One way by which this error can e solved is by placing
with tf.Graph().as_default():
after for loop.
This is the piece of code that worked for me while trying to read bulk image:
for filename in os.listdir(image_path):
with tf.Graph().as_default():
# Read in the image_data
image_data = tf.gfile.FastGFile(image_path + filename, 'rb').read()
I just edit the https://github.com/tensorflow/tensorflow/blob/r0.10/tensorflow/examples/tutorials/mnist/mnist_softmax.py to enable logging by using a validation monitor
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
# Import data
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
flags = tf.app.flags
FLAGS = flags.FLAGS
flags.DEFINE_string('data_dir', '/tmp/data/', 'Directory for storing data')
mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True)
sess = tf.InteractiveSession()
# Create the model
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
validation_metrics = {"accuracy": tf.contrib.metrics.streaming_accuracy,
"precision": tf.contrib.metrics.streaming_precision,
"recall": tf.contrib.metrics.streaming_recall}
validation_monitor = tf.contrib.learn.monitors.ValidationMonitor(
mnist.test.images,
mnist.test.labels,
every_n_steps=50, metrics=validation_metrics,
early_stopping_metric="loss",
early_stopping_metric_minimize=True,
early_stopping_rounds=200)
# Define loss and optimizer
y_ = tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
# Train
tf.initialize_all_variables().run()
for i in range(1000):
batch_xs, batch_ys = mnist.train.next_batch(100)
train_step.run({x: batch_xs, y_: batch_ys})
# Test trained model
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(accuracy.eval({x: mnist.test.images, y_: mnist.test.labels}))
But i am confused how i set validation_monitor in this program. I have learned in DNNClassfier , the validation_monitor is used in the flowwing way
# Fit model.
classifier.fit(x=training_set.data,
y=training_set.target,
steps=2000, monitors=[validation_monitor])
So, how i can use validation_monitor in softmax_classifer?
I don't think there's an easy way to do that, since ValidationMonitor is a part of tf.contrib, e.g. contribution code that is not supported by the TensorFlow team. So unless you are using some higher-level API from tf.contrib (like DNNClassfier), you might not be able to simply pass a ValidationMonitor instance to a optimizer's minimize method.
I believe your options are:
Check how DNNClassfier's fit method is implemented and utilise the same approach by manually handling a ValidationMonitor instance in your graph and session.
Implement your own validation routine for logging and/or early stopping or whatever it is you intend to use ValidationMonitor for.