Tensorflow: tf.trainable_variables() does not show my mode weights - machine-learning

I am trying to extract my model weights to be able to run my first pre-trained model. However, I am unable to extract my weights since executing tf.trainable_variables() give me the following output:
[<tf.Variable 'VGGNet/B1C1/kernel:0' shape=(3, 1, 3, 32) dtype=float32_ref>, <tf.Variable 'VGGNet/B1C1/bias:0' shape=(32,) dtype=float32_ref>, <tf.Variable 'VGGNet/B1C2/kernel:0' shape=(1, 3, 32, 32)....
It shows the shape, but not the numpy array that I am expecting. What am I missing?

You should not be extracting weights for using your pre-trained model; instead use the model.save_weights() and model.load_weights() utility methods provided by the API. Here's a link to the document you can use to learn more about it Save and load models.
Coming to your question - why you are not seeing the weights: it is because tf.trainable_variables() is supposed to give you variables and not their values (aka weights).

Related

Why does training a tensorflow model with a tf.data pipeline yield radically different results than directly feeding with EagerTensors?

I am trying to set up a pipeline to train a model. To get started, I am using 'training wheels'.
I preprocess all of my data into 5 EagerTensors -- 3 for features, 2 for targets.
For the sake of argument, lets call the feature tensors "in_a, in_b, in_c" and the target tensors "tgt_1, tgt_2"
The shape of the input tensors are as follows:
in_a.shape (67473, 132, 5)
in_b.shape (67473, 132)
in_c.shape (67473, 132)
Target tensors are:
tgt_1.shape (67473, 132)
tgt_2.shape (67473, 132)
If I feed these tensors into my model using the .fit method in the following way:
training_model.fit(x=[in_a, in_b, in_c],y=[tgt_1, tgt_2],batch_size = 32, shuffle = True, epochs = 20)
I get wonderful results 100% of the time that I run the fit (input data is identical in all cases)
HOWEVER, I have more data than I can fit in memory, so I am tyring to figure out the tf.data.Dataset flow, and this is where I have problems.
I take the exact same tensors and create a zipped dataset in the following way:
feature_ds = tf.data.Dataset.from_tensor_slices((a_in, b_in, c_in))
target_ds = tf.data.Dataset.from_tensor_slices((tgt_1, tgt_2))
full_dataset = tf.data.Dataset.zip((feature_ds,target_ds)).shuffle(buffer_size=320).batch(32).prefetch(tf.data.experimental.AUTOTUNE)
This yields the following element_spec:
((TensorSpec(shape=(None, 132, 5), dtype=tf.float64, name=None), TensorSpec(shape=(None, 132), dtype=tf.float32, name=None), TensorSpec(shape=(None, 132), dtype=tf.float32, name=None)), (TensorSpec(shape=(None, 132), dtype=tf.float32, name=None), TensorSpec(shape=(None, 132), dtype=tf.float32, name=None)))
Now, when I feed the dataset into the exact same model, I get radically variant results - every time I train the model.
training_model.fit(full_dataset, epochs = 20)
One fit of 20 epochs leaves good results; another run, medocre; another, awful.
What could I be doing wrong? Any ideas how to troubleshoot this? I mean, the data source doesn't change between the two ways of feeding the model, just the method used to get it there.
Many thanks in advance!
Reefmo
Sorted it out... turns out that model.fit(shuffle=True) will shuffle the entire dataset at each epoch.
The way I zipped the full_dataset above used a .shuffle(buffer_size=320)
Problem here was that the dataset was some 67k records long -- and the way the shuffle works on datasets is sorta funky -- just the buffer gets shuffled, and then as data is read out, it backfills the buffer (So I think). AND the default behavior is to NOT shuffle at the end of every epoch.
By changing the line to
full_dataset = tf.data.Dataset.zip((input_ds,target_ds)).shuffle(buffer_size=67000),reshuffle_each_iteration=True).batch(32).prefetch(tf.data.experimental.AUTOTUNE)
Fixed my issue.

Keras Same Feature Extraction from Different Images

I'm using Keras' pre-trained model for feature extraction in two images, however they gave the same outcome (array_equal = True). I've tried other model like VGG16 and Resnet50 but the results are the same. Am I writing the code wrong or is it the limitation of pre-trained model? Is there anything I can do to extract different features? Thanks!
import cv2
from keras.applications.inception_v3 import InceptionV3
from keras.applications.inception_v3 import preprocess_input
model = InceptionV3(weights='imagenet', include_top=False)
def get_img_vector(path):
im = cv2.imread(path)
im = cv2.resize(im,(224,224))
img = preprocess_input(np.expand_dims(im.copy(), axis=0))
resnet_feature = model.predict(img)
return np.array(resnet_feature)
arr1 = get_img_vector('image1.png')
arr2 = get_img_vector('image2.png')
np.array_equal(arr1,arr2)
Below are my two images:
I think that the file format png create the image loading problem. Currently cv2.imread a png file and cv2.imshow it result in a black screen, which make two images identical. Saving the file from png to jpg and trying it again.
If you run the code, you should see some warning like this,
WARNING: TensorFlow:Model was constructed with shape (None, 299, 299, 3)
for input Tensor("input_3:0", shape=(None, 299, 299, 3), dtype=float32),
but it was called on an input with incompatible shape (None, 224, 224, 3).
Change your code to
im = cv2.resize(im,(299,299))
Now about the similar features, pre-trained imagenet can classify 1000 classes and the given picture. If you decode then you'll see that both of them will give you the same output. And you'll see even for the top 5 predictions, confidence is very low, and most similar is to the image of a nematode.
[[('n01930112', 'nematode', 0.11086103), ('n03729826', 'matchstick', 0.08173305), ('n03196217', 'digital_clock', 0.034744), ('n03590841', "jack-o'-lantern", 0.017616412), ('n04286575', 'spotlight', 0.016781498)]]
However, if you want to train a model that can differentiate these two images then you can use the pre-trained models for transfer learning with your own dataset.

Is it possible to train a sklearn model (eg SVM) incrementally? [duplicate]

This question already has answers here:
Does the SVM in sklearn support incremental (online) learning?
(6 answers)
Closed 4 years ago.
I'm trying to perform sentiment analysis over the twitter dataset "Sentiment140" which consists of 1.6 million labelled tweets . I'm constructing my feature vector using Bag Of Words ( Unigram ) model , so each tweet is represented by about 20000 features . Now to train my sklearn model (SVM,Logistic Regression,Naive Bayes) using this dataset , i have to load the entire 1.6m x 20000 feature vectors into one variable and then feed it to the model . Even on my server machine which has a total of 115GB of memory , it causes the process to be killed .
So i wanted to know if i can train the model instance by instance , rather than loading the entire dataset into one variable ?
If sklearn does not have this flexibility , then is there any other libraries that you could recommend (which support sequential learning) ?
It is not really necessary (let alone efficient) to go to the other extreme and train instance by instance; what you are looking for is actually called incremental or online learning, and it is available in scikit-learn's SGDClassifier for linear SVM and logistic regression, which indeed contains a partial_fit method.
Here is a quick example with dummy data:
import numpy as np
from sklearn import linear_model
X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]])
Y = np.array([1, 1, 2, 2])
clf = linear_model.SGDClassifier(max_iter=1000, tol=1e-3)
clf.partial_fit(X, Y, classes=np.unique(Y))
X_new = np.array([[-1, -1], [2, 0], [0, 1], [1, 1]])
Y_new = np.array([1, 1, 2, 1])
clf.partial_fit(X_new, Y_new)
The default values for the loss and penalty arguments ('hinge' and 'l2' respectively) are these of a LinearSVC, so the above code essentially fits incrementally a linear SVM classifier with L2 regularization; these settings can of course be changed - check the docs for more details.
It is necessary to include the classes argument in the first call, which should contain all the existing classes in your problem (even though some of them might not be present in some of the partial fits); it can be omitted in subsequent calls of partial_fit - again, see the linked documentation for more details.

How do I obtain the layer names for use in the iOS sample app? (Tensorflow)

I'm very new to Tensorflow, and I'm trying to train something using the inception v3 network for use in an iPhone app. I managed to export my graph as a protocolbuffer file, manually remove the dropout nodes (correctly, I hope), and have placed that .pb file in my iOS project, but now I am receiving the following error:
Running model failed:Not found: FeedInputs: unable to find feed output input
which seems to indicate that my input_layer_name and output_layer_name variables in the iOS app are misconfigured.
I see in various places that it should be Mul and softmax respectively, for inception v3, but these values don't work for me.
My question is: what is a layer (with regards to this context), and how do I find out what mine are?
This is the exact definition of the model that I trained, but I don't see "Mul" or "softmax" present.
This is what I've been able to learn about layers, but it seems to be a different concept, since "Mul" isn't present in that list.
I'm worried that this might be a duplicate of this question but "layers" aren't explained (are they tensors?) and graph.get_operations() seems to be deprecated, or maybe I'm using it wrong.
As MohamedEzz wrote there are no layers in Tensorflow graphs. There are only operations that can be placed under the same name scope.
Usually operations of a single layer placed under the same scope and applications that aware of name scope concept can display them grouped.
One of such applications is Tensorboard. I believe that using Tensorboard is the easiest way to find node names.
Consider the following example:
import tensorflow as tf
import tensorflow.contrib.slim.nets as nets
input_placeholder = tf.placeholder(tf.float32, shape=(None, 224, 224, 3))
network = nets.inception.inception_v3(input_placeholder)
writer = tf.summary.FileWriter('.', tf.get_default_graph())
writer.close()
It creates placeholder for input data then creates Inception v3 network and saves event data (with graph) in current directory.
Launching Tensorflow in the same directory makes it possible to view graph structure.
tensorboard --logdir .
Tensorboard prints UI url to the console
Starting TensorBoard 41 on port 6006
(You can navigate to http://192.168.128.73:6006)
Below is an image of this graph.
Locate node you are interested in and select it to find its name (in the upper left information pane).
Input:
Output:
Please note that usually you need not node names but tensor names. In most cases it is enough to add :0 to node name to get tensor name.
For example to run Inception v3 network created above using names from the graph use the following code (continuation of the above code):
import numpy as np
data = np.random.randn(1, 224, 224, 3) # just random data
session = tf.InteractiveSession()
session.run(tf.global_variables_initializer())
result = session.run('InceptionV3/Predictions/Softmax:0', feed_dict={'Placeholder:0': data})
# result.shape = (1, 1000)
In the core of tensorflow, there are ops (operations) and tensors (n-dimensional arrays). Each op takes tensors and gives back tensors. Layers are just convenience wrappers around a number of ops that represent a neural network layer.
For example a convolution layer is composed of mainly 3 ops :
conv2d op : this is what slides a kernel over the input tensor and does element-wise multiplication between the kernel and the underlying input window.
bias_add op : adds the biases to the tensor coming out of the conv2d op
activation op : applies an activation function element-wise to the output tensor of the bias_add op
To run a tensorflow model, you provide feeds (inputs) and fetches (desired outputs). These are tensors, or tensor names.
From this line of code Inception_model, it seems that what you need is a tensor named 'predictions' which has the n_class output probabilities.
What you observed (softmax) is the type of the op that produced the predictions tensor
As for the input tensor name, the inception_model.py code does not show the input tensor name, since it's an argument to the function. So it depends on what name you have given to that input tensor.
When you create your layers or variable add the parameter called name
with tf.name_scope("output"):
W2 = tf.Variable(tf.truncated_normal([num_filters, num_classes], stddev=0.1), name="W2")
b2 = tf.Variable(tf.constant(0.1, shape=[num_classes]), name="b2")
scores = tf.nn.xw_plus_b(h_pool_flat, W2, b2, name="scores")
pred_y = tf.nn.softmax(scores,name="pred_y")
In this case I can access final predicted values by using "output/pred_y". If you dont have name_scope, you can just use "pred_y" to get to the values
conv = tf.nn.conv1d(word_embeddedings,
W1,
stride=stride_size,
padding="VALID",
name="conv") #will have dimensions [batch_size,out_width,num_filters] out_width is a function of max_words,filter_size and stride_size
# Apply nonlinearity
h = tf.nn.relu(tf.nn.bias_add(conv, b1), name="relu")
I called the layer "conv" and used it in the next layer. Paste your snippet like I have done here

Name of input and output tensors when loading Keras model to TensorFlow

I'm trying to use Keras' model in "pure" TensorFlow (I want to use it in Android app). I've successfully exported Keras model to protobuf and imported it to Tensorflow. However running tensorflow model requires providing input and output tensors' names and I don't know how to find them. My model looks like this:
seq = Sequential()
seq.add(Convolution2D(32, 3, 3, input_shape=(3, 15, 15), name="Conv1"))
....
seq.add(Activation('softmax', name="Act4"))
seq.compile()
When I'm printing tensors in TensorFlow I can find:
Tensor("Conv1_W/initial_value:0", shape=(32, 3, 3, 3), dtype=float32)
Tensor("Conv1_W:0", dtype=float32_ref)
Tensor("Conv1_W/Assign:0", shape=(32, 3, 3, 3), dtype=float32_ref)
Tensor("Conv1_W/read:0", dtype=float32)
Tensor("Act4_sample_weights:0", dtype=float32)
Tensor("Act4_target:0", dtype=float32)
Hovewer, there are no tensors that have shape (3,15,15).
I've seen here that I can add "my_input_tensor" as input, hovewer I don't know which type is it - I've tried TensorFlow's and Keras' placeholders and they gave me this error:
/XXXXXXXXX/lib/python2.7/site-packages/keras/engine/topology.pyc in __init__(self, input, output, name)
1599 # check that x is an input tensor
1600 layer, node_index, tensor_index = x._keras_history
-> 1601 if len(layer.inbound_nodes) > 1 or (layer.inbound_nodes and layer.inbound_nodes[0].inbound_layers):
1602 cls_name = self.__class__.__name__
1603 warnings.warn(cls_name + ' inputs must come from '
AttributeError: 'NoneType' object has no attribute 'inbound_nodes'
As of TensorFlow 2.0 (unfortunately they seem to change this often) you can export the model to the SavedModel format -in python- using
model.save('MODEL-FOLDER')
and then inspect the model using the saved_model_cli tool (found inside python folder <yourenv>/bin/saved_model_cli -in anaconda at least)
saved_model_cli show --dir /path/to/model/MODEL-FOLDER/ --tag_set serve --signature_def serving_default
the output will be something like
The given SavedModel SignatureDef contains the following input(s):
inputs['graph_input'] tensor_info:
dtype: DT_DOUBLE
shape: (-1, 28, 28)
name: serving_default_graph_input:0
The given SavedModel SignatureDef contains the following output(s):
outputs['graph_output'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 10)
name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict
By inspecting the output, you can see the name of the input and output tensors in this case to be, respectively: serving_default_graph_input and StatefulPartitionedCall
This is how you find the tensor names.
The right way to do this, though, is to define a graph path and its output and input tensors on the model using SignatureDefs. So you's load those SignaturesDefs instead of having to deal with the tensor name's directly.
This is a nice reference that explains this better than the official docs, imo:
https://sthalles.github.io/serving_tensorflow_models/
Call a model.summary() in Keras to see all the layers.
An input tensor will often be called input_1, input_2, etc. See in the summary the correct name.
When you use input_shape=(3,15,15) in Keras, you're actually using tensors that have shape (None, 3, 15, 15). Where None will be replaced by the batch size in training or prediction.
Often, for these unknonw dimensions, you use -1, such as in (-1, 3, 15, 15). But I cannot assure you that it will work like this. It works perfectly for reshaping tensors, but for creating I've never tested.
To get the input and output tensors of your Keras models, do the following:
input_tensor = seq.inputs[0]
output_tensor = seq.outputs[0]
print("Inputs: "+str(input_tensor))
print("Outputs: "+str(output_tensor))
The above assumes that there is only 1 input tensor and 1 output tensor. If you have more, then you would have to use the appropriate index to get those tensors.
Note that there is a difference between layer output shapes and tensor output shapes. The two are usually the same, but not always.
You can try calling summary() on the loaded model object as suggested in one of the answers. But if you couldn't find the input and output names in the model summary, try calling input_names and output_names on the model object as below :
from tensorflow.keras.models import load_model
model = load_model("./model/00001")
print(model.input_names)
print(model.output_names)
Tried out on TensorFlow version : 2.3.1

Resources