def try_and_error(layers, activation):
model = Sequential()
for i, nodes in enumerate(layers):
if i==0:
model.add(Dense(nodes,input_dim=train_X.shape[1]))#input layers
model.add(Activation(activation)) #Activation layer
else:
model.add(Dense(nodes))# Hidden Layers
model.add(Activation(activation))#Activation Layers
model.add(Dense(1)) # output layer
model.compile(optimizer='adam', loss='binary_crossentropy',metrics=['accuracy'])
return model
layers=[[150], [160,100], [140,100,500]]
activations = ['sigmoid', 'relu']
param_grid = dict(layers=layers, activation=activations, batch_size=
[500,800,1000])
grid =
RandomizedSearchCV(
estimator=KerasClassifier(build_fn=try_and_error
,epochs=100,verbose=0),
param_distributions =param_grid)
grid_result= grid.fit(train_X,train_y)}
and this is the error encountered even I have tried this with gridsearchcv result still the same.
RuntimeError: Cannot clone object <keras.wrappers.scikit_learn.KerasClassifier object at 0x7f3d7959c390>, as the constructor either does not set or modifies parameter layers
Try replacing
layers=[[150], [160,100], [140,100,500]]
with
layers=[(150), (160,100), (140,100,500)]
change the layers to this :
layers=[(150,), (160,100), (140,100,500)]
Also , Dont forget to add (,) in (150,) otherwise it will throw an error like : TypeError: 'int' object is not iterable
This is because single tuple without comma(,) is treated as int.
Related
I have a pretrained model which was saved by
torch.save(net, 'lenet5_mnist_model')
And now I am loading it back and trying to calculate fisher information matrix like this:
precision_matrices = {}
batch_size = 32
my_model = torch.load('lenet5_mnist_model')
my_model.eval() # I tried to comment this off, but still no luck
for n, p in deepcopy({n: p for n, p in my_model.named_parameters()}).items()
p = torch.tensor(p, requires_grad = True)
p.data.zero_()
precision_matrices[n] = variable(p.data)
for idx in range(int(images.shape[0]/batch_size)):
x = images[idx*batch_size : (idx+1)*batch_size]
my_model.zero_grad()
x = Variable(x.cuda(), requires_grad = True)
output = my_model(x).view(1,-1)
label = output.max(1)[1].view(-1)
loss = F.nll_loss(F.log_softmax(output, dim=1), label)
loss = Variable(loss, requires_grad = True)
loss.backward()
for n, p in my_model.named_parameters():
precision_matrices[n].data += p.grad.data**2
Finally, the above code will crash at the last line, because p.grad is NoneType. So the error is:
AttributeError: 'NoneType' object has no attribute 'data'.
Could someone provide some guidance on what caused the NoneType grad for the parameters? How should I fix this?
Your loss does not backpropagate the gradients through the model, because you are creating a new loss tensor with the value of the actual loss, which is a leaf of the computational graph, meaning that there is no history to backpropagate through.
loss.backward() needs to be called on the output of loss = F.nll_loss(F.log_softmax(output, dim=1), label).
I'm assuming that you thought you need to create a tensor with requires_grad=True, to be able to calculate the gradients. That is not the case. Tensors created with requires_grad=True are the leaves of the computational graph (they start the graph) and every operation performed on any tensor that is part of the graph is tracked such that the gradients can flow through the intermediate results to the leaves. Only tensors that need to be optimised (i.e. learnable parameters) should set requires_grad=True manually (the model's parameters do that automatically), everything else regarding the gradients is inferred. Neither x nor the loss are learnable parameters.
This confusion presumably arose due to the use of Variable. It was deprecated in PyTorch 0.4.0, which was released over 2 years ago, and all of its functionality has been merged into the tensors. Please do not use Variable.
x = images[idx*batch_size : (idx+1)*batch_size]
my_model.zero_grad()
x = x.cuda()
output = my_model(x).view(1,-1)
label = output.max(1)[1].view(-1)
loss = F.nll_loss(F.log_softmax(output, dim=1), label)
loss.backward()
I have trained a model and deployed it to tensorflow-serving for inference.
I am getting this error when making a request:
<Response [400]>
{'error': 'transpose expects a vector of size 5. But input(1) is a vector of size 3\n\t [[{{node bidirectional_1/transpose}} = Transpose[T=DT_FLOAT, Tperm=DT_INT32, _class=["loc:#bidirectional_1/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3"], _output_shapes=[[50,?,512]], _device="/job:localhost/replica:0/task:0/device:CPU:0"](embedding_1/embedding_lookup, Attention/transpose/perm)]]'}
The notable difference between this model and the first I deployed that worked without issue is that it contains a Keras custom Layer whereas my successful attempt contained only standard Keras layers.
This is how I am testing the POST request to my tf-serving model:
with open("CNN_last_test_set.pkl", "rb") as fp:
x_arr_test, y_test = pickle.load(fp)
out = x_arr_test[:1, :]
out = out.tolist()
payload = {
"instances": [{'input': [out]}]
}
r = requests.post('http://localhost:9000/v1/models/prod_mod:predict', json=payload)
pred = json.loads(r.content.decode('utf-8'))
To create the tensorflow model object to use with tf-serving I am using this function:
def export_model_custom_layer(filename, export_path_base):
# set the mode to test time.
K.set_learning_phase(0)
model = keras.models.load_model(filename, custom_objects={"Attention": Attention})
sess = K.get_session()
# set the path to save the model and model version
export_version = 1
export_path = os.path.join(
tf.compat.as_bytes(export_path_base),
tf.compat.as_bytes(str(export_version)))
tf.saved_model.simple_save(
sess,
export_path,
inputs={'input': model.input},
outputs={t.name.split(':')[0]: t for t in model.outputs},
legacy_init_op=tf.tables_initializer())
Where I've defined my customer layer as a custom object, in order for this to work I've added this function to my customer layer:
def get_config(self):
config = {
'name': "Attention"
}
base_config = super(Attention, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
When I predict with the model using the same data format as the tf-serving model is receiving using standard keras model.predict(), it works as intended:
class Attention(Layer):...
with open("CNN_last_test_set.pkl", "rb") as fp:
x_arr_test, y_test = pickle.load(fp)
model = keras.models.load_model(r"Data/modelCNN.model", custom_objects={"Attention": Attention})
out = x_arr_test[:1, :]
test1 = out.shape
out = out.tolist()
test = model.predict([out])
>> print(test)
>> [[0.21351092]]
This leads me to believe that the issue is happening either when I export the model from keras to the .pb file or in some way the model is being run in the docker container.
I am not sure what to make of this error but I'm assuming that this is related to my custom layer object considering that it worked with my previous model that only contained standard Keras layers.
Any help is greatly appreciated, thanks!
EDIT: I solved the issue, the problem was that my input data had two extra dimensions than necessary. I realized that when I removed the brackets from around the variable "out" my error changed from being 'transpose expects a vector of size 5' to 'transpose expects a vector of size 4'. So I reshaped my "out" variable from being (1, 50) to (50,) & removed the brackets and the problem resolved itself.
TensorFlow provides the possibility for combining ValidationMonitors with several predefined estimators like tf.contrib.learn.DNNClassifier.
But I want to use a ValidationMonitor for my own estimator which I have created based on 1.
For my own estimator I initialize first a ValidationMonitor:
validation_monitor = tf.contrib.learn.monitors.ValidationMonitor(testX,testY,every_n_steps=50)
estimator = tf.contrib.learn.Estimator(model_fn=model,model_dir=direc,config=tf.contrib.learn.RunConfig(save_checkpoints_secs=1))
input_fn = tf.contrib.learn.io.numpy_input_fn({"x": x}, y, 4, num_epochs=1000)
Here I pass the monitor as shown in 2 for tf.contrib.learn.DNNClassifier:
estimator.fit(input_fn=input_fn, steps=1000,monitors=[validation_monitor])
This fails and following error was printed:
ValueError: Features are incompatible with given information. Given features: Tensor("input:0", shape=(?, 1), dtype=float64), required signatures: {'x': TensorSignature(dtype=tf.float64, shape=TensorShape([Dimension(None)]), is_sparse=False)}.
How can I use monitors for my own estimators?
Thanks.
Problem is solved when passing input_fn containing testX and testY to ValidationMonitor instead of passing the tensors testX and testY directly.
For the record, your error was caused by the fact that ValidationMonitor expects x to be a dictionary like { 'feature_name_as_a_string' : feature_tensor }, which in your input_fn is done internally by the call to tf.contrib.learn.io.numpy_input_fn(...).
More information about how to build features dictionaries can be found in the Building Input Functions with tf.contrib.learn article of the documentation.
I'm trying to use a placeholder in my graph as a Variable (so I can later optimise something with respect to it), but I don't know the best way to do that. I've tried this:
x = tf.placeholder(tf.float32, shape = [None,1])
x_as_variable = tf.Variable(x, validate_shape = False)
but everytime I build my graph, I get an error when I try to add my loss function:
train = tf.train.AdamOptimizer().minimize(MSEloss)
The error is:
ValueError: as_list() is not defined on an unknown TensorShape.
Even if you're not familiar with the error entirely, I'd really appreciate it if you could guide me on how to build a replicate Variable which takes on the value of my placeholder.
Thanks!
As you've noticed, TensorFlow optimizers (i.e. subclasses of tf.train.Optimizer) operate on tf.Variable objects because they need to be able to assign new values to those objects, and in TensorFlow only variables support an assign operation. If you use a tf.placeholder(), there's nothing to update, because the value of a placeholder is immutable within each step.
So how do you optimize with respect to a fed-in value? I can think of two options:
Instead of feeding a tf.placeholder(), you could first assign a fed-in value to a variable and then optimize with respect to it:
var = tf.Variable(...)
set_var_placeholder = tf.placeholder(tf.float32, ...)
set_var_op = var.assign(set_var_placeholder)
# ...
train_op = tf.train.AdamOptimizer(...).minimize(mse_loss, var_list=[var, ...])
# ...
initial_val = ... # A NumPy array.
sess.run(set_var_op, feed_dict={set_var_placeholder: initial_val})
sess.run(train_op)
updated_val = sess.run(var)
You could use the lower-level tf.gradients() function to get the gradient of the loss with respect to a placeholder in a single step. You could then use that gradient in Python:
var = tf.placeholder(tf.float32, ...)
# ...
mse_loss = ...
var_grad, = tf.gradients(loss, [var])
var_grad_val = sess.run(var_grad, feed_dict={var: ...})
PS. The code in your question, where you define a tf.Variable(tf.placeholder(...), ...) is just defining a variable whose initial value is fed by the placeholder. This probably isn't what you want, because the training op that the optimizer creates will only use the value assigned to the variable, and ignore whatever you feed to the placeholder (after the initialization step).
I have a graph as follows, where the input x has two paths to reach y. They are combined with a gModule that uses cMulTable. Now if I do gModule:backward(x,y), I get a table of two values. Do they correspond to the error derivative derived from the two paths?
But since path2 contains other nn layers, I suppose I need to derive the derivates in this path in a stepwise fashion. But why did I get a table of two values for dy/dx?
To make things clearer, code to test this is as follows:
input1 = nn.Identity()()
input2 = nn.Identity()()
score = nn.CAddTable()({nn.Linear(3, 5)(input1),nn.Linear(3, 5)(input2)})
g = nn.gModule({input1, input2}, {score}) #gModule
mlp = nn.Linear(3,3) #path2 layer
x = torch.rand(3,3)
x_p = mlp:forward(x)
result = g:forward({x,x_p})
error = torch.rand(result:size())
gradient1 = g:backward(x, error) #this is a table of 2 tensors
gradient2 = g:backward(x_p, error) #this is also a table of 2 tensors
So what is wrong with my steps?
P.S, perhaps I have found out the reason because g:backward({x,x_p}, error) results in the same table. So I guess the two values stand for dy/dx and dy/dx_p respectively.
I think you simply made a mistake constructing your gModule. gradInput of every nn.Module has to have exactly the same structure as its input - that is the way backprop works.
Here's an example how to create a module like yours using nngraph:
require 'torch'
require 'nn'
require 'nngraph'
function CreateModule(input_size)
local input = nn.Identity()() -- network input
local nn_module_1 = nn.Linear(input_size, 100)(input)
local nn_module_2 = nn.Linear(100, input_size)(nn_module_1)
local output = nn.CMulTable()({input, nn_module_2})
-- pack a graph into a convenient module with standard API (:forward(), :backward())
return nn.gModule({input}, {output})
end
input = torch.rand(30)
my_module = CreateModule(input:size(1))
output = my_module:forward(input)
criterion_err = torch.rand(output:size())
gradInput = my_module:backward(input, criterion_err)
print(gradInput)
UPDATE
As I said, gradInput of every nn.Module has to have exactly the same structure as its input. So, if you define your module as nn.gModule({input1, input2}, {score}), your gradOutput (the result of the backward pass) will be a table of gradients w.r.t. input1 and input2 which in your case are x and x_p.
The only question remains: why on Earth don't you get an error when call:
gradient1 = g:backward(x, error)
gradient2 = g:backward(x_p, error)
An exception must be raised because the first argument must be not a tensor but a table of two tensors. Well, most (perhaps all) of torch modules during calculating :backward(input, gradOutput) don't use input argument (they usually store a copy of input from the last :forward(input) call). In fact, this argument is so useless that modules don't even bother themselves to verify it.