Why when using this simple model with multiple outputs does Keras complain about a lack of gradients? - machine-learning

So this problem occurs for in the context of a larger project, but I've assembled a minimal working example. Consider the following:
input_1 = Input((5,))
hidden_a = Dense(2)(input_1)
hidden_b = Dense(2)(input_1)
m1 = Model(input_1, [hidden_a, hidden_b])
input_2 = Input((2,))
output = Dense(1)(input_2)
m2 = Model(input_2, output)
m3 = Model(input_1, m2(m1(input_1)[0]))
print(m3.summary())
m3.compile(optimizer='adam', loss='mse')
x = np.random.random(size=(10,5))
y = np.random.random(size=(10,1))
m3.fit(x,y)
My expectation is that when evaluating this network, the output of hidden_b will simply be discarded and I'll effectively have a simple feed-forward neural network that goes input_1 -> hidden_a -> input_2 -> output. Instead, I get a cryptic error:
Traceback (most recent call last):
File "test.py", line 37, in <module>
m3.fit(x,y)
File "/home/thomas/.local/lib/python3.5/site-packages/keras/engine/training.py", line 1013, in fit
self._make_train_function()
File "/home/thomas/.local/lib/python3.5/site-packages/keras/engine/training.py", line 497, in _make_train_function
loss=self.total_loss)
File "/home/thomas/.local/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/thomas/.local/lib/python3.5/site-packages/keras/optimizers.py", line 445, in get_updates
grads = self.get_gradients(loss, params)
File "/home/thomas/.local/lib/python3.5/site-packages/keras/optimizers.py", line 80, in get_gradients
raise ValueError('An operation has `None` for gradient. '
ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.
Any idea what might be causing this? Thanks!
Update: If passing input_1 to m1 is the problem, then why does this work?
input_1 = Input((5,))
hidden_a = Dense(2)(input_1)
hidden_b = Dense(2)(input_1)
def sampling (args):
hidden_a, hidden_b = args
return hidden_a + hidden_b
z = Lambda(sampling)([hidden_a, hidden_b])
m1 = Model(input_1, [hidden_a, hidden_b, z])
input_2 = Input((2,))
output = Dense(1)(input_2)
m2 = Model(input_2, output)
m3 = Model(input_1, m2(m1(input_1)[2]))
m3.compile(optimizer='adam', loss='mse')
x = np.random.random(size=(10,5))
y = np.random.random(size=(10,1))
m3.fit(x,y)

You're passing an input to model 1 that is already the input of model 1.
m3 = Model(input_1, m2(m1.outputs[0]))

Related

Resource Exhausted Error while Creating Image captioning model

I have used pre_trained vgg16 for cnn_part to get features of image (which I am not training) and defining the decoder class, which is trained through model. I don't know how resources are getting exhausted in just training decoder part, which I think is not too complex as vgg16. Here I am attaching all the relevant code .
Here is code for vgg16 -->
image_model = tf.keras.applications.VGG16(include_top=False,weights='imagenet' )
image_model.trainable = False
new_input = image_model.input # Any arbitrary shapes with 3 channels
hidden_layer = image_model.layers[-1].output
image_features_extract_model = tf.keras.Model(new_input, hidden_layer)
# https://www.tensorflow.org/tutorials/text/image_captioning
class VGG16_Encoder(tf.keras.Model):
# This encoder passes the features through a Fully connected layer
def __init__(self , cnn_model ):
super(VGG16_Encoder, self).__init__()
# shape after fc : (batch_size, 49, embedding_dim)
self.conv_base = cnn_model
#self.fc = tf.keras.layers.Dense(embedding_dim)
#self.dropout = tf.keras.layers.Dropout(0.5, noise_shape=None, seed=None)
def call(self, x):
#x = self.fc(x)
#x = tf.nn.relu(x)
x = self.conv_base(x)
x = tf.reshape(x , (BATCH_SIZE, 49 , 512))
return x
Here is the code of decoder --->
def rnn_type(units):
# If you have a GPU, we recommend using CuDNNGRU(provides a 3x speedup than GRU)
# the code automatically does that.
if tf.test.is_gpu_available():
return tf.compat.v1.keras.layers.CuDNNGRU(units,
return_sequences=True,
return_state=True,
recurrent_initializer='glorot_uniform')
else:
return tf.keras.layers.GRU(units,
return_sequences=True,
return_state=True,
recurrent_activation='sigmoid',
'''The encoder_output(i.e. 'features'), hidden_state(initialized to 0)(i.e. 'hidden') and
the decoder_input (which is the start token)(i.e. 'x') is passed to the decoder.'''
class Rnn_Local_Decoder(tf.keras.Model):
def __init__(self, embedding_dim, units, vocab_size):
super().__init__()
self.units = units
self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
self.gru = tf.keras.layers.GRU(self.units,
return_sequences=True,
return_state=True,
recurrent_initializer='glorot_uniform')
self.fc1 = tf.keras.layers.Dense(self.units)
self.dropout = tf.keras.layers.Dropout(0.5, noise_shape=None, seed=None)
self.batchnormalization = tf.keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', moving_mean_initializer='zeros', moving_variance_initializer='ones', beta_regularizer=None, gamma_regularizer=None, beta_constraint=None, gamma_constraint=None)
self.fc2 = tf.keras.layers.Dense(vocab_size)
# Implementing Attention Mechanism
self.U_attn = tf.keras.layers.Dense(units)
self.W_attn = tf.keras.layers.Dense(units)
self.V_attn = tf.keras.layers.Dense(1)
#_________________________________________________________________________________________________________________________
def call(self, x, features, hidden):
# features : (batch_size,49,512) (Output from ENCODER)
# hidden : (batch_size, hidden_size) <==> (64,512)
# hidden_with_time_axis : (batch_size, 1, hidden_size) <==> (64,1,512)
hidden_with_time_axis = tf.expand_dims(hidden, 1)
# score shape : (64, 49, 1)
# Attention Function
'''e_ij = f( s_(t-1) , h_j )
e_ij = V_attn(T)*tanh(U_attn * h_j + W_attn * s_t )'''
score = self.V_attn(tf.nn.tanh(self.U_attn(features) + self.W_attn(hidden_with_time_axis)))
# self.Uattn(features) : (64,49,512)
# self.Wattn(hidden_with_time_axis) : (64,1,512)
# tf.nn.tanh(self.Uattn(features) + self.Wattn(hidden_with_time_axis)) : (64,49,512)
# self.Vattn(tf.nn.tanh(self.Uattn(features) + self.Wattn(hidden_with_time_axis))) : (64,49,1) ==> score
# you get 1 at the last axis because you are applying score to self.Vattn
# Then find Probability using Softmax
'''attention_weights(alpha_ij) = softmax(e_ij)'''
attention_weights = tf.nn.softmax(score, axis=1)
# attention_weights : (64, 49, 1)
# Give weights to the different pixels in the image
''' C(t) = Summation(j=1 to T) (attention_weights * VGG-16 features) '''
context_vector = attention_weights * features
context_vector = tf.reduce_sum(context_vector, axis=1)
# Context Vector(64,256) = AttentionWeights(64,49,1) * features(64,49,256)
# context_vector shape after sum : (64, 256) ---> doing ele_wise sum of features_vec (axis=1)
# x shape after passing through embedding : (64, 1, 256)
x = self.embedding(x)
# x shape after concatenation : (64, 1, 512)
x = tf.concat([tf.expand_dims(context_vector, 1), x], axis=-1)
# passing the concatenated vector to the GRU
output, state = self.gru(x)
# shape == (batch_size, max_length, hidden_size)
x = self.fc1(output)
# x : (batch_size * max_length, hidden_size)
x = tf.reshape(x, (-1, x.shape[2]))
# Adding Dropout and BatchNorm Layers
x= self.dropout(x)
x= self.batchnormalization(x)
# output : (64 * 512)
x = self.fc2(x)
# shape : (64 * 8329(vocab))
return x, state, attention_weights
#_______________________________________________________________________________________________________________________
def reset_state(self, batch_size):
return tf.zeros((batch_size, self.units)) recurrent_initializer='glorot_uniform')
encoder = VGG16_Encoder(image_features_extract_model)
decoder = Rnn_Local_Decoder(embedding_dim, units, vocab_size)
Here is the training code --->
def train_step(img_tensor, target):
loss = 0
# initializing the hidden state for each batch
# because the captions are not related from image to image
hidden = decoder.reset_state(batch_size=target.shape[0])
dec_input = tf.expand_dims([tokenizer.word_index['<start>']] * BATCH_SIZE, 1)
features = encoder(img_tensor)
with tf.GradientTape() as tape:
for i in range(1, max_len):
# passing the features through the decoder
predictions, hidden, _ = decoder(dec_input, features, hidden)
loss += loss_function(target[:, i], predictions)
# using teacher forcing
dec_input = tf.expand_dims(target[:, i], 1)
total_loss = (loss / int(target.shape[1]))
trainable_variables = decoder.trainable_variables
gradients = tape.gradient(loss, trainable_variables)
optimizer.apply_gradients(zip(gradients, trainable_variables))
return loss, total_loss
Here is the error --->
ResourceExhaustedError: Graph execution error:
Detected at node 'gradient_tape/rnn__local__decoder_1/dense_6/MatMul_3/MatMul_1' defined at (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.8/dist-packages/ipykernel_launcher.py", line 16, in <module>
app.launch_new_instance()
File "/usr/local/lib/python3.8/dist-packages/traitlets/config/application.py", line 992, in launch_instance
app.start()
File "/usr/local/lib/python3.8/dist-packages/ipykernel/kernelapp.py", line 612, in start
self.io_loop.start()
File "/usr/local/lib/python3.8/dist-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/usr/lib/python3.8/asyncio/base_events.py", line 570, in run_forever
self._run_once()
File "/usr/lib/python3.8/asyncio/base_events.py", line 1859, in _run_once
handle._run()
File "/usr/lib/python3.8/asyncio/events.py", line 81, in _run
self._context.run(self._callback, *self._args)
File "/usr/local/lib/python3.8/dist-packages/tornado/ioloop.py", line 690, in <lambda>
lambda f: self._run_callback(functools.partial(callback, future))
File "/usr/local/lib/python3.8/dist-packages/tornado/ioloop.py", line 743, in _run_callback
ret = callback()
File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 787, in inner
self.run()
File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 748, in run
yielded = self.gen.send(value)
File "/usr/local/lib/python3.8/dist-packages/ipykernel/kernelbase.py", line 381, in dispatch_queue
yield self.process_one()
File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 225, in wrapper
runner = Runner(result, future, yielded)
File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 714, in __init__
self.run()
File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 748, in run
yielded = self.gen.send(value)
File "/usr/local/lib/python3.8/dist-packages/ipykernel/kernelbase.py", line 365, in process_one
yield gen.maybe_future(dispatch(*args))
File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 209, in wrapper
yielded = next(result)
File "/usr/local/lib/python3.8/dist-packages/ipykernel/kernelbase.py", line 268, in dispatch_shell
yield gen.maybe_future(handler(stream, idents, msg))
File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 209, in wrapper
yielded = next(result)
File "/usr/local/lib/python3.8/dist-packages/ipykernel/kernelbase.py", line 543, in execute_request
self.do_execute(
File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 209, in wrapper
yielded = next(result)
File "/usr/local/lib/python3.8/dist-packages/ipykernel/ipkernel.py", line 306, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "/usr/local/lib/python3.8/dist-packages/ipykernel/zmqshell.py", line 536, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 2854, in run_cell
result = self._run_cell(
File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 2881, in _run_cell
return runner(coro)
File "/usr/local/lib/python3.8/dist-packages/IPython/core/async_helpers.py", line 68, in _pseudo_sync_runner
coro.send(None)
File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 3057, in run_cell_async
has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 3249, in run_ast_nodes
if (await self.run_code(code, result, async_=asy)):
File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 3326, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-82-94347d84883d>", line 11, in <module>
batch_loss, t_loss = train_step(img_tensor, target)
File "<ipython-input-63-9f15c0ea6d9d>", line 30, in train_step
gradients = tape.gradient(loss, trainable_variables)
Node: 'gradient_tape/rnn__local__decoder_1/dense_6/MatMul_3/MatMul_1'
Sorry for uploading so much of code , but I feel that all is necessary to sort this issue.
Thanks in advance !!!
I tried to reduce the data from 40000 images to just 500 images , but then also same error stayed. I even tried to reduce batch size, embedding dim of decoder (512-->128) but all in vain.
Kindly help me fix this issue.

ValueError: "input_length" is 47, but received input has shape (None, 47, 18704)

input = Input(shape=(47,vocab_length))
print(input.shape)
X = Embedding(input_dim=vocab_length,output_dim = embedding_size,input_length=47)(input)
print(X.shape)
X = Reshape([47,embedding_size,1])(X)
Above given is the part of my code, here I am getting an error in Embedding layer, as shown-
ValueError: "input_length" is 47, but received input has shape (None, 47, 18704)
Note here that vocab_length and embedding_size are integers.
Please help me out!
Thanks
So, further I tried making my own customed layer similar to the Embedding layer. But in it I am getting some error as described below.
'''
from keras import backend as K
from keras.engine.topology import Layer
class MyLayer(Layer):
def __init__(self,shape_0=1,shape_1=1):
super(MyLayer, self).__init__()
self.w = self.add_weight(
shape=(shape_0,shape_1), initializer="random_normal", trainable=True
)
def call(self, inputs):
return tf.matmul(inputs, self.w)
embedding_size = 300
vocab_length = 18704
input = Input(shape=(47,vocab_length))
X = MyLayer(vocab_length,embedding_size)(input)
X = Reshape([47,embedding_size,1])(X)
'''
Here it gives error in Reshape layer saying the target length should be same as the original length of the input. But it is same i.e. input shape = [47,embedding_size] and target size = [47,embedding_size,1]
'''
ValueError: total size of new array must be unchanged
'''

How can I fix the “TypeError: 'Tensor' object is not callable” error in Pytorch?

I am trying to compute a linear function an image's pixels, followed by log softmax (it's for a classification task). I am not sure how to do this without getting errors. Here is the relevant code:
...
torch.nn.functional.nll_loss(output, target) # error happens here
...
def __init__(self):
super(NetLin, self).__init__()
self.in_out = torch.nn.Linear(28, 2)
def forward(self, input):
out_sum = self.in_out(input)
output = torch.nn.LogSoftmax(out_sum)
return output
and the full error message I get is:
Traceback (most recent call last):
File "copy.py", line 98, in <module>
main()
File "copy.py", line 94, in main
train(args, net, device, train_loader, optimizer, epoch)
File "copy.py", line 21, in train
loss = torch.nn.functional.nll_loss(output, target)
File "/usr/local/lib/python3.7/site-packages/torch/nn/functional.py", line 2107, in nll_loss
dim = input.dim()
TypeError: 'Tensor' object is not callable
I have tried a few different solutions to this based on other answers online but they just result in different error messages. Clearly I am doing something fundamentally wrong here but I haven't used Pytorch before so I'm not sure what it is. Thank you
Edit:
My code is now:
def train(args, model, device, train_loader, optimizer, epoch):
if args.net == 'lin':
model = NetLin()
model.train()
loss = nn.NLLLoss()
for batch_idx, (data, target) in enumerate(train_loader):
data.requires_grad = True
data, target = data.to(device), target.to(device)
optimizer.zero_grad()
output = loss(model(input), target)
F.nll_loss(output, target)
loss.backward()
optimizer.step()
if batch_idx % 100 == 0:
print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
epoch, batch_idx * len(data), len(train_loader.dataset),
100. * batch_idx / len(train_loader), loss.item()))
class NetLin(nn.Module):
def __init__(self):
super(NetLin, self).__init__()
self.in_out = torch.nn.Linear(28 * 28, 2)
def forward(self, input):
input = input.view(-1, 28 * 28)
out_sum = self.in_out(input)
output = torch.nn.LogSoftmax(out_sum, dim=1)
return output
and my error message is now:
Traceback (most recent call last):
File "copy.py", line 102, in <module>
main()
File "copy.py", line 98, in main
train(args, net, device, train_loader, optimizer, epoch)
File "copy.py", line 24, in train
output = loss(model(input), target)
File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/Users/.../copy.py", line 15, in forward
input = input.view(-1, 28 * 28)
AttributeError: 'builtin_function_or_method' object has no attribute 'view'
As you can kind of see the data and target are read in from a file (they are from KMNIST actually) so I can't control their format exactly, but I do know the image sizes are all [1,28,28], i.e. a 28*28 greyscale image. Also the batch size is 64 in case that matters.
Did you remember to set your model to training mode in your train loop with model.train()? Also, nll_loss takes in 2 tensors, but the first entry (the input tensor) needs to have requires_grad=True before it goes through the model, which is also why you need to set model.train() before training.
So you would have something like this:
model = NetLin()
model.train()
loss = nn.NLLLoss()
input = torch.randn(7, 4, requires_grad=True) # your input image (tensor)
target = torch.tensor([1, 0]) # image label for image belonging to first class
output = loss(model(input), target)
I am also a bit concerned about your self.in_out = torch.nn.Linear(28, 2). This says that your linear layer is expecting 28 features, implying that your input images are either 7x4, 14x2 or 28x1, which doesn't seem right in my opinion? Aren't you using images of size 28x28 (very typical size in this context)? In which case, you would have your linear layer modified as self.in_out = torch.nn.Linear(28*28, 2), and your forward pass will have to be modified as follows:
def forward(self, input):
input = input.view(-1, 28*28)
out_sum = self.in_out(input)
output = torch.nn.LogSoftmax(out_sum)
return output

Neural Network Dense Layer Error in Shape attribute

I have created a feed forward neural network but but it is giving a Type Error despite changing the datatype of the parameter. I am really new to keras and Machine Learning so I would appreciate as detailed help as possible. I am attaching the code snippet and the error log below. CODE-
num_of_features = X_train.shape[1]
nb_classes = Y_train.shape[1]
def baseline_model():
def branch2(x):
x = Dense(np.floor(num_of_features*50), activation='sigmoid')(x)
x = Dropout(0.75)(x)
x = Dense(np.floor(num_of_features*20), activation='sigmoid')(x)
x = Dropout(0.5)(x)
x = Dense(np.floor(num_of_features), activation='sigmoid')(x)
x = Dropout(0.1)(x)
return x
main_input = Input(shape=(num_of_features,), name='main_input')
x = main_input
x = branch2(x)
main_output = Dense(nb_classes, activation='softmax')(x)
model = Model(input=main_input, output=main_output)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy', 'categorical_crossentropy'])
return model
model = baseline_model()
ERROR-
Traceback (most recent call last):
File "h2_fit_neural.py", line 143, in <module>
model = baseline_model()
File "h2_fit_neural.py", line 137, in baseline_model
x = branch2(x)
File "h2_fit_neural.py", line 124, in branch2
x = Dense(np.floor(num_of_features*50), activation='sigmoid')(x)
File "/home/shashank/tensorflow/lib/python3.6/site-packages/keras/engine/base_layer.py", line 432, in __call__
self.build(input_shapes[0])
File "/home/shashank/tensorflow/lib/python3.6/site-packages/keras/layers/core.py", line 872, in build
constraint=self.kernel_constraint)
File "/home/shashank/tensorflow/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/shashank/tensorflow/lib/python3.6/site-packages/keras/engine/base_layer.py", line 249, in add_weight
weight = K.variable(initializer(shape),
File "/home/shashank/tensorflow/lib/python3.6/site-packages/keras/initializers.py", line 218, in __call__
dtype=dtype, seed=self.seed)
File "/home/shashank/tensorflow/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 4077, in random_uniform
dtype=dtype, seed=seed)
File "/home/shashank/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/random_ops.py", line 242, in random_uniform
rnd = gen_random_ops.random_uniform(shape, dtype, seed=seed1, seed2=seed2)
File "/home/shashank/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/gen_random_ops.py", line 674, in random_uniform
name=name)
File "/home/shashank/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 609, in _apply_op_helper
param_name=input_name)
File "/home/shashank/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 60, in _SatisfiesTypeConstraint
", ".join(dtypes.as_dtype(x).name for x in allowed_list)))
TypeError: Value passed to parameter 'shape' has DataType float32 not in list of allowed values: int32, int64
Why are you using np.floor for the shape in your Dense layers? This will produce a float, you need an int there. Removing np.floor should solve your problem.

Invalid Argument error Expected begin[0] = 0

I am currently developing a neural network, and I got all the data and I got the code to the point that an image is being fed to the CNN for training. However, in the training process, for the first image an error pops up with the following code.
def convolutional_neural_network(x):
weights = {'W_conv1':tf.Variable(tf.random_normal([5,5,1,32])),
'W_conv2':tf.Variable(tf.random_normal([5,5,32,64])),
'W_fc':tf.Variable(tf.random_normal([7*7*64,1024])),
'out':tf.Variable(tf.random_normal([1024, n_classes]))}
biases = {'b_conv1':tf.Variable(tf.random_normal([32])),
'b_conv2':tf.Variable(tf.random_normal([64])),
'b_fc':tf.Variable(tf.random_normal([1024])),
'out':tf.Variable(tf.random_normal([n_classes]))}
x = tf.reshape(x, shape=[-1, 28, 28, 1])
conv1 = tf.nn.relu(conv2d(x, weights['W_conv1']) + biases['b_conv1'])
conv1 = maxpool2d(conv1)
conv2 = tf.nn.relu(conv2d(conv1, weights['W_conv2']) + biases['b_conv2'])
conv2 = maxpool2d(conv2)
fc = tf.reshape(conv2,[-1, 7*7*64])
fc = tf.nn.relu(tf.matmul(fc, weights['W_fc'])+biases['b_fc'])
fc = tf.nn.dropout(fc, keep_rate)
output = tf.matmul(fc, weights['out'])+biases['out']
print("hi")
return output
def shuffle_unison(images, labels):
shuffleLabel = []
shuffleImage = []
shuffleVector = []
for i in range(0, len(images)-1):
shuffleVector.append(i)
random.shuffle(shuffleLabel)
for i in range(0, len(shuffleVector)-1):
shuffleImage.append(images[shuffleVector[i]])
shuffleLabel.append(labels[shuffleVector[i]])
return shuffleImage, shuffleLabel
def train_neural_network(x):
prediction = convolutional_neural_network(x)
cost = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(prediction,y) )
optimizer = tf.train.AdamOptimizer().minimize(cost)
hm_epochs = 10
# step 4: Batching
with tf.Session() as sess:
init = tf.initialize_all_variables()
sess.run(init)
tf.train.start_queue_runners()
#array of strings and corresponding values
image_list, label_list = readImageLables()
for epoch in range(hm_epochs):
epoch_loss = 0
#shuffle every epoch
shuffle_image_list, shuffle_label_list = shuffle_unison(image_list, label_list)
sampleList = ['/home/sciencefair/Desktop/OrchardData/MachineLearningTesting/RottenOranges/result1.jpg']
for i in range(0,7683):
#filename_queue = tf.train.string_input_producer(sampleList)
file_contents = tf.read_file(shuffle_image_list[i])
image = tf.image.decode_jpeg(file_contents, channels=1)
resized_image = tf.image.resize_images(image, [28,28])
#image_batch, label_batch = tf.train.batch([resized_image, shuffle_label_list[i]], batch_size=batch_size) # does train.batch take individual images or final tensors
#if(i>batch_size):
#print(label_batch.eval())
a = tf.reshape(resized_image,[1, 784])
print(a.eval())
_, c = sess.run([optimizer, cost], feed_dict={x: tf.reshape(resized_image,[1, 784]).eval(), y: shuffle_label_list[i]})
epoch_loss += c
print("ok")
print('Epoch', epoch, 'completed out of',hm_epochs,'loss:',epoch_loss)
sess.close()
The stack trace looked like this
Caused by op 'Slice_1', defined at:
File "revisednet.py", line 128, in <module>
train_neural_network(x)
File "revisednet.py", line 87, in train_neural_network
cost = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(prediction,y) )
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/nn_ops.py", line 670, in softmax_cross_entropy_with_logits
labels = _flatten_outer_dims(labels)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/nn_ops.py", line 472, in _flatten_outer_dims
array_ops.shape(logits), [math_ops.sub(rank, 1)], [1])
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/array_ops.py", line 431, in slice
return gen_array_ops._slice(input_, begin, size, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 2234, in _slice
name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 749, in apply_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 2380, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1298, in __init__
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Expected begin[0] == 0 (got -1) and size[0] == 0 (got 1) when input.dim_size(0) == 0
[[Node: Slice_1 = Slice[Index=DT_INT32, T=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](Shape_2, Slice_1/begin, Slice_1/size)]]
This error seems to originate from the data causing some confliction with the softmax function. However I have absolutely no idea what is causing this problem.
I followed this tutorial: Sentdex,
First pass through Data w/ 3D ConvNet
to build a 3D CNN and got the same error as yours here.
This error occurs because the dimension of the label vector of my input data (for example, the location of the first label vector in Sentdex's train data is train_data[0][1]) should be the same number as n_classes which in the tutorial is 2.
In my wrong try, I just use a binary value 0 or 1 to represent it, whose dimension is 1 where should be 2. So the tf.nn.softmax_cross_entropy_with_logits() function was confused by the wrong size of label vector.
Try expand your label vectors' dimension to be equal to your n_classes.

Resources