ValueError: 'logits' and 'labels' must have the same shape for NLP sentiment multi-class classifier - machine-learning

I am trying to make a NLP multi-class sentiment classifier where it takes in sentences as input and classifies them into three classes (negative, neutral and positive). However, when training the model, I run into the error where my logits (None, 3) are not the same size as my labels (None, 1) and the model can't begin training.
My model is a multi-class classifier and not a multi-label classifier since it is only predicting one label per object. I made sure that my last layer had an output of 3 and had the activation = 'softmax'. This should be correct from what I have searched online so I think that the problem lies with my labels.
Currently, my labels have a dimension of (None, 1) since I mapped each class to a unique integer and passed this as my test and train y values (which are in the form of one dimensional numpy array.
Right now I am confused if I have change the dimensions of this array to match the output dimensions and how to go about doing it.
import os
import sys
import tensorflow as tf
import numpy as np
import pandas as pd
from tensorflow import keras
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from keras.optimizers import SGD
device_name = tf.test.gpu_device_name()
if len(device_name) > 0:
print("Found GPU at: {}".format(device_name))
else:
device_name = "/device:CPU:0"
print("No GPU, using {}.".format(device_name))
# Load dataset into a dataframe
train_data_path = "/content/drive/MyDrive/ML Datasets/tweet_sentiment_analysis/train.csv"
test_data_path = "/content/drive/MyDrive/ML Datasets/tweet_sentiment_analysis/test.csv"
train_df = pd.read_csv(train_data_path, encoding='unicode_escape')
test_df = pd.read_csv(test_data_path, encoding='unicode_escape').dropna()
sentiment_types = ('neutral', 'negative', 'positive')
train_df['sentiment'] = train_df['sentiment'].astype('category')
test_df['sentiment'] = test_df['sentiment'].astype('category')
train_df['sentiment_cat'] = train_df['sentiment'].cat.codes
test_df['sentiment_cat'] = test_df['sentiment'].cat.codes
train_y = np.array(train_df['sentiment_cat'])
test_y = np.array(test_df['sentiment_cat'])
# Function to convert df into a list of strings
def convert_to_list(df, x):
selected_text_list = []
labels = []
for index, row in df.iterrows():
selected_text_list.append(str(row[x]))
labels.append(str(row['sentiment']))
return np.array(selected_text_list), np.array(labels)
train_sentences, train_labels = convert_to_list(train_df, 'selected_text')
test_sentences, test_labels = convert_to_list(test_df, 'text')
# Instantiate tokenizer and create word_index
tokenizer = Tokenizer(num_words=1000, oov_token='<oov>')
tokenizer.fit_on_texts(train_sentences)
word_index = tokenizer.word_index
# Convert sentences into a sequence
train_sequence = tokenizer.texts_to_sequences(train_sentences)
test_sequence = tokenizer.texts_to_sequences(test_sentences)
# Padding sequences
pad_test_seq = pad_sequences(test_sequence, padding='post')
max_len = pad_test_seq[0].size
pad_train_seq = pad_sequences(train_sequence, padding='post', maxlen=max_len)
model = tf.keras.Sequential([
tf.keras.layers.Embedding(10000, 64, input_length=max_len),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64, return_sequences=True)),
tf.keras.layers.GlobalAveragePooling1D(),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(3, activation='softmax')
])
with tf.device(device_name):
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
num_epochs = 10
with tf.device(device_name):
history = model.fit(pad_train_seq, train_y, epochs=num_epochs, validation_data=(pad_test_seq, test_y), verbose=2)
Here is the error:
ValueError Traceback (most recent call last)
<ipython-input-28-62f3c6445887> in <module>
2
3 with tf.device(device_name):
----> 4 history = model.fit(pad_train_seq, train_y, epochs=num_epochs, validation_data=(pad_test_seq, test_y), verbose=2)
1 frames
/usr/local/lib/python3.8/dist-packages/keras/engine/training.py in tf__train_function(iterator)
13 try:
14 do_return = True
---> 15 retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
16 except:
17 do_return = False
ValueError: in user code:
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1051, in train_function *
return step_function(self, iterator)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1040, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1030, in run_step **
outputs = model.train_step(data)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 890, in train_step
loss = self.compute_loss(x, y, y_pred, sample_weight)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 948, in compute_loss
return self.compiled_loss(
File "/usr/local/lib/python3.8/dist-packages/keras/engine/compile_utils.py", line 201, in __call__
loss_value = loss_obj(y_t, y_p, sample_weight=sw)
File "/usr/local/lib/python3.8/dist-packages/keras/losses.py", line 139, in __call__
losses = call_fn(y_true, y_pred)
File "/usr/local/lib/python3.8/dist-packages/keras/losses.py", line 243, in call **
return ag_fn(y_true, y_pred, **self._fn_kwargs)
File "/usr/local/lib/python3.8/dist-packages/keras/losses.py", line 1930, in binary_crossentropy
backend.binary_crossentropy(y_true, y_pred, from_logits=from_logits),
File "/usr/local/lib/python3.8/dist-packages/keras/backend.py", line 5283, in binary_crossentropy
return tf.nn.sigmoid_cross_entropy_with_logits(labels=target, logits=output)
ValueError: `logits` and `labels` must have the same shape, received ((None, 3) vs (None, 1)).

my logits (None, 3) are not the same size as my labels (None, 1)
I made sure that my last layer had an output of 3 and had the activation = 'softmax'
my labels have a dimension of (None, 1) since I mapped each class to a unique integer
The key concept you are missing is that you need to one-hot encode your labels (after assigning integers to them - see below).
So your model, after the softmax, is spitting out three values: how probable each of your labels is. E.g. it might say A is 0.6, B is 0.1, and C is 0.3. If the correct answer is C, then it needs to see that correct answer as 0, 0, 1. It can then say that its prediction for A is 0.6 - 0 = +0.6 wrong, B is 0.1 - 0 = +0.1 wrong, and C is 0.3 - 1 = -0.7 wrong.
Theoretically you can go from a string label directly to a one-hot encoding. But it seems Tensorflow needs the labels to first be encoded as integers, and then that is one-hot encoded.
https://www.tensorflow.org/api_docs/python/tf/keras/layers/CategoryEncoding#examples says to use:
tf.keras.layers.CategoryEncoding(num_tokens=3, output_mode="one_hot")
Also see https://stackoverflow.com/a/69791457/841830 (the higher-voted answer there is from 2019, so applies to TensorFlow v1 I think). And searching for "tensorflow one-hot encoding" will bring up plenty of tutorials and examples.

The issue here was indeed due to the shape of my labels not being the same as logits. Logits were of shape (3) since they contained a float for the probability of each of the three classes that I wanted to predict. Labels were originally of shape (1) since it only contained one int.
To solve this, I used one-hot encoding which turned all labels into a shape of (3) and this solved the problem. Used the keras.utils.to_categorical() function to do so.
sentiment_types = ('negative', 'neutral', 'positive')
train_df['sentiment'] = train_df['sentiment'].astype('category')
test_df['sentiment'] = test_df['sentiment'].astype('category')
# Turning labels from strings to int
train_sentiment_cat = train_df['sentiment'].cat.codes
test_sentiment_cat = test_df['sentiment'].cat.codes
# One-hot encoding
train_y = to_categorical(train_sentiment_cat)
test_y = to_categorical(test_sentiment_cat)

Related

How can I fix the “TypeError: 'Tensor' object is not callable” error in Pytorch?

I am trying to compute a linear function an image's pixels, followed by log softmax (it's for a classification task). I am not sure how to do this without getting errors. Here is the relevant code:
...
torch.nn.functional.nll_loss(output, target) # error happens here
...
def __init__(self):
super(NetLin, self).__init__()
self.in_out = torch.nn.Linear(28, 2)
def forward(self, input):
out_sum = self.in_out(input)
output = torch.nn.LogSoftmax(out_sum)
return output
and the full error message I get is:
Traceback (most recent call last):
File "copy.py", line 98, in <module>
main()
File "copy.py", line 94, in main
train(args, net, device, train_loader, optimizer, epoch)
File "copy.py", line 21, in train
loss = torch.nn.functional.nll_loss(output, target)
File "/usr/local/lib/python3.7/site-packages/torch/nn/functional.py", line 2107, in nll_loss
dim = input.dim()
TypeError: 'Tensor' object is not callable
I have tried a few different solutions to this based on other answers online but they just result in different error messages. Clearly I am doing something fundamentally wrong here but I haven't used Pytorch before so I'm not sure what it is. Thank you
Edit:
My code is now:
def train(args, model, device, train_loader, optimizer, epoch):
if args.net == 'lin':
model = NetLin()
model.train()
loss = nn.NLLLoss()
for batch_idx, (data, target) in enumerate(train_loader):
data.requires_grad = True
data, target = data.to(device), target.to(device)
optimizer.zero_grad()
output = loss(model(input), target)
F.nll_loss(output, target)
loss.backward()
optimizer.step()
if batch_idx % 100 == 0:
print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
epoch, batch_idx * len(data), len(train_loader.dataset),
100. * batch_idx / len(train_loader), loss.item()))
class NetLin(nn.Module):
def __init__(self):
super(NetLin, self).__init__()
self.in_out = torch.nn.Linear(28 * 28, 2)
def forward(self, input):
input = input.view(-1, 28 * 28)
out_sum = self.in_out(input)
output = torch.nn.LogSoftmax(out_sum, dim=1)
return output
and my error message is now:
Traceback (most recent call last):
File "copy.py", line 102, in <module>
main()
File "copy.py", line 98, in main
train(args, net, device, train_loader, optimizer, epoch)
File "copy.py", line 24, in train
output = loss(model(input), target)
File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/Users/.../copy.py", line 15, in forward
input = input.view(-1, 28 * 28)
AttributeError: 'builtin_function_or_method' object has no attribute 'view'
As you can kind of see the data and target are read in from a file (they are from KMNIST actually) so I can't control their format exactly, but I do know the image sizes are all [1,28,28], i.e. a 28*28 greyscale image. Also the batch size is 64 in case that matters.
Did you remember to set your model to training mode in your train loop with model.train()? Also, nll_loss takes in 2 tensors, but the first entry (the input tensor) needs to have requires_grad=True before it goes through the model, which is also why you need to set model.train() before training.
So you would have something like this:
model = NetLin()
model.train()
loss = nn.NLLLoss()
input = torch.randn(7, 4, requires_grad=True) # your input image (tensor)
target = torch.tensor([1, 0]) # image label for image belonging to first class
output = loss(model(input), target)
I am also a bit concerned about your self.in_out = torch.nn.Linear(28, 2). This says that your linear layer is expecting 28 features, implying that your input images are either 7x4, 14x2 or 28x1, which doesn't seem right in my opinion? Aren't you using images of size 28x28 (very typical size in this context)? In which case, you would have your linear layer modified as self.in_out = torch.nn.Linear(28*28, 2), and your forward pass will have to be modified as follows:
def forward(self, input):
input = input.view(-1, 28*28)
out_sum = self.in_out(input)
output = torch.nn.LogSoftmax(out_sum)
return output

Layer concatenation with Keras

I'm planning to have the following design:
However my code doesn't seem working:
import numpy as np
from keras.models import Model
from keras.layers import Dense, Input, Concatenate
from keras import optimizers
trainX1 = np.array([[1,2],[3,4],[5,6],[7,8]]) # fake training data
trainY1 = np.array([[1],[2],[3],[4]]) # fake label
trainX2 = np.array([[2,3],[4,5],[6,7]])
trainY2 = np.array([[1],[2],[3]])
trainX3 = np.array([[0,1],[2,3]])
trainY3 = np.array([[1],[2]])
numFeatures = 2
trainXList = [trainX1, trainX2, trainX3]
trainYStack = np.vstack((trainY1,trainY2,trainY3))
inputList = []
modelList = []
for i,_ in enumerate(trainXList):
tempInput= Input(shape = (numFeatures,))
m = Dense(10, activation='tanh')(tempInput)
inputList.append(tempInput)
modelList.append(m)
mAll = Concatenate()(modelList)
out = Dense(1, activation='tanh')(mAll)
model = Model(inputs=inputList, outputs=out)
rmsp = optimizers.rmsprop(lr=0.00001)
model.compile(optimizer=rmsp,loss='mse', dropout = 0.1)
model.fit(trainXList, trainYStack, epochs = 1, verbose=0)
The error message says that my input data sets are not having the same shape, but after I padded my training set to make number of samples = 4 for all 3 sets, I still get errors saying dimension is not right. May I know how I can design this network properly? Thanks!
p.s. Here is the error message before padding:
ValueError: All input arrays (x) should have the same number of samples. Got array shapes: [(4, 2), (3, 2), (2, 2)]
Here is the error message after padding (happens on the last line of code):
ValueError: Input arrays should have the same number of samples as target arrays. Found 4 input samples and 12 target samples.
Your input shape is wrong for the given input.
You assign the input a size of numFeatures, but actually you have 2-dimensional arrays and they are different (4,2)(3,2)(2,2). I am not sure about your problem, but number of samples and number of features seem to be reversed.
tempInput= Input(shape = (numFeatures,))
Furthermore your y is also weird. Usually you have X (number_of samples, num_features) and y with (number of samples, labels).
Use model.summary() to see how your network looks like.

Issue in LSTM Input Dimensions in Keras

I am trying to implement a multi-input LSTM model using keras. The code is as follows:
data_1 -> shape (1150,50)
data_2 -> shape (1150,50)
y_train -> shape (1150,50)
input_1 = Input(shape=data_1.shape)
LSTM_1 = LSTM(100)(input_1)
input_2 = Input(shape=data_2.shape)
LSTM_2 = LSTM(100)(input_2)
concat = Concatenate(axis=-1)
x = concat([LSTM_1, LSTM_2])
dense_layer = Dense(1, activation='sigmoid')(x)
model = keras.models.Model(inputs=[input_1, input_2], outputs=[dense_layer])
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['acc'])
model.fit([data_1, data_2], y_train, epochs=10)
When I run this code, I get a ValueError:
ValueError: Error when checking model input: expected input_1 to have 3 dimensions, but got array with shape (1150, 50)
Do anyone have any solution to this problem?
Use data1 = np.expand_dims(data1, axis=2), before you define the model. LSTM expects inputs with dimensions (batch_size, timesteps, features), so, in your case, I guessing you have 1 feature, 50 time steps and 1150 samples, you need to add a dimension at the end of your vector.
This need to be done before you define the model otherwise when you set input_1 = Input(shape=data_1.shape) you are telling keras that your input has 1150 timesteps and 50 features,so it will expect inputs of shape (None, 1150, 50) (the non stands for "any dimension will be accepted").
The same holds for input_2
Hope this helps

Invalid Argument error Expected begin[0] = 0

I am currently developing a neural network, and I got all the data and I got the code to the point that an image is being fed to the CNN for training. However, in the training process, for the first image an error pops up with the following code.
def convolutional_neural_network(x):
weights = {'W_conv1':tf.Variable(tf.random_normal([5,5,1,32])),
'W_conv2':tf.Variable(tf.random_normal([5,5,32,64])),
'W_fc':tf.Variable(tf.random_normal([7*7*64,1024])),
'out':tf.Variable(tf.random_normal([1024, n_classes]))}
biases = {'b_conv1':tf.Variable(tf.random_normal([32])),
'b_conv2':tf.Variable(tf.random_normal([64])),
'b_fc':tf.Variable(tf.random_normal([1024])),
'out':tf.Variable(tf.random_normal([n_classes]))}
x = tf.reshape(x, shape=[-1, 28, 28, 1])
conv1 = tf.nn.relu(conv2d(x, weights['W_conv1']) + biases['b_conv1'])
conv1 = maxpool2d(conv1)
conv2 = tf.nn.relu(conv2d(conv1, weights['W_conv2']) + biases['b_conv2'])
conv2 = maxpool2d(conv2)
fc = tf.reshape(conv2,[-1, 7*7*64])
fc = tf.nn.relu(tf.matmul(fc, weights['W_fc'])+biases['b_fc'])
fc = tf.nn.dropout(fc, keep_rate)
output = tf.matmul(fc, weights['out'])+biases['out']
print("hi")
return output
def shuffle_unison(images, labels):
shuffleLabel = []
shuffleImage = []
shuffleVector = []
for i in range(0, len(images)-1):
shuffleVector.append(i)
random.shuffle(shuffleLabel)
for i in range(0, len(shuffleVector)-1):
shuffleImage.append(images[shuffleVector[i]])
shuffleLabel.append(labels[shuffleVector[i]])
return shuffleImage, shuffleLabel
def train_neural_network(x):
prediction = convolutional_neural_network(x)
cost = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(prediction,y) )
optimizer = tf.train.AdamOptimizer().minimize(cost)
hm_epochs = 10
# step 4: Batching
with tf.Session() as sess:
init = tf.initialize_all_variables()
sess.run(init)
tf.train.start_queue_runners()
#array of strings and corresponding values
image_list, label_list = readImageLables()
for epoch in range(hm_epochs):
epoch_loss = 0
#shuffle every epoch
shuffle_image_list, shuffle_label_list = shuffle_unison(image_list, label_list)
sampleList = ['/home/sciencefair/Desktop/OrchardData/MachineLearningTesting/RottenOranges/result1.jpg']
for i in range(0,7683):
#filename_queue = tf.train.string_input_producer(sampleList)
file_contents = tf.read_file(shuffle_image_list[i])
image = tf.image.decode_jpeg(file_contents, channels=1)
resized_image = tf.image.resize_images(image, [28,28])
#image_batch, label_batch = tf.train.batch([resized_image, shuffle_label_list[i]], batch_size=batch_size) # does train.batch take individual images or final tensors
#if(i>batch_size):
#print(label_batch.eval())
a = tf.reshape(resized_image,[1, 784])
print(a.eval())
_, c = sess.run([optimizer, cost], feed_dict={x: tf.reshape(resized_image,[1, 784]).eval(), y: shuffle_label_list[i]})
epoch_loss += c
print("ok")
print('Epoch', epoch, 'completed out of',hm_epochs,'loss:',epoch_loss)
sess.close()
The stack trace looked like this
Caused by op 'Slice_1', defined at:
File "revisednet.py", line 128, in <module>
train_neural_network(x)
File "revisednet.py", line 87, in train_neural_network
cost = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(prediction,y) )
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/nn_ops.py", line 670, in softmax_cross_entropy_with_logits
labels = _flatten_outer_dims(labels)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/nn_ops.py", line 472, in _flatten_outer_dims
array_ops.shape(logits), [math_ops.sub(rank, 1)], [1])
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/array_ops.py", line 431, in slice
return gen_array_ops._slice(input_, begin, size, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 2234, in _slice
name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 749, in apply_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 2380, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1298, in __init__
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Expected begin[0] == 0 (got -1) and size[0] == 0 (got 1) when input.dim_size(0) == 0
[[Node: Slice_1 = Slice[Index=DT_INT32, T=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](Shape_2, Slice_1/begin, Slice_1/size)]]
This error seems to originate from the data causing some confliction with the softmax function. However I have absolutely no idea what is causing this problem.
I followed this tutorial: Sentdex,
First pass through Data w/ 3D ConvNet
to build a 3D CNN and got the same error as yours here.
This error occurs because the dimension of the label vector of my input data (for example, the location of the first label vector in Sentdex's train data is train_data[0][1]) should be the same number as n_classes which in the tutorial is 2.
In my wrong try, I just use a binary value 0 or 1 to represent it, whose dimension is 1 where should be 2. So the tf.nn.softmax_cross_entropy_with_logits() function was confused by the wrong size of label vector.
Try expand your label vectors' dimension to be equal to your n_classes.

keras model fit_generator ValueError: Error when checking model target: expected cropping2d_4 to have 4 dimensions, but got array with shape (32, 1)

I'm trying to use keras model.fit_generator() to fit a model, below is my definition of the generator:
from sklearn.utils import shuffle
IMG_PATH_PREFIX = "./data/IMG/"
def generator(samples, batch_size=64):
num_samples = len(samples)
while 1: # Loop forever so the generator never terminates
shuffle(samples)
for offset in range(0, num_samples, batch_size):
batch_samples = samples[offset:offset+batch_size]
images = []
angles = []
for batch_sample in batch_samples:
name = IMG_PATH_PREFIX + batch_sample[0].split('/')[-1]
center_image = cv2.imread(name)
center_angle = float(batch_sample[3])
images.append(center_image)
angles.append(center_angle)
X_train = np.array(images)
y_train = np.array(angles)
#X_train = np.expand_dims(X_train, axis=0)
#y_train = np.expand_dims(y_train, axis=1)
print("X_train shape: ", X_train.shape, " y_train shape:", y_train.shape)
#print("X train: ", X_train)
yield X_train, y_train
train_generator = generator(train_samples, batch_size = 32)
validation_generator = generator(validation_samples, batch_size = 32)
Here the output shape is:
X_train shape: (32, 160, 320, 3) y_train shape: (32,)
The model fit code is:
model = Sequential()
#cropping layer
model.add(Cropping2D(cropping=((50,20), (1,1)), input_shape=(160,320,3), dim_ordering='tf'))
model.compile(loss = "mse", optimizer="adam")
model.fit_generator(train_generator, samples_per_epoch= len(train_samples), validation_data=validation_generator, nb_val_samples=len(validation_samples), nb_epoch=3)
Then I get the error message:
ValueError: Error when checking model target: expected cropping2d_6 to have 4 dimensions, but got array with shape (32, 1)
Could someone help let me know what's the issue?
The big question here is : do you know what you are trying to do ?
1) If you read here, the input is a 4D tensor and the output is ALSO a 4D tensor. Your target is a 2D tensor of shape (batch_size,1). So of course, when keras tries to compute the error between the output which has 3D (without batch dimension) and the target which has 1D (without batch dimension), it can not make sense out of that. Outputs and targets must have the same dimensions.
2) Do you know what cropping2D is actually doing ? It is cropping your images... So removing values at the beginning and end of your cropping dimensions. In your case you are outputing images of shape (90, 218, 3). This is not a prediction, there is no weight to train on this layer so no reason to fit the "model". Your model is just cropping images. No training needed for that.

Resources