I want to use the Crossentropyloss of pytorch but somehow my code only works with batchsize 2, so i am asuming there is something wrong with the shapes of target and output.
I get following error:
Value Error: Expected target size (50, 2), got torch.Size([50, 3])
My targetsize is (N=50,batchsize=3) and the output of my model is (N=50, batchsize=3, number of classes =2). Before the output layer my shape is (N=50,batchsize=3,dimensions=64).
How do i need to change the shapes so that the Crossentropyloss works?
Without further information about your model, here's what I would do. You have a many-to-many RNN which outputs (seq_len, batch_size, nb_classes) and the target is (seq_len, seq_len). The nn.CrossEntropyLoss module can take additional dimensions (batch_size, nb_classes, d1â, d2â, ..., dKâ) as an input.
You could make it work by permuting the axes, such that the outputted tensor is of shape (batch_size, nb_classes, seq_len). This should make it happen:
output = output.permute(0, 2, 1)
Additionally, your target will also have to change to be (batch_size, seq_len):
target = target.permute(1, 0)
Related
Problem Statement: I have an image and a pixel of the image can belong to only(either) one of Band5','Band6', 'Band7' (see below for details). Hence, I have a pytorch multi-class problem but I am unable to understand how to set the targets which needs to be in form [batch, w, h]
My dataloader return two values:
x = chips.loc[:, :, :, self.input_bands]
y = chips.loc[:, :, :, self.output_bands]
x = x.transpose('chip','channel','x','y')
y_ohe = y.transpose('chip','channel','x','y')
Also, I have defined:
input_bands = ['Band1','Band2', 'Band3', 'Band3', 'Band4'] # input classes
output_bands = ['Band5','Band6', 'Band7'] #target classes
model = ModelName(num_classes = 3, depth=default_depth, in_channels=5, merge_mode='concat').to(device)
loss_new = nn.CrossEntropyLoss()
In my training function:
#get values from dataloader
X = normalize_zero_to_one(X) #input
y = normalize_zero_to_one(y) #target
images = Variable(torch.from_numpy(X)).to(device) # [batch, channel, H, W]
masks = Variable(torch.from_numpy(y)).to(device)
optim.zero_grad()
outputs = model(images)
loss = loss_new(outputs, masks) # (preds, target)
loss.backward()
optim.step() # Update weights
I know the the target (here masks) should be [batch_size, w, h]. However, it is currently [batch_size, channels, w, h].
I read a lot of posts including 1, 2 and they say the target should only contain the target class indices. I don't understand how can I concatenate indices of three classes and still set target as [batch_size, w, h].
Right now, I get the error:
RuntimeError: only batches of spatial targets supported (3D tensors) but got targets of dimension: 4
To the best of my understanding, I don't need to do any one hot encoding. Similar errors and explanation I found on the internet are here:'
Reference 1
Reference 2
Reference 3
Reference 4
Any help will be appreciated! Thank you.
If I understand correctly, your current "target" is [batch_size, channels, w, h] with channels==3 as you have three possible targets.
What are the values in your target represent? You basically have a 3-vector target for each pixel - are these the expected class probabilities? Are they "one-hot-vectors" indicating the correct "band"?
If so, you can get the target indices by simply taking the argmax along the target channel dimension:
proper_target = torch.argmax(masks, dim=1) # make sure keepdim=False
loss = loss_new(outputs, proper_target)
hi i'm trying to use cnn on fashion mnist data
there are 5200 images 28*28 in grayscale so i used a 2D cnn
here is my code:
fashion_mnist=keras.datasets.fashion_mnist
(xtrain,ytrain),(xtest,ytest)=fashion_mnist.load_data()
xvalid,xtrain=xtrain[:5000]/255.0,xtrain[5000:]/255.0
yvalid,ytrain=ytrain[:5000],ytrain[5000:]
defaultcon=partial(keras.layers.Conv2D,kernel_size=3,activation='relu',padding="SAME")
model=keras.models.Sequential([
defaultcon(filters=64,kernel_size=7,input_shape=[28,28,1]),
keras.layers.MaxPooling2D(pool_size=2),
defaultcon(filters=128),
defaultcon(filters=128),
keras.layers.MaxPooling2D(pool_size=2),
defaultcon(filters=256),
defaultcon(filters=256),
keras.layers.MaxPooling2D(pool_size=2),
keras.layers.Flatten(),
keras.layers.Dense(units=128,activation='relu'),
keras.layers.Dropout(0.5),
keras.layers.Dense(units=64,activation='relu'),
keras.layers.Dropout(0.5),
keras.layers.Dense(units=10,activation='softmax'),
])
model.compile(optimizer='sgd',loss="sparse_categorical_crossentropy", metrics=["accuracy"])
history=model.fit(xtrain,ytrain,epochs=30,validation_data=(xvalid,yvalid))
but i get Error when checking input: expected conv2d_27_input to have 4 dimensions, but got array with shape (55000, 28, 28)
how expected to get 4D ?
In the input line :
defaultcon(filters=64,kernel_size=7,input_shape=[28,28,1])
you mistakenly defined the shape (28,28,1) which is not correct. And for a task with m samples, model will expect the data with the dimension of (m,28,28,1) which is a 4D.
Apparently your inputs are in the shape of (m,28,28) where m is the number of samples. So you can solve your problem by changing the line I mentioned above with this one:
defaultcon(filters=64,kernel_size=7,input_shape=(28,28))
and hopefully, you will be all set.
My current code using sparse_softmax_cross_entropy works fine.
loss_normal = (
tf.reduce_mean(tf.losses
.sparse_softmax_cross_entropy(labels=labels,
logits=logits,
weights=class_weights))
)
However, when I try to use the hinge_loss:
loss_normal = (
tf.reduce_mean(tf.losses
.hinge_loss(labels=labels,
logits=logits,
weights=class_weights))
)
It reported an error saying:
ValueError: Shapes (1024, 2) and (1024,) are incompatible
The error seems to be originated from this function in the losses_impl.py file:
with ops.name_scope(scope, "hinge_loss", (logits, labels)) as scope:
...
logits.get_shape().assert_is_compatible_with(labels.get_shape())
...
I modified my code as below to just extract 1 column of the logits tensor:
loss_normal = (
tf.reduce_mean(tf.losses
.hinge_loss(labels=labels,
logits=logits[:,1:],
weights=class_weights
))
)
But it still reports a similar error:
ValueError: Shapes (1024, 1) and (1024,) are incompatible.
Can someone please help point out why my code works fine with sparse_softmax_cross_entropy loss but not hinge_loss?
The tensor labels has the shape [1024], the tensor logits has [1024, 2] shape. This works fine for tf.nn.sparse_softmax_cross_entropy_with_logits:
labels: Tensor of shape [d_0, d_1, ..., d_{r-1}] (where r is rank of
labels and result) and dtype int32 or int64. Each entry in labels must
be an index in [0, num_classes). Other values will raise an exception
when this op is run on CPU, and return NaN for corresponding loss and
gradient rows on GPU.
logits: Unscaled log probabilities of shape
[d_0, d_1, ..., d_{r-1}, num_classes] and dtype float32 or float64.
But tf.hinge_loss requirements are different:
labels: The ground truth output tensor. Its shape should match the
shape of logits. The values of the tensor are expected to be 0.0 or
1.0.
logits: The logits, a float tensor.
You can resolve this in two ways:
Reshape the labels to [1024, 1] and use just one row of logits, like you did - logits[:,1:]:
labels = tf.reshape(labels, [-1, 1])
hinge_loss = (
tf.reduce_mean(tf.losses.hinge_loss(labels=labels,
logits=logits[:,1:],
weights=class_weights))
)
I think you'll also need to reshape the class_weights the same way.
Use all of learned logits features via tf.reduce_sum, which will make a flat (1024,) tensor:
logits = tf.reduce_sum(logits, axis=1)
hinge_loss = (
tf.reduce_mean(tf.losses.hinge_loss(labels=labels,
logits=logits,
weights=class_weights))
)
This way you don't need to reshape labels or class_weights.
After looking at the following gist, and doing some basic tests, I am trying to create a NER system using a LSTM in keras. I am using a generator and calling fit_generator.
Here is my basic keras model:
model = Sequential([
Embedding(input_dim=max_features, output_dim=embedding_size, input_length=maxlen, mask_zero=True),
Bidirectional(LSTM(hidden_size, return_sequences=True)),
TimeDistributed(Dense(out_size)),
Activation('softmax')
])
model.compile(loss='binary_crossentropy', optimizer='adam')
My input dimension seem right:
>>> generator = generate()
>>> i,t = next(generator)
>>> print( "Inputs: {}".format(model.input_shape))
>>> print( "Outputs: {}".format(model.output_shape))
>>> print( "Actual input: {}".format(i.shape))
Inputs: (None, 3949)
Outputs: (None, 3949, 1)
Actual input: (45, 3949)
However when I call:
model.fit_generator(generator, steps_per_epoch=STEPS_PER_EPOCH, epochs=EPOCHS)
I seem to get the following error:
ValueError:
Error when checking target:
expected activation_1 to have 3 dimensions,
but got array with shape (45, 3949)
I have seen a few other examples of similar issues, which leads me to believe I need to Flatten() my inputs before the Activation() but if I do so I get the following error.
Layer flatten_1 does not support masking,
but was passed an input_mask:
Tensor("embedding_37/NotEqual:0", shape=(?, 3949), dtype=bool)
As per previous questions, my generator is functionally equivalent to:
def generate():
maxlen=3949
while True:
inputs = np.random.randint(55604, size=maxlen)
targets = np.random.randint(2, size=maxlen)
yield inputs, targets
I am not assuming that I need to Flatten and I am open to additional suggestions.
You either need to return only the last element of the sequence (return_sequences=False):
model = Sequential([
Embedding(input_dim=max_features, output_dim=embedding_size, input_length=maxlen, mask_zero=True),
Bidirectional(LSTM(hidden_size)),
Dense(out_size),
Activation('softmax')
])
Or remove the masking (mask_zero=False) to be able to use Flatten:
model = Sequential([
Embedding(input_dim=max_features, output_dim=embedding_size, input_length=maxlen),
Bidirectional(LSTM(hidden_size, return_sequences=True)),
TimeDistributed(Dense(out_size)),
Flatten(),
Activation('softmax')
])
*Be careful that the output will be out_size x maxlen.
And I think you want the first option.
Edit 1: Looking at the example diagram, it makes a prediction on every timestep, so it need the softmax activation also TimeDistributed. The target dimension should be (None, maxlen, out_size):
model = Sequential([
Embedding(input_dim=max_features, output_dim=embedding_size, input_length=maxlen, mask_zero=True),
Bidirectional(LSTM(hidden_size, return_sequences=True)),
TimeDistributed(Dense(out_size)),
TimeDistributed(Activation('softmax'))
])
I have trained a classifier and I now want to pass any single image through.
I'm using the keras library with Tensorflow as the backend.
I'm getting an error I can't seem to get past
img_path = '/path/to/my/image.jpg'
import numpy as np
from keras.preprocessing import image
x = image.load_img(img_path, target_size=(250, 250))
x = image.img_to_array(x)
x = np.expand_dims(x, axis=0)
preds = model.predict(x)
Do I need to reshape my data to have None as the first dimension? I'm confused why Tensorflow would expect None as the first dimension?
Error when checking : expected convolution2d_input_1 to have shape (None, 250, 250, 3) but got array with shape (1, 3, 250, 250)
I'm wondering if there has been an issue with the architecture of my trained model?
edit: if i call model.summary() give convolution2d_input_1 as...
Edit: I did play around with the suggestion below but used numpy to transpose instead of tf - still seem to be hitting the same issue!
None matches any number. Usually, when you pass some data to a model, it is expected that you pass tensor of dimensions: None x data_size, meaning the first dimension is any dimension and denotes batch size. In your case, the problem is that you pass 250 x 250 x 3, and it is expected 3 x 250 x 250. Try:
x = image.load_img(img_path, target_size=(250, 250))
x_trans = tf.transpose(x, perm=[2, 0, 1])
x_expanded = np.expand_dims(x_trans, axis=0)
preds = model.predict(x_expanded)
Ok so using feedback rom Sygi i think i have half solved it,
The error was actually telling me i needed to pass in my dimensions as [1, 250, 250, 3] so that was an easy fix; i must say im not sure why TF is expecting the dimensions in this order as looking at the docs it doesnt seem right so more research required here.
Moving ahead im not sure transpose is the way to go as if i use a different input image the dimensions may not be in the same order meaning the transpose doesnt work properly,
Instead of transpose I'm probably trying to t call x_reshape = img.reshape((1, 250, 250, 3)) depending on what i find out about dimension order in reshaping for TS
thanks for the hints Sygi :)