How can I create a deeplearning4j Lambda Layer taking the mean over one dimension for a keras model inported from python? - deeplearning4j

I try to load a Keras Model trained in Python into Java using deeplearning4j. The Model contains a Lambda Layer, that takes the average in one dimension. I found out, that I have to create the Lambda Layer as a Class, so that deeplearning4j knows what to expect. However, I need help with correctly setting the output dimensions of that Layer in Java.
I have tried to construct a InputType with the shape I want. However, it is an Abstract Class and the derived Classes have no Constructor allowing to set the Shape. I also didn't find any resize options.
The only thing I found, but that didn't help, was the inferInputType, but I could not get anything useful, as returnTypes is again an Array, while I need a single Type with 2 Dimensions...
long[] shape = inputType.getShape();
INDArray inputShapeA = Nd4j.zeros(shape[0]);
INDArray inputShapeB = Nd4j.zeros(shape[2]);
INDArray[] inputShape = {inputShapeA,inputShapeB};
InputTypes[] returnTypes = InputType.InputTypeFeedForward.inferInputTypes(inputShape)
b_features = 2 + 20 + 100 # (#temporal, #user, #doc)
##### Main part #####
inputs = Input(shape=(None, nb_features), name="Capture_Input")
##### Classifier #####
z = Lambda(lambda x: K.mean(x, axis=1), output_shape=lambda s: (s[0], s[2]), name="Lambda_Mean_Layer")(
inputs) # (None, nb_score)
outputs = Dense(1, activation='tanh', name="classification_Layer")(z)
##### Model #####
model = Model(input=[inputs], output=outputs, name="minimal_Import_Error_Model")
##### Compile #####
adam = Adam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(optimizer=adam, loss='binary_crossentropy')
//training; could be done with random values, as the error is independent of the actual classification performance
model.summary()
model.save("model_name.h5")
Model Summary from Python, this is what I expected:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
Capture_Input (InputLayer) (None, None, 122) 0
_________________________________________________________________
Lambda_Mean_Layer (Lambda) (None, 122) 0
_________________________________________________________________
classification_Layer (Dense) (None, 1) 123
=================================================================
Total params: 123
Trainable params: 123
Non-trainable params: 0
_________________________________________________________________
Java Code defining Lambda Layer (Error source), loading Model and getting random predictions.
import org.apache.log4j.BasicConfigurator;
import org.deeplearning4j.nn.conf.inputs.InputType;
import org.deeplearning4j.nn.conf.layers.samediff.SameDiffLambdaLayer;
import org.deeplearning4j.nn.graph.ComputationGraph;
import org.deeplearning4j.nn.modelimport.keras.KerasLayer;
import org.deeplearning4j.nn.modelimport.keras.KerasModelImport;
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.nd4j.autodiff.samediff.SDVariable;
import org.nd4j.autodiff.samediff.SameDiff;
import org.nd4j.linalg.api.ndarray.INDArray;
import org.nd4j.linalg.factory.Nd4j;
import org.nd4j.linalg.io.ClassPathResource;// load the model
class OurLambdaLayer extends SameDiffLambdaLayer {
// Python Keras version: Lambda(lambda x: K.mean(x, axis=1), output_shape=lambda s: (s[0], s[2]))(sub_h)
#Override
public SDVariable defineLayer(SameDiff sd, SDVariable x) {
//visited only once, right before the Error
System.out.println(x.toString());
for (int i = 0; i < x.getShape().length; i++) {
System.out.print("dim " + i + " :" + x.getShape()[i] + " ");
}
System.out.println("");
SDVariable returnvalue = x.mean(1);
System.out.println(returnvalue.toString());
return returnvalue;
}
#Override
public InputType getOutputType(int layerIndex, InputType inputType) {
// visited 2 times during Graph construction
return inputType;
}
}
class Main {
public static void main(String[] argv) {
BasicConfigurator.configure(); //some logging library initialization
try {
//prepare for lambda layer
KerasLayer.registerLambdaLayer("Lambda_Mean_Layer", new OurLambdaLayer());
// load the model
ClassPathResource cpr = new ClassPathResource("model_name.h5");
String simpleMlp = cpr.getFile().getPath();
ComputationGraph model = KerasModelImport.importKerasModelAndWeights(simpleMlp);
// make a random sample
int inputs = 122;
int[] inputA = {1, 1, inputs};
INDArray features = Nd4j.rand(inputA);
// get the prediction
INDArray[] prediction = model.output(features);
} catch (Exception e) {
e.printStackTrace();
}
}
}
I get the following Error, I think this is due to the Lambda Layer producing an Output which has different dimensions to what its getOutputType promised. I want to change getOutputType to correspond to the ouptut I create.
SDVariable(name="input",variableType=VARIABLE,dtype=FLOAT)
dim 0 :1 dim 1 :1 dim 2 :122
SDVariable(name="reduce_mean_1",variableType=ARRAY,dtype=FLOAT)
java.lang.IllegalArgumentException: Invalid input: expect NDArray with rank 3 (i.e., activations for RNN layer)
at org.deeplearning4j.nn.conf.preprocessor.RnnToFeedForwardPreProcessor.preProcess(RnnToFeedForwardPreProcessor.java:56)
at org.deeplearning4j.nn.graph.vertex.impl.LayerVertex.applyPreprocessorAndSetInput(LayerVertex.java:118)
at org.deeplearning4j.nn.graph.vertex.impl.LayerVertex.setInput(LayerVertex.java:170)
at org.deeplearning4j.nn.graph.ComputationGraph.outputOfLayersDetached(ComputationGraph.java:2388)
at org.deeplearning4j.nn.graph.ComputationGraph.output(ComputationGraph.java:1737)
at org.deeplearning4j.nn.graph.ComputationGraph.output(ComputationGraph.java:1693)
at org.deeplearning4j.nn.graph.ComputationGraph.output(ComputationGraph.java:1623)
at Main.main(main.java:74)

Related

ValueError: 'logits' and 'labels' must have the same shape for NLP sentiment multi-class classifier

I am trying to make a NLP multi-class sentiment classifier where it takes in sentences as input and classifies them into three classes (negative, neutral and positive). However, when training the model, I run into the error where my logits (None, 3) are not the same size as my labels (None, 1) and the model can't begin training.
My model is a multi-class classifier and not a multi-label classifier since it is only predicting one label per object. I made sure that my last layer had an output of 3 and had the activation = 'softmax'. This should be correct from what I have searched online so I think that the problem lies with my labels.
Currently, my labels have a dimension of (None, 1) since I mapped each class to a unique integer and passed this as my test and train y values (which are in the form of one dimensional numpy array.
Right now I am confused if I have change the dimensions of this array to match the output dimensions and how to go about doing it.
import os
import sys
import tensorflow as tf
import numpy as np
import pandas as pd
from tensorflow import keras
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from keras.optimizers import SGD
device_name = tf.test.gpu_device_name()
if len(device_name) > 0:
print("Found GPU at: {}".format(device_name))
else:
device_name = "/device:CPU:0"
print("No GPU, using {}.".format(device_name))
# Load dataset into a dataframe
train_data_path = "/content/drive/MyDrive/ML Datasets/tweet_sentiment_analysis/train.csv"
test_data_path = "/content/drive/MyDrive/ML Datasets/tweet_sentiment_analysis/test.csv"
train_df = pd.read_csv(train_data_path, encoding='unicode_escape')
test_df = pd.read_csv(test_data_path, encoding='unicode_escape').dropna()
sentiment_types = ('neutral', 'negative', 'positive')
train_df['sentiment'] = train_df['sentiment'].astype('category')
test_df['sentiment'] = test_df['sentiment'].astype('category')
train_df['sentiment_cat'] = train_df['sentiment'].cat.codes
test_df['sentiment_cat'] = test_df['sentiment'].cat.codes
train_y = np.array(train_df['sentiment_cat'])
test_y = np.array(test_df['sentiment_cat'])
# Function to convert df into a list of strings
def convert_to_list(df, x):
selected_text_list = []
labels = []
for index, row in df.iterrows():
selected_text_list.append(str(row[x]))
labels.append(str(row['sentiment']))
return np.array(selected_text_list), np.array(labels)
train_sentences, train_labels = convert_to_list(train_df, 'selected_text')
test_sentences, test_labels = convert_to_list(test_df, 'text')
# Instantiate tokenizer and create word_index
tokenizer = Tokenizer(num_words=1000, oov_token='<oov>')
tokenizer.fit_on_texts(train_sentences)
word_index = tokenizer.word_index
# Convert sentences into a sequence
train_sequence = tokenizer.texts_to_sequences(train_sentences)
test_sequence = tokenizer.texts_to_sequences(test_sentences)
# Padding sequences
pad_test_seq = pad_sequences(test_sequence, padding='post')
max_len = pad_test_seq[0].size
pad_train_seq = pad_sequences(train_sequence, padding='post', maxlen=max_len)
model = tf.keras.Sequential([
tf.keras.layers.Embedding(10000, 64, input_length=max_len),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64, return_sequences=True)),
tf.keras.layers.GlobalAveragePooling1D(),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(3, activation='softmax')
])
with tf.device(device_name):
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
num_epochs = 10
with tf.device(device_name):
history = model.fit(pad_train_seq, train_y, epochs=num_epochs, validation_data=(pad_test_seq, test_y), verbose=2)
Here is the error:
ValueError Traceback (most recent call last)
<ipython-input-28-62f3c6445887> in <module>
2
3 with tf.device(device_name):
----> 4 history = model.fit(pad_train_seq, train_y, epochs=num_epochs, validation_data=(pad_test_seq, test_y), verbose=2)
1 frames
/usr/local/lib/python3.8/dist-packages/keras/engine/training.py in tf__train_function(iterator)
13 try:
14 do_return = True
---> 15 retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
16 except:
17 do_return = False
ValueError: in user code:
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1051, in train_function *
return step_function(self, iterator)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1040, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1030, in run_step **
outputs = model.train_step(data)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 890, in train_step
loss = self.compute_loss(x, y, y_pred, sample_weight)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 948, in compute_loss
return self.compiled_loss(
File "/usr/local/lib/python3.8/dist-packages/keras/engine/compile_utils.py", line 201, in __call__
loss_value = loss_obj(y_t, y_p, sample_weight=sw)
File "/usr/local/lib/python3.8/dist-packages/keras/losses.py", line 139, in __call__
losses = call_fn(y_true, y_pred)
File "/usr/local/lib/python3.8/dist-packages/keras/losses.py", line 243, in call **
return ag_fn(y_true, y_pred, **self._fn_kwargs)
File "/usr/local/lib/python3.8/dist-packages/keras/losses.py", line 1930, in binary_crossentropy
backend.binary_crossentropy(y_true, y_pred, from_logits=from_logits),
File "/usr/local/lib/python3.8/dist-packages/keras/backend.py", line 5283, in binary_crossentropy
return tf.nn.sigmoid_cross_entropy_with_logits(labels=target, logits=output)
ValueError: `logits` and `labels` must have the same shape, received ((None, 3) vs (None, 1)).
my logits (None, 3) are not the same size as my labels (None, 1)
I made sure that my last layer had an output of 3 and had the activation = 'softmax'
my labels have a dimension of (None, 1) since I mapped each class to a unique integer
The key concept you are missing is that you need to one-hot encode your labels (after assigning integers to them - see below).
So your model, after the softmax, is spitting out three values: how probable each of your labels is. E.g. it might say A is 0.6, B is 0.1, and C is 0.3. If the correct answer is C, then it needs to see that correct answer as 0, 0, 1. It can then say that its prediction for A is 0.6 - 0 = +0.6 wrong, B is 0.1 - 0 = +0.1 wrong, and C is 0.3 - 1 = -0.7 wrong.
Theoretically you can go from a string label directly to a one-hot encoding. But it seems Tensorflow needs the labels to first be encoded as integers, and then that is one-hot encoded.
https://www.tensorflow.org/api_docs/python/tf/keras/layers/CategoryEncoding#examples says to use:
tf.keras.layers.CategoryEncoding(num_tokens=3, output_mode="one_hot")
Also see https://stackoverflow.com/a/69791457/841830 (the higher-voted answer there is from 2019, so applies to TensorFlow v1 I think). And searching for "tensorflow one-hot encoding" will bring up plenty of tutorials and examples.
The issue here was indeed due to the shape of my labels not being the same as logits. Logits were of shape (3) since they contained a float for the probability of each of the three classes that I wanted to predict. Labels were originally of shape (1) since it only contained one int.
To solve this, I used one-hot encoding which turned all labels into a shape of (3) and this solved the problem. Used the keras.utils.to_categorical() function to do so.
sentiment_types = ('negative', 'neutral', 'positive')
train_df['sentiment'] = train_df['sentiment'].astype('category')
test_df['sentiment'] = test_df['sentiment'].astype('category')
# Turning labels from strings to int
train_sentiment_cat = train_df['sentiment'].cat.codes
test_sentiment_cat = test_df['sentiment'].cat.codes
# One-hot encoding
train_y = to_categorical(train_sentiment_cat)
test_y = to_categorical(test_sentiment_cat)

How to change the head of a model to accept vary size input?

I have a language model:
from transformers import RobertaTokenizer
from transformers import RobertaModel
import torch.nn as nn
import torch
checkpoint = 'roberta-base'
test_question = ['this is a string', 'this is another string but longer']
tokenizer = RobertaTokenizer.from_pretrained(checkpoint)
I'm trying to change the head of the model to have 4 linear layers with 512 neurons each:
class QModel(nn.Module):
def __init__(self):
super(QModel, self).__init__()
self.base_model = RobertaModel.from_pretrained(checkpoint)
self.dropout = nn.Dropout(0.5)
self.linear1 = nn.Linear(12288, 512)
self.linear2 = nn.Linear(512, 512)
self.linear3 = nn.Linear(512, 512)
self.linear4 = nn.Linear(512, 512)
def forward(self, x):
input_ids, attn_mask = torch.tensor(x['input_ids']), torch.tensor(x['attention_mask'])
outputs = self.base_model(input_ids, attention_mask=attn_mask)
# new head
outputs = self.dropout(outputs[0])
outputs = outputs.view(-1, 12288)
outputs = self.linear1(outputs)
outputs = self.dropout(outputs)
outputs = self.linear2(outputs)
outputs = self.dropout(outputs)
outputs = self.linear3(outputs)
outputs = self.dropout(outputs)
outputs = self.linear4(outputs)
return outputs
model = QModel()
model(tokenizer(test_question, padding=True))
But if I change the input size:
test_question = ['this is a string', 'this is another string but longer', 'another input']
I get the error:
RuntimeError: shape '[-1, 12288]' is invalid for input of size 18432
I understand that it arises from the 12288 value in linear1, but I'm not sure how to flatten it in the appropriate way to accept multiple inputs

ValueError: "input_length" is 47, but received input has shape (None, 47, 18704)

input = Input(shape=(47,vocab_length))
print(input.shape)
X = Embedding(input_dim=vocab_length,output_dim = embedding_size,input_length=47)(input)
print(X.shape)
X = Reshape([47,embedding_size,1])(X)
Above given is the part of my code, here I am getting an error in Embedding layer, as shown-
ValueError: "input_length" is 47, but received input has shape (None, 47, 18704)
Note here that vocab_length and embedding_size are integers.
Please help me out!
Thanks
So, further I tried making my own customed layer similar to the Embedding layer. But in it I am getting some error as described below.
'''
from keras import backend as K
from keras.engine.topology import Layer
class MyLayer(Layer):
def __init__(self,shape_0=1,shape_1=1):
super(MyLayer, self).__init__()
self.w = self.add_weight(
shape=(shape_0,shape_1), initializer="random_normal", trainable=True
)
def call(self, inputs):
return tf.matmul(inputs, self.w)
embedding_size = 300
vocab_length = 18704
input = Input(shape=(47,vocab_length))
X = MyLayer(vocab_length,embedding_size)(input)
X = Reshape([47,embedding_size,1])(X)
'''
Here it gives error in Reshape layer saying the target length should be same as the original length of the input. But it is same i.e. input shape = [47,embedding_size] and target size = [47,embedding_size,1]
'''
ValueError: total size of new array must be unchanged
'''

keras neural network predicts the same number for every handwritten digit

I am new to machine learning so as a first project I've tried to built a handwritten digit recognition neural network based on the mnist dataset and when I test it with the test images provided by the data set itself it seems to work pretty well (that's what the function test_predict is for). Now I would like to step it up and have the network recognise some actual handwritten digits that I've taken photos of.
The function partial_img_rec takes on an image containing multiple digits and it will be called by multiple_digits. I know it might seem weird that I use recursion here and I'm sure there are some more efficient ways to do this but that's not the matter. In order to test partial_img_rec I provide some photos of individual digits that are stored in the folder .\individual_test and they all look something like this:
The problem is: My neural network's prediction for every single one of my test images is "5". The probability is always about 22% no matter the actual digit displayed. I totally get why the results are not as great as those achieved with the mnist dataset's test images but I certainly didn't expect this. Do you have any idea why this is happening? Any advise is welcome.
Thank you in advance.
Here's my code (edited, now working):
# import keras and the MNIST dataset
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from keras.utils import np_utils
# numpy is necessary since keras uses numpy arrays
import numpy as np
# imports for pictures
from PIL import Image
from PIL import ImageOps
# imports for tests
import random
import os
class mnist_network():
def __init__(self):
""" load data, create and train model """
# load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# flatten 28*28 images to a 784 vector for each image
num_pixels = X_train.shape[1] * X_train.shape[2]
X_train = X_train.reshape((X_train.shape[0], num_pixels)).astype('float32')
X_test = X_test.reshape((X_test.shape[0], num_pixels)).astype('float32')
# normalize inputs from 0-255 to 0-1
X_train = X_train / 255
X_test = X_test / 255
# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
# create model
self.model = Sequential()
self.model.add(Dense(num_pixels, input_dim=num_pixels, kernel_initializer='normal', activation='relu'))
self.model.add(Dense(num_classes, kernel_initializer='normal', activation='softmax'))
# Compile model
self.model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# train the model
self.model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, batch_size=200, verbose=2)
self.train_img = X_train
self.train_res = y_train
self.test_img = X_test
self.test_res = y_test
def test_all(self):
""" evaluates the success rate using all the test data """
scores = self.model.evaluate(self.test_img, self.test_res, verbose=0)
print("Baseline Error: %.2f%%" % (100-scores[1]*100))
def predict_result(self, img, num_pixels = None, show=False):
""" predicts the number in a picture (vector) """
assert type(img) == np.ndarray and img.shape == (784,)
"""if show:
# show the picture!!!! some problem here
plt.imshow(img, cmap='Greys')
plt.show()"""
num_pixels = img.shape[0]
# the actual number
res_number = np.argmax(self.model.predict(img.reshape(-1,num_pixels)), axis = 1)
# the probabilities
res_probabilities = self.model.predict(img.reshape(-1,num_pixels))
return (res_number[0], res_probabilities.tolist()[0]) # we only need the first element since they only have one
def test_predict(self, amount_test = 100):
""" test some random numbers from the test part of the data set """
assert type(amount_test) == int and amount_test <= 10000
cnt_right = 0
cnt_wrong = 0
for i in range(amount_test):
ind = random.randrange(0,10000) # there are 10000 images in the test part of the data set
""" correct_res is the actual result stored in the data set
It's represented as a list of 10 elements one of which being 1, the rest 0 """
correct_list = self.test_res.tolist()
correct_list = correct_list[ind] # the correct sublist
correct_res = correct_list.index(1.0)
predicted_res = self.predict_result(self.test_img[ind])[0]
if correct_res != predicted_res:
cnt_wrong += 1
print("Error in predict ! \
index = ", ind, " predicted result = ", predicted_res, " correct result = ", correct_res)
else:
cnt_right += 1
print("The machine predicted correctly ",cnt_right," out of ",amount_test," examples. That is a success rate of ", (cnt_right/amount_test)*100,"%.")
def partial_img_rec(self, image, upper_left, lower_right, results=[]):
""" partial is a part of an image """
left_x, left_y = upper_left
right_x, right_y = lower_right
print("current test part: ", upper_left, lower_right)
print("results: ", results)
# condition to stop recursion: we've reached the full width of the picture
width, height = image.size
if right_x > width:
return results
partial = image.crop((left_x, left_y, right_x, right_y))
# rescale image to 28 *28 dimension
partial = partial.resize((28,28), Image.ANTIALIAS)
partial.show()
# transform to vector
partial = ImageOps.invert(partial)
partial = np.asarray(partial, "float32")
partial = partial / 255.
partial[partial < 0.5] = 0.
# flatten image to 28*28 = 784 vector
num_pixels = partial.shape[0] * partial.shape[1]
partial = partial.reshape(num_pixels)
step = height // 10
# is there a number in this part of the image?
res, prop = self.predict_result(partial)
print("result: ", res, ". probabilities: ", prop)
# only count this result if the network is >= 50% sure
if prop[res] >= 0.5:
results.append(res)
# step is 80% of the partial image's size (which is equivalent to the original image's height)
step = int(height * 0.8)
print("found valid result")
else:
# if there is no number found we take smaller steps
step = height // 20
print("step: ", step)
# recursive call with modified positions ( move on step variables )
return self.partial_img_rec(image, (left_x+step, left_y), (right_x+step, right_y), results=results)
def test_individual_digits(self):
""" test partial_img_rec with some individual digits (square shaped images)
saved in the folder 'individual_test' following the pattern 'number_digit.jpg' """
cnt_right, cnt_wrong = 0,0
folder_content = os.listdir(".\individual_test")
for imageName in folder_content:
# image file must be a jpg or png
assert imageName[-4:] == ".jpg" or imageName[-4:] == ".png"
correct_res = int(imageName[0])
image = Image.open(".\\individual_test\\" + imageName).convert("L")
# only square images in this test
if image.size[0] != image.size[1]:
print(imageName, " has the wrong proportions: ", image.size,". It has to be a square.")
continue
predicted_res = self.partial_img_rec(image, (0,0), (image.size[0], image.size[1]), results=[])
if predicted_res == []:
print("No prediction possible for ", imageName)
else:
predicted_res = predicted_res[0]
if predicted_res != correct_res:
print("error in partial_img-rec! Predicted ", predicted_res, ". The correct result would have been ", correct_res)
cnt_wrong += 1
else:
cnt_right += 1
print("correctly predicted ",imageName)
print(cnt_right, " out of ", cnt_right + cnt_wrong," digits were correctly recognised. The success rate is therefore ", (cnt_right / (cnt_right + cnt_wrong)) * 100," %.")
def multiple_digits(self, img):
""" takes as input an image without unnecessary whitespace surrounding the digits """
#assert type(img) == myImage
width, height = img.size
# start with the first quadratic part of the image
res_list = self.partial_img_rec(img, (0,0),(height ,height))
res_str =""
for elem in res_list:
res_str += str(elem)
return res_str
network = mnist_network()
network.test_individual_digits()
EDIT
#Geecode's answer was very helpful and the network now predicts correctly some of the pictures including the one shown above. Yet the overall success rate is lower than 50%. Do you have any ideas how to improve this?
Examples for images returning bad results:
Nothing wrong with your image in itself, your model can correctly classify it.
The issue is that you made a Floor Division on your partial:
partial = partial // 255
which always results in 0. So you always get a black image.
You have to do a "normal" division and some preparation, because your model was trained on black i.e. 0. valued pixel backgrounded negative images:
# transform to vector
partial = ImageOps.invert(partial)
partial = np.asarray(partial, "float32")
partial = partial / 255.
partial[partial < 0.5] = 0.
After then your model will classify correctly:
Out:
result: 1 . probabilities: [0.000431705528171733, 0.7594985961914062, 0.0011404436081647873, 0.00018972357793245465, 0.03162384033203125, 0.008697531186044216, 0.0014472954208031297, 0.18429973721504211, 0.006838776171207428, 0.005832481198012829]
found valid result
Note, that of course you can play on the image preparation yet, that was not the purpose of this answer.
Update:
My detailed answer regarding how to achive better performance in this task, see here.

Custom loss function for U-net in keras using class weights: `class_weight` not supported for 3+ dimensional targets

Here's the code I'm working with (pulled from Kaggle mostly):
inputs = Input((IMG_HEIGHT, IMG_WIDTH, IMG_CHANNELS))
...
outputs = Conv2D(4, (1, 1), activation='sigmoid') (c9)
model = Model(inputs=[inputs], outputs=[outputs])
model.compile(optimizer='adam', loss='dice', metrics=[mean_iou])
results = model.fit(X_train, Y_train, validation_split=0.1, batch_size=8, epochs=30, class_weight=class_weights)
I have 4 classes that are very imbalanced. Class A equals 70%, class B = 15%, class C = 10%, and class D = 5%. However, I care most about class D. So I did the following type of calculations: D_weight = A/D = 70/5 = 14 and so on for the weight for class B and A. (if there are better methods to select these weights, then feel free)
In the last line, I'm trying to properly set class_weights and I'm doing it as so: class_weights = {0: 1.0, 1: 6, 2: 7, 3: 14}.
However, when I do this, I get the following error.
class_weight not supported for 3+ dimensional targets.
Is it possible that I add a dense layer after the last layer and just use it as a dummy layer so I can pass the class_weights and then only use the output of the last conv2d layer to do the prediction?
If this is not possible, how would I modify the loss function (I'm aware of this post, however, just passing in the weights in to the loss function won't cut it, because the loss function is called separately for each class) ? Currently, I'm using the following loss function:
def dice_coef(y_true, y_pred):
smooth = 1.
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
intersection = K.sum(y_true_f * y_pred_f)
return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)
def bce_dice_loss(y_true, y_pred):
return 0.5 * binary_crossentropy(y_true, y_pred) - dice_coef(y_true, y_pred)
But I don't see any way in which I can input class weights. If someone wants the full working code see this post. But remember to change the final conv2d layer's num classes to 4 instead of 1.
You can always apply the weights yourself.
The originalLossFunc below you can import from keras.losses.
The weightsList is your list with the weights ordered by class.
def weightedLoss(originalLossFunc, weightsList):
def lossFunc(true, pred):
axis = -1 #if channels last
#axis= 1 #if channels first
#argmax returns the index of the element with the greatest value
#done in the class axis, it returns the class index
classSelectors = K.argmax(true, axis=axis)
#if your loss is sparse, use only true as classSelectors
#considering weights are ordered by class, for each class
#true(1) if the class index is equal to the weight index
classSelectors = [K.equal(i, classSelectors) for i in range(len(weightsList))]
#casting boolean to float for calculations
#each tensor in the list contains 1 where ground true class is equal to its index
#if you sum all these, you will get a tensor full of ones.
classSelectors = [K.cast(x, K.floatx()) for x in classSelectors]
#for each of the selections above, multiply their respective weight
weights = [sel * w for sel,w in zip(classSelectors, weightsList)]
#sums all the selections
#result is a tensor with the respective weight for each element in predictions
weightMultiplier = weights[0]
for i in range(1, len(weights)):
weightMultiplier = weightMultiplier + weights[i]
#make sure your originalLossFunc only collapses the class axis
#you need the other axes intact to multiply the weights tensor
loss = originalLossFunc(true,pred)
loss = loss * weightMultiplier
return loss
return lossFunc
For using this in compile:
model.compile(loss= weightedLoss(keras.losses.categorical_crossentropy, weights),
optimizer=..., ...)
Changing the class balance directly on the input data
You can change the balance of the input samples too.
For instance, if you have 5 samples from class 1 and 10 samples from class 2, pass the samples for class 5 twice in the input arrays.
.
Using the sample_weight argument.
Instead of working "by class", you can also work "by sample".
Create an array of weights for each sample in your input array: len(x_train) == len(weights)
And fit passing this array to the sample_weight argument.
(If it's fit_generator, the generator will have to return the weights along with the train/true pairs: return/yield inputs, targets, weights)

Resources