Can't figure out to solve this image segmentation problem - image-processing

My training images are made up of blue channels extracted from the ELAs (Error Level Analysis) of some spliced images and the labels just consist their corresponding ground truth masks.
I've have constructed a simple encoder-decoder CNN given down below to do the segmentation and have also tested it on the cell membrane segmentation task. There it performs well and creates near to ground truth images, so I guess the neural network I created is capable enough.
However, it is not working on the spliced images on CASIA1 + CASIA1GroundTruth dataset. Please help me to fix it, I have spent too many days on it trying different architectures and pre-processing on the images but no luck.
Input Image
Ground Truth
Output/Generated Image
For one, it is claiming such high accuracy (98%) and low losses but the output image is so wrong. It is sort of getting the wanted mask if you look carefully but along with it there are a lot of regions splattered with white. Seems like it is not able to get the difference in the intensities of the pixels for the wanted region vs the background. Please help me fix it :(
Preparation
def process(img):
img=img.getchannel('B')
return img
for i in splicedIMG:
img=process(Image.open('ELAs/'+str(i)))
X.append(np.array(img)/np.max(img))
for i in splicedGT:
lbl=Image.open('SGTResized/'+str(i))
Y.append(np.array(lbl)/np.max(lbl))
X = np.array(X)
Y = np.array(Y)
X = X.reshape(-1, 256,256, 1)
Y = Y.reshape(-1, 256,256, 1)
X_train, X_val, Y_train, Y_val = train_test_split(X, Y, test_size = 0.2)
Segmenter Model
model = Sequential()
model.add(Conv2D(filters = 16, kernel_size = (3,3),padding = 'same',
activation ='relu', input_shape = (256,256,1)))
model.add(BatchNormalization())
model.add(Conv2D(filters = 16, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(filters = 32, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(Conv2D(filters = 32, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(filters = 64, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(Conv2D(filters = 64, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(filters = 128, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(Conv2D(filters = 128, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(UpSampling2D(size = (2,2)))
model.add(Conv2D(filters = 128, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(Conv2D(filters = 128, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(UpSampling2D(size = (2,2)))
model.add(Conv2D(filters = 64, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(Conv2D(filters = 64, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(UpSampling2D(size = (2,2)))
model.add(Conv2D(filters = 32, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(Conv2D(filters = 32, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(UpSampling2D(size = (2,2)))
model.add(Conv2D(filters = 16, kernel_size = (3,3),padding = 'same',
activation ='relu', input_shape = (256,256,3)))
model.add(BatchNormalization())
model.add(Conv2D(filters = 16, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(Conv2D(filters = 1, kernel_size = (1,1), activation = 'sigmoid'))
model.summary()
Training
model.compile(optimizer = Adam(lr = 0.0001), loss = 'binary_crossentropy', metrics = ['accuracy'])
model_checkpoint = ModelCheckpoint('segmenter_weights.h5', monitor='loss',verbose=1, save_best_only=True)
model.fit(X_train, Y_train, validation_data = (X_val, Y_val), batch_size=4, epochs=200, verbose=1, callbacks=[PlotLossesCallback(),model_checkpoint])

Oops, I did a stupid one. In order to see what I have picked up for testing from the X array, I multiplied that array by 255 cause PIL doesn't display arrays in 0-1 range. Mistakenly, I just used the same modified variable and passed it in test/prediction.

Related

Input 0 of layer conv1d_2 is incompatible with the layer: : expected min_ndim=3, found ndim=2. Full shape received: (None, 128). To develop LSTM-CNN

I am trying to incorporate a CNN layer into the LSTM network as shown.
model = Sequential()
model.add(LSTM(64, return_sequences = True, input_shape=(X_train.shape[1], X_train.shape[2]),activation='relu'))
model.add(Dropout(0.1)) model.add(LSTM(128, activation= 'relu'))
model.add(Conv1D(32, kernel_size=3, activation='relu'))
model.add(Flatten()) model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
But it is giving the following error about the input shape. Please help to resolve the issue.
Try this:
model = Sequential()
model.add(LSTM(64, return_sequences = True, input_shape = (X_train.shape[1], X_train.shape[2]), activation='relu'))
model.add(Dropout(0.1))
model.add(LSTM(128, activation = 'relu', return_sequences = True))
model.add(Conv1D(32, kernel_size= 1, input_shape = (None, 128, 1), activation = 'relu'))
model.add(Flatten())
model.add(Dense(1))

CNN incompatible

My data has the following shapes:
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=0)
print(X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)
(942, 32, 32, 1) (236, 32, 32, 1) (942, 3, 3) (236, 3, 3)
And whenever I try to run my CNN I get the following error:
from tensorflow.keras import layers
from tensorflow.keras import Model
img_input = layers.Input(shape=(32, 32, 1))
x = layers.Conv2D(16, (3,3), activation='relu', strides = 1, padding = 'same')(img_input)
x = layers.Conv2D(32, (3,3), activation='relu', strides = 2)(x)
x = layers.Conv2D(128, (3,3), activation='relu', strides = 2)(x)
x = layers.MaxPool2D(pool_size=2)(x)
x = layers.Conv2D(3, 3, activation='linear', strides = 2)(x)
output = layers.Flatten()(x)
model = Model(img_input, output)
model.summary()
model.compile(loss='mean_squared_error',optimizer= 'adam', metrics=['mse'])
history = model.fit(X_train,Y_train,validation_data=(X_test, Y_test), epochs = 100,verbose=1)
Error:
InvalidArgumentError: Incompatible shapes: [32,3] vs. [32,3,3]
[[node BroadcastGradientArgs_2 (defined at /usr/local/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1751) ]] [Op:__inference_distributed_function_7567]
Function call stack:
distributed_function
What am I missing here?
you don't handle the dimensionality inside your network properly. Firstly expand the dimension of your y in order to get them in this format (n_sample, 3, 3, 1). At this point adjust the network (I remove flatten and max pooling and adjust the last conv output)
# create dummy data
n_sample = 10
X = np.random.uniform(0,1, (n_sample, 32, 32, 1))
y = np.random.uniform(0,1, (n_sample, 3, 3))
# expand y dim
y = y[...,np.newaxis]
print(X.shape, y.shape)
img_input = Input(shape=(32, 32, 1))
x = Conv2D(16, (3,3), activation='relu', strides = 1, padding = 'same')(img_input)
x = Conv2D(32, (3,3), activation='relu', strides = 2)(x)
x = Conv2D(128, (3,3), activation='relu', strides = 2)(x)
x = Conv2D(1, (3,3), activation='linear', strides = 2)(x)
model = Model(img_input, x)
model.summary()
model.compile(loss='mean_squared_error',optimizer= 'adam', metrics=['mse'])
model.fit(X,y, epochs=3)

ValueError: Found input variables with inconsistent numbers of samples: [4162, 3]

I am facing an error in plotting my confusion matrix. I am giving the test labels and my predicted label in confusion matrix function but it is giving me the value error having the problem in number of samples.
Shape of My data is below.
Trainig Data Shape (4162, 224, 224, 3)
Training Data Labels Shape (4162, 5)
Testing Data Shape (3921, 224, 224, 3)
Testing Data Labels Shape (3921, 5)
Predicted Label is a bit ugly because of only 2 epochs run, I just wanted to plot the confusion matrix first so thats why.
predictingimage = "D:/compCarsThesisData/data/image/78/3/2010/0ba8d018cdc994.jpg" #67/1698/2010/6805eb92ac6c70.jpg"
predictImageRead = mpg.imread(predictingimage)
resizingImage = cv2.cv2.resize(predictImageRead,(224,224))
reshapedFinalImage = np.expand_dims(resizingImage, axis=0)
npimage = np.asarray(reshapedFinalImage)
m = model.predict(npimage)
print(m)
[array([[0.02502811, 0.01959323, 0.6556284 , 0.26472655, 0.03502375]],
dtype=float32), array([[5.8234303e-04, 3.1917400e-04, 9.4957882e-01, 1.8873921e-02,
3.0645736e-02]], dtype=float32), array([[0.02581117, 0.04752538, 0.81816435, 0.04812173, 0.06037736]],
dtype=float32)]
cm = confusion_matrix(train_labels_Encode,m)
plt.imshow(cm)
plt.show()
ERROR
Traceback (most recent call last):
File "d:/ThesisWork/seriouswork/Inception_SVM_CompCarsGoogleNetArchitecture.py", line 299, in <module>
cm = confusion_matrix(train_labels_hotEncode,n)
File "C:\Users\zeele\Miniconda3\lib\site-packages\sklearn\metrics\classification.py", line 253, in confusion_matrix
y_type, y_true, y_pred = _check_targets(y_true, y_pred)
File "C:\Users\zeele\Miniconda3\lib\site-packages\sklearn\metrics\classification.py", line 71, in _check_targets
check_consistent_length(y_true, y_pred)
File "C:\Users\zeele\Miniconda3\lib\site-packages\sklearn\utils\validation.py", line 235, in check_consistent_length
" samples: %r" % [int(l) for l in lengths])
ValueError: Found input variables with inconsistent numbers of samples: [4162, 3]
Classifier Code:
X_train = np.load('D:/Inception_preprocessed_data_Labels_2004/Top5/TrainingData_Top5.npy')#('D:/ThesisWork/S_224_Training_data.npy')#training_images
X_test = np.load('D:/Inception_preprocessed_data_Labels_2004/Top5/TrainingLabels_Top5.npy')#('D:/ThesisWork/S_224_Training_labels.npy')#training_labels
y_train = np.load('D:/Inception_preprocessed_data_Labels_2004/Top5/TestingData_Top5.npy')#('D:/ThesisWork/S_224_Testing_data.npy')#testing_images
y_test = np.load('D:/Inception_preprocessed_data_Labels_2004/Top5/TestingLabels_Top5.npy')#('D:/ThesisWork/S_224_Testing_labels.npy')#testing_labels
print(X_test)
le = preprocessing.LabelEncoder()
le.fit(X_test)
transform_trainLabels = le.transform(X_test)
print(transform_trainLabels)
print(le.inverse_transform(transform_trainLabels))
train_labels_hotEncode = np_utils.to_categorical(transform_trainLabels,len(set(transform_trainLabels)))
shuffle(X_train)
shuffle(train_labels_hotEncode)
le2 = preprocessing.LabelEncoder()
le2.fit(y_test)
transform_testLabels = le2.transform(y_test)
test_labels_hotEncode = np_utils.to_categorical(transform_testLabels,len(set(transform_testLabels)))
print(test_labels_hotEncode.shape)
shuffle(y_train)
shuffle(test_labels_hotEncode)
# print(train_labels_hotEncode[3000])
# exit()
# X_train = np.asarray(X_train / 255.0)
# y_train = np.asarray(y_train / 255.0)
# print("X_Training" ,X_train.shape, X_train)
# print("X_TEST", X_test.shape)
# print("Y_train", y_train.shape)
# print("y_test", y_test.shape)
# exit()
# plt.imshow(X_train[1])
# print(X_test)
# plt.imshow(y_train[1])
# print(y_test)
# plt.show()
print("Trainig Data Shape",X_train.shape)
print("Training Data Labels Shape",train_labels_hotEncode.shape)
print("Testing Data Shape", y_train.shape)
print("Testing Data Labels Shape", test_labels_hotEncode.shape)
# X_train = np.array(X_train).astype(np.float32)
# y_train = np.array(y_train).astype(np.float32)
def inception_module(image,
filters_1x1,
filters_3x3_reduce,
filter_3x3,
filters_5x5_reduce,
filters_5x5,
filters_pool_proj,
name=None):
conv_1x1 = Conv2D(filters_1x1, (1,1), padding='same', activation='relu', kernel_initializer=kernel_init, bias_initializer= bias_init)(image)
conv_3x3 = Conv2D(filters_3x3_reduce, (1,1), padding='same', activation='relu', kernel_initializer=kernel_init, bias_initializer= bias_init)(image)
conv_3x3 = Conv2D(filter_3x3,(3,3), padding='same', activation='relu', kernel_initializer=kernel_init, bias_initializer=bias_init)(conv_3x3)
conv_5x5 = Conv2D(filters_5x5_reduce,(1,1), padding='same', activation='relu',kernel_initializer=kernel_init, bias_initializer= bias_init)(image)
conv_5x5 = Conv2D(filters_5x5, (3,3), padding='same', activation='relu',kernel_initializer=kernel_init, bias_initializer=bias_init)(conv_5x5)
pool_proj = MaxPool2D((3,3), strides=(1,1), padding='same')(image)
pool_proj = Conv2D(filters_pool_proj, (1,1), padding='same', activation='relu', kernel_initializer=kernel_init, bias_initializer= bias_init)(pool_proj)
output = concatenate([conv_1x1, conv_3x3, conv_5x5, pool_proj], axis=3, name=name)
return output
kernel_init = keras.initializers.glorot_uniform()
bias_init = keras.initializers.Constant(value=0.2)
# IMG_SIZE = 64
input_layer = Input(shape=(224,224,3))
image = Conv2D(64,(7,7),padding='same', strides=(2,2), activation='relu', name='conv_1_7x7/2', kernel_initializer=kernel_init, bias_initializer=bias_init)(input_layer)
image = MaxPool2D((3,3), padding='same', strides=(2,2), name='max_pool_1_3x3/2')(image)
image = Conv2D(64, (1,1), padding='same', strides=(1,1), activation='relu', name='conv_2a_3x3/1' )(image)
image = Conv2D(192, (3,3), padding='same', strides=(1,1), activation='relu', name='conv_2b_3x3/1')(image)
image = MaxPool2D((3,3), padding='same', strides=(2,2), name='max_pool_2_3x3/2')(image)
image = inception_module(image,
filters_1x1= 64,
filters_3x3_reduce= 96,
filter_3x3 = 128,
filters_5x5_reduce=16,
filters_5x5= 32,
filters_pool_proj=32,
name='inception_3a')
image = inception_module(image,
filters_1x1=128,
filters_3x3_reduce=128,
filter_3x3=192,
filters_5x5_reduce=32,
filters_5x5=96,
filters_pool_proj=64,
name='inception_3b')
image = MaxPool2D((3,3), padding='same', strides=(2,2), name='max_pool_3_3x3/2')(image)
image = inception_module(image,
filters_1x1=192,
filters_3x3_reduce=96,
filter_3x3=208,
filters_5x5_reduce=16,
filters_5x5=48,
filters_pool_proj=64,
name='inception_4a')
image1 = AveragePooling2D((5,5), strides=3)(image)
image1 = Conv2D(128, (1,1), padding='same', activation='relu')(image1)
image1 = Flatten()(image1)
image1 = Dense(1024, activation='relu')(image1)
image1 = Dropout(0.7)(image1)
image1 = Dense(5, activation='softmax', name='auxilliary_output_1')(image1)
image = inception_module(image,
filters_1x1 = 160,
filters_3x3_reduce= 112,
filter_3x3= 224,
filters_5x5_reduce= 24,
filters_5x5= 64,
filters_pool_proj=64,
name='inception_4b')
image = inception_module(image,
filters_1x1= 128,
filters_3x3_reduce = 128,
filter_3x3= 256,
filters_5x5_reduce= 24,
filters_5x5=64,
filters_pool_proj=64,
name='inception_4c')
image = inception_module(image,
filters_1x1=112,
filters_3x3_reduce=144,
filter_3x3= 288,
filters_5x5_reduce= 32,
filters_5x5=64,
filters_pool_proj=64,
name='inception_4d')
image2 = AveragePooling2D((5,5), strides=3)(image)
image2 = Conv2D(128, (1,1), padding='same', activation='relu')(image2)
image2 = Flatten()(image2)
image2 = Dense(1024, activation='relu')(image2)
image2 = Dropout(0.7)(image2) #Changed from 0.7
image2 = Dense(5, activation='softmax', name='auxilliary_output_2')(image2)
image = inception_module(image,
filters_1x1=256,
filters_3x3_reduce=160,
filter_3x3=320,
filters_5x5_reduce=32,
filters_5x5=128,
filters_pool_proj=128,
name= 'inception_4e')
image = MaxPool2D((3,3), padding='same', strides=(2,2), name='max_pool_4_3x3/2')(image)
image = inception_module(image,
filters_1x1=256,
filters_3x3_reduce=160,
filter_3x3= 320,
filters_5x5_reduce=32,
filters_5x5= 128,
filters_pool_proj=128,
name='inception_5a')
image = inception_module(image,
filters_1x1=384,
filters_3x3_reduce=192,
filter_3x3=384,
filters_5x5_reduce=48,
filters_5x5=128,
filters_pool_proj=128,
name='inception_5b')
image = GlobalAveragePooling2D(name='avg_pool_5_3x3/1')(image)
image = Dropout(0.7)(image)
image = Dense(5, activation='softmax', name='output')(image)
model = Model(input_layer, [image,image1,image2], name='inception_v1')
model.summary()
epochs = 2
initial_lrate = 0.001 # Changed From 0.01
def decay(epoch, steps=100):
initial_lrate = 0.01
drop = 0.96
epochs_drop = 8
lrate = initial_lrate * math.pow(drop,math.floor((1+epoch)/epochs_drop))#
return lrate
sgd = keras.optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
# nadam = keras.optimizers.Nadam(lr= 0.002, beta_1=0.9, beta_2=0.999, epsilon=None)
# keras
lr_sc = LearningRateScheduler(decay)
# rms = keras.optimizers.RMSprop(lr = initial_lrate, rho=0.9, epsilon=1e-08, decay=0.0)
# ad = keras.optimizers.adam(lr=initial_lrate)
model.compile(loss=['categorical_crossentropy', 'categorical_crossentropy','categorical_crossentropy'],loss_weights=[1,0.3,0.3], optimizer='sgd', metrics=['accuracy'])
# loss = 'categorical_crossentropy', 'categorical_crossentropy','categorical_crossentropy'
history = model.fit(X_train, [train_labels_hotEncode,train_labels_hotEncode,train_labels_hotEncode], validation_split=0.3,shuffle=True,epochs=epochs, batch_size= 32, callbacks=[lr_sc]) # batch size changed from 256 or 64 to 16(y_train,[y_test,y_test,y_test])
# validation_data=(y_train,[test_labels_hotEncode,test_labels_hotEncode,test_labels_hotEncode]), validation_data= (X_train, [train_labels_hotEncode,train_labels_hotEncode,train_labels_hotEncode]),
print(history.history.keys())
plt.plot(history.history['output_acc'])
plt.plot(history.history['val_output_acc'])
plt.title('Model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'],loc = 'upper left')
plt.show()
# predictingimage = "D:/compCarsThesisData/data/image/78/3/2010/0ba8d018cdc994.jpg" #67/1698/2010/6805eb92ac6c70.jpg"
predictImageRead = X_train
# resizingImage = cv2.cv2.resize(predictImageRead,(224,224))
# reshapedFinalImage = np.expand_dims(predictImageRead, axis=0)
# print(reshapedFinalImage.shape)
# npimage = np.array(reshapedFinalImage)
m = model.predict(predictImageRead)
print(m)
print(predictImageRead.shape)
print(train_labels_hotEncode)
# print(m.shape)
plt.imshow(predictImageRead[1])
plt.show()
# n = np.argmax(m,axis=-1)
# n = np.array(m)
print(confusion_matrix(X_test,m[0]))
cm = confusion_matrix(X_test,m[0])
plt.imshow(cm)
plt.show()
Please guide me through this.
Thanks!
If you want to compute a confusion matrix of your training data you have to make your moddel predict all your training examples, roughly like this:
m = model.predict(train_data) # train_data should have the shape (4162, 224, 224, 3)
m should then have a length of 4162 and you can plot the confusion matrix like this:
cm = confusion_matrix(train_labels_Encode, m)
plt.imshow(cm)
plt.show()

How to train Siamese network in Keras?

I have a pandas dataframe containing filenames of positive and negative examples as below
img1 img2 y
001.jpg 002.jpg 1
003.jpg 004.jpg 0
003.jpg 002.jpg 1
I want to train my Siamese network using Keras ImageDataGenerator and flow_from_dataframe. How do I set up my training so that the code inputs 2 images with 1 label simultaneously.
Below is the code for my model
def siamese_model(input_shape) :
left = Input(input_shape)
right = Input(input_shape)
model = Sequential()
model.add(Conv2D(32, (3,3), activation='relu', input_shape=input_shape))
model.add(BatchNormalization())
model.add(Conv2D(64, (3,3), activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(128, (3,3), activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(256, (3,3), activation='relu')
model.add(BatchNormalization())
model.add(Conv2D(256, (3,3), activation='relu')
model.add(MaxPooling2D())
model.add(BatchNormalization())
model.add(Flatten())
model.add(Dense(512, activation='sigmoid'))
left_encoded = model(left)
right_encoded = model(right)
L1_layer = Lambda(lambda tensors:K.abs(tensors[0] - tensors[1]))
L1_distance = L1_layer([left_encoded, right_encoded])
prediction = Dense(1,activation='sigmoid')(L1_distance)
siamese_net = Model(inputs=[left,right],outputs=prediction)
return siamese_net
model = siamese_model((224,224,3))
model.compile(loss="binary_crossentropy",optimizer="adam", metrics=['accuracy'])
datagen_left = ImageDataGenerator(rotation_range=10,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
vertical_flip = True)
datagen_right = ImageDataGenerator(rotation_range=10,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
vertical_flip = True)
Join the generators in a custom generator.
Make one of them output the desired labels, discard the label of the other.
class DoubleGenerator(Sequence):
def __init__(self, gen1, gen2):
self.gen1 = gen1
self.gen2 = gen2
def __len__(self):
return len(self.gen1)
def __getitem__(self, i):
x1,y = self.gen1[i]
x2,y2 = self.gen2[i]
return (x1,x2), y
Use it:
double_gen = DoubleGenerator(datagen_left.flow_from_directory(...),
datagen_right.flow_from_directory(...))

Keras: Which loss function for grayscale image as label

I am new to Deep Learning and Keras and Image Processing. I am working on a project in which I try to compensate motion artifacts in grayscale images using CNNs. Thus, I have a grayscale image as label that has no motion artifacts.
But now I am not sure which loss function and what kind of error metric to use. Maybe I need some kind of 2D cross-correlation loss function? Or does a loss function like mean squared error make sense? A first training with 'mean squared logarithmic error' produced visually good results (prediction looked a lot like the label image) but the accuracy of the CNN was like 0%.
Does someone has experience in that area and can recommend some literature or suggest a suitable loss function and error metric!?
If I need to provide more detailed information, just let me know and I am more than happy to do so.
The used CNN (somewhat like Unet):
input_1 = Input((X_train.shape[1],X_train.shape[2], X_train.shape[3]))
conv1 = Conv2D(16, (3,3), strides=(2,2), activation='relu', padding='same')(input_1)
batch1 = BatchNormalization(axis=3)(conv1)
conv2 = Conv2D(32, (3,3), strides=(2,2), activation='relu', padding='same')(batch1)
batch2 = BatchNormalization(axis=3)(conv2)
conv3 = Conv2D(64, (3,3), strides=(2,2), activation='relu', padding='same')(batch2)
batch3 = BatchNormalization(axis=3)(conv3)
conv4 = Conv2D(128, (3,3), strides=(2,2), activation='relu', padding='same')(batch3)
batch4 = BatchNormalization(axis=3)(conv4)
conv5 = Conv2D(256, (3,3), strides=(2,2), activation='relu', padding='same')(batch4)
batch5 = BatchNormalization(axis=3)(conv5)
conv6 = Conv2D(512, (3,3), strides=(2,2), activation='relu', padding='same')(batch5)
drop1 = Dropout(0.25)(conv6)
upconv1 = Conv2DTranspose(256, (3,3), strides=(1,1), padding='same')(drop1)
upconv2 = Conv2DTranspose(128, (3,3), strides=(2,2), padding='same')(upconv1)
upconv3 = Conv2DTranspose(64, (3,3), strides=(2,2), padding='same')(upconv2)
upconv4 = Conv2DTranspose(32, (3,3), strides=(2,2), padding='same')(upconv3)
upconv5 = Conv2DTranspose(16, (3,3), strides=(2,2), padding='same')(upconv4)
upconv5_1 = concatenate([upconv5,conv2], axis=3)
upconv6 = Conv2DTranspose(8, (3,3), strides=(2,2), padding='same')(upconv5_1)
upconv6_1 = concatenate([upconv6,conv1], axis=3)
upconv7 = Conv2DTranspose(1, (3,3), strides=(2,2), activation='linear', padding='same')(upconv6_1)
model = Model(outputs=upconv7, inputs=input_1)
Thanks for your help!

Resources