I am new to Deep Learning and Keras and Image Processing. I am working on a project in which I try to compensate motion artifacts in grayscale images using CNNs. Thus, I have a grayscale image as label that has no motion artifacts.
But now I am not sure which loss function and what kind of error metric to use. Maybe I need some kind of 2D cross-correlation loss function? Or does a loss function like mean squared error make sense? A first training with 'mean squared logarithmic error' produced visually good results (prediction looked a lot like the label image) but the accuracy of the CNN was like 0%.
Does someone has experience in that area and can recommend some literature or suggest a suitable loss function and error metric!?
If I need to provide more detailed information, just let me know and I am more than happy to do so.
The used CNN (somewhat like Unet):
input_1 = Input((X_train.shape[1],X_train.shape[2], X_train.shape[3]))
conv1 = Conv2D(16, (3,3), strides=(2,2), activation='relu', padding='same')(input_1)
batch1 = BatchNormalization(axis=3)(conv1)
conv2 = Conv2D(32, (3,3), strides=(2,2), activation='relu', padding='same')(batch1)
batch2 = BatchNormalization(axis=3)(conv2)
conv3 = Conv2D(64, (3,3), strides=(2,2), activation='relu', padding='same')(batch2)
batch3 = BatchNormalization(axis=3)(conv3)
conv4 = Conv2D(128, (3,3), strides=(2,2), activation='relu', padding='same')(batch3)
batch4 = BatchNormalization(axis=3)(conv4)
conv5 = Conv2D(256, (3,3), strides=(2,2), activation='relu', padding='same')(batch4)
batch5 = BatchNormalization(axis=3)(conv5)
conv6 = Conv2D(512, (3,3), strides=(2,2), activation='relu', padding='same')(batch5)
drop1 = Dropout(0.25)(conv6)
upconv1 = Conv2DTranspose(256, (3,3), strides=(1,1), padding='same')(drop1)
upconv2 = Conv2DTranspose(128, (3,3), strides=(2,2), padding='same')(upconv1)
upconv3 = Conv2DTranspose(64, (3,3), strides=(2,2), padding='same')(upconv2)
upconv4 = Conv2DTranspose(32, (3,3), strides=(2,2), padding='same')(upconv3)
upconv5 = Conv2DTranspose(16, (3,3), strides=(2,2), padding='same')(upconv4)
upconv5_1 = concatenate([upconv5,conv2], axis=3)
upconv6 = Conv2DTranspose(8, (3,3), strides=(2,2), padding='same')(upconv5_1)
upconv6_1 = concatenate([upconv6,conv1], axis=3)
upconv7 = Conv2DTranspose(1, (3,3), strides=(2,2), activation='linear', padding='same')(upconv6_1)
model = Model(outputs=upconv7, inputs=input_1)
Thanks for your help!
Related
I am new to deep learning and neural network so I need help understanding why this is happening and how i can fix it.
I have a training size of 7500 images
This is my model
img_size = 50
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
input_shape=(img_size, img_size, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
model.summary()
model.compile(loss='binary_crossentropy',
optimizer=optimizers.RMSprop(learning_rate=2*1e-4),
metrics=['acc'])
# Date processing
# All images will be rescaled by 1./255
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
# This is the target directory
train_dir,
target_size=(img_size, img_size),
batch_size=20,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_dir,
target_size=(img_size, img_size),
batch_size=20,
class_mode='binary')
# Train the Model
history = model.fit(
train_generator,
steps_per_epoch=375, #train_sample_size/data_batch_size
epochs=100,
validation_data=validation_generator,
validation_steps=50)
I have tried changing the parameters, such as adding dropout, changing batch size etc.. but still get a really high loss. The loss would be in the negative 20million and just keep increases.
My training images are made up of blue channels extracted from the ELAs (Error Level Analysis) of some spliced images and the labels just consist their corresponding ground truth masks.
I've have constructed a simple encoder-decoder CNN given down below to do the segmentation and have also tested it on the cell membrane segmentation task. There it performs well and creates near to ground truth images, so I guess the neural network I created is capable enough.
However, it is not working on the spliced images on CASIA1 + CASIA1GroundTruth dataset. Please help me to fix it, I have spent too many days on it trying different architectures and pre-processing on the images but no luck.
Input Image
Ground Truth
Output/Generated Image
For one, it is claiming such high accuracy (98%) and low losses but the output image is so wrong. It is sort of getting the wanted mask if you look carefully but along with it there are a lot of regions splattered with white. Seems like it is not able to get the difference in the intensities of the pixels for the wanted region vs the background. Please help me fix it :(
Preparation
def process(img):
img=img.getchannel('B')
return img
for i in splicedIMG:
img=process(Image.open('ELAs/'+str(i)))
X.append(np.array(img)/np.max(img))
for i in splicedGT:
lbl=Image.open('SGTResized/'+str(i))
Y.append(np.array(lbl)/np.max(lbl))
X = np.array(X)
Y = np.array(Y)
X = X.reshape(-1, 256,256, 1)
Y = Y.reshape(-1, 256,256, 1)
X_train, X_val, Y_train, Y_val = train_test_split(X, Y, test_size = 0.2)
Segmenter Model
model = Sequential()
model.add(Conv2D(filters = 16, kernel_size = (3,3),padding = 'same',
activation ='relu', input_shape = (256,256,1)))
model.add(BatchNormalization())
model.add(Conv2D(filters = 16, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(filters = 32, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(Conv2D(filters = 32, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(filters = 64, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(Conv2D(filters = 64, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(filters = 128, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(Conv2D(filters = 128, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(UpSampling2D(size = (2,2)))
model.add(Conv2D(filters = 128, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(Conv2D(filters = 128, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(UpSampling2D(size = (2,2)))
model.add(Conv2D(filters = 64, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(Conv2D(filters = 64, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(UpSampling2D(size = (2,2)))
model.add(Conv2D(filters = 32, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(Conv2D(filters = 32, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(UpSampling2D(size = (2,2)))
model.add(Conv2D(filters = 16, kernel_size = (3,3),padding = 'same',
activation ='relu', input_shape = (256,256,3)))
model.add(BatchNormalization())
model.add(Conv2D(filters = 16, kernel_size = (3,3),padding = 'same',
activation ='relu'))
model.add(BatchNormalization())
model.add(Conv2D(filters = 1, kernel_size = (1,1), activation = 'sigmoid'))
model.summary()
Training
model.compile(optimizer = Adam(lr = 0.0001), loss = 'binary_crossentropy', metrics = ['accuracy'])
model_checkpoint = ModelCheckpoint('segmenter_weights.h5', monitor='loss',verbose=1, save_best_only=True)
model.fit(X_train, Y_train, validation_data = (X_val, Y_val), batch_size=4, epochs=200, verbose=1, callbacks=[PlotLossesCallback(),model_checkpoint])
Oops, I did a stupid one. In order to see what I have picked up for testing from the X array, I multiplied that array by 255 cause PIL doesn't display arrays in 0-1 range. Mistakenly, I just used the same modified variable and passed it in test/prediction.
I am using this code for a CNN
train_batches = ImageDataGenerator().flow_from_directory('dice_sklearn/train', target_size=(IMG_WIDTH, IMG_HEIGHT),
classes=['1', '2', '3', '4', '5', '6'],
batch_size=cv_opt['batch'],
color_mode="grayscale")
test_batches = ImageDataGenerator().flow_from_directory('dice_sklearn/test', target_size=(IMG_WIDTH, IMG_HEIGHT),
class_mode='categorical',
batch_size=cv_opt['batch'],
shuffle=False)
train_num = len(train_batches)
test_num = len(test_batches)
model = Sequential([
Conv2D(32, (3, 3), padding='same', activation='relu', input_shape=(IMG_WIDTH, IMG_HEIGHT, 1)),
Conv2D(32, (3, 3), activation='relu'),
MaxPooling2D(pool_size=(2, 2)),
Dropout(0.30),
Conv2D(64, (3, 3), padding='same', activation='relu'),
Conv2D(64, (3, 3), activation='relu'),
MaxPooling2D(pool_size=(2, 2)),
Dropout(0.30),
Conv2D(64, (3, 3), padding='same', activation='relu'),
Conv2D(64, (3, 3), activation='relu'),
MaxPooling2D(pool_size=(2, 2)),
Dropout(0.30),
Flatten(),
Dense(512, activation='relu'),
Dropout(0.5),
Dense(6, activation='softmax'),
])
print(model.summary())
model.compile(Adam(lr=cv_opt['lr']), loss='categorical_crossentropy', metrics=['accuracy'])
history = model.fit(train_batches, steps_per_epoch=train_num,
epochs=cv_opt['epoch'], verbose=2)
model.save('cnn-keras.h5')
test_batches.reset()
prediction = model.predict(test_batches, steps=test_num, verbose=1)
predicted_class = np.argmax(prediction, axis=1)
classes = test_batches.classes[test_batches.index_array]
accuracy = (predicted_class == classes).mean()
print("Final accuracy:", accuracy * 100)
Where
cv_opt['batch'] is set to 50
cv_opt['lr'] is set to 0.0003
cv_opt['epoch'] is set to 50
The output from the training phase (with model.fit) on the last line (last epoch) returns:
192/192 [==============================] - 98s 510ms/step - loss: 0.0514 - accuracy: 0.9818 - val_loss: 0.0369 - val_accuracy: 0.9833
But when I run this part of code:
test_batches.reset()
prediction = model.predict(test_batches, steps=test_num, verbose=1)
predicted_class = np.argmax(prediction, axis=1)
classes = test_batches.classes[test_batches.index_array]
accuracy = (predicted_class == classes).mean()
print("Final accuracy:", accuracy * 100)
I get an accuracy score very very low: (0.16).
But if a plot the learning curves I can see that the test/validation curve (if in testing or in parameter tuning) both reach accuracies near 90%.
Am I using the model.predict in the wrong way?
Your model is not overfitting. Steps 1 and 2 do not have to be implemented at all in order to solve your problem. In fact, it is even more wrong since the author states that in case of overfitting you need to add more layers, which is strongly advised against: when one has an overfitting model, the model needs to be made simpler, not more complex.
The solution to your issue lies in #Dr.Snoopy's answer : the order of the classes do not match.
My recommendation is to iterate manually through the entire test set, get the ground truth, get the prediction (ensure the same exact preprocessing on images like in the training set is applied on your test set images) before you feed them to your model.
Then, calculate your metrics. This will solve your problem.
For example, you could use the idea below:
correctly_predicted = 0
for image in os.scandir(path_to_my_test_directory):
image_path = image.path
image = cv2.imread(image_path)
image = apply_the_same_preprocessing_like_in_training(image)
#transform from (H,W,3) to (1,H,W,3) because TF + Keras predict only on batches
image = np.expand_dims(image,axis=0)
prediction_label = np.argmax(model.predict(image))
if prediction_label == ground_truth_label:
correctly_predicted+=1
I am writing a code for running autoencoder on CIFAR10 dataset and see the reconstructed images.
The requirement is to create
Encoder with First Layer
Input shape: (32,32,3)
Conv2D Layer with 64 Filters of (3,3)
BatchNormalization layer
ReLu activation
2D MaxpoolingLayer with (2,2) filter
Encoder with Second Layer
Conv2D layer with 16 filters (3,3)
BatchNormalization layer
ReLu activation
2D MaxpoolingLayer with (2,2) filter
Final Encoded as MaxPool with (2,2) with all previous layers
Decoder with First Layer
Input shape: encoder output
Conv2D Layer with 16 Filters of (3,3)
BatchNormalization layer
ReLu activation
UpSampling2D with (2,2) filter
Decoder with Second Layer
Conv2D Layer with 32 Filters of (3,3)
BatchNormalization layer
ReLu activation
UpSampling2D with (2,2) filter
Final Decoded as Sigmoid with all previous layers
I understand that
When we are creating Convolutional Autoencoder (or any AE), we need to pass the output of the previous layer to the next layer.
So, when I create the first Conv2D layer with ReLu and then perform BatchNormalization .. in which I pass the Conv2D layer .. right?
But when I do MaxPooling2D .. what should I pass .. BatchNormalization output or Conv2D layer output?
Also, is there any order in which I should be performing these operations?
Conv2D --> BatchNormalization --> MaxPooling2D
OR
Conv2D --> MaxPooling2D --> BatchNormalization
I am attaching my code below ... I have attempted it to two different ways and hence getting different outputs (in terms of model summary and also model training graph)
Can someone please help me by explaining which is the correct method (Method-1 or Method-2)? Also, how do I understand which graph shows better model performance?
Method - 1
input_image = Input(shape=(32, 32, 3))
### Encoder
conv1_1 = Conv2D(64, (3, 3), activation='relu', padding='same')(input_image)
bnorm1_1 = BatchNormalization()(conv1_1)
mpool1_1 = MaxPooling2D((2, 2), padding='same')(conv1_1)
conv1_2 = Conv2D(16, (3, 3), activation='relu', padding='same')(mpool1_1)
borm1_2 = BatchNormalization()(conv1_2)
encoder = MaxPooling2D((2, 2), padding='same')(conv1_2)
### Decoder
conv2_1 = Conv2D(16, (3, 3), activation='relu', padding='same')(encoder)
bnorm2_1 = BatchNormalization()(conv2_1)
up1_1 = UpSampling2D((2, 2))(conv2_1)
conv2_2 = Conv2D(32, (3, 3), activation='relu', padding='same')(up1_1)
bnorm2_2 = BatchNormalization()(conv2_2)
up2_1 = UpSampling2D((2, 2))(conv2_2)
decoder = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(up2_1)
model = Model(input_image, decoder)
model.compile(optimizer='adam', loss='binary_crossentropy')
model.summary()
history = model.fit(trainX, trainX,
epochs=50,
batch_size=1000,
shuffle=True,
verbose=2,
validation_data=(testX, testX)
)
As an output of the model summary, I get this
Total params: 18,851
Trainable params: 18,851
Non-trainable params: 0
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper right')
plt.show()
Method - 2
input_image = Input(shape=(32, 32, 3))
### Encoder
x = Conv2D(64, (3, 3), activation='relu', padding='same')(input_image)
x = BatchNormalization()(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = BatchNormalization()(x)
encoder = MaxPooling2D((2, 2), padding='same')(x)
### Decoder
x = Conv2D(16, (3, 3), activation='relu', padding='same')(encoder)
x = BatchNormalization()(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = BatchNormalization()(x)
x = UpSampling2D((2, 2))(x)
decoder = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)
model = Model(input_image, decoder)
model.compile(optimizer='adam', loss='binary_crossentropy')
model.summary()
history = model.fit(trainX, trainX,
epochs=50,
batch_size=1000,
shuffle=True,
verbose=2,
validation_data=(testX, testX)
)
As an output of the model summary, I get this
Total params: 19,363
Trainable params: 19,107
Non-trainable params: 256
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper right')
plt.show()
In method 1, BatchNormalization layers does not exist in the compiled model, as the output of these layers are not used anywhere. You can check this by running model1.summary()
Method 2 is perfectly alright.
Order of the operations :
Conv2D --> BatchNormalization --> MaxPooling2D is usually the common approach.
Though either order would work since, since BatchNorm is just mean and variance normalization.
Edit:
For Conv2D --> BatchNormalization --> MaxPooling2D :
conv1_1 = Conv2D(64, (3, 3), activation='relu', padding='same')(input_image)
bnorm1_1 = BatchNormalization()(conv1_1)
mpool1_1 = MaxPooling2D((2, 2), padding='same')(bnorm1_1)
and then use mpool1_1 as input for next layer.
For Conv2D --> MaxPooling2D --> BatchNormalization:
conv1_1 = Conv2D(64, (3, 3), activation='relu', padding='same')(input_image)
mpool1_1 = MaxPooling2D((2, 2), padding='same')(conv1_1)
bnorm1_1 = BatchNormalization()(mpool1_1)
and then use bnorm1_1 as input for next layer.
To effectively use BatchNormalization layer, you should always use it before activation.
Instead of:
conv1_1 = Conv2D(64, (3, 3), activation='relu', padding='same')(input_image)
bnorm1_1 = BatchNormalization()(conv1_1)
mpool1_1 = MaxPooling2D((2, 2), padding='same')(bnorm1_1)
Use it like this:
conv1_1 = Conv2D(64, (3, 3), padding='same')(input_image)
bnorm1_1 = BatchNormalization()(conv1_1)
act_1 = Activation('relu')(bnorm1_1)
mpool1_1 = MaxPooling2D((2, 2), padding='same')(act_1)
For more details, check here:
Where do I call the BatchNormalization function in Keras?
I have a pandas dataframe containing filenames of positive and negative examples as below
img1 img2 y
001.jpg 002.jpg 1
003.jpg 004.jpg 0
003.jpg 002.jpg 1
I want to train my Siamese network using Keras ImageDataGenerator and flow_from_dataframe. How do I set up my training so that the code inputs 2 images with 1 label simultaneously.
Below is the code for my model
def siamese_model(input_shape) :
left = Input(input_shape)
right = Input(input_shape)
model = Sequential()
model.add(Conv2D(32, (3,3), activation='relu', input_shape=input_shape))
model.add(BatchNormalization())
model.add(Conv2D(64, (3,3), activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(128, (3,3), activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(256, (3,3), activation='relu')
model.add(BatchNormalization())
model.add(Conv2D(256, (3,3), activation='relu')
model.add(MaxPooling2D())
model.add(BatchNormalization())
model.add(Flatten())
model.add(Dense(512, activation='sigmoid'))
left_encoded = model(left)
right_encoded = model(right)
L1_layer = Lambda(lambda tensors:K.abs(tensors[0] - tensors[1]))
L1_distance = L1_layer([left_encoded, right_encoded])
prediction = Dense(1,activation='sigmoid')(L1_distance)
siamese_net = Model(inputs=[left,right],outputs=prediction)
return siamese_net
model = siamese_model((224,224,3))
model.compile(loss="binary_crossentropy",optimizer="adam", metrics=['accuracy'])
datagen_left = ImageDataGenerator(rotation_range=10,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
vertical_flip = True)
datagen_right = ImageDataGenerator(rotation_range=10,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
vertical_flip = True)
Join the generators in a custom generator.
Make one of them output the desired labels, discard the label of the other.
class DoubleGenerator(Sequence):
def __init__(self, gen1, gen2):
self.gen1 = gen1
self.gen2 = gen2
def __len__(self):
return len(self.gen1)
def __getitem__(self, i):
x1,y = self.gen1[i]
x2,y2 = self.gen2[i]
return (x1,x2), y
Use it:
double_gen = DoubleGenerator(datagen_left.flow_from_directory(...),
datagen_right.flow_from_directory(...))