Use SMOTE to oversample image data - image-processing

I'm doing a binary classification with CNNs and the data is imbalanced where the positive medical image : negative medical image = 0.4 : 0.6. So I want to use SMOTE to oversample the positive medical image data before training.
However, the dimension of the data is 4D (761,64,64,3) which cause the error
Found array with dim 4. Estimator expected <= 2
So, I reshape my train_data:
X_res, y_res = smote.fit_sample(X_train.reshape(X_train.shape[0], -1), y_train.ravel())
And it works fine. Before feed it to CNNs, I reshape it back by:
X_res = X_res.reshape(X_res.shape[0], 64, 64, 3)
Now, I'm not sure is it a correct way to oversample and will the reshape operator change the images' structer?

I had a similar issue. I had used the reshape function to reshape the image (basically flattened the image)
X_train.shape
(8000, 250, 250, 3)
ReX_train = X_train.reshape(8000, 250 * 250 * 3)
ReX_train.shape
(8000, 187500)
smt = SMOTE()
Xs_train, ys_train = smt.fit_sample(ReX_train, y_train)
Although, this approach is pathetically slow, but helped to improve the performance.

As soon as you flatten an image you are loosing localized information, this is one of the reasons why convolutions are used in image-based machine learning.
8000x250x250x3 has an inherent meaning - 8000 samples of images, each image of width 250, height 250 and all of them have 3 channels when you do 8000x250*250*3 reshape is just a bunch of numbers unless you use some kind of sequence network to teach its bad.
oversampling is bad for image data, you can do image augmentations (20crop, introducing noise like a gaussian blur, rotations, translations, etc..)

First Flatten the image
Apply SMOTE on this flattened image data and its labels
Reshape the flattened image to RGB image
from imblearn.over_sampling import SMOTE
sm = SMOTE(random_state=42)
train_rows=len(X_train)
X_train = X_train.reshape(train_rows,-1)
(80,30000)
X_train, y_train = sm.fit_resample(X_train, y_train)
X_train = X_train.reshape(-1,100,100,3)
(>80,100,100,3)

Related

what happens if i set input size to 32,32 mnist

I want to train MNIST on VGG16.
MNIST image size is 28*28 and I set the input size to 32*32 in keras VGG16. When I train I get good metrics, but I´m not sure what really happens. Is keras filling in with empty space or is the image being expanded linearly, like in a zoom function? Anyone understands how I can get a test accuracy of +95% after 60 epochs?
Here I define target size:
target_size = (32, 32)
This is where I define my flow_from_dataframe generator:
train_df = pd.read_csv("cv1_train.csv", quoting=3)
train_df_generator = train_image_datagen.flow_from_dataframe(
dataframe=train_df,
directory="../../../MNIST",
target_size=target_size,
class_mode='categorical',
batch_size=batch_size,
shuffle=False,
color_mode="rgb",
classes=["zero","one","two","three","four","five","six","seven","eight","nine"]
)
Here I define my input size:
model_base = VGG16(weights=None, include_top=False,
input_shape=(32, 32, 3), classes=10)
The images would be simply resized to the specified target_size. This has been clearly stated in the documentation:
target_size: tuple of integers (height, width), default: (256, 256). The dimensions to which all images found will be resized.
You can also inspect the source code and find the relevant part in the load_img function. Also the default interpolation method used to resize the images is nearest. You can find more information about various interpolation methods here (MATLAB) or here (PIL).

Use pretrained model with different input shape and class model

I am working on a classification problem using CNN where my input image size is 64X64 and I want to use pretrained model such as VGG16,COCO or any other. But the problem is input image size of pretrained model is 224X224. How do I sort this issue. Is there any data augmentation way for input image size.
If I resize my input image to 224X224 then there is very high chance of image will get blurred and that may impact the training. Please correct me if I am wrong.
Another question is related to pretrained model. If I am using transfer learning then generally how layers I have to freeze from pretrained model. Considering my classification is very different from pretrained model classes. But I guess first few layers we can freeze it to get the edges, curve etc.. of the images which is very common in all the images.
But the problem is input image size of pretrained model is 224X224.
I assume you work with Keras/Tensorflow (It's the same for other DL frameworks). According to the docs in the Keras Application:
input_shape: optional shape tuple, only to be specified if include_top
is False (otherwise the input shape has to be (224, 224, 3) (with
'channels_last' data format) or (3, 224, 224) (with 'channels_first'
data format). It should have exactly 3 inputs channels, and width and
height should be no smaller than 48. E.g. (200, 200, 3) would be one
So there are two options to solve your issue:
Resize your input image to 244*244 by existing library and use VGG classifier [include_top=True].
Train your own classifier on top of the VGG models. As mentioned in the above documentation in Keras if your image is different than 244*244, you should train your own classifier [include_top=False]. You can do such things easily with:
inp = keras.layers.Input(shape=(64, 64, 3), name='image_input')
vgg_model = VGG19(weights='imagenet', include_top=False)
vgg_model.trainable = False
x = keras.layers.Flatten(name='flatten')(vgg_model)
x = keras.layers.Dense(512, activation='relu', name='fc1')(x)
x = keras.layers.Dense(512, activation='relu', name='fc2')(x)
x = keras.layers.Dense(10, activation='softmax', name='predictions')(x)
new_model = keras.models.Model(inputs=inp, outputs=x)
new_model.compile(optimizer='adam', loss='categorical_crossentropy',
metrics=['accuracy'])
If I am using transfer learning then generally how layers I have to
freeze from pretrained model
It is really depend on what your new task, how many training example you have, whats your pretrained model, and lots of other things. If I were you, I first throw away the pretrained model classifier. Then, If not worked, remove some other Convolution layer and do it step by step until I get good performance.
The following code works for me for image size 128*128*3:
vgg_model = VGG16(include_top=False, weights='imagenet')
print(vgg_model.summary())
#Get the dictionary of config for vgg16
vgg_config = vgg_model.get_config()
vgg_config["layers"][0]["config"]["batch_input_shape"] = (None, 128, 128, 3)
vgg_updated = Model.from_config(vgg_config)
vgg_updated.trainable = False
model = Sequential()
# Add the vgg convolutional base model
model.add(vgg_updated)
# Flattedn Layer must be added
model.add(Flatten())
vgg_updated.summary()
model.summary()

keras vgg 16 shape error

im trying to fit the data with the following shape to the pretrained keras vgg19 model.
image input shape is (32383, 96, 96, 3)
label shape is (32383, 17)
and I got this error
expected block5_pool to have 4 dimensions, but got array with shape (32383, 17)
at this line
model.fit(x = X_train, y= Y_train, validation_data=(X_valid, Y_valid),
batch_size=64,verbose=2, epochs=epochs,callbacks=callbacks,shuffle=True)
Here's how I define my model
model = VGG16(include_top=False, weights='imagenet', input_tensor=None, input_shape=(96,96,3),classes=17)
How did maxpool give me a 2d tensor but not a 4D tensor ? I'm using the original model from keras.applications.vgg16. How can I fix this error?
Your problem comes from VGG16(include_top=False,...) as this makes your solution to load only a convolutional part of VGG. This is why Keras is complaining that it got 2-dimensional output insted of 4-dimensional one (4 dimensions come from the fact that convolutional output has shape (nb_of_examples, width, height, channels)). In order to overcome this issue you need to either set include_top=True or add additional layers which will squash the convolutional part - to a 2d one (by e.g. using Flatten, GlobalMaxPooling2D, GlobalAveragePooling2D and a set of Dense layers - including a final one which should be a Dense with size of 17 and softmax activation function).

Output dimensions of convolutional layer with Keras

The Keras tutorial gives the following code example (with comments):
# apply a convolution 1d of length 3 to a sequence with 10 timesteps,
# with 64 output filters
model = Sequential()
model.add(Convolution1D(64, 3, border_mode='same', input_shape=(10, 32)))
# now model.output_shape == (None, 10, 64)
I am confused about the output size. Shouldn't it create 10 timesteps with a depth of 64 and a width of 32 (stride defaults to 1, no padding)? So (10,32,64) instead of (None,10,64)
In k-Dimensional convolution you will have a filters which will somehow preserve a structure of first k-dimensions and will squash the information from all other dimension by convoluting them with a filter weights. So basically every filter in your network will have a dimension (3x32) and all information from the last dimension (this one with size 32) will be squashed to a one real number with the first dimension preserved. This is the reason why you have a shape like this.
You could imagine a similar situation in 2-D case when you have a colour image. Your input will have then 3-dimensional structure (picture_length, picture_width, colour). When you apply the 2-D convolution with respect to your first two dimensions - all information about colours will be squashed by your filter and will no be preserved in your output structure. The same as here.

How to classify and localize noisy circles with convolutional neural networks[tensorflow]

I've got the task to find and locate circles(markers) in medical pictures(x-ray) by an neural network.
The pictures A [800x600, grayimage] are very noisy and contain structures like bones and circles/markers.
First of all I build a dataset of pictures for training the classifier.
I save from the 800x600 pictures the areas which contain circles with the size 28x28 (label=Marker). The circles itself got different intensity levels (click here to see an example circle picture) Sometimes the circles are partly overlaid by bones. Furthermore I save some areas which not contain circles(label=NoMarker).
Each of this training image has the size 28x28.
(Database 100'000 pictures size 28x28, half of it with circle, the other half without). I save all of this [28x28] images and labels to the same format as the minst dataset(idx3/idx1).
I started to build an classifier based on the tensorflow mnist_softmax example and got an accuracy of round about 90 percent.
The code look like this:
# Read my Datasets, code adapted from tensorflow.examples.tutorials.mnist:
# train_image [28x28] pixel, and the labels
mnist = read_data_sets(FLAGS.data_dir, one_hot=True)
sess = tf.InteractiveSession()
# Create the model
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
# Define loss and optimizer
y_ = tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.1).minimize(cross_entropy)
# Train
tf.initialize_all_variables().run()
for i in range(2000):
batch_xs, batch_ys = mnist.train.next_batch(2000)
train_step.run({x: batch_xs, y_: batch_ys})
# Test trained model
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print("Accuracy"+ str(accuracy.eval({x: mnist.test.images, y_: mnist.test.labels})))
After this I try to localise the pixels in the circle.
For localization I take a sliding window approach. That means that I run other a 800x600 picture take always the area in the sliding window(28x28) as an input for my trained neural networked. The result of the NN is the propabily that this window contains a circle. This results are saved in a matrix/picture:
Result of localisation:
[Edited]: I have marked the circles red which was correctly classified in the Picture. The intensity of the graylevel stands for the probability to be an circle[black pixel =100% inside a circle, white pixel=0% inside a pixel]. So in best case there should only be red points and the rest of the picture should be white.
It seems to me that the geometrie/curves of the circle is not involved in the learning process.
Im very new in the field machine learning. Is there a got possibility/ model I could use for this task.
Best regards,
Kater

Resources