Spatio temporal 2D-CNN-LSTM - time-series

train input shape : (13974, 100, 6, 5)
train output shape : (13974, 1, 6, 5)
test input shape : (3494, 100, 6, 5)
test output shape : (3494, 1, 6, 5)
model = Sequential()
model.add(TimeDistributed(Conv2D(32, (3, 3),
padding='same'),
input_shape=(100, 6, 5,1)))
model.add(TimeDistributed(Activation('relu')))
model.add(TimeDistributed(Conv2D(32, (3, 3))))
model.add(TimeDistributed(Activation('relu')))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(TimeDistributed(Dropout(0.25)))
model.add(TimeDistributed(Flatten()))
model.add(TimeDistributed(Dense(512)))
model.add(TimeDistributed(Dense(35, name="first_dense_flow" )))
model.add(LSTM(20, return_sequences=True, name="lstm_layer_flow"));
model.add(TimeDistributed(Dense(101), name=" time_distr_dense_one_ flow "))
model.add(GlobalAveragePooling1D(name="global_avg_flow"))
model.compile(loss='mae', optimizer='adam', metrics=['accuracy']) model.fit(train_input,train_output,epochs=50,batch_size=60)
I am trying to build a CNN-LSTM model capable of detecting the future. Input is 13974 sequence, each sequence consists of 100 time stamps, each containing 6 locations and 5 features (variables) and so the input is (13974,100,6,5) and the output is (13974,1,6,5).
How I can change my model so that the spatio temporal prediction can be done

Related

Keras wrong accuracy using model.predict

I am using this code for a CNN
train_batches = ImageDataGenerator().flow_from_directory('dice_sklearn/train', target_size=(IMG_WIDTH, IMG_HEIGHT),
classes=['1', '2', '3', '4', '5', '6'],
batch_size=cv_opt['batch'],
color_mode="grayscale")
test_batches = ImageDataGenerator().flow_from_directory('dice_sklearn/test', target_size=(IMG_WIDTH, IMG_HEIGHT),
class_mode='categorical',
batch_size=cv_opt['batch'],
shuffle=False)
train_num = len(train_batches)
test_num = len(test_batches)
model = Sequential([
Conv2D(32, (3, 3), padding='same', activation='relu', input_shape=(IMG_WIDTH, IMG_HEIGHT, 1)),
Conv2D(32, (3, 3), activation='relu'),
MaxPooling2D(pool_size=(2, 2)),
Dropout(0.30),
Conv2D(64, (3, 3), padding='same', activation='relu'),
Conv2D(64, (3, 3), activation='relu'),
MaxPooling2D(pool_size=(2, 2)),
Dropout(0.30),
Conv2D(64, (3, 3), padding='same', activation='relu'),
Conv2D(64, (3, 3), activation='relu'),
MaxPooling2D(pool_size=(2, 2)),
Dropout(0.30),
Flatten(),
Dense(512, activation='relu'),
Dropout(0.5),
Dense(6, activation='softmax'),
])
print(model.summary())
model.compile(Adam(lr=cv_opt['lr']), loss='categorical_crossentropy', metrics=['accuracy'])
history = model.fit(train_batches, steps_per_epoch=train_num,
epochs=cv_opt['epoch'], verbose=2)
model.save('cnn-keras.h5')
test_batches.reset()
prediction = model.predict(test_batches, steps=test_num, verbose=1)
predicted_class = np.argmax(prediction, axis=1)
classes = test_batches.classes[test_batches.index_array]
accuracy = (predicted_class == classes).mean()
print("Final accuracy:", accuracy * 100)
Where
cv_opt['batch'] is set to 50
cv_opt['lr'] is set to 0.0003
cv_opt['epoch'] is set to 50
The output from the training phase (with model.fit) on the last line (last epoch) returns:
192/192 [==============================] - 98s 510ms/step - loss: 0.0514 - accuracy: 0.9818 - val_loss: 0.0369 - val_accuracy: 0.9833
But when I run this part of code:
test_batches.reset()
prediction = model.predict(test_batches, steps=test_num, verbose=1)
predicted_class = np.argmax(prediction, axis=1)
classes = test_batches.classes[test_batches.index_array]
accuracy = (predicted_class == classes).mean()
print("Final accuracy:", accuracy * 100)
I get an accuracy score very very low: (0.16).
But if a plot the learning curves I can see that the test/validation curve (if in testing or in parameter tuning) both reach accuracies near 90%.
Am I using the model.predict in the wrong way?
Your model is not overfitting. Steps 1 and 2 do not have to be implemented at all in order to solve your problem. In fact, it is even more wrong since the author states that in case of overfitting you need to add more layers, which is strongly advised against: when one has an overfitting model, the model needs to be made simpler, not more complex.
The solution to your issue lies in #Dr.Snoopy's answer : the order of the classes do not match.
My recommendation is to iterate manually through the entire test set, get the ground truth, get the prediction (ensure the same exact preprocessing on images like in the training set is applied on your test set images) before you feed them to your model.
Then, calculate your metrics. This will solve your problem.
For example, you could use the idea below:
correctly_predicted = 0
for image in os.scandir(path_to_my_test_directory):
image_path = image.path
image = cv2.imread(image_path)
image = apply_the_same_preprocessing_like_in_training(image)
#transform from (H,W,3) to (1,H,W,3) because TF + Keras predict only on batches
image = np.expand_dims(image,axis=0)
prediction_label = np.argmax(model.predict(image))
if prediction_label == ground_truth_label:
correctly_predicted+=1

How to fix Error when checking input error?

I'm trying to train a GAN for generating samples of images and ground truths for a semantic segmentation task, however I'm getting an error regarding the shape of my input. From the errors it seems that it expects my input arrays are 4 dimensions however I believe that the shape I need (5 dimensions) is required for the problem at hand.
The shape of my tensors is (64, 2, 128, 128, 3)
64 for the images in the batch, 2 for either the image or ground truth, 128, 128 for the image dimension, and 3 for RGB channels.
My generator looks like this:
def build_generator(input_dim, output_size):
"""
生成器を構築
# 引数
input_dim : Integer, 入力ノイズの次元
output_size : List, 出力画像サイズ
# 戻り値
model : Keras model, 生成器
"""
model = Sequential()
model.add(Dense(256, input_dim=(input_dim)))
unit_size = 128 * output_size[0] // 8 * output_size[1] // 8
model.add(Dense(unit_size))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
shape = (output_size[0] // 8, output_size[1] // 8, 128)
model.add(Reshape(shape))
model.add(UpSampling2D(size=(2, 2)))
model.add(Conv2D(64, (3, 3), padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(UpSampling2D(size=(2, 2)))
model.add(Conv2D(32, (3, 3), padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(UpSampling2D(size=(2, 2)))
model.add(Conv2D(3, (5, 5), padding='same'))
model.add(Activation('sigmoid'))
return model
ValueError: Error when checking input: expected conv2d_9_input to have 4 dimensions, but got array with shape (64, 2, 128, 128, 3)
Is there anywhere I can change so that my generator will accept these 5 dimensional array inputs?

Why implementation of Resnet50 in Keras forbids images smaller than 32x32x3?

I am trying to understand why the implementation of ResNet50 in Keras forbids images smaller than 32x32x3.
Based on their implementation: https://github.com/keras-team/keras-applications/blob/master/keras_applications/resnet50.py
The function that catches that is _obtain_input_shape
To overcome this problem, I made my own implementation based on their code and I removed the code that forbids minimal size. In my implementation I also add the possibility to work with pre-trained model with more than three channels by replicating the RGB weights for the first conv1 layer.
def ResNet50(load_weights=True,
input_shape=None,
pooling=None,
classes=1000):
img_input = Input(shape=input_shape, name='tuned_input')
x = ZeroPadding2D(padding=(3, 3), name='conv1_pad')(img_input)
# Stage 1 (conv1_x)
x = Conv2D(64, (7, 7),
strides=(2, 2),
padding='valid',
kernel_initializer=KERNEL_INIT,
name='tuned_conv1')(x)
x = BatchNormalization(axis=CHANNEL_AXIS, name='bn_conv1')(x)
x = Activation('relu')(x)
x = ZeroPadding2D(padding=(1, 1), name='pool1_pad')(x)
x = MaxPooling2D((3, 3), strides=(2, 2))(x)
# Stage 2 (conv2_x)
x = _convolution_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
for block in ['b', 'c']:
x = _identity_block(x, 3, [64, 64, 256], stage=2, block=block)
# Stage 3 (conv3_x)
x = _convolution_block(x, 3, [128, 128, 512], stage=3, block='a')
for block in ['b', 'c', 'd']:
x = _identity_block(x, 3, [128, 128, 512], stage=3, block=block)
# Stage 4 (conv4_x)
x = _convolution_block(x, 3, [256, 256, 1024], stage=4, block='a')
for block in ['b', 'c', 'd', 'e', 'f']:
x = _identity_block(x, 3, [256, 256, 1024], stage=4, block=block)
# Stage 5 (conv5_x)
x = _convolution_block(x, 3, [512, 512, 2048], stage=5, block='a')
for block in ['b', 'c']:
x = _identity_block(x, 3, [512, 512, 2048], stage=5, block=block)
# Condition on the last layer
if pooling == 'avg':
x = layers.GlobalAveragePooling2D()(x)
elif pooling == 'max':
x = layers.GlobalMaxPooling2D()(x)
inputs = img_input
# Create model.
model = models.Model(inputs, x, name='resnet50')
if load_weights:
weights_path = get_file(
'resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5',
WEIGHTS_PATH_NO_TOP,
cache_subdir='models',
md5_hash='a268eb855778b3df3c7506639542a6af')
model.load_weights(weights_path, by_name=True)
f = h5py.File(weights_path, 'r')
d = f['conv1']
# Used to work with more than 3 channels with pre-trained model
if input_shape[2] % 3 == 0:
model.get_layer('tuned_conv1').set_weights([d['conv1_W_1:0'][:].repeat(input_shape[2] / 3, axis=2),
d['conv1_b_1:0']])
else:
m = (3 * int(input_shape[2] / 3)) + 3
model.get_layer('tuned_conv1').set_weights(
[d['conv1_W_1:0'][:].repeat(m, axis=2)[:, :, 0:input_shape[2], :],
d['conv1_b_1:0']])
return model
I run my implementation with a 10x10x3 images and it seems to work. Thus I do not understand why they put this minimal bound.
They do not provide any information about this choice. I also check the original paper and I did not found any restriction mentioned about a minimal input shape. I suppose there is a reason for this bound but I do not know this one.
Thus I would like to know why such restriction has been done for the Resnet implementation.
ResNet50 has 5 stages of downsampling, between MaxPooling of 2x2 and Strided Convolution with strides of 2 px in each direction. This means that the minimum input size is 2^5 = 32, and this value is also the size of the receptive field.
There is not much point of using smaller images than 32x32, since then downsampling is not doing anything, and this will change the behavior of the network. For such small images then its better to use another network with less downsampling (like DenseNet) or with less depth.

Rewriting Sequential model using Functional API

I am trying to rewrite a Sequential model of Network In Network CNN using Functional API. I use it with CIFAR-10 dataset. The Sequential model trains without a problem, but Functional API model gets stuck. I probably missed something when rewriting the model.
Here's a reproducible example:
Dependencies:
from keras.models import Model, Input, Sequential
from keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D, Dropout, Activation
from keras.utils import to_categorical
from keras.losses import categorical_crossentropy
from keras.optimizers import Adam
from keras.datasets import cifar10
Loading the dataset:
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train = x_train / 255.
x_test = x_test / 255.
y_train = to_categorical(y_train, num_classes=10)
y_test = to_categorical(y_test, num_classes=10)
input_shape = x_train[0,:,:,:].shape
Here's the working Sequential model:
model = Sequential()
#mlpconv block1
model.add(Conv2D(32, (5, 5), activation='relu',padding='valid',input_shape=input_shape))
model.add(Conv2D(32, (1, 1), activation='relu'))
model.add(Conv2D(32, (1, 1), activation='relu'))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(0.5))
#mlpconv block2
model.add(Conv2D(64, (3, 3), activation='relu',padding='valid'))
model.add(Conv2D(64, (1, 1), activation='relu'))
model.add(Conv2D(64, (1, 1), activation='relu'))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(0.5))
#mlpconv block3
model.add(Conv2D(128, (3, 3), activation='relu',padding='valid'))
model.add(Conv2D(32, (1, 1), activation='relu'))
model.add(Conv2D(10, (1, 1), activation='relu'))
model.add(GlobalAveragePooling2D())
model.add(Activation('softmax'))
Compile and train:
model.compile(loss=categorical_crossentropy, optimizer=Adam(), metrics=['acc'])
_ = model.fit(x=x_train, y=y_train, batch_size=32,
epochs=200, verbose=1,validation_split=0.2)
In three epochs the model gets close to 50% validation accuracy.
Here's the same model rewritten using Functional API:
model_input = Input(shape=input_shape)
#mlpconv block1
x = Conv2D(32, (5, 5), activation='relu',padding='valid')(model_input)
x = Conv2D(32, (1, 1), activation='relu')(x)
x = Conv2D(32, (1, 1), activation='relu')(x)
x = MaxPooling2D((2,2))(x)
x = Dropout(0.5)(x)
#mlpconv block2
x = Conv2D(64, (3, 3), activation='relu',padding='valid')(x)
x = Conv2D(64, (1, 1), activation='relu')(x)
x = Conv2D(64, (1, 1), activation='relu')(x)
x = MaxPooling2D((2,2))(x)
x = Dropout(0.5)(x)
#mlpconv block3
x = Conv2D(128, (3, 3), activation='relu',padding='valid')(x)
x = Conv2D(32, (1, 1), activation='relu')(x)
x = Conv2D(10, (1, 1), activation='relu')(x)
x = GlobalAveragePooling2D()(x)
x = Activation(activation='softmax')(x)
model = Model(model_input, x, name='nin_cnn')
This model is then compiled using the same parameters as the Sequential model. When trained, the training accuracy gets stuck at 0.10, meaning the model doesn't get better and randomly chooses one of 10 classes.
What did I miss when rewriting the model? When calling model.summary() the models look identical except for the explicit Input layer in the Functional API model.
Removing activation in the final conv layer solves the problem:
x = Conv2D(10, (1, 1))(x)
Still not sure why the Sequential model works fine with activation in that layer.

Keras model predict train/test shape

I am training a CNN with Keras but with 30x30 patches from an image. I want to test the network with a full image but I get the following error:
ValueError: GpuElemwise. Input dimension mis-match. Input 2 (indices
start at 0) has shape[1] == 30, but the output's size on that axis is
100. Apply node that caused the error: GpuElemwise{Composite{((i0 + i1) - i2)}}[(0, 0)](GpuDimShuffle{0,2,3,1}.0, GpuReshape{4}.0,
GpuFromHost.0) Toposort index: 79 Inputs types:
[CudaNdarrayType(float32, 4D), CudaNdarrayType(float32, (True, True,
True, False)), CudaNdarrayType(float32, 4D)] Inputs shapes: [(10, 100,
100, 3), (1, 1, 1, 3), (10, 30, 30, 3)] Inputs strides: [(30000, 100,
1, 10000), (0, 0, 0, 1), (2700, 90, 3, 1)] Inputs values: ['not
shown', CudaNdarray([[[[ 0.01060364 0.00988821 0.00741314]]]]), 'not
shown'] Outputs clients:
[[GpuCAReduce{pre=sqr,red=add}{0,1,1,1}(GpuElemwise{Composite{((i0 +
i1) - i2)}}[(0, 0)].0)]]
This is my model.predict:
predict_image = model.predict(np.array([test_images[1]]), batch_size=1)[0]
It's seems like the issue is that the input size cannot be anything other than 30x30 but the first input shape for the first layer of my network is none, none, 3.
model.add(Convolution2D(n1, f1, f1, border_mode='same', input_shape=(None, None, 3), activation='relu'))
Is it simply not possible to test an image with different dimensions to the ones I trained with?
As fchollet himself described here, you should be able to define the input as so:
input_shape=(1, None, None)
However this will fail if you have layers that use the Flatten operation.
This suggests that you should be able to accomplish your goal with a fully convolutional NN.

Resources