I'm trying to train a GAN for generating samples of images and ground truths for a semantic segmentation task, however I'm getting an error regarding the shape of my input. From the errors it seems that it expects my input arrays are 4 dimensions however I believe that the shape I need (5 dimensions) is required for the problem at hand.
The shape of my tensors is (64, 2, 128, 128, 3)
64 for the images in the batch, 2 for either the image or ground truth, 128, 128 for the image dimension, and 3 for RGB channels.
My generator looks like this:
def build_generator(input_dim, output_size):
"""
生成器を構築
# 引数
input_dim : Integer, 入力ノイズの次元
output_size : List, 出力画像サイズ
# 戻り値
model : Keras model, 生成器
"""
model = Sequential()
model.add(Dense(256, input_dim=(input_dim)))
unit_size = 128 * output_size[0] // 8 * output_size[1] // 8
model.add(Dense(unit_size))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
shape = (output_size[0] // 8, output_size[1] // 8, 128)
model.add(Reshape(shape))
model.add(UpSampling2D(size=(2, 2)))
model.add(Conv2D(64, (3, 3), padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(UpSampling2D(size=(2, 2)))
model.add(Conv2D(32, (3, 3), padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(UpSampling2D(size=(2, 2)))
model.add(Conv2D(3, (5, 5), padding='same'))
model.add(Activation('sigmoid'))
return model
ValueError: Error when checking input: expected conv2d_9_input to have 4 dimensions, but got array with shape (64, 2, 128, 128, 3)
Is there anywhere I can change so that my generator will accept these 5 dimensional array inputs?
Related
I am working with KDDTrain+ dataset and trying to implement CNN-LSTM Model.
I have converted the given dataset into (125973,121) size and then convert to (125973,11,11,1) size which I named as "X_Train_new".
Following is the model I try to write for CNN-LSTM
model = Sequential()
model.add(TimeDistributed(Convolution2D(64, (3,3), strides=(1,1), padding="same", activation="relu"),input_shape=(10,11,11,1)))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2,2), strides=(1,1), padding="same")))
model.add(TimeDistributed(Convolution2D(64,kernel_size=(3,3),strides=(1,1), padding="same",activation="relu")))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2,2),strides=(1,1),padding="same")))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(35,dropout=0.1,stateful=True, return_sequences=True))
model.add(Dropout(0.1))
model.add(LSTM(25,dropout=0.1))
model.add(Dropout(0.1))
model.add(Flatten())
model.add(Dense(256, activation="relu"))
model.add(Dropout(0.5))
model.add(Dense(5, activation="softmax"))
model.compile(optimizer ='adam',loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
model.fit(X_Train_new, Y_Train, epochs = 10, batch_size = 32)
But I keep getting this error:
ValueError: Input 0 of layer "sequential_24" is incompatible with the layer:
expected shape=(None, 10, 11, 11, 1), found shape=(None, 11, 11, 1)
How can I solve this error?
I am writing a code for running autoencoder on CIFAR10 dataset and see the reconstructed images.
The requirement is to create
Encoder with First Layer
Input shape: (32,32,3)
Conv2D Layer with 64 Filters of (3,3)
BatchNormalization layer
ReLu activation
2D MaxpoolingLayer with (2,2) filter
Encoder with Second Layer
Conv2D layer with 16 filters (3,3)
BatchNormalization layer
ReLu activation
2D MaxpoolingLayer with (2,2) filter
Final Encoded as MaxPool with (2,2) with all previous layers
Decoder with First Layer
Input shape: encoder output
Conv2D Layer with 16 Filters of (3,3)
BatchNormalization layer
ReLu activation
UpSampling2D with (2,2) filter
Decoder with Second Layer
Conv2D Layer with 32 Filters of (3,3)
BatchNormalization layer
ReLu activation
UpSampling2D with (2,2) filter
Final Decoded as Sigmoid with all previous layers
I understand that
When we are creating Convolutional Autoencoder (or any AE), we need to pass the output of the previous layer to the next layer.
So, when I create the first Conv2D layer with ReLu and then perform BatchNormalization .. in which I pass the Conv2D layer .. right?
But when I do MaxPooling2D .. what should I pass .. BatchNormalization output or Conv2D layer output?
Also, is there any order in which I should be performing these operations?
Conv2D --> BatchNormalization --> MaxPooling2D
OR
Conv2D --> MaxPooling2D --> BatchNormalization
I am attaching my code below ... I have attempted it to two different ways and hence getting different outputs (in terms of model summary and also model training graph)
Can someone please help me by explaining which is the correct method (Method-1 or Method-2)? Also, how do I understand which graph shows better model performance?
Method - 1
input_image = Input(shape=(32, 32, 3))
### Encoder
conv1_1 = Conv2D(64, (3, 3), activation='relu', padding='same')(input_image)
bnorm1_1 = BatchNormalization()(conv1_1)
mpool1_1 = MaxPooling2D((2, 2), padding='same')(conv1_1)
conv1_2 = Conv2D(16, (3, 3), activation='relu', padding='same')(mpool1_1)
borm1_2 = BatchNormalization()(conv1_2)
encoder = MaxPooling2D((2, 2), padding='same')(conv1_2)
### Decoder
conv2_1 = Conv2D(16, (3, 3), activation='relu', padding='same')(encoder)
bnorm2_1 = BatchNormalization()(conv2_1)
up1_1 = UpSampling2D((2, 2))(conv2_1)
conv2_2 = Conv2D(32, (3, 3), activation='relu', padding='same')(up1_1)
bnorm2_2 = BatchNormalization()(conv2_2)
up2_1 = UpSampling2D((2, 2))(conv2_2)
decoder = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(up2_1)
model = Model(input_image, decoder)
model.compile(optimizer='adam', loss='binary_crossentropy')
model.summary()
history = model.fit(trainX, trainX,
epochs=50,
batch_size=1000,
shuffle=True,
verbose=2,
validation_data=(testX, testX)
)
As an output of the model summary, I get this
Total params: 18,851
Trainable params: 18,851
Non-trainable params: 0
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper right')
plt.show()
Method - 2
input_image = Input(shape=(32, 32, 3))
### Encoder
x = Conv2D(64, (3, 3), activation='relu', padding='same')(input_image)
x = BatchNormalization()(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = BatchNormalization()(x)
encoder = MaxPooling2D((2, 2), padding='same')(x)
### Decoder
x = Conv2D(16, (3, 3), activation='relu', padding='same')(encoder)
x = BatchNormalization()(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = BatchNormalization()(x)
x = UpSampling2D((2, 2))(x)
decoder = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)
model = Model(input_image, decoder)
model.compile(optimizer='adam', loss='binary_crossentropy')
model.summary()
history = model.fit(trainX, trainX,
epochs=50,
batch_size=1000,
shuffle=True,
verbose=2,
validation_data=(testX, testX)
)
As an output of the model summary, I get this
Total params: 19,363
Trainable params: 19,107
Non-trainable params: 256
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper right')
plt.show()
In method 1, BatchNormalization layers does not exist in the compiled model, as the output of these layers are not used anywhere. You can check this by running model1.summary()
Method 2 is perfectly alright.
Order of the operations :
Conv2D --> BatchNormalization --> MaxPooling2D is usually the common approach.
Though either order would work since, since BatchNorm is just mean and variance normalization.
Edit:
For Conv2D --> BatchNormalization --> MaxPooling2D :
conv1_1 = Conv2D(64, (3, 3), activation='relu', padding='same')(input_image)
bnorm1_1 = BatchNormalization()(conv1_1)
mpool1_1 = MaxPooling2D((2, 2), padding='same')(bnorm1_1)
and then use mpool1_1 as input for next layer.
For Conv2D --> MaxPooling2D --> BatchNormalization:
conv1_1 = Conv2D(64, (3, 3), activation='relu', padding='same')(input_image)
mpool1_1 = MaxPooling2D((2, 2), padding='same')(conv1_1)
bnorm1_1 = BatchNormalization()(mpool1_1)
and then use bnorm1_1 as input for next layer.
To effectively use BatchNormalization layer, you should always use it before activation.
Instead of:
conv1_1 = Conv2D(64, (3, 3), activation='relu', padding='same')(input_image)
bnorm1_1 = BatchNormalization()(conv1_1)
mpool1_1 = MaxPooling2D((2, 2), padding='same')(bnorm1_1)
Use it like this:
conv1_1 = Conv2D(64, (3, 3), padding='same')(input_image)
bnorm1_1 = BatchNormalization()(conv1_1)
act_1 = Activation('relu')(bnorm1_1)
mpool1_1 = MaxPooling2D((2, 2), padding='same')(act_1)
For more details, check here:
Where do I call the BatchNormalization function in Keras?
train input shape : (13974, 100, 6, 5)
train output shape : (13974, 1, 6, 5)
test input shape : (3494, 100, 6, 5)
test output shape : (3494, 1, 6, 5)
model = Sequential()
model.add(TimeDistributed(Conv2D(32, (3, 3),
padding='same'),
input_shape=(100, 6, 5,1)))
model.add(TimeDistributed(Activation('relu')))
model.add(TimeDistributed(Conv2D(32, (3, 3))))
model.add(TimeDistributed(Activation('relu')))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(TimeDistributed(Dropout(0.25)))
model.add(TimeDistributed(Flatten()))
model.add(TimeDistributed(Dense(512)))
model.add(TimeDistributed(Dense(35, name="first_dense_flow" )))
model.add(LSTM(20, return_sequences=True, name="lstm_layer_flow"));
model.add(TimeDistributed(Dense(101), name=" time_distr_dense_one_ flow "))
model.add(GlobalAveragePooling1D(name="global_avg_flow"))
model.compile(loss='mae', optimizer='adam', metrics=['accuracy']) model.fit(train_input,train_output,epochs=50,batch_size=60)
I am trying to build a CNN-LSTM model capable of detecting the future. Input is 13974 sequence, each sequence consists of 100 time stamps, each containing 6 locations and 5 features (variables) and so the input is (13974,100,6,5) and the output is (13974,1,6,5).
How I can change my model so that the spatio temporal prediction can be done
I am doing image classification but i got the error for calculate the accuracy ,please help me how to do it.
this is my model :
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(6))
model.add(Activation('softmax'))
I want classifly the image like this :
label_dict={'0':'buildings',
this is my classification labels :
'1':'forest',
'2':'glacier',
'3':'mountain',
'4':'sea' ,
'5':'street' }
I am using categorical_crossentropy:
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adam(),
metrics=['accuracy'])
I am predicting classes :
pred=model.predict_classes(test)
I am calculate the test accuracy , but i got some errors:
print('Test loss:', pred[0])
print('Test accuracy:',pred[1])
Test loss: 5
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-28-b74afa5e2da9> in <module>
1 print('Test loss:', pred[0])
----> 2 print('Test accuracy:',pred[1])
IndexError: index 1 is out of bounds for axis 0 with size 1
If the size of array is n , max index value is n-1 .
So you can access to pred[0] only
I am trying to use U-net network architeture for stereo vision.
I have datasets with 3 different image sizes (1240x368, 1224x368 and 1384x1104).
Here is My whole class:
import pickle
from keras.models import Sequential
from keras.layers import Convolution2D, MaxPooling2D, UpSampling2D, Conv2DTranspose
from keras.utils import np_utils
import sys, numpy as np
import keras
import cv2
pkl_file = open('data.p', 'rb')
dict = pickle.load(pkl_file)
X_data = dict['images']
Y_data = dict['disparity']
data_num = len(X_data)
train_num = int(data_num * 0.8)
X_train = X_data[:train_num]
X_test = X_data[train_num:]
Y_train = Y_data[:train_num]
Y_test = Y_data[train_num:]
def gen(X, Y):
while True:
for x, y in zip(X, Y):
yield x, y
model = Sequential()
model.add(Convolution2D(6, (2, 2), input_shape=(None, None, 6), activation='relu', padding='same'))
model.add(Convolution2D(64, (3, 3), activation='relu'))
model.add(Convolution2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(64, (3, 3), activation='relu'))
model.add(Convolution2D(128, (3, 3), activation='relu'))
model.add(Convolution2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(128, (3, 3), activation='relu'))
model.add(Convolution2D(256, (3, 3), activation='relu'))
model.add(Convolution2D(256, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(UpSampling2D(size=(2, 2)))
model.add(Conv2DTranspose(256, (3, 3), activation='relu'))
model.add(Conv2DTranspose(256, (3, 3), activation='relu'))
model.add(Conv2DTranspose(128, (3, 3), activation='relu'))
model.add(UpSampling2D(size=(2, 2)))
model.add(Conv2DTranspose(128, (3, 3), activation='relu'))
model.add(Conv2DTranspose(128, (3, 3), activation='relu'))
model.add(Conv2DTranspose(64, (3, 3), activation='relu'))
model.add(UpSampling2D(size=(2, 2)))
model.add(Conv2DTranspose(64, (3, 3), activation='relu'))
model.add(Conv2DTranspose(64, (3, 3), activation='relu'))
model.add(Conv2DTranspose(3, (3, 3), activation='relu'))
model.compile(loss=['mse'], optimizer='adam', metrics=['accuracy'])
model.fit_generator(gen(X_train, Y_train), steps_per_epoch=len(X_train), epochs=5)
scores = model.evaluate(X_test, Y_test, verbose=0)
When I try to run this code, I get an error in which it says:
Incompatible shapes: [1,370,1242,3] vs. [1,368,1240,3]
I resized the pictures to be divisible by 8 since I have 3 maxpool layers.
As input I put 2 images (I am doing stereo vision) and as an output I get disparity map for the first image. I am concatenating 2 images by putting the second one in third dimension (np.concatenate((img1,img2), axis=-1).
Can somebody tell me what I am doing wrong?
Here is my trace:
Traceback (most recent call last):
File "C:\Users\Ivan\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1322, in _do_call
return fn(*args)
File "C:\Users\Ivan\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "C:\Users\Ivan\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [1,370,1242,3] vs. [1,368,1240,3]
[[Node: loss/conv2d_transpose_9_loss/sub = Sub[T=DT_FLOAT, _class=["loc:#training/Adam/gradients/loss/conv2d_transpose_9_loss/sub_grad/Reshape"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](conv2d_transpose_9/Relu-1-0-TransposeNCHWToNHWC-LayoutOptimizer, _arg_conv2d_transpose_9_target_0_2/_303)]]
[[Node: loss/mul/_521 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2266_loss/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
I tried resizing pictures and learning works, but since as a result I get disparity maps, resizing is not a good option. Does anybody have any advice?
If the picture is too big to fit in conv2dTransponse, you can use Cropping2d layer so it crops the picture on wished size. This works if input picture has even number of pixels.