Related
X and Y is of shape (89362, 5) and (89362,) repectively.
x_train, x_test, y_train, y_test = train_test_split(X, Y,
test_size = 0.3,
random_state = 1)
x_train.shape, y_train.shape = ((62553, 5), (62553,))
x_test.shape, y_test.shape = ((26809, 5), (26809,))
Reshaped the vectors to:
torch.Size([1, 62553, 5]), torch.Size([1, 62553])
torch.Size([1, 26809, 5]), torch.Size([1, 26809])
The model is defined as
n_steps = 62553
n_features = 5
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_steps, n_features)))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(62553))
model.compile(optimizer='adam', loss='mse')
model.fit(x_train, y_train, epochs=10, verbose=0)
While predicting with x_test, it throws value error
yhat = model.predict(x_test, verbose=0)
print(yhat)
ValueError: Error when checking input: expected conv1d_4_input to have shape (62553, 5) but got array with shape torch.Size([26809, 5])
This is happening because you are specifying a fixed size here:
model.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_steps, n_features)))
Once you pass something else to the model, the model is still expecting that fixed size with dimensions:
n_steps = 62553
n_features = 5
Removing the input_shape parameter should correct this issue:
model.add(Conv1D(filters=64, kernel_size=2, activation='relu'))
I hope that this helps you.
I am writing a code for running autoencoder on CIFAR10 dataset and see the reconstructed images.
The requirement is to create
Encoder with First Layer
Input shape: (32,32,3)
Conv2D Layer with 64 Filters of (3,3)
BatchNormalization layer
ReLu activation
2D MaxpoolingLayer with (2,2) filter
Encoder with Second Layer
Conv2D layer with 16 filters (3,3)
BatchNormalization layer
ReLu activation
2D MaxpoolingLayer with (2,2) filter
Final Encoded as MaxPool with (2,2) with all previous layers
Decoder with First Layer
Input shape: encoder output
Conv2D Layer with 16 Filters of (3,3)
BatchNormalization layer
ReLu activation
UpSampling2D with (2,2) filter
Decoder with Second Layer
Conv2D Layer with 32 Filters of (3,3)
BatchNormalization layer
ReLu activation
UpSampling2D with (2,2) filter
Final Decoded as Sigmoid with all previous layers
I understand that
When we are creating Convolutional Autoencoder (or any AE), we need to pass the output of the previous layer to the next layer.
So, when I create the first Conv2D layer with ReLu and then perform BatchNormalization .. in which I pass the Conv2D layer .. right?
But when I do MaxPooling2D .. what should I pass .. BatchNormalization output or Conv2D layer output?
Also, is there any order in which I should be performing these operations?
Conv2D --> BatchNormalization --> MaxPooling2D
OR
Conv2D --> MaxPooling2D --> BatchNormalization
I am attaching my code below ... I have attempted it to two different ways and hence getting different outputs (in terms of model summary and also model training graph)
Can someone please help me by explaining which is the correct method (Method-1 or Method-2)? Also, how do I understand which graph shows better model performance?
Method - 1
input_image = Input(shape=(32, 32, 3))
### Encoder
conv1_1 = Conv2D(64, (3, 3), activation='relu', padding='same')(input_image)
bnorm1_1 = BatchNormalization()(conv1_1)
mpool1_1 = MaxPooling2D((2, 2), padding='same')(conv1_1)
conv1_2 = Conv2D(16, (3, 3), activation='relu', padding='same')(mpool1_1)
borm1_2 = BatchNormalization()(conv1_2)
encoder = MaxPooling2D((2, 2), padding='same')(conv1_2)
### Decoder
conv2_1 = Conv2D(16, (3, 3), activation='relu', padding='same')(encoder)
bnorm2_1 = BatchNormalization()(conv2_1)
up1_1 = UpSampling2D((2, 2))(conv2_1)
conv2_2 = Conv2D(32, (3, 3), activation='relu', padding='same')(up1_1)
bnorm2_2 = BatchNormalization()(conv2_2)
up2_1 = UpSampling2D((2, 2))(conv2_2)
decoder = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(up2_1)
model = Model(input_image, decoder)
model.compile(optimizer='adam', loss='binary_crossentropy')
model.summary()
history = model.fit(trainX, trainX,
epochs=50,
batch_size=1000,
shuffle=True,
verbose=2,
validation_data=(testX, testX)
)
As an output of the model summary, I get this
Total params: 18,851
Trainable params: 18,851
Non-trainable params: 0
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper right')
plt.show()
Method - 2
input_image = Input(shape=(32, 32, 3))
### Encoder
x = Conv2D(64, (3, 3), activation='relu', padding='same')(input_image)
x = BatchNormalization()(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = BatchNormalization()(x)
encoder = MaxPooling2D((2, 2), padding='same')(x)
### Decoder
x = Conv2D(16, (3, 3), activation='relu', padding='same')(encoder)
x = BatchNormalization()(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = BatchNormalization()(x)
x = UpSampling2D((2, 2))(x)
decoder = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)
model = Model(input_image, decoder)
model.compile(optimizer='adam', loss='binary_crossentropy')
model.summary()
history = model.fit(trainX, trainX,
epochs=50,
batch_size=1000,
shuffle=True,
verbose=2,
validation_data=(testX, testX)
)
As an output of the model summary, I get this
Total params: 19,363
Trainable params: 19,107
Non-trainable params: 256
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper right')
plt.show()
In method 1, BatchNormalization layers does not exist in the compiled model, as the output of these layers are not used anywhere. You can check this by running model1.summary()
Method 2 is perfectly alright.
Order of the operations :
Conv2D --> BatchNormalization --> MaxPooling2D is usually the common approach.
Though either order would work since, since BatchNorm is just mean and variance normalization.
Edit:
For Conv2D --> BatchNormalization --> MaxPooling2D :
conv1_1 = Conv2D(64, (3, 3), activation='relu', padding='same')(input_image)
bnorm1_1 = BatchNormalization()(conv1_1)
mpool1_1 = MaxPooling2D((2, 2), padding='same')(bnorm1_1)
and then use mpool1_1 as input for next layer.
For Conv2D --> MaxPooling2D --> BatchNormalization:
conv1_1 = Conv2D(64, (3, 3), activation='relu', padding='same')(input_image)
mpool1_1 = MaxPooling2D((2, 2), padding='same')(conv1_1)
bnorm1_1 = BatchNormalization()(mpool1_1)
and then use bnorm1_1 as input for next layer.
To effectively use BatchNormalization layer, you should always use it before activation.
Instead of:
conv1_1 = Conv2D(64, (3, 3), activation='relu', padding='same')(input_image)
bnorm1_1 = BatchNormalization()(conv1_1)
mpool1_1 = MaxPooling2D((2, 2), padding='same')(bnorm1_1)
Use it like this:
conv1_1 = Conv2D(64, (3, 3), padding='same')(input_image)
bnorm1_1 = BatchNormalization()(conv1_1)
act_1 = Activation('relu')(bnorm1_1)
mpool1_1 = MaxPooling2D((2, 2), padding='same')(act_1)
For more details, check here:
Where do I call the BatchNormalization function in Keras?
I have a pandas dataframe containing filenames of positive and negative examples as below
img1 img2 y
001.jpg 002.jpg 1
003.jpg 004.jpg 0
003.jpg 002.jpg 1
I want to train my Siamese network using Keras ImageDataGenerator and flow_from_dataframe. How do I set up my training so that the code inputs 2 images with 1 label simultaneously.
Below is the code for my model
def siamese_model(input_shape) :
left = Input(input_shape)
right = Input(input_shape)
model = Sequential()
model.add(Conv2D(32, (3,3), activation='relu', input_shape=input_shape))
model.add(BatchNormalization())
model.add(Conv2D(64, (3,3), activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(128, (3,3), activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(256, (3,3), activation='relu')
model.add(BatchNormalization())
model.add(Conv2D(256, (3,3), activation='relu')
model.add(MaxPooling2D())
model.add(BatchNormalization())
model.add(Flatten())
model.add(Dense(512, activation='sigmoid'))
left_encoded = model(left)
right_encoded = model(right)
L1_layer = Lambda(lambda tensors:K.abs(tensors[0] - tensors[1]))
L1_distance = L1_layer([left_encoded, right_encoded])
prediction = Dense(1,activation='sigmoid')(L1_distance)
siamese_net = Model(inputs=[left,right],outputs=prediction)
return siamese_net
model = siamese_model((224,224,3))
model.compile(loss="binary_crossentropy",optimizer="adam", metrics=['accuracy'])
datagen_left = ImageDataGenerator(rotation_range=10,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
vertical_flip = True)
datagen_right = ImageDataGenerator(rotation_range=10,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
vertical_flip = True)
Join the generators in a custom generator.
Make one of them output the desired labels, discard the label of the other.
class DoubleGenerator(Sequence):
def __init__(self, gen1, gen2):
self.gen1 = gen1
self.gen2 = gen2
def __len__(self):
return len(self.gen1)
def __getitem__(self, i):
x1,y = self.gen1[i]
x2,y2 = self.gen2[i]
return (x1,x2), y
Use it:
double_gen = DoubleGenerator(datagen_left.flow_from_directory(...),
datagen_right.flow_from_directory(...))
I am trying to use U-net network architeture for stereo vision.
I have datasets with 3 different image sizes (1240x368, 1224x368 and 1384x1104).
Here is My whole class:
import pickle
from keras.models import Sequential
from keras.layers import Convolution2D, MaxPooling2D, UpSampling2D, Conv2DTranspose
from keras.utils import np_utils
import sys, numpy as np
import keras
import cv2
pkl_file = open('data.p', 'rb')
dict = pickle.load(pkl_file)
X_data = dict['images']
Y_data = dict['disparity']
data_num = len(X_data)
train_num = int(data_num * 0.8)
X_train = X_data[:train_num]
X_test = X_data[train_num:]
Y_train = Y_data[:train_num]
Y_test = Y_data[train_num:]
def gen(X, Y):
while True:
for x, y in zip(X, Y):
yield x, y
model = Sequential()
model.add(Convolution2D(6, (2, 2), input_shape=(None, None, 6), activation='relu', padding='same'))
model.add(Convolution2D(64, (3, 3), activation='relu'))
model.add(Convolution2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(64, (3, 3), activation='relu'))
model.add(Convolution2D(128, (3, 3), activation='relu'))
model.add(Convolution2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(128, (3, 3), activation='relu'))
model.add(Convolution2D(256, (3, 3), activation='relu'))
model.add(Convolution2D(256, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(UpSampling2D(size=(2, 2)))
model.add(Conv2DTranspose(256, (3, 3), activation='relu'))
model.add(Conv2DTranspose(256, (3, 3), activation='relu'))
model.add(Conv2DTranspose(128, (3, 3), activation='relu'))
model.add(UpSampling2D(size=(2, 2)))
model.add(Conv2DTranspose(128, (3, 3), activation='relu'))
model.add(Conv2DTranspose(128, (3, 3), activation='relu'))
model.add(Conv2DTranspose(64, (3, 3), activation='relu'))
model.add(UpSampling2D(size=(2, 2)))
model.add(Conv2DTranspose(64, (3, 3), activation='relu'))
model.add(Conv2DTranspose(64, (3, 3), activation='relu'))
model.add(Conv2DTranspose(3, (3, 3), activation='relu'))
model.compile(loss=['mse'], optimizer='adam', metrics=['accuracy'])
model.fit_generator(gen(X_train, Y_train), steps_per_epoch=len(X_train), epochs=5)
scores = model.evaluate(X_test, Y_test, verbose=0)
When I try to run this code, I get an error in which it says:
Incompatible shapes: [1,370,1242,3] vs. [1,368,1240,3]
I resized the pictures to be divisible by 8 since I have 3 maxpool layers.
As input I put 2 images (I am doing stereo vision) and as an output I get disparity map for the first image. I am concatenating 2 images by putting the second one in third dimension (np.concatenate((img1,img2), axis=-1).
Can somebody tell me what I am doing wrong?
Here is my trace:
Traceback (most recent call last):
File "C:\Users\Ivan\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1322, in _do_call
return fn(*args)
File "C:\Users\Ivan\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "C:\Users\Ivan\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [1,370,1242,3] vs. [1,368,1240,3]
[[Node: loss/conv2d_transpose_9_loss/sub = Sub[T=DT_FLOAT, _class=["loc:#training/Adam/gradients/loss/conv2d_transpose_9_loss/sub_grad/Reshape"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](conv2d_transpose_9/Relu-1-0-TransposeNCHWToNHWC-LayoutOptimizer, _arg_conv2d_transpose_9_target_0_2/_303)]]
[[Node: loss/mul/_521 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2266_loss/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
I tried resizing pictures and learning works, but since as a result I get disparity maps, resizing is not a good option. Does anybody have any advice?
If the picture is too big to fit in conv2dTransponse, you can use Cropping2d layer so it crops the picture on wished size. This works if input picture has even number of pixels.
I am trying to rewrite a Sequential model of Network In Network CNN using Functional API. I use it with CIFAR-10 dataset. The Sequential model trains without a problem, but Functional API model gets stuck. I probably missed something when rewriting the model.
Here's a reproducible example:
Dependencies:
from keras.models import Model, Input, Sequential
from keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D, Dropout, Activation
from keras.utils import to_categorical
from keras.losses import categorical_crossentropy
from keras.optimizers import Adam
from keras.datasets import cifar10
Loading the dataset:
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train = x_train / 255.
x_test = x_test / 255.
y_train = to_categorical(y_train, num_classes=10)
y_test = to_categorical(y_test, num_classes=10)
input_shape = x_train[0,:,:,:].shape
Here's the working Sequential model:
model = Sequential()
#mlpconv block1
model.add(Conv2D(32, (5, 5), activation='relu',padding='valid',input_shape=input_shape))
model.add(Conv2D(32, (1, 1), activation='relu'))
model.add(Conv2D(32, (1, 1), activation='relu'))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(0.5))
#mlpconv block2
model.add(Conv2D(64, (3, 3), activation='relu',padding='valid'))
model.add(Conv2D(64, (1, 1), activation='relu'))
model.add(Conv2D(64, (1, 1), activation='relu'))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(0.5))
#mlpconv block3
model.add(Conv2D(128, (3, 3), activation='relu',padding='valid'))
model.add(Conv2D(32, (1, 1), activation='relu'))
model.add(Conv2D(10, (1, 1), activation='relu'))
model.add(GlobalAveragePooling2D())
model.add(Activation('softmax'))
Compile and train:
model.compile(loss=categorical_crossentropy, optimizer=Adam(), metrics=['acc'])
_ = model.fit(x=x_train, y=y_train, batch_size=32,
epochs=200, verbose=1,validation_split=0.2)
In three epochs the model gets close to 50% validation accuracy.
Here's the same model rewritten using Functional API:
model_input = Input(shape=input_shape)
#mlpconv block1
x = Conv2D(32, (5, 5), activation='relu',padding='valid')(model_input)
x = Conv2D(32, (1, 1), activation='relu')(x)
x = Conv2D(32, (1, 1), activation='relu')(x)
x = MaxPooling2D((2,2))(x)
x = Dropout(0.5)(x)
#mlpconv block2
x = Conv2D(64, (3, 3), activation='relu',padding='valid')(x)
x = Conv2D(64, (1, 1), activation='relu')(x)
x = Conv2D(64, (1, 1), activation='relu')(x)
x = MaxPooling2D((2,2))(x)
x = Dropout(0.5)(x)
#mlpconv block3
x = Conv2D(128, (3, 3), activation='relu',padding='valid')(x)
x = Conv2D(32, (1, 1), activation='relu')(x)
x = Conv2D(10, (1, 1), activation='relu')(x)
x = GlobalAveragePooling2D()(x)
x = Activation(activation='softmax')(x)
model = Model(model_input, x, name='nin_cnn')
This model is then compiled using the same parameters as the Sequential model. When trained, the training accuracy gets stuck at 0.10, meaning the model doesn't get better and randomly chooses one of 10 classes.
What did I miss when rewriting the model? When calling model.summary() the models look identical except for the explicit Input layer in the Functional API model.
Removing activation in the final conv layer solves the problem:
x = Conv2D(10, (1, 1))(x)
Still not sure why the Sequential model works fine with activation in that layer.