Related
I built an autoencoder using my own data set of about 32k images. I did a 75/25 split for training/testing, and I was able to get results I'm happy with.
Now I want to be able to extract the feature space and map them to every image in my dataset and to new data that wasn't tested. I couldn't find a tutorial online that delved into using the encoder as a feature space. All I could find was to build the full network.
My code:
> input_img = Input(shape=(200, 200, 1))
# encoder part of the model (increased filter lyaer after each filter)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
# decoder part of the model (went backwards from the encoder)
x = Conv2D(128, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
decoded = Cropping2D(cropping=((8,0), (8,0)), data_format=None)(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')
autoencoder.summary()
Here's my net setup if anybody is interested:
Model: "model_22"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_23 (InputLayer) (None, 200, 200, 1) 0
_________________________________________________________________
conv2d_186 (Conv2D) (None, 200, 200, 16) 160
_________________________________________________________________
max_pooling2d_83 (MaxPooling (None, 100, 100, 16) 0
_________________________________________________________________
conv2d_187 (Conv2D) (None, 100, 100, 32) 4640
_________________________________________________________________
max_pooling2d_84 (MaxPooling (None, 50, 50, 32) 0
_________________________________________________________________
conv2d_188 (Conv2D) (None, 50, 50, 64) 18496
_________________________________________________________________
max_pooling2d_85 (MaxPooling (None, 25, 25, 64) 0
_________________________________________________________________
conv2d_189 (Conv2D) (None, 25, 25, 128) 73856
_________________________________________________________________
max_pooling2d_86 (MaxPooling (None, 13, 13, 128) 0
_________________________________________________________________
conv2d_190 (Conv2D) (None, 13, 13, 128) 147584
_________________________________________________________________
up_sampling2d_82 (UpSampling (None, 26, 26, 128) 0
_________________________________________________________________
conv2d_191 (Conv2D) (None, 26, 26, 64) 73792
_________________________________________________________________
up_sampling2d_83 (UpSampling (None, 52, 52, 64) 0
_________________________________________________________________
conv2d_192 (Conv2D) (None, 52, 52, 32) 18464
_________________________________________________________________
up_sampling2d_84 (UpSampling (None, 104, 104, 32) 0
_________________________________________________________________
conv2d_193 (Conv2D) (None, 104, 104, 16) 4624
_________________________________________________________________
up_sampling2d_85 (UpSampling (None, 208, 208, 16) 0
_________________________________________________________________
conv2d_194 (Conv2D) (None, 208, 208, 1) 145
_________________________________________________________________
cropping2d_2 (Cropping2D) (None, 200, 200, 1) 0
=================================================================
Total params: 341,761
Trainable params: 341,761
Non-trainable params: 0
Then my training:
autoencoder.fit(train, train,
epochs=3,
batch_size=128,
shuffle=True,
validation_data=(test, test))
My results:
Train on 23412 samples, validate on 7805 samples
Epoch 1/3
23412/23412 [==============================] - 773s 33ms/step - loss: 0.0620 - val_loss: 0.0398
Epoch 2/3
23412/23412 [==============================] - 715s 31ms/step - loss: 0.0349 - val_loss: 0.0349
Epoch 3/3
23412/23412 [==============================] - 753s 32ms/step - loss: 0.0314 - val_loss: 0.0319
Rather not share the images, but they look well reconstructed.
Thank you for all and any help!
Not sure if I fully understand your questions, but do you want to get the resulting feature space for every image you trained on as well as others. Why not just do this?
Name your encoded layer in your autoencoder architecture as 'embedding.' Then create the encoder the following way:
embedding_layer = autoencoder.get_layer(name='embedding').output
encoder = Model(input_img,embedding_layer)
I am trying to get a neural network model like that:
input
|
hidden
/ \
hidden output2
|
output1
Here is a simple example in code:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten()) # from here I would like to add a new neural network
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
How to get an expected model?
Please sorry if I ask a stupid question, I am a very beginner in artificial intelligence.
You could make use of keras functional APIs instead of sequential APIs to get it done as shown below:
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
num_classes = 10
inp= Input(shape=input_shape)
conv1 = Conv2D(32, kernel_size=(3,3), activation='relu')(inp)
conv2 = Conv2D(64, (3, 3), activation='relu')(conv1)
max_pool = MaxPooling2D(pool_size=(2, 2))(conv2)
flat = Flatten()(max_pool)
hidden1 = Dense(128, activation='relu')(flat)
output1 = Dense(num_classes, activation='softmax')(hidden1)
hidden2 = Dense(10, activation='relu')(flat) #specify the number of hidden units
output2 = Dense(3, activation='softmax')(hidden2) #specify the number of classes
model = Model(inputs=inp, outputs=[output1 ,output2])
your network looks like this:
Model: "model_1"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_7 (InputLayer) (None, 64, 256, 256) 0
__________________________________________________________________________________________________
conv2d_10 (Conv2D) (None, 62, 254, 32) 73760 input_7[0][0]
__________________________________________________________________________________________________
conv2d_11 (Conv2D) (None, 60, 252, 64) 18496 conv2d_10[0][0]
__________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D) (None, 30, 126, 64) 0 conv2d_11[0][0]
__________________________________________________________________________________________________
flatten_4 (Flatten) (None, 241920) 0 max_pooling2d_4[0][0]
__________________________________________________________________________________________________
dense_6 (Dense) (None, 128) 30965888 flatten_4[0][0]
__________________________________________________________________________________________________
dense_8 (Dense) (None, 10) 2419210 flatten_4[0][0]
__________________________________________________________________________________________________
dense_7 (Dense) (None, 10) 1290 dense_6[0][0]
__________________________________________________________________________________________________
dense_9 (Dense) (None, 3) 33 dense_8[0][0]
==================================================================================================
Total params: 33,478,677
Trainable params: 33,478,677
Non-trainable params: 0
Attempting a single-label classification problem with num_classes = 73
Here's my simplified Keras model:
num_classes = 73
batch_size = 4
train_data_list = [training_file_names list here..]
validation_data_list = [ validation_file_names list here..]
training_generator = DataGenerator(train_data_list, batch_size, num_classes)
validation_generator = DataGenerator(validation_data_list, batch_size, num_classes)
model = Sequential()
model.add(Conv1D(32, 3, strides=1, input_shape=(15,120), activation="relu"))
model.add(Conv1D(16, 3, strides=1, activation="relu"))
model.add(Flatten())
model.add(Dense(n_classes, activation='softmax'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss="categorical_crossentropy",optimizer=sgd,metrics=['accuracy'])
model.fit_generator(generator=training_generator, epochs=100,
validation_data=validation_generator)
Here's my DataGenerator's __get_item__ method:
def __get_item__(self):
X = np.zeros((self.batch_size,15,120))
y = np.zeros((self.batch_size, 1 ,self.n_classes))
for i in range(self.batch_size):
X_row = some_method_that_gives_X_of_15x20_dim()
target = some_method_that_gives_target()
one_hot = keras.utils.to_categorical(target, num_classes=self.n_classes)
X[i] = X_row
y[i] = one_hot
return X, y
Since my X values are correctly returned with dimension (batch_size, 15, 120), I am not showing it here. My issue is with the y value returned.
y returned from this generator method has a shape of (batch_size, 1, 73) as one hot encoded label for the 73 classes, which I think is the correct shape to return.
However Keras gives the following error for the last layer:
ValueError: Error when checking target: expected dense_1 to have 2
dimensions, but got array with shape (4, 1, 73)
Since the batch size is 4, I think the target batch should also be 3 dimensional (4,1,73). Why is then Keras expecting the last layer to be 2 dimensions ?
you model' s summary shows that in the output layer there should be only 2 dimensions, (None, 73)
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1d_7 (Conv1D) (None, 13, 32) 11552
_________________________________________________________________
conv1d_8 (Conv1D) (None, 11, 16) 1552
_________________________________________________________________
flatten_5 (Flatten) (None, 176) 0
_________________________________________________________________
dense_4 (Dense) (None, 73) 12921
=================================================================
Total params: 26,025
Trainable params: 26,025
Non-trainable params: 0
_________________________________________________________________
Since dimension of your target is (batch_size, 1, 73), you can just change to (batch_size, 73) in order for your model to run
i'm trying to adapt the 2d convolutional autoencoder example from the keras website: https://blog.keras.io/building-autoencoders-in-keras.html
to my own case where i use 1d inputs:
from keras.layers import Input, Dense, Conv1D, MaxPooling1D, UpSampling1D
from keras.models import Model
from keras import backend as K
import scipy as scipy
import numpy as np
mat = scipy.io.loadmat('edata.mat')
emat = mat['edata']
input_img = Input(shape=(64,1)) # adapt this if using `channels_first` image data format
x = Conv1D(32, (9), activation='relu', padding='same')(input_img)
x = MaxPooling1D((4), padding='same')(x)
x = Conv1D(16, (9), activation='relu', padding='same')(x)
x = MaxPooling1D((4), padding='same')(x)
x = Conv1D(8, (9), activation='relu', padding='same')(x)
encoded = MaxPooling1D(4, padding='same')(x)
x = Conv1D(8, (9), activation='relu', padding='same')(encoded)
x = UpSampling1D((4))(x)
x = Conv1D(16, (9), activation='relu', padding='same')(x)
x = UpSampling1D((4))(x)
x = Conv1D(32, (9), activation='relu')(x)
x = UpSampling1D((4))(x)
decoded = Conv1D(1, (9), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
x_train = emat[:,0:80000]
x_train = np.reshape(x_train, (x_train.shape[1], 64, 1))
x_test = emat[:,80000:120000]
x_test = np.reshape(x_test, (x_test.shape[1], 64, 1))
from keras.callbacks import TensorBoard
autoencoder.fit(x_train, x_train,
epochs=50,
batch_size=128,
shuffle=True,
validation_data=(x_test, x_test),
callbacks=[TensorBoard(log_dir='/tmp/autoencoder')])
however, i receive this error when i try to run the autoencoder.fit():
ValueError: Error when checking target: expected conv1d_165 to have
shape (None, 32, 1) but got array with shape (80000, 64, 1)
i know i'm probably doing something wrong when i set up my layers, i just changed the maxpool and conv2d sizes to a 1d form...i have very little experience with keras or autoencoders, anyone see what i'm doing wrong?
thanks
EDIT:
the error when i run it on a fresh console:
ValueError: Error when checking target: expected conv1d_7 to have
shape (None, 32, 1) but got array with shape (80000, 64, 1)
here is the output of autoencoder.summary()
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 64, 1) 0
_________________________________________________________________
conv1d_1 (Conv1D) (None, 64, 32) 320
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 16, 32) 0
_________________________________________________________________
conv1d_2 (Conv1D) (None, 16, 16) 4624
_________________________________________________________________
max_pooling1d_2 (MaxPooling1 (None, 4, 16) 0
_________________________________________________________________
conv1d_3 (Conv1D) (None, 4, 8) 1160
_________________________________________________________________
max_pooling1d_3 (MaxPooling1 (None, 1, 8) 0
_________________________________________________________________
conv1d_4 (Conv1D) (None, 1, 8) 584
_________________________________________________________________
up_sampling1d_1 (UpSampling1 (None, 4, 8) 0
_________________________________________________________________
conv1d_5 (Conv1D) (None, 4, 16) 1168
_________________________________________________________________
up_sampling1d_2 (UpSampling1 (None, 16, 16) 0
_________________________________________________________________
conv1d_6 (Conv1D) (None, 8, 32) 4640
_________________________________________________________________
up_sampling1d_3 (UpSampling1 (None, 32, 32) 0
_________________________________________________________________
conv1d_7 (Conv1D) (None, 32, 1) 289
=================================================================
Total params: 12,785
Trainable params: 12,785
Non-trainable params: 0
_________________________________________________________________
Since the autoencoder output should reconstruct the input, a minimum requirement is that their dimensions should match, right?
Looking at your autoencoder.summary(), it is easy to confirm that this is not the case: your input is of shape (64,1), while the output of your last convolutional layer conv1d_7 is (32,1) (we ignore the None in the first dimension, since they refer to the batch size).
Let's have a look at the example in the Keras blog you link to (it is a 2D autoencoder, but the idea is the same):
from keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras import backend as K
input_img = Input(shape=(28, 28, 1)) # adapt this if using `channels_first` image data format
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
# at this point the representation is (4, 4, 8) i.e. 128-dimensional
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
Here is the result of autoencoder.summary() in this case:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 28, 28, 1) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 28, 28, 16) 160
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 14, 14, 16) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 14, 14, 8) 1160
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 7, 7, 8) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 7, 7, 8) 584
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 4, 4, 8) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 4, 4, 8) 584
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 8, 8, 8) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 8, 8, 8) 584
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 16, 16, 8) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 14, 14, 16) 1168
_________________________________________________________________
up_sampling2d_3 (UpSampling2 (None, 28, 28, 16) 0
_________________________________________________________________
conv2d_7 (Conv2D) (None, 28, 28, 1) 145
=================================================================
Total params: 4,385
Trainable params: 4,385
Non-trainable params: 0
It is easy to confirm that here the dimensions of the input and the output (last convolutional layer conv2d_7) are indeed both (28, 28, 1).
So, the summary() method is your friend when building autoencoders; you should experiment with the parameters until you are sure that you produce an output of the same dimensionality as your input. I managed to do so with your autoencoder simply by changing the size argument of the last UpSampling1D layer from 4 to 8:
input_img = Input(shape=(64,1))
x = Conv1D(32, (9), activation='relu', padding='same')(input_img)
x = MaxPooling1D((4), padding='same')(x)
x = Conv1D(16, (9), activation='relu', padding='same')(x)
x = MaxPooling1D((4), padding='same')(x)
x = Conv1D(8, (9), activation='relu', padding='same')(x)
encoded = MaxPooling1D(4, padding='same')(x)
x = Conv1D(8, (9), activation='relu', padding='same')(encoded)
x = UpSampling1D((4))(x)
x = Conv1D(16, (9), activation='relu', padding='same')(x)
x = UpSampling1D((4))(x)
x = Conv1D(32, (9), activation='relu')(x)
x = UpSampling1D((8))(x) ## <-- change here (was 4)
decoded = Conv1D(1, (9), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
In which case, the autoencoder.summary() becomes:
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 64, 1) 0
_________________________________________________________________
conv1d_1 (Conv1D) (None, 64, 32) 320
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 16, 32) 0
_________________________________________________________________
conv1d_2 (Conv1D) (None, 16, 16) 4624
_________________________________________________________________
max_pooling1d_2 (MaxPooling1 (None, 4, 16) 0
_________________________________________________________________
conv1d_3 (Conv1D) (None, 4, 8) 1160
_________________________________________________________________
max_pooling1d_3 (MaxPooling1 (None, 1, 8) 0
_________________________________________________________________
conv1d_4 (Conv1D) (None, 1, 8) 584
_________________________________________________________________
up_sampling1d_1 (UpSampling1 (None, 4, 8) 0
_________________________________________________________________
conv1d_5 (Conv1D) (None, 4, 16) 1168
_________________________________________________________________
up_sampling1d_2 (UpSampling1 (None, 16, 16) 0
_________________________________________________________________
conv1d_6 (Conv1D) (None, 8, 32) 4640
_________________________________________________________________
up_sampling1d_3 (UpSampling1 (None, 64, 32) 0
_________________________________________________________________
conv1d_7 (Conv1D) (None, 64, 1) 289
=================================================================
Total params: 12,785
Trainable params: 12,785
Non-trainable params: 0
with the dimensionality of your input and output matched, as it should be...
I am trying to create a model to fit data from the cifar-10 dataset. I have a working convolution neural network from an example but when I try to create a multi layer perceptron I keep getting a shape mismatch problem.
#https://gist.github.com/fchollet/0830affa1f7f19fd47b06d4cf89ed44d
#https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Activation, Dropout, Flatten, Dense
from keras import backend as K
from keras.optimizers import RMSprop
# dimensions of our images.
img_width, img_height = 32, 32
train_data_dir = 'pro-data/train'
validation_data_dir = 'pro-data/test'
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 50
batch_size = 16
if K.image_data_format() == 'channels_first':
input_shape = (3, img_width, img_height)
else:
input_shape = (img_width, img_height, 3)
model = Sequential()
model.add(Dense(512, activation='relu', input_shape=input_shape))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(10, activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy',
optimizer=RMSprop(),
metrics=['accuracy'])
# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator()
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='binary')
model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples // batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=nb_validation_samples // batch_size)
score = model.evaluate_generator(validation_generator, 1000)
print("Accuracy = ", score[1])
The error I am getting is this:
ValueError: Error when checking target: expected dense_3 to have 4 dimensions, but got array with shape (16, 1)
But if if change the input_shape for the input layer to an incorrect value "(784,)", I get this error:
ValueError: Error when checking input: expected dense_1_input to have 2 dimensions, but got array with shape (16, 32, 32, 3)
This is where I got a working cnn model using flow_from_directory:
https://gist.github.com/fchollet/0830affa1f7f19fd47b06d4cf89ed44d
In case anyone is curious I am getting an accuracy of only 10% for cifar10 using the convolution neural network model. Pretty poor I think.
according to your model, your model summary is
dense_1 (Dense) (None, 32, 32, 512) 2048
dropout_1 (Dropout) (None, 32, 32, 512) 0
dense_2 (Dense) (None, 32, 32, 512) 262656
dropout_2 (Dropout) (None, 32, 32, 512) 0
dense_3 (Dense) (None, 32, 32, 10) 5130
Total params: 269,834
Trainable params: 269,834
Non-trainable params: 0
Your output format is (32,32,10)
In the cifar-10 dataset you want to classify into 10 labels
Try adding
model.add(Flatten())
before your last dense layer.
Now your output layer is
Layer (type) Output Shape Param #
dense_1 (Dense) (None, 32, 32, 512) 2048
dropout_1 (Dropout) (None, 32, 32, 512) 0
dense_2 (Dense) (None, 32, 32, 512) 262656
dropout_2 (Dropout) (None, 32, 32, 512) 0
flatten_1 (Flatten) (None, 524288) 0
dense_3 (Dense) (None, 10) 5242890
Total params: 5,507,594
Trainable params: 5,507,594
Non-trainable params: 0
Also, you've just used the dense and dropout layers in your model. To get better accuracy you should google the various CNN architectures which consists of dense and maxpooling layers.