Implementation of AlexNet in Keras on cifar-10 gives poor accuracy - machine-learning

I tried implementing AlexNet as explained in this video. Pardon me if I have implemented it wrong, this is the code for my implementation it in keras.
Edit : The cifar-10 ImageDataGenerator
cifar_generator = ImageDataGenerator()
cifar_data = cifar_generator.flow_from_directory('datasets/cifar-10/train',
batch_size=32,
target_size=input_size,
class_mode='categorical')
The Model described in Keras:
model = Sequential()
model.add(Convolution2D(filters=96, kernel_size=(11, 11), input_shape=(227, 227, 3), strides=4, activation='relu'))
model.add(MaxPool2D(pool_size=(3 ,3), strides=2))
model.add(Convolution2D(filters=256, kernel_size=(5, 5), strides=1, padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=(3 ,3), strides=2))
model.add(Convolution2D(filters=384, kernel_size=(3, 3), strides=1, padding='same', activation='relu'))
model.add(Convolution2D(filters=384, kernel_size=(3, 3), strides=1, padding='same', activation='relu'))
model.add(Convolution2D(filters=256, kernel_size=(3, 3), strides=1, padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=(3 ,3), strides=2))
model.add(Flatten())
model.add(Dense(units=4096))
model.add(Dense(units=4096))
model.add(Dense(units=10, activation='softmax'))
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
I have used an ImageDataGenerator to train this network on the cifar-10 data set. However, I am only able to get an accuracy of about .20. I cannot figure out what I am doing wrong.

For starters, you need to extend the relu activation to your two intermediate dense layers, too; as they are now:
model.add(Dense(units=4096))
model.add(Dense(units=4096))
i.e. with linear activation (default), it can be shown that they are equivalent to a simple linear unit each (Andrew Ng devotes a whole lecture in his first course on the DL specialization explaining this). Change them to:
model.add(Dense(units=4096, activation='relu'))
model.add(Dense(units=4096, activation='relu'))
Check the SO thread Why must a nonlinear activation function be used in a backpropagation neural network?, as well as the AlexNet implementations here and here to confirm this.

Related

Before training the CNN network score on the testing data

I have a simple binary image classification CNN network. Below is the code
model = Sequential()
model.add(Conv2D(32, kernel_size=(3,3), padding='same',
kernel_initializer=gabor_init, input_shape=(32, 32, 1)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Conv2D(32, kernel_size=(3,3), padding='same', kernel_initializer=gabor_init))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Conv2D(64, kernel_size=(3,3), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128,activation='relu'))
model.add(Dropout(0.4))
model.add(Dense(2,input_dim=128,activation='sigmoid'))
model.compile(loss='binary_crossentropy',optimizer='rmsprop',metrics=['accuracy'])
model.summary()
from sklearn.model_selection import train_test_split
trainX,testX,trainY,testY=train_test_split(Xdata,Ytarget,test_size=.3)
history=model.fit(trainX,trainY,epochs=70,batch_size = 64,
verbose = 1,validation_split=.3)
print(model.evaluate(testX,testY))
Here I am training the model then validating the model. My question is
I want to check the model on the test data before training; as I am using the Gabor Kernel Initializer, I want to see how this filter works before training. In that case, do I need to add `model.fit()? I am little confused.
Any suggestion or modification for the last part of the code so the model can be tested on test data before training?
After you have defined your model in keras, you are only required to compile it using the model.compile() in order to be able to invoke predictions on the initial untrained weights. model.fit() only updates the weights as the model is trained and does not contribute to any configuration setup.

How to get a good binary classification deep neural model where negative data is more on dataset?

I wanted to make a binary image classification using Cifar-10 dataset. Where I modified Cifar-10 such a way that class-0 as class-True(1) and all other class as class-False(0). Now there is only two classes in my dataset - True(1) and False(0).
while I am doing training using the following Keras model(Tensorflow as backend) I am getting almost 99% accuracy.
But in the test I am finding that all the False is predicted as False and all True are also predicted as False - and getting 99% accuracy.
But I do not wanted that all True are predicted as False.
I was expecting that all True are predicted as True.
How can I resolve this problem?
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
output=model.fit(x_train, y_train, batch_size=32, epochs=10)
You have a few options here:
Get more data with True label. However in most scenarios this is not easily possible.
Use only a small amount of the data that is labeled False. Maybe it is enough to train your model?
Use weights for the loss function during training. In Kerasyou can do this using the class_weight option of fit. The class True should have a higher weight than the class False in your example.
As mentioned in the comments this is a huge problem in the ML field. These are just a few very simple things you could try.

Run Multiple Keras Models In A Cluster Like OAR2

I want to build an algorithm like automl, but, don't know how to train multiple keras models simultaneously in a cluster like OAR2.
Assume that i have two different keras models like this :
model1 = Sequential()
model1.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model1.add(Conv2D(64, (3, 3), activation='relu'))
model1.add(MaxPooling2D(pool_size=(2, 2)))
model1.add(Dropout(0.25))
model1.add(Flatten())
model1.add(Dense(128, activation='relu'))
model1.add(Dropout(0.5))
model1.add(Dense(num_classes, activation='softmax'))
model2 = Sequential()
model2.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model2.add(Conv2D(128, (3, 3), activation='selu'))
model2.add(MaxPooling2D(pool_size=(2, 2)))
model2.add(Dropout(0.25))
model2.add(Flatten())
model2.add(Dense(128, activation='relu'))
model2.add(Dropout(0.5))
model2.add(Dense(num_classes, activation='softmax'))
How could train these two models simultaneously in a cluster?

Image classifier with Keras not converging

all. I am trying to build an image classifier with Keras (Tensorflow as backend). The objective is to separate memes from other images.
I am using the structure convolutional layers + fully connected layers with max pooling and dropouts.
The code is as following:
model = Sequential()
model.add(Conv2D(64, (3,3), activation='relu', input_shape=conv_input_shape))
model.add(Conv2D(64, (3,3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.
compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
The input is a matrix of shape (n, 100, 100, 3). n RGB images with resolution 100 x 100, and output labels are [1, 0] for meme and [0, 1] otherwise.
However, when I train the model, the loss won't ever decrease from the first iteration.
Is there anything off in the code?
I am thinking that meme is actually not that different from other images in many ways except that some of them have some sort of captions together with some other features.
What are some better architectures to solve a problem like this?

Keras model.fit() - which training algorithm is used?

I am using Keras on top of Theano to create a MLP which I train and use to predict time series. Independently of the structure and depth of my network I cannot figure out (Keras documentation, StackOverflow, searching the net...) which training algorithm (Backpropagation,...) Keras' model.fit() function is using.
Within Theano (used without Keras before) I could define the way the parameters are adjusted myself with
self.train_step = theano.function(inputs=[u_in, t_in, lrate], outputs=[cost, y],
on_unused_input='warn',
updates=[(p, p - lrate * g) for p, g in zip(self.parameters, self.gradients)],
allow_input_downcast=True)
Not finding any information causes a certain fear that I am missing something essential and that this may be a totally stupid question.
Can anybody help me out here? Thanks a lot in advance.
Look at the example here:
...
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
...
model.fit does not use an algorithm to predict the outcome, rather it uses the model you describe. The optimiser algorithm is then specified in model.compile
e.g.
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=**keras.optimizers.Adadelta()**,
metrics=['accuracy'])
You can find out more about the available optimisers here : https://keras.io/optimizers/

Resources