I am using Keras on top of Theano to create a MLP which I train and use to predict time series. Independently of the structure and depth of my network I cannot figure out (Keras documentation, StackOverflow, searching the net...) which training algorithm (Backpropagation,...) Keras' model.fit() function is using.
Within Theano (used without Keras before) I could define the way the parameters are adjusted myself with
self.train_step = theano.function(inputs=[u_in, t_in, lrate], outputs=[cost, y],
on_unused_input='warn',
updates=[(p, p - lrate * g) for p, g in zip(self.parameters, self.gradients)],
allow_input_downcast=True)
Not finding any information causes a certain fear that I am missing something essential and that this may be a totally stupid question.
Can anybody help me out here? Thanks a lot in advance.
Look at the example here:
...
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
...
model.fit does not use an algorithm to predict the outcome, rather it uses the model you describe. The optimiser algorithm is then specified in model.compile
e.g.
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=**keras.optimizers.Adadelta()**,
metrics=['accuracy'])
You can find out more about the available optimisers here : https://keras.io/optimizers/
Related
I have a simple binary image classification CNN network. Below is the code
model = Sequential()
model.add(Conv2D(32, kernel_size=(3,3), padding='same',
kernel_initializer=gabor_init, input_shape=(32, 32, 1)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Conv2D(32, kernel_size=(3,3), padding='same', kernel_initializer=gabor_init))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Conv2D(64, kernel_size=(3,3), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128,activation='relu'))
model.add(Dropout(0.4))
model.add(Dense(2,input_dim=128,activation='sigmoid'))
model.compile(loss='binary_crossentropy',optimizer='rmsprop',metrics=['accuracy'])
model.summary()
from sklearn.model_selection import train_test_split
trainX,testX,trainY,testY=train_test_split(Xdata,Ytarget,test_size=.3)
history=model.fit(trainX,trainY,epochs=70,batch_size = 64,
verbose = 1,validation_split=.3)
print(model.evaluate(testX,testY))
Here I am training the model then validating the model. My question is
I want to check the model on the test data before training; as I am using the Gabor Kernel Initializer, I want to see how this filter works before training. In that case, do I need to add `model.fit()? I am little confused.
Any suggestion or modification for the last part of the code so the model can be tested on test data before training?
After you have defined your model in keras, you are only required to compile it using the model.compile() in order to be able to invoke predictions on the initial untrained weights. model.fit() only updates the weights as the model is trained and does not contribute to any configuration setup.
I wanted to make a binary image classification using Cifar-10 dataset. Where I modified Cifar-10 such a way that class-0 as class-True(1) and all other class as class-False(0). Now there is only two classes in my dataset - True(1) and False(0).
while I am doing training using the following Keras model(Tensorflow as backend) I am getting almost 99% accuracy.
But in the test I am finding that all the False is predicted as False and all True are also predicted as False - and getting 99% accuracy.
But I do not wanted that all True are predicted as False.
I was expecting that all True are predicted as True.
How can I resolve this problem?
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
output=model.fit(x_train, y_train, batch_size=32, epochs=10)
You have a few options here:
Get more data with True label. However in most scenarios this is not easily possible.
Use only a small amount of the data that is labeled False. Maybe it is enough to train your model?
Use weights for the loss function during training. In Kerasyou can do this using the class_weight option of fit. The class True should have a higher weight than the class False in your example.
As mentioned in the comments this is a huge problem in the ML field. These are just a few very simple things you could try.
I tried implementing AlexNet as explained in this video. Pardon me if I have implemented it wrong, this is the code for my implementation it in keras.
Edit : The cifar-10 ImageDataGenerator
cifar_generator = ImageDataGenerator()
cifar_data = cifar_generator.flow_from_directory('datasets/cifar-10/train',
batch_size=32,
target_size=input_size,
class_mode='categorical')
The Model described in Keras:
model = Sequential()
model.add(Convolution2D(filters=96, kernel_size=(11, 11), input_shape=(227, 227, 3), strides=4, activation='relu'))
model.add(MaxPool2D(pool_size=(3 ,3), strides=2))
model.add(Convolution2D(filters=256, kernel_size=(5, 5), strides=1, padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=(3 ,3), strides=2))
model.add(Convolution2D(filters=384, kernel_size=(3, 3), strides=1, padding='same', activation='relu'))
model.add(Convolution2D(filters=384, kernel_size=(3, 3), strides=1, padding='same', activation='relu'))
model.add(Convolution2D(filters=256, kernel_size=(3, 3), strides=1, padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=(3 ,3), strides=2))
model.add(Flatten())
model.add(Dense(units=4096))
model.add(Dense(units=4096))
model.add(Dense(units=10, activation='softmax'))
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
I have used an ImageDataGenerator to train this network on the cifar-10 data set. However, I am only able to get an accuracy of about .20. I cannot figure out what I am doing wrong.
For starters, you need to extend the relu activation to your two intermediate dense layers, too; as they are now:
model.add(Dense(units=4096))
model.add(Dense(units=4096))
i.e. with linear activation (default), it can be shown that they are equivalent to a simple linear unit each (Andrew Ng devotes a whole lecture in his first course on the DL specialization explaining this). Change them to:
model.add(Dense(units=4096, activation='relu'))
model.add(Dense(units=4096, activation='relu'))
Check the SO thread Why must a nonlinear activation function be used in a backpropagation neural network?, as well as the AlexNet implementations here and here to confirm this.
I wanna train a CNN using SVM to classify at the last layer. I understand that the categorical_hinge is the best loss function for that . I have 6 classes to classify .
My model is as shown below:
model = Sequential()
model.add(Conv2D(50, 3, 3, activation = 'relu', input_shape = train_data.shape[1:]))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(50, 3, 3, activation = 'relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(50, 3, 3, activation = 'relu'))
model.add(Flatten())
model.add(Dense(400, activation = 'relu'))
model.add(Dropout(0.5))
model.add(Dense(128, activation = 'relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation = 'sigmoid'))
Is there a problem with the network , data processing , or the loss function?
The model does not learn anything after a point as shown in the image
What should I do?
Your model has a single output neuron, there is no way this will work with 6 classes. The output of your model should have 6 neurons. Also the output of your model should have no activation function in order to produce logits that the categorical hinge can use.
Note that the categorical hinge was added recently (2-3 weeks ago) so its quite new and probably not many people have tested it.
Use hinge loss in and linear activation in last layer.
model.add(Dense(nb_classes), W_regularizer=l2(0.01))
model.add(Activation('linear'))
model.compile(loss='hinge',
optimizer='adadelta',
metrics=['accuracy'])
for more information visit https://github.com/keras-team/keras/issues/6090
I have a trained model that I've exported the weights and want to partially load into another model.
My model is built in Keras using TensorFlow as backend.
Right now I'm doing as follows:
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape, trainable=False))
model.add(Activation('relu', trainable=False))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3), trainable=False))
model.add(Activation('relu', trainable=False))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), trainable=True))
model.add(Activation('relu', trainable=True))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
model.load_weights("image_500.h5")
model.pop()
model.pop()
model.pop()
model.pop()
model.pop()
model.pop()
model.add(Conv2D(1, (6, 6),strides=(1, 1), trainable=True))
model.add(Activation('relu', trainable=True))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
I'm sure it's a terrible way to do it, although it works.
How do I load just the first 9 layers?
If your first 9 layers are consistently named between your original trained model and the new model, then you can use model.load_weights() with by_name=True. This will update weights only in the layers of your new model that have an identically named layer found in the original trained model.
The name of the layer can be specified with the name keyword, for example:
model.add(Dense(8, activation='relu',name='dens_1'))
This call:
weights_list = model.get_weights()
will return a list of all weight tensors in the model, as Numpy arrays.
All what you have to do next is to iterate over this list and apply:
for i, weights in enumerate(weights_list[0:9]):
model.layers[i].set_weights(weights)
where model.layers is a flattened list of the layers comprising the model. In this case, you reload the weights of the first 9 layers.
More information is available here:
https://keras.io/layers/about-keras-layers/
https://keras.io/models/about-keras-models/