Saliency maps in Keras in a network with an embedding layer - machine-learning

I want to implement saliency maps using LSTM with an embedding layer in Keras. So basically I want to implement an differentiation function of output from a final layer with respect to input. Here is the code where I'm defining my model:
model = Sequential()
model.add(Embedding(max_words, 32, dropout=0.5))
model.add(LSTM(32, dropout_W=0.5, dropout_U=0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy']
I would be grateful if someone could help in this task.

Related

How to improve accuracy with keras multi class classification?

I am trying to do multi class classification with tf keras. I have total 20 labels and total data I have is 63952and I have tried the following code
features = features.astype(float)
labels = df_test["label"].values
encoder = LabelEncoder()
encoder.fit(labels)
encoded_Y = encoder.transform(labels)
dummy_y = np_utils.to_categorical(encoded_Y)
Then
def baseline_model():
model = Sequential()
model.add(Dense(50, input_dim=3, activation='relu'))
model.add(Dense(40, activation='softmax'))
model.add(Dense(30, activation='softmax'))
model.add(Dense(20, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
finally
history = model.fit(data,dummy_y,
epochs=5000,
batch_size=50,
validation_split=0.3,
shuffle=True,
callbacks=[ch]).history
I have a very poor accuray with this. How can I improve that ?
softmax activations in the intermediate layers do not make any sense at all. Change all of them to relu and keep softmax only in the last layer.
Having done that, and should you still be getting unsatisfactory accuracy, experiment with different architectures (different numbers of layers and nodes) with a short number of epochs (say ~ 50), in order to get a feeling of how your model behaves, before going for a full fit with your 5,000 epochs.
You did not give us vital information, but here are some guidelines:
1. Reduce the number of Dense layer - you have a complicated layer with a small amount of data (63k is somewhat small). You might experience overfitting on your train data.
2. Did you check that the test has the same distribution as your train?
3. Avoid using softmax in middle Dense layers - softmax should be used in the final layer, use sigmoid or relu instead.
4. Plot a loss as a function of epoch curve and check if it is reduces - you can then understand if your learning rate is too high or too small.

Convert sklearn.svm SVC classifier to Keras implementation

I'm trying to convert some old code from using sklearn to Keras implementation. Since it is crucial to maintain the same way of operation, I want to understand if I'm doing it correctly.
I've converted most of the code already, however I'm having trouble with sklearn.svm SVC classifier conversion. Here is how it looks right now:
from sklearn.svm import SVC
model = SVC(kernel='linear', probability=True)
model.fit(X, Y_labels)
Super easy, right. However, I couldn't find the analog of SVC classifier in Keras. So, what I've tried is this:
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(64, activation='relu'))
model.add(Dense(1, activation='softmax'))
model.compile(loss='squared_hinge',
optimizer='adadelta',
metrics=['accuracy'])
model.fit(X, Y_labels)
But, I think that it is not correct by any means. Could you, please, help me find an alternative of the SVC classifier from sklearn in Keras?
Thank you.
If you are making a classifier, you need squared_hinge and regularizer, to get the complete SVM loss function as can be seen here. So you will also need to break your last layer to add regularization parameter before performing activation, I have added the code here.
These changes should give you the output
from keras.regularizers import l2
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(64, activation='relu'))
model.add(Dense(1), kernel_regularizer=l2(0.01))
model.add(activation('softmax'))
model.compile(loss='squared_hinge',
optimizer='adadelta',
metrics=['accuracy'])
model.fit(X, Y_labels)
Also hinge is implemented in keras for binary classification, so if you are working on a binary classification model, use the code below.
from keras.regularizers import l2
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(64, activation='relu'))
model.add(Dense(1), kernel_regularizer=l2(0.01))
model.add(activation('linear'))
model.compile(loss='hinge',
optimizer='adadelta',
metrics=['accuracy'])
model.fit(X, Y_labels)
If you cannot understand the article or have issues with the code, feel free to comment.
I had this same issue a while back, and this GitHub thread helped me understand, maybe go through it too, some of the ideas here are directly from here https://github.com/keras-team/keras/issues/2588
If you are using Keras 2.0 then you need to change the following lines of anand v sing's answer.
W_regularizer -> kernel_regularizer
Github link
model.add(Dense(nb_classes, kernel_regularizer=regularizers.l2(0.0001)))
model.add(Activation('linear'))
model.compile(loss='squared_hinge',
optimizer='adadelta', metrics=['accuracy'])
Or You can use follow
top_model = bottom_model.output
top_model = Flatten()(top_model)
top_model = Dropout(0.5)(top_model)
top_model = Dense(64, activation='relu')(top_model)
top_model = Dense(2, kernel_regularizer=l2(0.0001))(top_model)
top_model = Activation('linear')(top_model)
model = Model(bottom_model.input, top_model)
model.compile(loss='squared_hinge',
optimizer='adadelta', metrics=['accuracy'])
You can use SVM with Keras implementation suing scikeras. It is a Scikit-Learn API wrapper for Keras. It was first release in May 2020. Below I have attached the official documentation link for it. I hope you will find your answer over there.
https://pypi.org/project/scikeras/#description

Machine learning approach to facial recognition

First of all i'm very new to the field. maybe my question is a bit too naive of even trivial..
I'm currently trying to understand how can i go about recognizing different faces.
Here is what i tried so far and the main issues with each approach:
1) Haar Cascade -> HOG -> SVM:
The main issue is that the algorithm becomes very indecisive when more than 4 people are trained.. the same occurs when we change Haar Cascade for a pre-trained CNN to detect faces..
2) dlib facial landmarks -> distance between points -> SVM or Simple Neural Network Classification:
This is the current approach and it behaves very well when when 4 people are trained.. when more people are trained it becomes very messy, jumping from decision to decision and never resolves to a choice.
I've read online that Triplet loss is the way to go.. but I very confused as to how id go about implementing it.. can i use the current distance vectors found using Dlib or should i scrap everything and train my own CNN?
If i can use the distance vectors how would i pass the data to the algorithm? is Triplet loss a trivial neural network only with it's loss function altered?
I've took the liberty to show exactly how the distance vectors are being calculated:
The green lines represent the distances being calculated
A 33 float list is returned which is then fed to the classifier
Here is the relevant code for the classifier (Keras):
def fit_classifier(self):
x_train, y_train = self._get_data(self.train_data_path)
x_test, y_test = self._get_data(self.test_data_path)
encoding_train_y = np_utils.to_categorical(y_train)
encoding_test_y = np_utils.to_categorical(y_test)
model = Sequential()
model.add(Dense(10, input_dim=33, activation='relu'))
model.add(Dense(20, activation='relu'))
model.add(Dense(30, activation='relu'))
model.add(Dense(40, activation='relu'))
model.add(Dense(30, activation='relu'))
model.add(Dense(20, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(max(y_train)+1, activation='softmax'))
model.compile(loss='mse', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, encoding_train_y, epochs=100, batch_size=10)
I think this is a more theoretical question than anything else.. if someone with good experience in the field could help me out i'd be very happy!

Binary Classification vs. Multi Class Classification

I have a machine learning classification problem with 3 possible classes (Class A, Class b and Class C). Please let me know which one would be better approach?
- Split the problem into 2 binary classification: First Identify whether it is Class A or Class 'Not A'. Then if it is Class 'Not A', then another binary classification to classify into Class B or Class C
Binary classification may at the end use sigmoid function (goes smooth from 0 to 1). This is how we will know how to classify two values.
from keras.layers import Dense
model.add(Dense(1, input_dim=8, kernel_initializer='uniform', activation='relu'))
model.add(Dense(1, kernel_initializer='uniform', activation='relu'))
model.add(Dense(1, kernel_initializer='uniform', activation='sigmoid'))
For multi class classification you would typically use softmax at the very last layer, and the number of neurons in the next example will be 10, means 10 choices.
from keras.layers import Dropout
model.add(Dense(512,activation='relu',input_shape=(784,)))
model.add(Dropout(0.2))
model.add(Dense(10, activation='softmax'))
However, you can also use softmax with 2 neurons in the last layer for the binary classification as well:
model.add(Dense(2, activation='softmax'))
Hope this provides little intuition on classifiers.
What you describe is one method used for Multi Class Classification.
It is called One vs. All / One vs. Rest.
The best way is to chose a good classifier framework with both options and choose the better one using Cross Validation process.

Keras: How to feed input directly into other hidden layers of the neural net than the first?

I have a question about using Keras to which I'm rather new. I'm using a convolutional neural net that feeds its results into a standard perceptron layer, which generates my output. This CNN is fed with a series of images. This is so far quite normal.
Now I like to pass a short non-image input vector directly into the last perceptron layer without sending it through all the CNN layers. How can this be done in Keras?
My code looks like this:
# last CNN layer before perceptron layer
model.add(Convolution2D(200, 2, 2, border_mode='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(Dropout(0.25))
# perceptron layer
model.add(Flatten())
# here I like to add to the input from the CNN an additional vector directly
model.add(Dense(1500, W_regularizer=l2(1e-3)))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
Any answers are greatly appreciated, thanks!
You didn't show which kind of model you use to me, but I assume that you initialized your model as Sequential. In a Sequential model you can only stack one layer after another - so adding a "short-cut" connection is not possible.
For this reason authors of Keras added option of building "graph" models. In this case you can build a graph (DAG) of your computations. It's a more complicated than designing a stack of layers, but still quite easy.
Check the documentation site to look for more details.
Provided your Keras's backend is Theano, you can do the following:
import theano
import numpy as np
d = Dense(1500, W_regularizer=l2(1e-3), activation='relu') # I've joined activation and dense layers, based on assumption you might be interested in post-activation values
model.add(d)
model.add(Dropout(0.5))
model.add(Dense(1))
c = theano.function([d.get_input(train=False)], d.get_output(train=False))
layer_input_data = np.random.random((1,20000)).astype('float32') # refer to d.input_shape to get proper dimensions of layer's input, in my case it was (None, 20000)
o = c(layer_input_data)
The answer here works. It is more high level and works also for the tensorflow backend:
input_1 = Input(input_shape)
input_2 = Input(input_shape)
merge = merge([input_1, input_2], mode="concat") # could also to "sum", "dot", etc.
hidden = Dense(hidden_dims)(merge)
classify = Dense(output_dims, activation="softmax")(hidden)
model = Model(input=[input_1, input_2], output=hidden)

Resources