Confusion about domain adaptaion method: "Encoder" versus "prediction head" - machine-learning

In regard to this article:
Chen, D., Zhou, R., Pan, Y., & Liu, F. (2022). A Simple Baseline for Adversarial Domain Adaptation-based Unsupervised Flood Forecasting. arXiv preprint arXiv:2206.08105.
The authors describe two models. The first model is a 1D-CNN "encoder" with three layers. The second model is a "prediction head". It is also a 1D-CNN with three layers.
How to implement this?
For example, I'd start by creating two Sequential() models, each with three Conv1D layers and number of filters and kernel size as specified. Next step would be to train the encoder models on the source and target datasets. But what comes next?
For example:
# Encoder model
encoder_model = Sequential()
encoder_model.add(Conv1D(filters=30, kernel_size=2, activation='relu',
input_shape=(n_timesteps, n_features)))
encoder_model.add(Dropout(0.2))
encoder_model.add(Conv1D(filters=30, kernel_size=2, activation='relu'))
encoder_model.add(Dropout(0.2))
encoder_model.add(Conv1D(filters=30, kernel_size=2, activation='relu'))
encoder_model.add(Dropout(0.2))
# Prediction head model
ph_model = Sequential()
ph_model.add(Conv1D(filters=36, kernel_size=2, activation='relu',
input_shape=(n_timesteps, n_features)))
ph_model.add(Conv1D(filters=36, kernel_size=2, activation='relu'))
ph_model.add(Conv1D(filters=1, kernel_size=3))
There is also a "residual connection" in the prediction head. But how to add that?
The article includes this diagram:
How would this look when programmed, for example, in keras?

Related

Merge Sequential models in Keras

I want to concatenate these two models in my project, I am quite new in this field so please don't judge me hard. So here is the code.
model2 = Sequential()
model2.add(Dense(10, input_dim=df2_x.shape[1], activation='relu'))
model2.add(Dense(50, input_dim=df2_x.shape[1], activation='relu'))
model2.add(Dense(10, input_dim=df2_x.shape[1], activation='relu'))
model2.add(Dense(1, kernel_initializer='normal'))
model2.add(Dense(df2_y.shape[1],activation='softmax'))
model2.compile(loss='categorical_crossentropy', optimizer='adam')
monitor2 = EarlyStopping(monitor='val_loss', min_delta=1e-3,
patience=5, verbose=1, mode='auto',
restore_best_weights=True)
model2.fit(df2_x_train,df2_y_train,validation_data=(df2_x_test, df2_y_test),
callbacks=[monitor2],verbose=2,epochs=1000)
model = Sequential()
model.add(Dense(10, input_dim=df_x.shape[1], activation='relu'))
model.add(Dense(50, input_dim=df_x.shape[1], activation='relu'))
model.add(Dense(10, input_dim=df_x.shape[1], activation='relu'))
model.add(Dense(1, kernel_initializer='normal'))
model.add(Dense(df_y.shape[1],activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
monitor = EarlyStopping(monitor='val_loss', min_delta=1e-3,
patience=5, verbose=1, mode='auto',
restore_best_weights=True)
model.fit(df_x_train,df_y_train,validation_data=(df_x_test, df_y_test),
callbacks=[monitor],verbose=2,epochs=1000)
And after the model obtained, I want to make predictions.
So I have two datasets, one for DOS-portmap attacks, and one for DOS-UDP attacks.
If I want to predict something, how can I distinguish between these two?
You can use Keras functional API for this. Model is basically same like a layer, it's callable, and can be included in another model.
For example, this is greatly simplified version of concatenating results from 2 Models:
inputs = keras.Input(input_shape)
y_1 = model1(inputs)
y_2 = model2(inputs)
outputs = tf.concat([y_1, y_2], axis=0)
new_model = keras.Model(inputs, outputs)
Of course, you want to make sure that the outputs is your desired result. Therefore, the key is in the last layer, what kind of operations you want to do with the values you get from y_1 and y_2.
Some use custom layer for this, you can use that as well. You can look for more article about Functional API and Custom Layer in Tensorflow doc.

Machine learning approach to facial recognition

First of all i'm very new to the field. maybe my question is a bit too naive of even trivial..
I'm currently trying to understand how can i go about recognizing different faces.
Here is what i tried so far and the main issues with each approach:
1) Haar Cascade -> HOG -> SVM:
The main issue is that the algorithm becomes very indecisive when more than 4 people are trained.. the same occurs when we change Haar Cascade for a pre-trained CNN to detect faces..
2) dlib facial landmarks -> distance between points -> SVM or Simple Neural Network Classification:
This is the current approach and it behaves very well when when 4 people are trained.. when more people are trained it becomes very messy, jumping from decision to decision and never resolves to a choice.
I've read online that Triplet loss is the way to go.. but I very confused as to how id go about implementing it.. can i use the current distance vectors found using Dlib or should i scrap everything and train my own CNN?
If i can use the distance vectors how would i pass the data to the algorithm? is Triplet loss a trivial neural network only with it's loss function altered?
I've took the liberty to show exactly how the distance vectors are being calculated:
The green lines represent the distances being calculated
A 33 float list is returned which is then fed to the classifier
Here is the relevant code for the classifier (Keras):
def fit_classifier(self):
x_train, y_train = self._get_data(self.train_data_path)
x_test, y_test = self._get_data(self.test_data_path)
encoding_train_y = np_utils.to_categorical(y_train)
encoding_test_y = np_utils.to_categorical(y_test)
model = Sequential()
model.add(Dense(10, input_dim=33, activation='relu'))
model.add(Dense(20, activation='relu'))
model.add(Dense(30, activation='relu'))
model.add(Dense(40, activation='relu'))
model.add(Dense(30, activation='relu'))
model.add(Dense(20, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(max(y_train)+1, activation='softmax'))
model.compile(loss='mse', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, encoding_train_y, epochs=100, batch_size=10)
I think this is a more theoretical question than anything else.. if someone with good experience in the field could help me out i'd be very happy!

How to use categorical_hinge loss in keras in order to train with an SVM in the last layer?

I wanna train a CNN using SVM to classify at the last layer. I understand that the categorical_hinge is the best loss function for that . I have 6 classes to classify .
My model is as shown below:
model = Sequential()
model.add(Conv2D(50, 3, 3, activation = 'relu', input_shape = train_data.shape[1:]))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(50, 3, 3, activation = 'relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(50, 3, 3, activation = 'relu'))
model.add(Flatten())
model.add(Dense(400, activation = 'relu'))
model.add(Dropout(0.5))
model.add(Dense(128, activation = 'relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation = 'sigmoid'))
Is there a problem with the network , data processing , or the loss function?
The model does not learn anything after a point as shown in the image
What should I do?
Your model has a single output neuron, there is no way this will work with 6 classes. The output of your model should have 6 neurons. Also the output of your model should have no activation function in order to produce logits that the categorical hinge can use.
Note that the categorical hinge was added recently (2-3 weeks ago) so its quite new and probably not many people have tested it.
Use hinge loss in and linear activation in last layer.
model.add(Dense(nb_classes), W_regularizer=l2(0.01))
model.add(Activation('linear'))
model.compile(loss='hinge',
optimizer='adadelta',
metrics=['accuracy'])
for more information visit https://github.com/keras-team/keras/issues/6090

Keras model.fit() - which training algorithm is used?

I am using Keras on top of Theano to create a MLP which I train and use to predict time series. Independently of the structure and depth of my network I cannot figure out (Keras documentation, StackOverflow, searching the net...) which training algorithm (Backpropagation,...) Keras' model.fit() function is using.
Within Theano (used without Keras before) I could define the way the parameters are adjusted myself with
self.train_step = theano.function(inputs=[u_in, t_in, lrate], outputs=[cost, y],
on_unused_input='warn',
updates=[(p, p - lrate * g) for p, g in zip(self.parameters, self.gradients)],
allow_input_downcast=True)
Not finding any information causes a certain fear that I am missing something essential and that this may be a totally stupid question.
Can anybody help me out here? Thanks a lot in advance.
Look at the example here:
...
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
...
model.fit does not use an algorithm to predict the outcome, rather it uses the model you describe. The optimiser algorithm is then specified in model.compile
e.g.
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=**keras.optimizers.Adadelta()**,
metrics=['accuracy'])
You can find out more about the available optimisers here : https://keras.io/optimizers/

Keras: How to feed input directly into other hidden layers of the neural net than the first?

I have a question about using Keras to which I'm rather new. I'm using a convolutional neural net that feeds its results into a standard perceptron layer, which generates my output. This CNN is fed with a series of images. This is so far quite normal.
Now I like to pass a short non-image input vector directly into the last perceptron layer without sending it through all the CNN layers. How can this be done in Keras?
My code looks like this:
# last CNN layer before perceptron layer
model.add(Convolution2D(200, 2, 2, border_mode='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(Dropout(0.25))
# perceptron layer
model.add(Flatten())
# here I like to add to the input from the CNN an additional vector directly
model.add(Dense(1500, W_regularizer=l2(1e-3)))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
Any answers are greatly appreciated, thanks!
You didn't show which kind of model you use to me, but I assume that you initialized your model as Sequential. In a Sequential model you can only stack one layer after another - so adding a "short-cut" connection is not possible.
For this reason authors of Keras added option of building "graph" models. In this case you can build a graph (DAG) of your computations. It's a more complicated than designing a stack of layers, but still quite easy.
Check the documentation site to look for more details.
Provided your Keras's backend is Theano, you can do the following:
import theano
import numpy as np
d = Dense(1500, W_regularizer=l2(1e-3), activation='relu') # I've joined activation and dense layers, based on assumption you might be interested in post-activation values
model.add(d)
model.add(Dropout(0.5))
model.add(Dense(1))
c = theano.function([d.get_input(train=False)], d.get_output(train=False))
layer_input_data = np.random.random((1,20000)).astype('float32') # refer to d.input_shape to get proper dimensions of layer's input, in my case it was (None, 20000)
o = c(layer_input_data)
The answer here works. It is more high level and works also for the tensorflow backend:
input_1 = Input(input_shape)
input_2 = Input(input_shape)
merge = merge([input_1, input_2], mode="concat") # could also to "sum", "dot", etc.
hidden = Dense(hidden_dims)(merge)
classify = Dense(output_dims, activation="softmax")(hidden)
model = Model(input=[input_1, input_2], output=hidden)

Resources