How can I convert this keras cnn model to pytorch version - machine-learning

This is the example keras code that I want to convert to pytorch. My input dataset is 10000*1*102 (two dimensions for labels). The dataset includes 10000 samples. Each sample contains one row with 102 features. I am thinking to use 1dcnn for regression.
PS: hyper-parameter (e.g. filters, kernel_size, stride, padding) could be adjusted based on my 10000*1*102 dataset.
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(n_timesteps,n_features)))
model.add(Conv1D(filters=64, kernel_size=3, activation='relu'))
model.add(Dropout(0.5))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(100, activation='relu'))
model.add(Dense(n_outputs, activation='softmax'))

Welcome to pytorch. :)
I am really glad you decide to switch from Keras to PyTorch. It was an important step for me to understand how NNs work in more detail. If you have any specific questions about code or if it isn't working please let me know.
import torch.nn as nn
a0 = nn.Conv1D(n_timesteps, 64, 3)
a1 = nn.Relu()
b0 = nn.Conv1D(64, 64, 3)
b1 = nn.Relu()
c0 = torch.nn.Dropout(p=0.5)
d0 = nn.MaxPool1d(2)
e0 = nn.Flatten()
e1 = nn.Linear(32*n_timesteps,100)
e2 = nn.Relu()
e3 = nn.Linear(n_outputs)
f0 = nn.Softmax(dim=1)
model = nn.Sequential(a0,a1,b0,b1,c0,d0,e0,e1,e2,e3,f0)

Name: torch
Version: 1.11.0.dev20211231+cu113
This article has been very helpful. But the code seems to have changed a little as of the current standards, so I'm leaving a reply.
import torch.nn as nn
from torchsummary import summary as summary_
n_timesteps = 10000
n_features = 102
n_outputs = 1
a0 = nn.Conv1d(n_features, 64, 3)
a1 = nn.ReLU()
b0 = nn.Conv1d(64, 64, 3)
b1 = nn.ReLU()
c0 = nn.Dropout(p=0.5)
d0 = nn.MaxPool1d(2)
e0 = nn.Flatten()
e1 = nn.Linear(319872,100)
e2 = nn.ReLU()
e3 = nn.Linear(100,n_outputs)
f0 = nn.Softmax(dim=1)
model = nn.Sequential(a0,a1,b0,b1,c0,d0,e0,e1,e2,e3,f0)
model.to('cuda')
summary_(model,(n_features,n_timesteps),batch_size=1)

Related

Poor predictions on second dataset from trained LSTM model

I've trained an LSTM model with 8 features and 1 output. I have one dataset and split it into two separate files to train and predict with the first half of the set, and then attempt to predict the second half of the set using the trained model from the first part of my dataset. My model predicts the trained and testing sets from the dataset I used to train the model pretty well (RMSE of around 5-7), however when I attempt to predict using the second half of the set I get very poor predictions (RMSE of around 50-60). How can I get my trained model to predict outside datasets well?
dataset at this link
file = r'/content/drive/MyDrive/only_force_pt1.csv'
df = pd.read_csv(file)
df.head()
X = df.iloc[:, 1:9]
y = df.iloc[:,9]
print(X.shape)
print(y.shape)
plt.figure(figsize = (20, 6), dpi = 100)
plt.plot(y)
WINDOW_LEN = 50
def window_size(size, inputdata, targetdata):
X = []
y = []
i=0
while(i + size) <= len(inputdata)-1:
X.append(inputdata[i: i+size])
y.append(targetdata[i+size])
i+=1
assert len(X)==len(y)
return (X,y)
X_series, y_series = window_size(WINDOW_LEN, X, y)
print(len(X))
print(len(X_series))
print(len(y_series))
X_train, X_val, y_train, y_val = train_test_split(np.array(X_series),np.array(y_series),test_size=0.3, shuffle = True)
X_val, X_test,y_val, y_test = train_test_split(np.array(X_val),np.array(y_val),test_size=0.3, shuffle = False)
n_timesteps, n_features, n_outputs = X_train.shape[1], X_train.shape[2],1
[verbose, epochs, batch_size] = [1, 300, 32]
input_shape = (n_timesteps, n_features)
model = Sequential()
# LSTM
model.add(LSTM(64, input_shape=input_shape, return_sequences = False))
model.add(Dropout(0.2))
model.add(Dense(64, activation='relu', kernel_regularizer=keras.regularizers.l2(0.001)))
#model.add(Dropout(0.2))
model.add(Dense(32, activation='relu', kernel_regularizer=keras.regularizers.l2(0.001)))
model.add(Dense(1, activation='relu'))
earlystopper = EarlyStopping(monitor='val_loss', min_delta=0, patience = 30, verbose =1, mode = 'auto')
model.summary()
model.compile(loss = 'mse', optimizer = Adam(learning_rate = 0.001), metrics=[tf.keras.metrics.RootMeanSquaredError()])
history = model.fit(X_train, y_train, batch_size = batch_size, epochs = epochs, verbose = verbose, validation_data=(X_val,y_val), callbacks = [earlystopper])
Second dataset:
tests = r'/content/drive/MyDrive/only_force_pt2.csv'
df_testing = pd.read_csv(tests)
X_testing = df_testing.iloc[:4038,1:9]
torque = df_testing.iloc[:4038,9]
print(X_testing.shape)
print(torque.shape)
plt.figure(figsize = (20, 6), dpi = 100)
plt.plot(torque)
X_testing = X_testing.to_numpy()
X_testing_series, y_testing_series = window_size(WINDOW_LEN, X_testing, torque)
X_testing_series = np.array(X_testing_series)
y_testing_series = np.array(y_testing_series)
scores = model.evaluate(X_testing_series, y_testing_series, verbose =1)
X_prediction = model.predict(X_testing_series, batch_size = 32)
If your model is working fine on training data but performs bad on validation data, then your model did not learn the "true" connection between input and output variables but simply memorized the corresponding output to your input. To tackle this you can do multiple things:
Typically you would use 80% of your data to train and 20% to test, this will present more data to the model, which should make it learn more of the true underlying function
If your model is too complex, it will have neurons which will just be used to memorize input-output data pairs. Try to reduce the complexity of your model (layers, neurons) to make it more simple, so that the remaining layers can really learn instead of memorize
Look into more detail on training performance here

Unable to get output using CNN model

I am trying to use cnn-lstm model on this dataset. I've stored this dataset in dataframe named as df. there are totally 11 column in this dataset but i am just mentioning 9 columns here. All columns have numerical values only
Area book_hotel votes location hotel_type Total_Price Facilities Dine rate
6 0 0 1 163 400 22 7 4.4
19 1 2 7 122 220 28 11 4.6
X=df.drop(['rate'],axis=1)
Y=df['rate']
x_train, x_test, y_train, y_test = train_test_split(np.asarray(X), np.asarray(Y), test_size=0.33, shuffle= True)
x_train has shape (3350,10) and
x_test has shape (1650, 10)
# The known number of output classes.
num_classes = 10
# Input image dimensions
input_shape = (10,)
# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)
x_train = x_train.reshape(3350, 10,1)
x_test = x_test.reshape(1650, 10,1)
input_layer = Input(shape=(10, 1))
conv1 = Conv1D(filters=32,
kernel_size=8,
strides=1,
activation='relu',
padding='same')(input_layer)
lstm1 = LSTM(32, return_sequences=True)(conv1)
output_layer = Dense(1, activation='sigmoid')(lstm1)
model = Model(inputs=input_layer, outputs=output_layer)
model.summary()
model.compile(loss='mse',optimizer='adam')
Finally when i am trying to fit the model with input
model.fit(x_train,y_train)
ValueError Traceback (most recent call last)
<ipython-input-170-4719cf73997a> in <module>()
----> 1 model.fit(x_train,y_train)
2 frames
/usr/local/lib/python3.6/dist-packages/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
133 ': expected ' + names[i] + ' to have ' +
134 str(len(shape)) + ' dimensions, but got array '
--> 135 'with shape ' + str(data_shape))
136 if not check_batch_axis:
137 data_shape = data_shape[1:]
ValueError: Error when checking target: expected dense_2 to have 3 dimensions, but got array with shape (3350, 1)
Can someone please help me resolving this error
I see some problem in your code...
the last dimension output must be equal to the number of class and with multiclass tasks you need to apply a softmax activation: Dense(num_classes, activation='softmax')
you must set return_sequences=False in your last lstm cell because you need a 2D output and not a 3D
you must use categorical_crossentropy as loss function with one-hot encoded target
here a complete dummy example...
num_classes = 10
n_sample = 1000
X = np.random.uniform(0,1, (n_sample,10,1))
y = tf.keras.utils.to_categorical(np.random.randint(0,num_classes, n_sample))
input_layer = Input(shape=(10, 1))
conv1 = Conv1D(filters=32,
kernel_size=8,
strides=1,
activation='relu',
padding='same')(input_layer)
lstm1 = LSTM(32, return_sequences=False)(conv1)
output_layer = Dense(num_classes, activation='softmax')(lstm1)
model = Model(inputs=input_layer, outputs=output_layer)
model.compile(loss='categorical_crossentropy',optimizer='adam')
model.fit(X,y, epochs=5)

How to use Recursive Feature elimination?

I am new to ML and have been trying Feature selection with RFE approach. My dataset has 5K records and its binary classification problem. This is the code that I am following based on a tutorial online
#no of features
nof_list=np.arange(1,13)
high_score=0
#Variable to store the optimum features
nof=0
score_list =[]
for n in range(len(nof_list)):
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.3, random_state = 0)
model = RandomForestClassifier()
rfe = RFE(model,nof_list[n])
X_train_rfe = rfe.fit_transform(X_train,y_train)
X_test_rfe = rfe.transform(X_test)
model.fit(X_train_rfe,y_train)
score = model.score(X_test_rfe,y_test)
score_list.append(score)
if(score>high_score):
high_score = score
nof = nof_list[n]
print("Optimum number of features: %d" %nof)
print("Score with %d features: %f" % (nof, high_score))
I encounter the below error. Can someone please help
TypeError Traceback (most recent call last)
<ipython-input-332-a23dfb331001> in <module>
9 model = RandomForestClassifier()
10 rfe = RFE(model,nof_list[n])
---> 11 X_train_rfe = rfe.fit_transform(X_train,y_train)
12 X_test_rfe = rfe.transform(X_test)
13 model.fit(X_train_rfe,y_train)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\base.py in fit_transform(self, X, y, **fit_params)
554 Training set.
555
--> 556 y : numpy array of shape [n_samples]
557 Target values.
558
~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\feature_selection\_base.py in transform(self, X)
75 X = check_array(X, dtype=None, accept_sparse='csr',
76 force_all_finite=not tags.get('allow_nan', True))
---> 77 mask = self.get_support()
78 if not mask.any():
79 warn("No features were selected: either the data is"
~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\feature_selection\_base.py in get_support(self, indices)
44 values are indices into the input feature vector.
45 """
---> 46 mask = self._get_support_mask()
47 return mask if not indices else np.where(mask)[0]
48
~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\feature_selection\_rfe.py in _get_support_mask(self)
269
270 def _get_support_mask(self):
--> 271 check_is_fitted(self)
272 return self.support_
273
TypeError: check_is_fitted() missing 1 required positional argument: 'attributes'
What is your sklearn version ?
The following (using artificial data) should work fine:
from sklearn.model_selection import train_test_split
import numpy as np
from sklearn.feature_selection import RFE
from sklearn.ensemble import RandomForestClassifier
X = np.random.rand(100,20)
y = np.ones((X.shape[0]))
#no of features
nof_list=np.arange(1,13)
high_score=0
#Variable to store the optimum features
nof=0
score_list =[]
for n in range(len(nof_list)):
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.3, random_state = 0)
model = RandomForestClassifier()
rfe = RFE(model,nof_list[n])
X_train_rfe = rfe.fit_transform(X_train,y_train)
X_test_rfe = rfe.transform(X_test)
model.fit(X_train_rfe,y_train)
score = model.score(X_test_rfe,y_test)
score_list.append(score)
if(score>high_score):
high_score = score
nof = nof_list[n]
print("Optimum number of features: %d" %nof)
print("Score with %d features: %f" % (nof, high_score))
Optimum number of features: 1
Score with 1 features: 1.000000
Versions tested:
sklearn.__version__
'0.20.4'
sklearn.__version__
'0.21.3'

How to concatenate embeddings with variable length inputs in Keras?

Here is the network diagram I am working on and data is tabular and structured,
On the left, we have some abilities which are continuous features and on the right, we could have 'N' number of modifiers. Each modifier has modifier_type which is categorical and some statistics which are continuous features.
If it was only one modifier here is the code that works just fine!
import keras.backend as K
from keras.models import Model
from keras.layers import Input, Embedding, concatenate
from keras.layers import Dense, GlobalMaxPooling1D, Reshape
from keras.optimizers import Adam
K.clear_session()
# Using embeddings for categorical features
modifier_type_embedding_in=[]
modifier_type_embedding_out=[]
# sample categorical features
categorical_features = ['modifier_type']
modifier_input_ = Input(shape=(1,), name='modifier_type_in')
# Let's assume 10 unique type of modifiers and let's have embedding dimension as 6
modifier_output_ = Embedding(input_dim=10, output_dim=6, name='modifier_type')(modifier_input_)
modifier_output_ = Reshape(target_shape=(6,))(modifier_output_)
modifier_type_embedding_in.append(modifier_input_)
modifier_type_embedding_out.append(modifier_output_)
# sample continuous features
statistics = ['duration']
statistics_inputs =[Input(shape=(len(statistics),), name='statistics')] # Input(shape=(1,))
# sample continuous features
abilities = ['buyback_cost', 'cooldown', 'number_of_deaths', 'ability', 'teleport', 'team', 'level', 'max_mana', 'intelligence']
abilities_inputs=[Input(shape=(len(abilities),), name='abilities')] # Input(shape=(9,))
concat = concatenate(modifier_type_embedding_out + statistics_inputs)
FC_relu = Dense(128, activation='relu', name='fc_relu_1')(concat)
FC_relu = Dense(128, activation='relu', name='fc_relu_2')(FC_relu)
model = concatenate(abilities_inputs + [FC_relu])
model = Dense(64, activation='relu', name='fc_relu_3')(model)
model_out = Dense(1, activation='sigmoid', name='fc_sigmoid')(model)
model_in = abilities_inputs + modifier_type_embedding_in + statistics_inputs
model = Model(inputs=model_in, outputs=model_out)
model.compile(loss='binary_crossentropy', optimizer=Adam(lr=2e-05, decay=1e-3), metrics=['accuracy'])
However, while compiling for 'N' number of modifiers I get below error and below are the changes I made in code,
modifier_input_ = Input(shape=(None, 1,), name='modifier_type_in')
statistics_inputs =[Input(shape=(None, len(statistics),), name='statistics')] # Input(shape=(None, 1,))
FC_relu = Dense(128, activation='relu', name='fc_relu_2')(FC_relu)
max_pool = GlobalMaxPooling1D()(FC_relu)
model = concatenate(abilities_inputs + [max_pool])
Here is what I get,
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-3-7703088b1d24> in <module>
22 abilities_inputs=[Input(shape=(len(abilities),), name='abilities')] # Input(shape=(9,))
23
---> 24 concat = concatenate(modifier_type_embedding_out + statistics_inputs)
25 FC_relu = Dense(128, activation='relu', name='fc_relu_1')(concat)
26 FC_relu = Dense(128, activation='relu', name='fc_relu_2')(FC_relu)
e:\Miniconda3\lib\site-packages\keras\layers\merge.py in concatenate(inputs, axis, **kwargs)
647 A tensor, the concatenation of the inputs alongside axis `axis`.
648 """
--> 649 return Concatenate(axis=axis, **kwargs)(inputs)
650
651
e:\Miniconda3\lib\site-packages\keras\engine\base_layer.py in __call__(self, inputs, **kwargs)
423 'You can build it manually via: '
424 '`layer.build(batch_input_shape)`')
--> 425 self.build(unpack_singleton(input_shapes))
426 self.built = True
427
e:\Miniconda3\lib\site-packages\keras\layers\merge.py in build(self, input_shape)
360 'inputs with matching shapes '
361 'except for the concat axis. '
--> 362 'Got inputs shapes: %s' % (input_shape))
363
364 def _merge_function(self, inputs):
ValueError: A `Concatenate` layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 6), (None, None, 1)]
How do I use the embedding layer within a neural network designed to accept variable input length features?
The answer is,
import keras.backend as K
from keras.models import Model
from keras.layers import Input, Embedding, concatenate
from keras.layers import Dense, GlobalMaxPooling1D, Reshape
from keras.optimizers import Adam
K.clear_session()
# Using embeddings for categorical features
modifier_type_embedding_in=[]
modifier_type_embedding_out=[]
# sample categorical features
categorical_features = ['modifier_type']
modifier_input_ = Input(shape=(None,), name='modifier_type_in')
# Let's assume 10 unique type of modifiers and let's have embedding dimension as 6
modifier_output_ = Embedding(input_dim=10, output_dim=6, name='modifier_type')(modifier_input_)
modifier_type_embedding_in.append(modifier_input_)
modifier_type_embedding_out.append(modifier_output_)
# sample continuous features
statistics = ['duration']
statistics_inputs =[Input(shape=(None, len(statistics),), name='statistics')] # Input(shape=(1,))
# sample continuous features
abilities = ['buyback_cost', 'cooldown', 'number_of_deaths', 'ability', 'teleport', 'team', 'level', 'max_mana', 'intelligence']
abilities_inputs=[Input(shape=(len(abilities),), name='abilities')] # Input(shape=(9,))
concat = concatenate(modifier_type_embedding_out + statistics_inputs)
FC_relu = Dense(128, activation='relu', name='fc_relu_1')(concat)
FC_relu = Dense(128, activation='relu', name='fc_relu_2')(FC_relu)
max_pool = GlobalMaxPooling1D()(FC_relu)
model = concatenate(abilities_inputs + [max_pool])
model = Dense(64, activation='relu', name='fc_relu_3')(model)
model_out = Dense(1, activation='sigmoid', name='fc_sigmoid')(model)
model_in = abilities_inputs + modifier_type_embedding_in + statistics_inputs
model = Model(inputs=model_in, outputs=model_out)
model.compile(loss='binary_crossentropy', optimizer=Adam(lr=2e-05, decay=1e-3), metrics=['accuracy'])

Caffe's configuration for Cohn-Kanade dataset

I'm trying to make a facial expression recognizer using Caffe and the Cohn-Kanade database.
This is my train prototxt configuration:
def configureTrainProtoTxt(lmdb, batch_size):
n = caffe.NetSpec()
n.data, n.label = L.Data(batch_size=batch_size, backend=P.Data.LMDB, source=lmdb,
transform_param=dict(scale= 1 / 126.0), ntop=2)
n.conv1 = L.Convolution(n.data, kernel_size=2, pad=1, param=dict(lr_mult=1), num_output=10, weight_filler=dict(type='xavier'))
n.conv1 = L.Convolution(n.data, kernel_size=2, pad=1, param=dict(lr_mult=1), num_output=10, weight_filler=dict(type='xavier'))
n.conv1 = L.Convolution(n.data, kernel_size=5, pad=0, num_output=20, weight_filler=dict(type='xavier'))
n.pool1 = L.Pooling(n.conv1, kernel_size=5, stride=2, pool=P.Pooling.MAX)
n.conv2 = L.Convolution(n.pool1, kernel_size=3, num_output=10, weight_filler=dict(type='xavier'))
n.pool2 = L.Pooling(n.conv2, kernel_size=2, stride=2, pool=P.Pooling.MAX)
n.ip1 = L.InnerProduct(n.pool2, num_output=100, weight_filler=dict(type='xavier'))
n.relu1 = L.ReLU(n.ip1, in_place=True)
n.ip2 = L.InnerProduct(n.relu1, num_output=2, weight_filler=dict(type='xavier'))
n.loss = L.SoftmaxWithLoss(n.ip2, n.label)
return n.to_proto()
This is my trainning function, that I got it from lenet example:
def train(solver):
niter = 200
test_interval = 10
train_loss = zeros(niter)
test_acc = zeros(int(np.ceil(niter / test_interval)))
output = zeros((niter, 32, 2))
for it in range(niter):
solver.step(1)
train_loss[it] = solver.net.blobs['loss'].data
solver.test_nets[0].forward(start='conv1')
output[it] = solver.test_nets[0].blobs['ip2'].data[:32]
if it % test_interval == 0:
print 'Iteration', it, 'testing...'
correct = 0
for test_it in range(100):
solver.test_nets[0].forward()
correct += sum(solver.test_nets[0].blobs['ip2'].data.argmax(1) == solver.test_nets[0].blobs['label'].data)
test_acc[it // test_interval] = correct / 1e4
_, ax1 = subplots()
ax2 = ax1.twinx()
ax1.plot(arange(niter), train_loss)
ax2.plot(test_interval * arange(len(test_acc)), test_acc, 'r')
ax1.set_xlabel('iteration')
ax1.set_ylabel('train loss')
ax2.set_ylabel('test accuracy')
show()
I'm only using the Neutral and Surprise faces (if I solve my problem I'll use more emotions). But my net has only 28% of accuracy. I'd like to know the low accuracy has to with a problem in the network configuration, with the logic inside my train function or if my training db is too small?. This is my dataset discription:
Train dataset: 56 images of Neutral faces. 60 images of Surprise faces.
Test dataset: 15 images of Neutral faces. 15 images of Surprise faces.
All images are 32x32 and in grayscale.
And my batch_size is 32.
Please, can someone help me to known what is my problem?

Resources