sequential model testing in keras - machine-learning

I'm using sequential model because in my actual data, order is important. So I'm testing with a very simple problem.
Using this data structure, since my actual project uses this.
The keras model is supposed to find the sum of the elements for each 4-element array and return an array of the results.
Am new to machine learning.
Keep getting the same low accuracy in my results.
would it make a difference if I switch to pytorch ?
Are my layers wrong ?
code:
import numpy as np
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation
from tensorflow.keras import activations
from tensorflow.python.keras.optimizer_v2.adam import Adam
x_train = np.random.randint(2, size=(200,50, 4))
y_train = []
for i in x_train:
b = []
for j in i:
b.append(sum(j))
y_train.append(b)
y_train = np.asarray(y_train)
model = Sequential()
model.add(Dense(50, input_shape=(50,4), activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(50, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.compile(optimizer=Adam(), loss='mean_squared_error', metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=50, epochs=150)
x_test = np.random.randint(2, size=(2,50, 4))
y_test = model.predict(x_test)
UPDATED sample of results with accuracy of 0.3:
[1.8860947] [1 1 0 0]
[1.8860947] [1 1 0 0]
[2.8001838] [1 0 1 1]
[0.9009073] [0 0 1 0]
[1.8860947] [1 1 0 0]
[0.9719887] [0 1 0 0]
[1.0507569] [1 0 0 0]
[1.8607054] [0 0 1 1]
[2.8148863] [0 1 1 1]
[3.6628482] [1 1 1 1]
[0.9719887] [0 1 0 0]
[3.6628482] [1 1 1 1]
[1.8607054] [0 0 1 1]
[0.9009073] [0 0 1 0]
[1.9050169] [0 1 1 0]

As far as I understood, you are trying to get linear results, however you have a non-linear activation at the output layer. Softmax outputs predictions, you should use it when you have more than 2 classes in your dataset, i.e when making classification models. And its' elements sum is always 1. Changing following line can help:
model.add(Dense(1))
model.compile(optimizer=Adam(), loss='mean_squared_error')

Related

Is it necessary to use cross-validation after data is split using StratifiedShuffleSplit?

I used StratifiedShuffleSplit to split the data and now I am wondering if I need to use cross-validation again as I go for building the classification model(Logistic Regression,KNN,Random Forest etc.) I am confused about it because reading the documentation in Sklearn I get the impression that StratifiedShuffleSplit is a mix of splitting the data and cross-validating it at the same time.
StratifiedShuffleSplit provides you just a list with train/test indices. How it will be used depends on you.
You can fit the model with train set and predict on the test and calculate the score manually - so implementing cross validation by yourself
Or you can use cross_val_score and pass StratifiedShuffleSplit() to it and cross_val_score will do the same thing.
Example:
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from sklearn.model_selection import StratifiedShuffleSplit, cross_val_score
X = np.array([[1, 2], [3, 4], [1, 2], [3, 4], [1, 2], [3, 4]])
y = np.array([0, 0, 0, 1, 1, 1])
sss = StratifiedShuffleSplit(n_splits=5, test_size=0.5, random_state=0)
model = RandomForestClassifier(n_estimators=1, random_state=1)
# Calculate scores automatically
accuracy_per_split = cross_val_score(model, X, y, scoring="accuracy", cv=sss, n_jobs=1)
print(f"Accuracies per splits: {accuracy_per_split}")
# Calculate scores manually
accuracy_per_split = []
for train_index, test_index in sss.split(X, y):
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
acc = accuracy_score(y_test, y_pred)
accuracy_per_split.append(acc)
print(f"Accuracies per splits: {accuracy_per_split}")

CNN for non-image data

I am trying to create a model from this https://machinelearningmastery.com/cnn-models-for-human-activity-recognition-time-series-classification/ example that takes as inputs 3 (to unbug, there will be 1000s) inputs which are arrays of dimension (17,40):
[[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 5 5 5]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]]
the output is a single integer between 0 and 8:
[[6]
[3]
[1]]
I use a CNN as follows:
X_train, X_test, y_train, y_test = train_test_split(Xo, Yo)
print("Xtrain", X_train)
print("Y_train", y_train)
print("Xtest", X_test)
print("Y_test", y_test)
print("X_train.shape[1]", X_train.shape[1])
print("X_train.shape[2]", X_train.shape[2])
#print("y_train.shape[1]", y_train.shape[1])
verbose, epochs, batch_size = 1, 10, 10
n_timesteps, n_features, n_outputs = X_train.shape[1], X_train.shape[2], 1
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_timesteps,n_features)))
model.add(Conv1D(filters=64, kernel_size=2, activation='relu'))
model.add(Dropout(0.5))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(10, activation='relu'))
model.add(Dense(1, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
it gives me the following error:
ValueError: You are passing a target array of shape (6, 1)
but in fact it should take only 1 value as output.
Why do I have such error message when it is only supposed to take 1 value as output ?
The Softmax layer size should be equal to the number of classes. Your Softmax layer has only 1 output.
For this classification problem, first of all, you should turn your targets to a one-hot encoded format, then edit the size of the Softmax layer to the number of classes.

Meta classifier based on "or" logic in scikit-learn

How can I build a meta-classifier in scikit-learn out of N binary classifiers which will return 1 if any of the classifiers returns 1?
Currently I've tried VotingClassifier, but it lacks the logic that I need, both with voting equal to hard and soft. Pipeline seems to be oriented towards sequential computation
I can write the logic by myself, but I am wondering if there is anything built-in?
The built-in options are only soft and hard voting. As you mentioned, we can create a custom function to this meta-classifier, which uses OR logic based on the source code here. This custom meta classifier can fit into the pipeline as well.
from sklearn.utils.validation import check_is_fitted
class CustomMetaClassifier(VotingClassifier):
def predict(self, X):
""" Predict class labels for X.
Parameters
----------
X : {array-like, sparse matrix}, shape = [n_samples, n_features]
The input samples.
Returns
----------
maj : array-like, shape = [n_samples]
Predicted class labels.
"""
check_is_fitted(self, 'estimators_')
maj = np.max(eclf1._predict(X), 1)
maj = self.le_.inverse_transform(maj)
return maj
>>> import numpy as np
>>> from sklearn.linear_model import LogisticRegression
>>> from sklearn.naive_bayes import GaussianNB
>>> from sklearn.ensemble import RandomForestClassifier, VotingClassifier
>>> clf1 = LogisticRegression(solver='lbfgs', multi_class='multinomial',
... random_state=1)
>>> clf2 = RandomForestClassifier(n_estimators=50, random_state=1)
>>> clf3 = GaussianNB()
>>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
>>> y = np.array([1, 1, 1, 2, 2, 2])
>>> eclf1 = CustomMetaClassifier(estimators=[
... ('lr', clf1), ('rf', clf2), ('gnb', clf3)])
>>> eclf1 = eclf1.fit(X, y)
>>> eclf1.predict(X)
array([1, 1, 1, 2, 2, 2])

Decrease loss in keras training using lstm

I have an input like this:
x_train = [
[0,0,0,1,-1,-1,1,0,1,0,...,0,1,-1],
[-1,0,0,-1,-1,0,1,1,1,...,-1,-1,0],
...
[1,0,0,1,1,0,-1,-1,-1,...,-1,-1,0]
]
which 1 means increase in one metric and -1 means decrease in it and 0 means no change in the metric. Each array has 83 items for 83 fields and the output (labels) for each array is a categorical array that shows effect of these metrics on a single metric:
[[ 0. 0. 1.]
[ 1. 0. 0.],
[ 0. 0. 1.],
...
[ 0. 0. 1.],
[ 1. 0. 0.]]
I used keras and lstm in the following code:
def train(x, y, x_test, y_test):
x_train = np.array(x)
y_train = np.array(y)
y_train = to_categorical(y_train, 3)
model = Sequential()
model.add(Embedding(x_train.shape[0], output_dim=256))
model.add(LSTM(128))
model.add(Dropout(0.5))
model.add(Dense(3, activation='softmax'))
opt = optimizers.SGD(lr=0.001)
model.compile(loss='categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=128, nb_epoch=100)
y_test = to_categorical(y_test, 3)
score = model.evaluate(x_test, y_test, batch_size=128)
prediction = model.predict(x_test, batch_size=128)
print score
print prediction
but the loss after 100 epochs is:
1618/1618 [==============================] - 0s - loss: 0.7328 - acc: 0.5556
How can I decrease this loss percentage?

Is F1 micro the same as Accuracy?

I have tried many examples with F1 micro and Accuracy in scikit-learn and in all of them, I see that F1 micro is the same as Accuracy. Is this always true?
Script
from sklearn import svm
from sklearn import metrics
from sklearn.cross_validation import train_test_split
from sklearn.datasets import load_iris
from sklearn.metrics import f1_score, accuracy_score
# prepare dataset
iris = load_iris()
X = iris.data[:, :2]
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# svm classification
clf = svm.SVC(kernel='rbf', gamma=0.7, C = 1.0).fit(X_train, y_train)
y_predicted = clf.predict(X_test)
# performance
print "Classification report for %s" % clf
print metrics.classification_report(y_test, y_predicted)
print("F1 micro: %1.4f\n" % f1_score(y_test, y_predicted, average='micro'))
print("F1 macro: %1.4f\n" % f1_score(y_test, y_predicted, average='macro'))
print("F1 weighted: %1.4f\n" % f1_score(y_test, y_predicted, average='weighted'))
print("Accuracy: %1.4f" % (accuracy_score(y_test, y_predicted)))
Output
Classification report for SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape=None, degree=3, gamma=0.7, kernel='rbf',
max_iter=-1, probability=False, random_state=None, shrinking=True,
tol=0.001, verbose=False)
precision recall f1-score support
0 1.00 0.90 0.95 10
1 0.50 0.88 0.64 8
2 0.86 0.50 0.63 12
avg / total 0.81 0.73 0.74 30
F1 micro: 0.7333
F1 macro: 0.7384
F1 weighted: 0.7381
Accuracy: 0.7333
F1 micro = Accuracy
In classification tasks for which every test case is guaranteed to be assigned to exactly one class, micro-F is equivalent to accuracy. It won't be the case in multi-label classification.
This is because we are dealing with a multi class classification , where every test data should belong to only 1 class and not multi label , in such case where there is no TN , we can call True Negatives as True Positives.
Formula wise ,
correction : F1 score is 2* precision* recall / (precision + recall)
Micoaverage precision, recall, f1 and accuracy are all equal for cases in which every instance must be classified into one (and only one) class. A simple way to see this is by looking at the formulas precision=TP/(TP+FP) and recall=TP/(TP+FN). The numerators are the same, and every FN for one class is another classes's FP, which makes the denominators the same as well. If precision = recall, then f1 will also be equal.
For any inputs should should be able to show that:
from sklearn.metrics import accuracy_score as acc
from sklearn.metrics import f1_score as f1
f1(y_true,y_pred,average='micro')=acc(y_true,y_pred)
I had the same issue so I investigated and came up with this:
Just thinking about the theory, it is impossible that accuracy and the f1-score are the very same for every single dataset. The reason for this is that the f1-score is independent from the true-negatives while accuracy is not.
By taking a dataset where f1 = acc and adding true negatives to it, you get f1 != acc.
>>> from sklearn.metrics import accuracy_score as acc
>>> from sklearn.metrics import f1_score as f1
>>> y_pred = [0, 1, 1, 0, 1, 0]
>>> y_true = [0, 1, 1, 0, 0, 1]
>>> acc(y_true, y_pred)
0.6666666666666666
>>> f1(y_true,y_pred)
0.6666666666666666
>>> y_true = [0, 1, 1, 0, 1, 0, 0, 0, 0]
>>> y_pred = [0, 1, 1, 0, 0, 1, 0, 0, 0]
>>> acc(y_true, y_pred)
0.7777777777777778
>>> f1(y_true,y_pred)
0.6666666666666666

Resources