Keras pretrained Xception model always gives the prediction 'sewing_machine' - machine-learning

I'm using Keras pretrained model 'Xception' to do image recognition. However, no matter what picture I give Xception, the predictions are always:
Predicted: [[('n04179913', 'sewing_machine', 1.0), ('n15075141,
toilet_tissue', 0.0), ('n02317335', 'starfish', 0.0), ('n02389026,
sorrel', 0.0), ('n02364673', 'guinea_pig', 0.0)]]
Is there anything wrong with my code?
My code is:
from tensorflow.contrib.keras import applications as app
from tensorflow.contrib.keras import preprocessing as pp
import numpy as np
model = app.Xception(weights='imagenet', include_top=True)
img_path = 'test123.jpg'
img = pp.image.load_img(path=img_path, target_size=(299, 299))
x = pp.image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = app.xception.preprocess_input(x)
preds = model.predict(x)
print('Predicted:', app.xception.decode_predictions(preds))

Normalize the image by x/255. just before predict function call
As per my understanding, Xception module is trained on normalized intensities. I faced the same problem. So, i normalized the pixel intensities by dividing them by 255. You can try the same. I hope it helps.

Related

Unexpected behaviour (inflated results on random-data) in scikit-learn with nested cross-validation

When trying to train/evaluate a support vector machine in scikit-learn, I am experiencing some unexpected behaviour and I am wondering whether I am doing something wrong or that this is a possible bug.
In a very specific subset of circumstances, nested cross-validation using GridSearchCV and SVM, provides inflated predictive results, even with randomly generated data.
For instance, see this code:
from sklearn import svm
from sklearn.linear_model import LogisticRegression
import numpy as np
from sklearn.model_selection import GridSearchCV, StratifiedKFold, LeaveOneOut
from sklearn.metrics import roc_auc_score, brier_score_loss
from tqdm import tqdm
import pandas as pd
N = 20
N_FEATURES = 50
param_grid = {'C': [1e-5, 1e-3, 1, 1e3, 1e5]}
scores = []
for z in tqdm(range(100)):
X = np.random.uniform(size=(N, N_FEATURES))
y = np.random.binomial(1, 0.5, size=N)
if z < 10:
y = np.array([0, 1] * int(N/2))
y = np.random.permutation(y)
for skf_outer in [StratifiedKFold(n_splits=5), LeaveOneOut()]:
for skf_inner in [5, LeaveOneOut()]:
for model in [svm.SVC(probability=True), LogisticRegression()]:
y_pred, y_real = [], []
for train_index, test_index in skf_outer.split(X, y):
X_train, X_test = X[train_index], X[test_index, :]
y_train, y_test = y[train_index], y[test_index]
clf = GridSearchCV(
model, param_grid, cv=skf_inner, n_jobs=-1, scoring='neg_brier_score'
)
clf.fit(X_train, y_train)
predictions = clf.predict_proba(X_test)[:, 1]
y_pred.extend(predictions)
y_real.extend(y_test)
scores.append([str(skf_outer), str(skf_inner), str(model), np.mean(y), brier_score_loss(np.array(y_real), np.array(y_pred)), roc_auc_score(np.array(y_real), np.array(y_pred))])
df_scores = pd.DataFrame(scores)
df_scores.columns = ['skf_outer', 'skf_inner', 'model', 'y_label', 'brier', 'auc']
df_scores['y_0.5'] = df_scores['y_label'] == 0.5
df_scores = df_scores.groupby(['skf_outer', 'skf_inner', 'model', 'y_0.5']).mean()
print(df_scores)
In the following circumstances:
Both in the inner- and outerloop of the CV, LeaveOneOut() is used
The SVM is used
The y labels are balanced (i.e. the mean of y is 0.5)
The predictions are much better than expected by random chance (AUC>0.9, sometimes even 1, Brier of 0.15 or lower). I can replicate this generating more samples, more features etc - the issue stays the same. Swapping the SVM for LogisticRegression (as shown in the analysis above), leads to expected results (AUC 0.5, Brier of 0.25). And for the other scenario's (no LOO-CV in either inner or outer loop, or a different distribution of y labels), the results are as expected.
Can anyone replicate this? Am I missing something obvious?
I've replicated this with an older version of sklearn (0.24.0) and the newest one (1.2.0).

sklearn GP return std dev is zero for predictions where it must be large

I am trying regression using Gaussian processes sklearn package. The standard deviation on predictions are zero, where it must be larger.
kernel = ConstantKernel() + 1.0 * DotProduct() ** 0.3 + 1.0 * WhiteKernel()
gpr = GaussianProcessRegressor(
kernel=kernel,
alpha=0.3,
normalize_y=True,
random_state=123,
n_restarts_optimizer=0
)
gpr.fit(X_train, y_train)
Here I have shown the samples from posterior after training the model. It clearly shows the standard deviation increases along with x-axis.
This is the output I got. As the value increases along x-axis the stddev must increase, where as it is showing zero stddev.
Acutal results should look something like this.
Is it a bug ?
Full Code to reproduce the issue.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import ConstantKernel, WhiteKernel, DotProduct
df = pd.read_csv('train.csv')
X_train = df[:,0].to_numpy().reshape(-1,1)
y_train = df[:,1].to_numpy()
X_pred = np.linspace(0.01, 8.5, 1000).reshape(-1,1)
# Instantiate a Gaussian Process model
kernel = ConstantKernel() + 1.0 * DotProduct() ** 0.3 + 1.0 * WhiteKernel()
gpr = GaussianProcessRegressor(
kernel=kernel,
alpha=0.3,
normalize_y=True,
random_state=123,
n_restarts_optimizer=0
)
gpr.fit(X_train, y_train)
print(
f"Kernel parameters before fit:\n{kernel} \n"
f"Kernel parameters after fit: \n{gpr.kernel_} \n"
f"Log-likelihood: {gpr.log_marginal_likelihood(gpr.kernel_.theta):.3f} \n"
f"Score = {gpr.score(X_train,y_train)}"
)
n_samples = 10
y_samples = gpr.sample_y(X_pred, n_samples)
for idx, single_prior in enumerate(y_samples.T):
plt.plot(
X_pred,
single_prior,
linestyle="--",
alpha=0.7,
label=f"Sampled function #{idx + 1}",
)
plt.title('Sample from posterior distribution')
plt.show()
y_pred, sigma = gpr.predict(X_pred, return_std=True)
plt.figure(figsize=(10,6))
plt.plot(X_train, y_train, 'r.', markersize=3, label='Observations')
plt.plot(X_pred, y_pred, 'b-', label='Prediction',)
plt.fill_between(X_pred[:,0], y_pred-1*sigma, y_pred+1*sigma,
alpha=.4, fc='b', ec='None', label='68% confidence interval')
plt.fill_between(X_pred[:,0], y_pred-2*sigma, y_pred+2*sigma,
alpha=.3, fc='b', ec='None', label='95% confidence interval')
plt.fill_between(X_pred[:,0], y_pred-3*sigma, y_pred+3*sigma,
alpha=.1, fc='b', ec='None', label='99% confidence interval')
plt.legend()
plt.show()
Not really an answer but something to look out for that maybe it might help. I was having the same problem and had some results when changing the alpha, some kernel parameters or normalizing the data.
Probably it was due to a matter of scale (with big numbers, the std dev is too small in proportion)

How can I plot a confusion matrix for image dataset from directory?

I've built up my own neural model, trained it, and got 99.58% accuracy. But I am facing a problem with plotting the confusion matrix. There are some examples available for flow_from_directory but no examples exist for image_dataset_from_directory. Can anyone help me?
See the post How to plot confusion matrix for prefetched dataset in Tensorflow using
true_categories = tf.concat([y for x, y in val_ds], axis=0)
to get the true labels for the validation set. Then you can plot the confusion matrix with something like this
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import confusion_matrix
cm = confusion_matrix(true_categories, predicted_id)
fig = plt.figure(figsize = (8,8))
ax1 = fig.add_subplot(1,1,1)
sns.set(font_scale=1.4) #for label size
sns.heatmap(cm, annot=True, annot_kws={"size": 12},
cbar = False, cmap='Purples');
ax1.set_ylabel('True Values',fontsize=14)
ax1.set_xlabel('Predicted Values',fontsize=14)
plt.show()
Here is the code I created to be able to assemble the matrix of confusion
Note:
test_dataset is a tf.data.Dataset variable.
I used validation_dataset = tf.keras.preprocessing.image_dataset_from_directory()
import tensorflow as tf
y_true = []
y_pred = []
for x,y in validation_dataset:
y= tf.argmax(y,axis=1)
y_true.append(y)
y_pred.append(tf.argmax(model.predict(x),axis = 1))
y_pred = tf.concat(y_pred, axis=0)
y_true = tf.concat(y_true, axis=0)

How to extract coefficients from fitted pipeline for penalized logistic regression?

I have a set of training data that consists of X, which is a set of n columns of data (features), and Y, which is one column of target variable.
I am trying to train my model with logistic regression using the following pipeline:
pipeline = sklearn.pipeline.Pipeline([
('logistic_regression', LogisticRegression(penalty = 'none', C = 10))
])
My goal is to obtain the values of each of the n coefficients corresponding to the features, under the assumption of a linear model (y = coeff_0 + coeff_1*x1 + ... + coeff_n*xn).
What I tried was to train this pipeline on my data with model = pipeline.fit(X, Y). So I think that I now have the model that contains the coefficients that I want. However, I don't know how to access them. I'm looking for something like mode.best_params_('logistic_regression').
Does anyone know how to extract the fitted coefficients from a model like this?
Have a look at the scikit-learn documentation for Pipeline, this example is inspired by it:
from sklearn import svm
from sklearn.datasets import make_classification
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import f_regression
from sklearn.pipeline import Pipeline
# generate some data to play with
X, y = make_classification(n_informative=5, n_redundant=0, random_state=42)
# ANOVA SVM-C
anova_filter = SelectKBest(f_regression, k=5)
clf = svm.SVC(kernel='linear')
anova_svm = Pipeline([('anova', anova_filter), ('svc', clf)])
anova_svm.set_params(anova__k=10, svc__C=.1).fit(X, y)
# access coefficients
print(anova_svm['svc'].coef_)
model.coef_ does the job, .best_params_ is usualy associated with GridSearch, i.e. hyperparameter optimization.
In your specific case try: model['logistic_regression'].coefs_.
Example to get the coefs from a pipeline.
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.pipeline import Pipeline
X, y = load_iris(return_X_y=True)
pipeline = Pipeline([('lr', LogisticRegression(penalty = 'l2',
C = 10))])
pipeline.fit(X, y)
pipeline['lr'].coef_
array([[-0.42923513, 2.08235619, -4.28084811, -1.97174699],
[ 1.06321671, -0.08077595, -0.46911772, -2.3221883 ],
[-0.63398158, -2.00158024, 4.74996583, 4.29393529]])
here is how to visualize the coefficients and measure model accuracy. I used the baby weight and height and gestation period to predict preterm
pipeline = Pipeline([('lr', LogisticRegression(penalty='l2',C=10))])
scaler=StandardScaler()
#X=np.array(df['gestation_wks']).reshape(-1,1)
X=scaler.fit_transform(df[['bwt_lbs','height_ft','gestation_wks']])
y=np.array(df['PreTerm'])
X_train,X_test, y_train,y_test=train_test_split(X,y,test_size=0.3,random_state=123)
pipeline.fit(X_train,y_train)
y_pred_prob=pipeline.predict_proba(X_test)
predictions=pipeline.predict(X_test)
print(predictions)
sns.countplot(x=predictions, orient='h')
plt.show()
#print(predictions[:,0])
print(pipeline['lr'].coef_)
print(pipeline['lr'].intercept_)
print('Coefficients close to zero will contribute little to the end result')
num_err = np.sum(y != pipeline.predict(X))
print("Number of errors:", num_err)
def my_loss(y,w):
s = 0
for i in range(y.size):
# Get the true and predicted target values for example 'i'
y_i_true = y[i]
y_i_pred = w[i]
s = s + (y_i_true - y_i_pred)**2
return s
print("Loss:",my_loss(y_test,predictions))
fpr, tpr, threshholds = roc_curve(y_test,y_pred_prob[:,1])
plt.plot([0, 1], [0, 1], 'k--')
plt.plot(fpr, tpr)
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve')
plt.show()
accuracy=round(pipeline['lr'].score(X_train, y_train) * 100, 2)
print("Model Accuracy={accuracy}".format(accuracy=accuracy))
cm=confusion_matrix(y_test,predictions)
print(cm)

what should be the dimension of image in Convolutional neural network

Beginner to Deep learning..
I'm trying to identify the slum using satellite images(google map) for Pune city. So, in training dataset i have provided about 100 images of slum and 100 images of other area. But my model is not able to classify input image properly even though accuracy rate is high.
I think this might be because of dimensions of image.
I'm resizing all images to 128*128 pixel.
Kernal size is 3*3.
Link to the map:
https://www.google.co.in/maps/#18.5129661,73.822531,286m/data=!3m1!1e3?hl=en
Following is the code
import os,cv2
import glob
import numpy as np
from keras.utils import plot_model
from keras.utils.np_utils import to_categorical
from sklearn.utils import shuffle
from sklearn.model_selection import train_test_split
from keras.models import Model
from keras.layers import Input, Convolution2D, MaxPooling2D, Flatten, Dense, Dropout
PATH = os.getcwd()
data_path = PATH + '/dataset/*'
files = glob.glob(data_path)
X = []
for myFiles in files:
image = cv2.imread(myFiles)
image_resize = cv2.resize(image, (256, 256))
X.append(image_resize)
image_data = np.array(X)
image_data = image_data.astype('float32')
image_data /= 255
print("Image_data shape ", image_data.shape)
no_of_classes = 2
no_of_samples = image_data.shape[0]
label = np.ones(no_of_samples, dtype='int64')
label[0:86] = 0 #Slum
label[87:] = 1 #noSlum
Y = to_categorical(label, no_of_classes)
#shuffle dataset
x,y = shuffle(image_data , Y, random_state = 2)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state = 2)
#print(x_train)
#print(y_train)
input_shape = image_data[0].shape
input = Input(input_shape)
conv_1 = Convolution2D(32,(3,3), padding='same', activation='relu')(input)
conv_2 = Convolution2D(32,(3,3), padding = 'same', activation = 'relu')(conv_1)
pool_1 = MaxPooling2D(pool_size = (2,2))(conv_2)
drop_1 = Dropout(0.5)(pool_1)
conv_3 = Convolution2D(64,(3,3), padding='same', activation='relu')(drop_1)
conv_4 = Convolution2D(64,(3,3), padding='same', activation = 'relu')(conv_3)
pool_2 = MaxPooling2D(pool_size = (2,2))(conv_4)
drop_2 = Dropout(0.5)(pool_2)
flat_1 = Flatten()(drop_2)
hidden = Dense(64,activation='relu')(flat_1)
drop_3 = Dropout(0.5)(hidden)
out = Dense(no_of_classes,activation = 'softmax')(drop_3)
model = Model(inputs = input, outputs = out)
model.compile(loss = 'categorical_crossentropy', optimizer = 'rmsprop', metrics= ['accuracy'])
model.fit(x_train,y_train,batch_size=10,nb_epoch=20,verbose =1, validation_data=(x_test,y_test))
model.save('model.h5')
score = model.evaluate(x_test,y_test,verbose=1)
print('Test Loss: ',score[0])
print('Test Accuracy: ',score[1])
test_image = x_test[0:1]
print(test_image.shape)
print (model.predict(test_image))
Usually, the behavior you've described above resembles to the inability of NN to identify small objects on input images. Just imagine you give an image of 128*128 with rough noise where nothing is seen - you want NN to correctly classify objects?
What to do?
1) Try to manually convert some input image from your dataset to 128*128 size and see on what data you truly train your NN. So, it'll give you more insight --> maybe you need to have better image's dimension size
2) Add more Conv layers with more neurons that will give you ability to detect small and more sophisticated objects by adding more non-linearity to your output function. Google such great Neural Network structures as ResNet.
3) Add more training data, 100 images isn't enough to have an appropriate result
4) Add data augmentation technique as well ( Rotations seem so strong in your case )
And don't give up :) Eventually, you'll solve it out. Good Luck

Resources