Why are my inception-v3 (in Keras) predictions all wrong? - machine-learning

I am an ML beginner and simply implementing inception-v3 using the ImageNet weights. This is my first run at it. My implementation is in Keras. My predictions are all wrong and I need a little foot up, to see what the problem is. It is actaully pretty difficult to find an example of inception-v3 used from top to bottom using Keras online. Most are tutorials on transfer learning. Here is my code.
import keras as k
from keras.applications.inception_v3 import InceptionV3
from keras.applications.imagenet_utils import preprocess_input, decode_predictions
from keras.preprocessing import image
import cv2
import numpy as np
model = k.applications.inception_v3.InceptionV3(include_top=True, weights='imagenet', input_tensor=None, input_shape=None)
im = 'images/cat.jpg'
cv2.imread(im).shape
(168, 299, 3)
im = cv2.resize(cv2.imread(im), (299, 299)).astype(np.float32)
im = np.expand_dims(im, axis=0)
im.shape
(1, 299, 299, 3)
preds = model.predict(im)
print('Predicted:', decode_predictions(preds
Predicted: [[('n03047690', 'clog', 1.0), ('n01924916', 'flatworm', 7.0789714e-11), ('n03950228', 'pitcher', 2.1705252e-11), ('n02841315', 'binoculars', 4.1622389e-13), ('n06359193', 'web_site', 3.8697981e-16)]]
Could someone suggest how this most basic of implementations is wrong. Perhaps my input shape is incorrect?

The inception-v3 model requires that you run your image through preprocess_input() before predicting.
add:
im=preprocess_input(im)
Also, you should import preprocess_input from keras.applications.inception_v3 and not from keras.applications.imagenet_utils

Related

How does the training happen in LBPH recognizer in open-cv?

import os
import cv2
import numpy as np
from PIL import Image
class training:
def train():
reconizer = cv2.face.LBPHFaceRecognizer_create()
\\ some code part
reconizer.train(faces,Id) \\\\ <----- this line
reconizer.save('reconizer/traindata.yml')
cv2.destroyAllWindows()
What is exactly getting trained in this step. is it some machine learning algorithm going behind or else ?

Flatten layer incompatible with input

I am trying to run the code
import data_processing as dp
import numpy as np
test_set = dp.read_data("./data2019-12-01.csv")
import tensorflow as tf
import keras
def train_model():
autoencoder = keras.Sequential([
keras.layers.Flatten(input_shape=[400]),
keras.layers.Dense(150,name='bottleneck'),
keras.layers.Dense(400,activation='sigmoid')
])
autoencoder.compile(optimizer='adam',loss='mse')
return autoencoder
trained_model=train_model()
trained_model.load_weights('./weightsfile.h5')
trained_model.evaluate(test_set,test_set)
The test_set in line 3 is of numpy array of shape (3280977,400). I am using keras 2.1.4 and tensorflow 1.5.
However, this puts out the following error
ValueError: Input 0 is incompatible with layer flatten_1: expected min_ndim=3, found ndim=2
How can I solve it? I tried changing the input_shape in flatten layer and also searched on the internet for possible solutions but none of them worked out. Can anyone help me out here? Thanks
After much trial and error, I was able to run the code. This is the code which runs:-
import data_processing as dp
import numpy as np
test_set = np.array(dp.read_data("./datanew.csv"))
print(np.shape(test_set))
import tensorflow as tf
from tensorflow import keras
# import keras
def train_model():
autoencoder = keras.Sequential([
keras.layers.Flatten(input_shape=[400]),
keras.layers.Dense(150,name='bottleneck'),
keras.layers.Dense(400,activation='sigmoid')
])
autoencoder.compile(optimizer='adam',loss='mse')
return autoencoder
trained_model=train_model()
trained_model.load_weights('./weightsfile.h5')
trained_model.evaluate(test_set,test_set)
The change I made is I replaced
import keras
with
from tensorflow import keras
This may work for others also, who are using old versions of tensorflow and keras. I used tensorflow 1.5 and keras 2.1.4 in my code.
Keras and TensorFlow only accept batch input data for prediction.
You must 'simulate' the batch index dimension.
For example, if your data is of shape (M x N), you need to feed at the prediction step a tensor of form (K x M x N), where K is the batch_dimension.
Simulating the batch axis is very easy, you can use numpy to achieve that:
Using: np.expand_dims(axis = 0), for an input tensor of shape M x N, you now have the shape 1 x M x N. This why you get that error, that missing '1' or 'K', the third dimension is that batch_index.

How to know if my data has been scaled by StandardScaler?

"I have scaled my dataset by using Standard Scaler , Now how to know it has been scaled, I am sure it has been scaled but how to see it"
As #Coderji said you can always find out the mean and standard deviation, which should be equal to 0 and 1 respectively.
However, there is another method to visualize it.
from sklearn import datasets
import numpy as np
from sklearn.preprocessing import StandardScaler
I am using iris dataset for this example.
iris = datasets.load_iris()
X = iris.data
sc = StandardScaler()
sc.fit(X)
x = sc.transform(X)
import matplotlib.pyplot as plt
import seaborn as sns
sns.distplot(x[:,1])
See this Output for sepel length
Similarly you can see for all variables or a simple pairplot will do the job.
This gives an idea that the data is standardised visually.

how to get more accuracy on CNN with less number of images

currently I am working on flower Classification dataset of kaggle which has only 210 images, with this set of image I am getting accuracy of only 11% on validation set.
enter code here
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import cv2
#from tqdm import tqdm
import os
import warnings
warnings.filterwarnings('ignore')
flower_img = r'C:\Users\asus\Downloads\flower_images\flower_images'
data = pd.read_csv(r'C:\Users\asus\Downloads\flower_images\flower_labels.csv')
img = os.listdir(flower_img)[1]
image_name = [img.split('.')[-2] for img in os.listdir(flower_img)]
label_array = np.array(data['label'])
label_unique = np.unique(label_array)
names = [' phlox','rose','calendula','iris','leucanthemum maximum','bellflower','viola','rudbeckia laciniata','peony','aquilegia']
Flower_names = {}
for i in range(10):
Flower_names[i] = names[i]
print(Flower_names)
Flower_names.get(8)
x = data['label'][2]
Flower_names.get(x)
i=0
for img in os.listdir(flower_img):
#print(img)
path = os.path.join(flower_img,img)
#img = cv2.imread(path,cv2.IMREAD_GRAYSCALE)
img = cv2.imread(path)
#print(img.shape)
img = cv2.resize(img,(128,128))
data['file'][i] = np.array(img)
i+=1
data['file'][0].shape
plt.imshow(data['file'][0])
plt.show()
import keras
from keras.models import Sequential
from keras.layers import Dense,Conv2D,Activation,MaxPool2D,Dropout,Flatten
model = Sequential()
model.add(Conv2D(32,kernel_size=3,activation='relu',input_shape=(128,128,3)))
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Conv2D(64,kernel_size=3,activation='relu'))
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Conv2D(128,kernel_size=3,activation='relu'))
model.add(MaxPool2D(pool_size=(2,2)))
#model.add(Conv2D(512,kernel_size=3,activation='relu'))
#model.add(MaxPool2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(512,activation='relu'))
model.add(Dense(10,activation='softmax'))
model.add(Dropout(0.25))
from keras.optimizers import Adam
model.compile(loss='categorical_crossentropy',optimizer=Adam(lr=0.002),metrics=['accuracy'])
model.summary()
x = np.array([i for i in data['file']]).reshape(-1,128,128,3)
y = np.array([i for i in data['label']])
from keras.utils import to_categorical
y = to_categorical(y)
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(x,y)
model.fit(x_train,y_train,validation_data=(x_test,y_test),epochs=10)
model.evaluate(x_test,y_test)
model.evaluate(x_train,y_train)
how can I increase accuracy only using this dataset also how can I predict classes for any input image.
Link of Flower color images dataset : https://www.kaggle.com/olgabelitskaya/flower-color-images
Your dataset size is very small. Convolutional neural networks are optimal when trained using very large data sets. You really want to have thousands of images (or more!) in your data set.
You can try to enhance your current data set by using various image processing techniques to increase the size of the data set. These techniques will take the original images, skew them, rotate them and do other modification to bolster the size of the training data. These techniques can be helpful, but increasing the natural size of the data set is preferred.
If you cannot increase the size of the dataset, you should examine why you need to use a CNN. There are other algorithms that may give better results when trained with a smaller data set. Take a look at Support Vector Machines or k-NN.
If you must use a CNN, Transfer Learning is a good solution. You can use the features from a trained model and apply them to your problem. I have had great success with this approach.
The things you can do:
Progressive resizing link
Image augmentation link
Transfer learning link
To be honest, there are much and much more techniques could be utilized to enhance the effectiveness of used data. Try to search about this topic. These ones are the ones that I remember in a minute. These ones that I've given link are just major example ones. You can dig better with a dedicated research.

Accuracy matrics isn't working on linear regression

Kindly help here:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
X = [[1.1],[1.3],[1.5],[2],[2.2],[2.9],[3],[3.2],[3.2],[3.7],[3.9],[4],[4],[4.1],[4.5],[4.9],[5.1],[5.3],[5.9],[6],[6.8],[7.1],[7.9],[8.2],[8.7],[9],[9.5],[9.6],[10.3],[10.5]]
y = [39343,46205,37731,43525,39891,56642,60150,54445,64445,57189,63218,55794,56957,57081,61111,67938,66029,83088,81363,93940,91738,98273,101302,113812,109431,105582,116969,112635,122391,121872]
#implement the dataset for train & test
from sklearn.cross_validation import train_test_split
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size = 1/3,random_state=0)
#implement our classifier based on Simple Linear Regression
from sklearn.linear_model import LinearRegression
SimpleLinearRegression = LinearRegression()
SimpleLinearRegression.fit(X_train,y_train)
y_predict= SimpleLinearRegression.predict(X_test)
from sklearn.metrics import accuracy_score
print(accuracy_score(y_test,y_predict))
I'm sure I'm missing something here, is there some other way to calculate accuracy score for regression? Thanks in advance :)
Accuracy as a metric is applicable to a classification problem, as it is defined as a fraction of labels that is correctly predicted. In your case you do a regression (LinearRegression), i.e. your target variable is continuous. So either you picked a wrong model my mistake or accuracy is a wrong metric for your problem.
You can use mean absolute error and mean squared error.
from sklearn.metrics import mean_absolute_error, mean_squared_error
import numpy as np
MAE = mean_absolute_error(y_test, y_predict)
RMSE = np.sqrt(mean_squared_error(y_test, y_predict))
We can't use accuracy for regression problems, Its only used in classification problem.
You can use MSE,RMSE,MAPE,MAE as the matrix to determine how good your regression model is.
These values tell us how far we are from the correct predictions. The lower values are better for these cases.

Resources