If in KNN classifier with K=7, suppose i have classified image pixels based on euclidean distance and got an array of size 7 with 3 belonging to class 11, 3 to class 12 and 1 to class 2, to which class should i classify that object to?(I just simply excluded those classes) Or what other case should i include to handle it. Please suggest using scikit-learn or any other python library.This is the code for that array:
# storing class numbers
new_dist=mat_dist[np.argsort(mat_dist[:,0])] # to sort by distances
counter= np.zeros((17,))
for x in range(k):
counter[int(new_dist[x][1])]+=1
if np.count_nonzero(counter == max(counter))>1:
class_index=-1
exclusion+=1 # cases of more than one class with max occurence is excluded
else:
class_index = np.argmax(counter)
Related
Here's my code:
# Load libraries
import numpy as np
from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import CountVectorizer
# Create text
text_data = np.array(['Tim is smart!',
'Joy is the best',
'Lisa is dumb',
'Fred is lazy',
'Lisa is lazy'])
# Create target vector
y = np.array([1,1,0,0,0])
# Create bag of words
count = CountVectorizer()
bag_of_words = count.fit_transform(text_data) #
# Create feature matrix
X = bag_of_words.toarray()
mnb = MultinomialNB(alpha = 1, fit_prior = True, class_prior = None)
mnb.fit(X,y)
print(count.get_feature_names())
# output:['best', 'dumb', 'fred', 'is', 'joy', 'lazy', 'lisa', 'smart', 'the', 'tim']
print(mnb.feature_log_prob_)
# output
[[-2.94443898 -2.2512918 -2.2512918 -1.55814462 -2.94443898 -1.84582669
-1.84582669 -2.94443898 -2.94443898 -2.94443898]
[-2.14006616 -2.83321334 -2.83321334 -1.73460106 -2.14006616 -2.83321334
-2.83321334 -2.14006616 -2.14006616 -2.14006616]]
My question is:
Let's say for word: "best": the probability for class 1 : -2.14006616.
What is the formula to calculate to get this score.
I am using LOG (P(best|y=class=1)) -> Log(1/2) -> can't get the -2.14006616
From the documentation we can infer that feature_log_prob_ corresponds to the empirical log probability of features given a class. Let's take an example feature "best" for the purpose of this illustration, the log probability of this feature for class 1 is -2.14006616 (as you pointed out), now if we were to convert it into actual probability score it will be np.exp(1)**-2.14006616 = 0.11764. Let's take one more step back to see how and why the probability of "best" in class 1 is 0.11764. As per the documentation of Multinomial Naive Bayes, we see that these probabilities are computed using the formula below:
Where, the numerator roughly corresponds to the number of times feature "best" appears in the class 1 (which is of our interest in this example) in the training set, and the denominator corresponds to the total count of all features for class 1. Also, we add a small smoothing value, alpha to prevent from the probabilities going to zero and n corresponds to the total number of features i.e. size of vocabulary. Computing these numbers for the example we have,
N_yi = 1 # "best" appears only once in class `1`
N_y = 7 # There are total 7 features (count of all words) in class `1`
alpha = 1 # default value as per sklearn
n = 10 # size of vocabulary
Required_probability = (1+1)/(7+1*10) = 0.11764
You can do the math in a similar fashion for any given feature and class.
I have 8 classes in dataset of handwritten text and symbols. but it only predicts only one type of out of the 8 classes. I'm using matterport implementation.
class SymbolConfig(Config):
# Give the configuration a recognizable name
NAME = "symbols"
IMAGES_PER_GPU = 2
# Number of classes (including background)
NUM_CLASSES = 1 + 7 # Background +
# Number of training steps per epoch
STEPS_PER_EPOCH = 100
# Skip detections with < 90% confidence
DETECTION_MIN_CONFIDENCE = 0.9
Because you also need to add the classes in the load_custom() function.
like:
self.add_class("datasetName", 1, "class1")
self.add_class("datasetName", 2, "class2")
......
self.add_class("datasetName", 8, "class8")
What I want to do is to generate a score of 0-100 based on the predictions of a three class classification model.
For eg. The predict_proba of a 3 class logistic regression model gives me 3 probabilities x, y, z as shown below -
0 1 2
x y z
Now, I want to generate a score of 0-100 based on these probabilities, where 0 is closer to class 0 and 100 is closer to class 2.
Try this:
prob['P']=(prob['1']*1+prob['2']*2)/2
prob['0'] is multiplied by 0, so you don't need it.
examples:
prob['0']=0.5, prob['1']=0.5, prob['2']=0==>prob['P']=0.25
prob['0']=0.75, prob['1']=0.25, prob['2']=0==>prob['P']=0.125
prob['0']=0.1, prob['1']=0.2, prob['2']=0.7==>prob['P']=0.8
prob['0']=0, prob['1']=0, prob['2']=1==>prob['P']=1
I am trying to create a convolutional neural network for image classification using one of the open access github codes. I have two classes of images. But, when I start running the one part of the code I keep getting this error
/Users/user/anaconda/envs/tensorflow/lib/python3.5/site-packages/ipykernel/__main__.py:46: DeprecationWarning: elementwise == comparison failed; this will raise an error in the future.
This is the part of code that has error (although the origin of this error are probably somewhere else, my intuition tells me that it lies in the labelling of images, but I am not sure how to fix that, I tried relabelling multiple times, nothing worked to fix this).
def print_test_accuracy(show_example_errors=False,
show_confusion_matrix=False):
# Number of images in the test-set.
num_test = len(test_images)
# Allocate an array for the predicted classes which
# will be calculated in batches and filled into this array.
cls_pred = np.zeros(shape=num_test, dtype=np.int)
# Now calculate the predicted classes for the batches.
# We will just iterate through all the batches.
# There might be a more clever and Pythonic way of doing this.
# The starting index for the next batch is denoted i.
i = 0
while i < num_test:
# The ending index for the next batch is denoted j.
j = min(i + test_batch_size, num_test)
# Get the images from the test-set between index i and j.
images = test_images[i:j, :]
# Get the associated labels.
labels = test_labels[i:j, :]
# Create a feed-dict with these images and labels.
feed_dict = {x: images,
y_true: labels}
# Calculate the predicted class using TensorFlow.
cls_pred[i:j] = session.run(y_pred_cls, feed_dict=feed_dict)
# Set the start-index for the next batch to the
# end-index of the current batch.
i = j
# Convenience variable for the true class-numbers of the test-set.
cls_true = test_class_labels
# Create a boolean array whether each image is correctly classified.
correct = (cls_true == cls_pred)
# Calculate the number of correctly classified images.
# When summing a boolean array, False means 0 and True means 1.
correct_sum = sum(correct)
# Classification accuracy is the number of correctly classified
# images divided by the total number of images in the test-set.
acc = float(correct_sum) / num_test
# Print the accuracy.
msg = "Accuracy on Test-Set: {0:.1%} ({1} / {2})"
print(msg.format(acc, correct_sum, num_test))
# Plot some examples of mis-classifications, if desired.
if show_example_errors:
print("Example errors:")
plot_example_errors(cls_pred=cls_pred, correct=correct)
# Plot the confusion matrix, if desired.
if show_confusion_matrix:
print("Confusion Matrix:")
plot_confusion_matrix(cls_pred=cls_pred)
Try tf.equal:
correct = tf.equal(cls_pred, cls_true)
or, if it is a probability distribution rather than just the argmax already:
correct = tf.equal(tf.argmax(cls_pred, 1), tf.argmax(cls_true, 1))
I have a one-vs-all classifier set. This set consists of, let's say, 3 classifiers (LibSVM SVMs) each trained on data for a class and all other class data. The current setup for a sample is that the classifier of the 3 classes that gives the highest score is said to be the matching class.
This setup gives a FAR and FRR result. The issue is that the FAR and FRR results are not enough to construct an ROC curve, which I need. I am wondering what I can do to produce and ROC curve.
This can be done using "multiclass ROC curves" (see e.g. this answer for more details). Usually, you either look at each class individually, or even at each pair of classes individually. I'll provide a short R example of how the first one could look, which is less complicated and still gives a good feeling for how well individual classes could be recognized.
You first need to obtain some class probabilities (for reproducibility, this is what you already have):
# Computing some class probabilities for a 3 class problem using repeated cross validation
library(caret)
model <- train(x = iris[,1:2], y = iris[,5], method = 'svmLinear', trControl = trainControl(method = 'repeatedcv', number = 10, repeats = 10, classProbs = T, savePredictions = T))
# those are the class probabilities for each sample
> model$pred
pred obs setosa versicolor virginica rowIndex C Resample
[...]
11 virginica virginica 1.202911e-02 0.411723759 0.57624713 116 1 Fold01.Rep01
12 versicolor virginica 4.970032e-02 0.692146087 0.25815359 122 1 Fold01.Rep01
13 virginica virginica 5.258769e-03 0.310586094 0.68415514 125 1 Fold01.Rep01
14 virginica virginica 4.321882e-05 0.202372698 0.79758408 131 1 Fold01.Rep01
15 versicolor virginica 1.057353e-03 0.559993337 0.43894931 147 1 Fold01.Rep01
[...]
Now you can look at the ROC curve for each class individually. For each curve, the FRR indicates the rate how often samples of this class were predicted as samples of some other class, while the FAR indicates the rate how often a sample of some other class was predicted as a sample of this class:
plot(roc(predictor = model$pred$setosa, response = model$pred$obs=='setosa'), xlab = 'FAR', ylab = '1-FRR')
plot(roc(predictor = model$pred$versicolor, response = model$pred$obs=='versicolor'), add=T, col=2)
plot(roc(predictor = model$pred$virginica, response = model$pred$obs=='virginica'), add=T, col=3)
legend('bottomright', legend = c('setosa', 'versicolor', 'virginica'), col=1:3, lty=1)
As mentioned before, you could instead also look at the ROC curve for each pair of classes, but IMHO this transports much more information, hence it is more complicated/takes longer to grasp the contained information.