How to get the middle point of contour heapmap(s) - opencv

I want to use GradCAM activation to infer the obeject location on the image, it is not really a problem if there is only one contour heatmap since I can simply use argmax to get that. But I want to be able to grab more than one heatmap hoping that with more than one heatmap we can point the location more accurately.
Here is the example
import matplotlib.pyplot as plt
activation_map = [[0.0724, 0.0615, 0.0607, 0.0710, 0.0000, 0.0000, 0.0154],
[0.1111, 0.0835, 0.0923, 0.0409, 0.0000, 0.0000, 0.0000],
[0.0986, 0.0860, 0.1138, 0.0706, 0.0144, 0.0000, 0.0000],
[0.1134, 0.1109, 0.2244, 0.3414, 0.2652, 0.2708, 0.1664],
[0.1165, 0.1620, 0.5605, 0.7064, 0.4593, 0.6628, 0.6103],
[0.0852, 0.2324, 1.0000, 0.8605, 0.5095, 0.8457, 0.8332],
[0.0349, 0.2422, 0.9287, 0.5717, 0.2054, 0.4749, 0.6983]]
plt.imshow(activation_map)
After I rescale this activation map to proper scaling it looks something like this
import cv2
activation_map_resized = cv2.resize(np.array(activation_map), (64, 64))
plt.imshow(activation_map_resized)
I want to be able to get the point around (22, 50) and (55, 50).
Any method I can use to find the center of that two contour heatmap efficiently? I can imagine using gradient or some clustering, but I'm not sure if it is the most effiient method to use.

What you are looking for is called peak detection, in your case you'll have to threshold the data first as the gradient has some "low" peaks in it that you would like to ignore
import numpy as np
from scipy.ndimage.filters import maximum_filter
from scipy.ndimage.morphology import generate_binary_structure, binary_erosion
import matplotlib.pyplot as pp
def detect_peaks(image):
neighborhood = generate_binary_structure(2,2)
local_max = maximum_filter(image, footprint=neighborhood)==image
background = (image==0)
eroded_background = binary_erosion(background, structure=neighborhood, border_value=1)
detected_peaks = local_max ^ eroded_background
return detected_peaks
activation_map = np.array([[0.0724, 0.0615, 0.0607, 0.0710, 0.0000, 0.0000, 0.0154],
[0.1111, 0.0835, 0.0923, 0.0409, 0.0000, 0.0000, 0.0000],
[0.0986, 0.0860, 0.1138, 0.0706, 0.0144, 0.0000, 0.0000],
[0.1134, 0.1109, 0.2244, 0.3414, 0.2652, 0.2708, 0.1664],
[0.1165, 0.1620, 0.5605, 0.7064, 0.4593, 0.6628, 0.6103],
[0.0852, 0.2324, 1.0000, 0.8605, 0.5095, 0.8457, 0.8332],
[0.0349, 0.2422, 0.9287, 0.5717, 0.2054, 0.4749, 0.6983]])
#Thresholding - remove this line to expose "low" peaks
activation_map[activation_map<0.3] = 0
detected_peaks = detect_peaks(activation_map)
pp.subplot(4,2,1)
pp.imshow(activation_map)
pp.subplot(4,2,2)
pp.imshow(detected_peaks)
pp.show()
Reference/more detailed info about peak detection:
Peak detection in a 2D array

Related

Find the optimal k(hyperparameter) of KNN given k vs AUC score, such that cv_auc should be maximum and gap between train_auc and cv_auc is minimum

I have a task to find the optimal hyperparameter(k) of KNN. I plotted the k vs AUC curve using roc_auc_score. I am supposed to find k such that cv_auc is maximum and the gap between train_auc and cv_auc is minimum. How can I achieve that?
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import roc_auc_score
import matplotlib.pyplot as plt
train_auc=[]
cv_auc=[]
k=[i for i in range(1,50,5)]
for i in k:
knn=KNeighborsClassifier(n_neighbors=i)
knn.fit(x_train_bow,y_train)
y_train_pred=knn.predict_proba(x_train_bow)[:,1]
y_cv_pred=knn.predict_proba(x_cv_bow)[:,1]
train_auc.append(roc_auc_score(y_train,y_train_pred))
cv_auc.append(roc_auc_score(y_cv,y_cv_pred))
#plot the roc curve
plt.plot(k,train_auc,label="Train AUC")
plt.plot(k,cv_auc,label="CV AUC")
plt.legend()
plt.xlabel('K:hyperparameter')
plt.ylabel('AUC')
plt.title("Error plot")
plt.show()
picture of the roc curve
print(cv_auc)
print(cv_auc.index(max(cv_auc)))
array1 = np.array(train_auc)
array2 = np.array(cv_auc)
subtracted_array = np.subtract(array1, array2)
subtracted = list(subtracted_array)
print(subtracted)
subtracted.index(min(subtracted))
Output:
[0.6241694315220194, 0.6985803616697652, 0.7222662029418654, 0.7429448007376901, 0.7433472984472336, 0.7492335494812746, 0.7499829512940709, 0.7594353468596283, 0.757365782209453, 0.7518153165574067]
7
[0.3758305684779806, 0.1995133667387895, 0.1433755719502956, 0.10953834255228179, 0.09624883964242126, 0.08236753388538032, 0.07710481774180344, 0.06538756093043141, 0.05998659695603492, 0.06576356656762017]
8

How can I plot a confusion matrix for image dataset from directory?

I've built up my own neural model, trained it, and got 99.58% accuracy. But I am facing a problem with plotting the confusion matrix. There are some examples available for flow_from_directory but no examples exist for image_dataset_from_directory. Can anyone help me?
See the post How to plot confusion matrix for prefetched dataset in Tensorflow using
true_categories = tf.concat([y for x, y in val_ds], axis=0)
to get the true labels for the validation set. Then you can plot the confusion matrix with something like this
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import confusion_matrix
cm = confusion_matrix(true_categories, predicted_id)
fig = plt.figure(figsize = (8,8))
ax1 = fig.add_subplot(1,1,1)
sns.set(font_scale=1.4) #for label size
sns.heatmap(cm, annot=True, annot_kws={"size": 12},
cbar = False, cmap='Purples');
ax1.set_ylabel('True Values',fontsize=14)
ax1.set_xlabel('Predicted Values',fontsize=14)
plt.show()
Here is the code I created to be able to assemble the matrix of confusion
Note:
test_dataset is a tf.data.Dataset variable.
I used validation_dataset = tf.keras.preprocessing.image_dataset_from_directory()
import tensorflow as tf
y_true = []
y_pred = []
for x,y in validation_dataset:
y= tf.argmax(y,axis=1)
y_true.append(y)
y_pred.append(tf.argmax(model.predict(x),axis = 1))
y_pred = tf.concat(y_pred, axis=0)
y_true = tf.concat(y_true, axis=0)

Total of correctly predicted in binary classification of images with CNN in keras

I have succeeded build binary classification model for image in CNN using Keras and made the prediction using model.predict_classes() and here is my code:
import numpy as np
import os,sys
from keras.models import load_model
import PIL
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
model = load_model('./potholes16_2.h5')
model.compile (loss = 'binary_crossentropy',
optimizer = 'adam',
metric = ['accuracy'])
path= os.path.abspath("./potholes14/test/positive")
extensions = 'JPG'
if __name__ == "__main__":
for f in os.listdir(path):
if os.path.isfile(os.path.join(path,f)):
f_text, f_ext= os.path.splitext(f)
f_ext= f_ext[1:].upper()
if f_ext in extensions:
print (f)`enter code here`
img = Image.open(os.path.join(path,f))
new_width = 200
new_height = 200
img = img.resize((new_width, new_height), Image.ANTIALIAS)
#width, height= image.size
img = np.reshape(img,[1,new_width,new_height,3])
classes = model.predict_classes(img)
print (classes)
Now I want to count total of images which correctly predicted, for example how many classes are belong to class 0 or class 1?
You need to invoke the model.evaluate function; supposing you want to evaluate the data in x_test with the ground truth labels in y_test, then:
score = model.evaluate(x_test, y_test, verbose=0)
score[0] will give you the loss (binary cross entropy in your case), while score[1] contains the required binary accuracy.
See the docs for more details (scroll down looking for evaluate).
You must have the a a sample array of the data you are predicting on correct? well you could load that data as well. Keep the code you have,
classes = model.predict_classes(img)
yields
array([[ 0.94981687],[ 0.57888238],[ 0.58651019],[ 0.30058956],[ 0.21879381]])
and your class data looks like this
class_validation = np.array([[1],[0],[0],[0],[1]])
Then just find where there equal once rounding classes
np.where(np.round(classes,0)==class_validation)[0].shape[0]
Note: there are many was to write the last line, that assums your numpy array is shape (number_of_sample,1)
Another way to check
totalCorrect = class_validation[((np.round(classes,0) - class_validation)==0)]
print('Correct in Class 1 = ',np.count_nonzero(totalCorrect),'Correct in Class 0 = ',abs(len(totalCorrect)-np.count_nonzero(totalCorrect)))

Earth Mover Distance between numpy 1-D histograms

I try to calculate the Earth Mover Distance between two 1-dimensional numpy histograms, like:
(array([ 0.53586639, 0.71448852, 1.22534781, 1.68262046, 1.20391316]), array([ 0. , 0.18648936, 0.37297871, 0.55946807, 0.74595742,
0.93244678]), <a list of 5 Patch objects>)
and
(array([ 0.05986936, 0.41133267, 1.0449142 , 2.43569242, 2.50891394]), array([ 0.17373296, 0.32851441, 0.48329586, 0.63807731, 0.79285876,
0.9476402 ]), <a list of 5 Patch objects>)
I want to do it for 1-dimensional arrays, not for images. I want a simple solution.
A simple python code:
import numpy as np
def wasserstein_distance(A,B):
n = len(A)
dist = np.zeros(n)
for x in range(n-1):
dist[x+1] = A[x]-B[x]+dist[x]
return np.sum(abs(dist))

How do you use PyTorch PackedSequence in code?

Can someone give a full working code (not a snippet, but something that runs on a variable-length recurrent neural network) on how would you use the PackedSequence method in PyTorch?
There do not seem to be any examples of this in the documentation, github, or the internet.
https://github.com/pytorch/pytorch/releases/tag/v0.1.10
Not the most beautiful piece of code, but this is what I gathered for my personal use after going through PyTorch forums and docs. There can be certainly better ways to handle the sorting - restoring part, but I chose it to be in the network itself
EDIT: See answer from #tusonggao which makes torch utils take care of sorting parts
class Encoder(nn.Module):
def __init__(self, vocab_size, embedding_size, embedding_vectors=None, tune_embeddings=True, use_gru=True,
hidden_size=128, num_layers=1, bidrectional=True, dropout=0.6):
super(Encoder, self).__init__()
self.embed = nn.Embedding(vocab_size, embedding_size, padding_idx=0)
self.embed.weight.requires_grad = tune_embeddings
if embedding_vectors is not None:
assert embedding_vectors.shape[0] == vocab_size and embedding_vectors.shape[1] == embedding_size
self.embed.weight = nn.Parameter(torch.FloatTensor(embedding_vectors))
cell = nn.GRU if use_gru else nn.LSTM
self.rnn = cell(input_size=embedding_size, hidden_size=hidden_size, num_layers=num_layers,
batch_first=True, bidirectional=True, dropout=dropout)
def forward(self, x, x_lengths):
sorted_seq_lens, original_ordering = torch.sort(torch.LongTensor(x_lengths), dim=0, descending=True)
ex = self.embed(x[original_ordering])
pack = torch.nn.utils.rnn.pack_padded_sequence(ex, sorted_seq_lens.tolist(), batch_first=True)
out, _ = self.rnn(pack)
unpacked, unpacked_len = torch.nn.utils.rnn.pad_packed_sequence(out, batch_first=True)
indices = Variable(torch.LongTensor(np.array(unpacked_len) - 1).view(-1, 1)
.expand(unpacked.size(0), unpacked.size(2))
.unsqueeze(1))
last_encoded_states = unpacked.gather(dim=1, index=indices).squeeze(dim=1)
scatter_indices = Variable(original_ordering.view(-1, 1).expand_as(last_encoded_states))
encoded_reordered = last_encoded_states.clone().scatter_(dim=0, index=scatter_indices, src=last_encoded_states)
return encoded_reordered
Actually there is no need to mind the sorting - restoring problem yourself, let the torch.nn.utils.rnn.pack_padded_sequence function do all the work, by setting the parameter enforce_sorted=False.
Then the returned PackedSequence object will carry the sorting related info in its sorted_indices and unsorted_indicies attributes, which can be used properly by the followed nn.GRU or nn.LSTM to restore the original index order.
Runnable code example:
import torch
from torch import nn
from torch.nn.utils.rnn import pad_sequence, pack_padded_sequence, pad_packed_sequence
data = [torch.tensor([1]),
torch.tensor([2, 3, 4, 5]),
torch.tensor([6, 7]),
torch.tensor([8, 9, 10])]
lengths = [d.size(0) for d in data]
padded_data = pad_sequence(data, batch_first=True, padding_value=0)
embedding = nn.Embedding(20, 5, padding_idx=0)
embeded_data = embedding(padded_data)
packed_data = pack_padded_sequence(embeded_data, lengths, batch_first=True, enforce_sorted=False)
lstm = nn.LSTM(5, 5, batch_first=True)
o, (h, c) = lstm(packed_data)
# (h, c) is the needed final hidden and cell state, with index already restored correctly by LSTM.
# but o is a PackedSequence object, to restore to the original index:
unpacked_o, unpacked_lengths = pad_packed_sequence(o, batch_first=True)
# now unpacked_o, (h, c) is just like the normal output you expected from a lstm layer.
print(unpacked_o, unpacked_lengths)
We get the output of unpacked_o, unpacked_lengths something like follows:
# output (unpacked_o, unpacked_lengths):
tensor([[[ 1.5230, -1.7530, 0.5462, 0.6078, 0.9440],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000]],
[[ 1.8888, -0.5465, 0.5404, 0.4132, -0.3266],
[ 0.1657, 0.5875, 0.4556, -0.8858, 1.1443],
[ 0.8957, 0.8676, -0.6614, 0.6751, -1.2377],
[-1.8999, 2.8260, 0.1650, -0.6244, 1.0599]],
[[ 0.0637, 0.3936, -0.4396, -0.2788, 0.1282],
[ 0.5443, 0.7401, 1.0287, -0.1538, -0.2202],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000]],
[[-0.5008, 2.1262, -0.3623, 0.5864, 0.9871],
[-0.6996, -0.3984, 0.4890, -0.8122, -1.0739],
[ 0.3392, 1.1305, -0.6669, 0.5054, -1.7222],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000]]],
grad_fn=<IndexSelectBackward>) tensor([1, 4, 2, 3])
Comparing it with the original data and lengths, we can find the sorting - restoring problem has been neatly taken care of.

Resources