I try to calculate the Earth Mover Distance between two 1-dimensional numpy histograms, like:
(array([ 0.53586639, 0.71448852, 1.22534781, 1.68262046, 1.20391316]), array([ 0. , 0.18648936, 0.37297871, 0.55946807, 0.74595742,
0.93244678]), <a list of 5 Patch objects>)
and
(array([ 0.05986936, 0.41133267, 1.0449142 , 2.43569242, 2.50891394]), array([ 0.17373296, 0.32851441, 0.48329586, 0.63807731, 0.79285876,
0.9476402 ]), <a list of 5 Patch objects>)
I want to do it for 1-dimensional arrays, not for images. I want a simple solution.
A simple python code:
import numpy as np
def wasserstein_distance(A,B):
n = len(A)
dist = np.zeros(n)
for x in range(n-1):
dist[x+1] = A[x]-B[x]+dist[x]
return np.sum(abs(dist))
Related
I have a task to find the optimal hyperparameter(k) of KNN. I plotted the k vs AUC curve using roc_auc_score. I am supposed to find k such that cv_auc is maximum and the gap between train_auc and cv_auc is minimum. How can I achieve that?
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import roc_auc_score
import matplotlib.pyplot as plt
train_auc=[]
cv_auc=[]
k=[i for i in range(1,50,5)]
for i in k:
knn=KNeighborsClassifier(n_neighbors=i)
knn.fit(x_train_bow,y_train)
y_train_pred=knn.predict_proba(x_train_bow)[:,1]
y_cv_pred=knn.predict_proba(x_cv_bow)[:,1]
train_auc.append(roc_auc_score(y_train,y_train_pred))
cv_auc.append(roc_auc_score(y_cv,y_cv_pred))
#plot the roc curve
plt.plot(k,train_auc,label="Train AUC")
plt.plot(k,cv_auc,label="CV AUC")
plt.legend()
plt.xlabel('K:hyperparameter')
plt.ylabel('AUC')
plt.title("Error plot")
plt.show()
picture of the roc curve
print(cv_auc)
print(cv_auc.index(max(cv_auc)))
array1 = np.array(train_auc)
array2 = np.array(cv_auc)
subtracted_array = np.subtract(array1, array2)
subtracted = list(subtracted_array)
print(subtracted)
subtracted.index(min(subtracted))
Output:
[0.6241694315220194, 0.6985803616697652, 0.7222662029418654, 0.7429448007376901, 0.7433472984472336, 0.7492335494812746, 0.7499829512940709, 0.7594353468596283, 0.757365782209453, 0.7518153165574067]
7
[0.3758305684779806, 0.1995133667387895, 0.1433755719502956, 0.10953834255228179, 0.09624883964242126, 0.08236753388538032, 0.07710481774180344, 0.06538756093043141, 0.05998659695603492, 0.06576356656762017]
8
I've built up my own neural model, trained it, and got 99.58% accuracy. But I am facing a problem with plotting the confusion matrix. There are some examples available for flow_from_directory but no examples exist for image_dataset_from_directory. Can anyone help me?
See the post How to plot confusion matrix for prefetched dataset in Tensorflow using
true_categories = tf.concat([y for x, y in val_ds], axis=0)
to get the true labels for the validation set. Then you can plot the confusion matrix with something like this
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import confusion_matrix
cm = confusion_matrix(true_categories, predicted_id)
fig = plt.figure(figsize = (8,8))
ax1 = fig.add_subplot(1,1,1)
sns.set(font_scale=1.4) #for label size
sns.heatmap(cm, annot=True, annot_kws={"size": 12},
cbar = False, cmap='Purples');
ax1.set_ylabel('True Values',fontsize=14)
ax1.set_xlabel('Predicted Values',fontsize=14)
plt.show()
Here is the code I created to be able to assemble the matrix of confusion
Note:
test_dataset is a tf.data.Dataset variable.
I used validation_dataset = tf.keras.preprocessing.image_dataset_from_directory()
import tensorflow as tf
y_true = []
y_pred = []
for x,y in validation_dataset:
y= tf.argmax(y,axis=1)
y_true.append(y)
y_pred.append(tf.argmax(model.predict(x),axis = 1))
y_pred = tf.concat(y_pred, axis=0)
y_true = tf.concat(y_true, axis=0)
I'm following this paper:
http://robotics.stanford.edu/users/shoham/www%20papers/Empirical%20Hardness.pdf
and I try to predict the running time for the Traveling salesman problem on a blackbox solver.
I get some weird results during regression that I'd love to consult about:
I find it hard to believe that in XGBOOST or at any regessor the number of cities is irrelevant as a feature? as seen in XGBOOST feature importance image.
In the RIDGE and LINEAR REGRESSION results graphs you can see that for some problem instances the graphs you can see that the predicted value is negative (when we talk about run time), I saw in other question here that this is because "Linear regression does not respect the bounds of 0" and that I should put a natural log on it, but I don't know exactly where. So I'd love help with that also.
I'd also love to be reccomended on other regression models that may fit my problem.
Thanks a lot!
Here is my code pieces (google colab), followed by the results I got:
1
# Import the standard libraries of pandas.
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
import warnings
warnings. filterwarnings("ignore")
sns.set_style('whitegrid')
from google.colab import files
2
# Install the solver and import its libraries, in addition import all the
# libraries with which we will prepare the features.
!pip3 install ortools
!pip install python-igraph
from ortools.constraint_solver import routing_enums_pb2
from ortools.constraint_solver import pywrapcp
import numpy as np
import time
import random
from random import randrange
from scipy import stats
from scipy.stats import skew
from scipy.sparse import csr_matrix
from scipy.sparse.csgraph import minimum_spanning_tree
from scipy.sparse.csgraph import depth_first_tree
from igraph import Graph, mean
import igraph
import itertools
import math
3
# Simple travelling salesman problem between cities - solver OR Tools By Google.
def create_data_model():
# Stores the data for the problem.
data = {}
# dim will be the number of Vertices\Cities in the Traveling Salesman Problem.
# Randomly select the matrix dimension in unifom distribution.
dim = np.random.randint(10, 350)
# Generate a square symmetric matrix It will be the distance matrix that the solver will solve.
square_matrice = [[0 for row in range(dim)] for col in range(dim)]
for i in range(dim):
for j in range(dim):
if i == j:
square_matrice[i][j] = 0
else:
# Randomly fill the matrix in unifom distribution.
square_matrice[i][j] = square_matrice[j][i] = np.random.randint(1, 1000)
data['distance_matrix'] = square_matrice # yapf: disable
data['num_vehicles'] = 1
data['depot'] = 0
return data
def main():
# Start measuring solution time.
start_time = time.time()
# Instantiate the data problem.
data = create_data_model()
# Create the routing index manager.
manager = pywrapcp.RoutingIndexManager(len(data['distance_matrix']),
data['num_vehicles'], data['depot'])
# Create Routing Model.
routing = pywrapcp.RoutingModel(manager)
def distance_callback(from_index, to_index):
# Returns the distance between the two nodes.
# Convert from routing variable Index to distance matrix NodeIndex.
from_node = manager.IndexToNode(from_index)
to_node = manager.IndexToNode(to_index)
return data['distance_matrix'][from_node][to_node]
transit_callback_index = routing.RegisterTransitCallback(distance_callback)
# Define cost of each arc.
routing.SetArcCostEvaluatorOfAllVehicles(transit_callback_index)
# Setting first solution heuristic.
search_parameters = pywrapcp.DefaultRoutingSearchParameters()
search_parameters.first_solution_strategy = (
routing_enums_pb2.FirstSolutionStrategy.PATH_CHEAPEST_ARC)
# Solve the problem.
solution = routing.SolveWithParameters(search_parameters)
solution_time = time.time() - start_time
'''In this part of the code we will create the following features on the distance matrix of the problem.
* Mean - Average weights of the distance matrix.
* Std - Standard Deviation of the distance matrix.
* Skewness - What is the tendency of the weights in the distance matrix.
* Noc - Number of cities we have in the distance matrix [matrix dimension].
* Td - The total distance of the solution rout.
* Dmft - Distance matrix features time, That is how long it took us to calculate all these features.
'''
dmt_start_time = time.time()
mat = np.array(data['distance_matrix'])
mean = mat.mean()
std = mat.std()
merged = list(itertools.chain(*mat))
skewness = skew(merged)
noc = len(data['distance_matrix'])
td = solution.ObjectiveValue() if solution else -1
dmft = time.time() - dmt_start_time
'''In this part of the code we will create from the distance matrix of the problem an MST and than
on the MST we take the following features.
* MST_Mean - Average weights of the MST.
* MST_Std - Standard Deviation of the MST.
* MST_Skewness - What is the tendency of the weights in the MST.
* MST_ft - MST features time, That is how long it took us to calculate the MST & all these features.
'''
spt_start_time = time.time()
X = csr_matrix(mat)
Tcsr = minimum_spanning_tree(X)
mat_st = np.array(Tcsr.toarray().astype(int))
mst_mean = mat_st.mean()
mst_std = mat_st.std()
merged_st = list(itertools.chain(*mat_st))
mst_skewness = skew(merged_st)
mst_ft = time.time() - spt_start_time
'''In this part of the code we calculate features from the MST that are considered to be
related to the rank and depth of the tracks in it.
* D_Mean - Average degree of the MST.
* D_Std - Standard Deviation of the MST degrees.
* D_Skewness - What is the tendency of the degrees in the MST.
* DFT_Mean - The average weight of the deepest track in MST.
* DFT_Std - Standard Deviation of the deepest track in MST.
* DFT_Max - The heaviest arch on the longest route in MST.
* DDFT_ft - Degree & DFT features time, That is how long it took us to calculate all these features.
'''
dstt_start_time = time.time()
g = Graph.Weighted_Adjacency(mat_st.tolist())
d_mean = igraph.statistics.mean(g.degree())
d_std = igraph.statistics.sd(g.degree())
d_skewness = skew(g.degree())
d_t = depth_first_tree(X, 0, directed=False)
mat_dt = np.array(d_t.toarray().astype(int))
dft_mean = mat_dt.mean()
dft_std = mat_dt.std()
dft_max = np.amax(mat_dt)
ddft_ft = time.time() - dstt_start_time
# In this map we will hold all the features and their results.
features_map = {'Mean': mean, 'Std': std, 'Skewness': skewness, 'Noc': noc, 'Td': td, 'Dmft': dmft,
'MST_Mean': mst_mean, 'MST_Std': mst_std, 'MST_Skewness': mst_skewness, 'MST_ft': mst_ft,
'D_Mean': d_mean, 'D_Std': d_std, 'D_Skewness': d_skewness, 'DFT_Mean': dft_mean,'DFT_Std': dft_std,
'DFT_Max': dft_max, 'DDFT_ft': ddft_ft, 'Solution_time': solution_time}
return features_map
# Main
# Create dataFrame.
data_TSP = pd.DataFrame()
# Fill the dataFrame.
for i in range(10000):
#print(i)
features_map = main()
data_TSP = data_TSP.append(features_map, ignore_index=True)
# Show data frame.
data_TSP.head()
data_TSP.to_csv('data_10000.csv')
files.download('data_10000.csv')
Regression models:
# Import the standard libraries of pandas.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
import warnings
warnings. filterwarnings("ignore")
sns.set_style('whitegrid')
2
# Neaded for opening data file in drive.
from google.colab import files
uploaded = files.upload()
import io
df = pd.read_csv(io.BytesIO(uploaded['data_10000_clean.csv']))
try:
df.drop(['Unnamed: 0'], axis=1, inplace=True)
except:
pass
df.head()
from sklearn.model_selection import train_test_split
# Split the data to training set and test set (70%, 30%)
features = list(df.drop('Solution_time', axis = 1, inplace = False))
y = df['Solution_time']
X = df[features]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3)
Import models which we predicted with tham the solution,
And scoring methods to evaluate these models.
from sklearn.ensemble import RandomForestRegressor
from sklearn.neighbors import KNeighborsRegressor
from sklearn.tree import DecisionTreeRegressor
from sklearn.linear_model import Ridge
from sklearn.linear_model import RidgeCV
from sklearn.model_selection import GridSearchCV
from sklearn import linear_model
!pip3 install xgboost
from xgboost import XGBRegressor
from sklearn.metrics import r2_score
from sklearn.model_selection import cross_val_score
!pip install scikit-plot
import scikitplot as skplt
import matplotlib as mpl
########################################## Several functions for different regression models. ##########################################
class Score:
r2 = 0.0 # This score determines how close the predictions really are to the real data.
cross_vali_score = 0.0 # How true is our algorithm if it going to well predict new data.
class Regressor:
def __init__(self, name):
self.name = name
self.score = Score()
self.y_pred = None
self.reg = None
# Map between the name of a model and the model itself.
models_map = {'Random Forest': RandomForestRegressor(), 'Xgboost': XGBRegressor(), 'Ridge': Ridge(),
'Kneighbors': KNeighborsRegressor(), 'Linear Regressor': linear_model.LinearRegression()}
# This function return a map that maps between each model and its Regressor class.
def get_models():
result_map = {}
for key, val in models_map.items():
result = Regressor(key)
reg = val
result.score.cross_vali_score = np.mean(cross_val_score(reg, X_train, y_train, cv=5))
result.reg = reg.fit(X_train, y_train)
result.y_pred = reg.predict(X_test)
result.score.r2 = r2_score(y_test, result.y_pred)
result_map[key] = result
return result_map
# This function print a graph for models of the features that influenced their decision making.
def print_influence_graph(map):
for key, val in map.items():
if key == 'Random Forest' or key == 'Xgboost':
# The parameters that most influenced the decision.
feature_imp = pd.Series(val.reg.feature_importances_,index=features).sort_values(ascending=False)
sns.barplot(x=feature_imp, y=feature_imp.index)
# Add labels to your graph
plt.xlabel('Feature Importance Score')
plt.ylabel('Features')
plt.title(val.name.upper() +" - Visualizing Important Features")
plt.show()
# This function print a graph for models that show the real results against the model predictions.
def show_predicted_vs_actual(map):
for key, val in map.items():
fig, ax = plt.subplots()
ax.scatter(y_test, val.y_pred, edgecolors=(0, 0, 1))
ax.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'r--', lw=3)
ax.set_xlabel('Predicted')
ax.set_ylabel('Actual')
ax.title.set_text(val.name.upper() +" - Predicted time vs actual time")
plt.show()
# This function print numerical scores for the models.
def print_scores(map):
for key, val in map.items():
print(val.name.upper() + ' SCORE: ')
print('R2score' + ' = ', val.score.r2)
print('Cross_val_score' + ' = ', val.score.cross_vali_score)
print('------------------------------------------\n')
# This function print a graph showing the differences between the scores of the models.
def show_models_differences_graph(map):
comp_df = pd.DataFrame(columns = ('Method', 'R2 Score', 'Cross val score'))
for i in map:
row = {'Method': i, 'R2 Score': map[i].score.r2, 'Cross val score': map[i].score.cross_vali_score}
comp_df = comp_df.append(row, ignore_index=True)
ax = comp_df.plot.bar(x='Method', rot=30, figsize=(12,6))
ax.set_title('Comparison graph')
#########################################################################################################################################
models = get_models()
print_influence_graph(models)
show_predicted_vs_actual(models)
print_scores(models)
show_models_differences_graph(models)
And here are the results:
So this question is about GANs.
I am trying to do a trivial example for my own proof of concept; namely, generate images of hand written digits (MNIST). While most will approach this via deep convolutional gans (dgGANs), I am just trying to achieve this via the 1D array (i.e. instead of 28x28 gray-scale pixel values, a 28*28 1d array).
This git repo features a "vanilla" gans which treats the MNIST dataset as a 1d array of 784 values. Their output values look pretty acceptable so I wanted to do something similar.
Import statements
from __future__ import print_function
import matplotlib as mpl
from matplotlib import pyplot as plt
import mxnet as mx
from mxnet import nd, gluon, autograd
from mxnet.gluon import nn, utils
import numpy as np
import os
from math import floor
from random import random
import time
from datetime import datetime
import logging
ctx = mx.gpu()
np.random.seed(3)
Hyper parameters
batch_size = 100
epochs = 100
generator_learning_rate = 0.001
discriminator_learning_rate = 0.001
beta1 = 0.5
latent_z_size = 100
Load data
mnist = mx.test_utils.get_mnist()
# convert imgs to arrays
flattened_training_data = mnist["test_data"].reshape(10000, 28*28)
define models
G = nn.Sequential()
with G.name_scope():
G.add(nn.Dense(300, activation="relu"))
G.add(nn.Dense(28 * 28, activation="tanh"))
D = nn.Sequential()
with D.name_scope():
D.add(nn.Dense(128, activation="relu"))
D.add(nn.Dense(64, activation="relu"))
D.add(nn.Dense(32, activation="relu"))
D.add(nn.Dense(2, activation="tanh"))
loss = gluon.loss.SoftmaxCrossEntropyLoss()
init stuff
G.initialize(mx.init.Normal(0.02), ctx=ctx)
D.initialize(mx.init.Normal(0.02), ctx=ctx)
trainer_G = gluon.Trainer(G.collect_params(), 'adam', {"learning_rate": generator_learning_rate, "beta1": beta1})
trainer_D = gluon.Trainer(D.collect_params(), 'adam', {"learning_rate": discriminator_learning_rate, "beta1": beta1})
metric = mx.metric.Accuracy()
dynamic plot (for juptyer notebook)
import matplotlib.pyplot as plt
import time
def dynamic_line_plt(ax, y_data, colors=['r', 'b', 'g'], labels=['Line1', 'Line2', 'Line3']):
x_data = []
y_max = 0
y_min = 0
x_min = 0
x_max = 0
for y in y_data:
x_data.append(list(range(len(y))))
if max(y) > y_max:
y_max = max(y)
if min(y) < y_min:
y_min = min(y)
if len(y) > x_max:
x_max = len(y)
ax.set_ylim(y_min, y_max)
ax.set_xlim(x_min, x_max)
if ax.lines:
for i, line in enumerate(ax.lines):
line.set_xdata(x_data[i])
line.set_ydata(y_data[i])
else:
for i in range(len(y_data)):
l = ax.plot(x_data[i], y_data[i], colors[i], label=labels[i])
ax.legend()
fig.canvas.draw()
train
stamp = datetime.now().strftime('%Y_%m_%d-%H_%M')
logging.basicConfig(level=logging.DEBUG)
# arrays to store data for plotting
loss_D = nd.array([0], ctx=ctx)
loss_G = nd.array([0], ctx=ctx)
acc_d = nd.array([0], ctx=ctx)
labels = ['Discriminator Loss', 'Generator Loss', 'Discriminator Acc.']
%matplotlib notebook
fig, ax = plt.subplots(1, 1)
ax.set_xlabel('Time')
ax.set_ylabel('Loss')
dynamic_line_plt(ax, [loss_D.asnumpy(), loss_G.asnumpy(), acc_d.asnumpy()], labels=labels)
for epoch in range(epochs):
tic = time.time()
data_iter.reset()
for i, batch in enumerate(data_iter):
####################################
# Update Disriminator: maximize log(D(x)) + log(1-D(G(z)))
####################################
# extract batch of real data
data = batch.data[0].as_in_context(ctx)
# add noise
# Produce our noisey input to the generator
latent_z = mx.nd.random_normal(0,1,shape=(batch_size, latent_z_size), ctx=ctx)
# soft and noisy labels
# real_label = mx.nd.ones((batch_size, ), ctx=ctx) * nd.random_uniform(.7, 1.2, shape=(1)).asscalar()
# fake_label = mx.nd.ones((batch_size, ), ctx=ctx) * nd.random_uniform(0, .3, shape=(1)).asscalar()
# real_label = nd.random_uniform(.7, 1.2, shape=(batch_size), ctx=ctx)
# fake_label = nd.random_uniform(0, .3, shape=(batch_size), ctx=ctx)
real_label = mx.nd.ones((batch_size, ), ctx=ctx)
fake_label = mx.nd.zeros((batch_size, ), ctx=ctx)
with autograd.record():
# train with real data
real_output = D(data)
errD_real = loss(real_output, real_label)
# train with fake data
fake = G(latent_z)
fake_output = D(fake.detach())
errD_fake = loss(fake_output, fake_label)
errD = errD_real + errD_fake
errD.backward()
trainer_D.step(batch_size)
metric.update([real_label, ], [real_output,])
metric.update([fake_label, ], [fake_output,])
####################################
# Update Generator: maximize log(D(G(z)))
####################################
with autograd.record():
output = D(fake)
errG = loss(output, real_label)
errG.backward()
trainer_G.step(batch_size)
####
# Plot Loss
####
# append new data to arrays
loss_D = nd.concat(loss_D, nd.mean(errD), dim=0)
loss_G = nd.concat(loss_G, nd.mean(errG), dim=0)
name, acc = metric.get()
acc_d = nd.concat(acc_d, nd.array([acc], ctx=ctx), dim=0)
# plot array
dynamic_line_plt(ax, [loss_D.asnumpy(), loss_G.asnumpy(), acc_d.asnumpy()], labels=labels)
name, acc = metric.get()
metric.reset()
logging.info('Binary training acc at epoch %d: %s=%f' % (epoch, name, acc))
logging.info('time: %f' % (time.time() - tic))
output
img = G(mx.nd.random_normal(0,1,shape=(100, latent_z_size), ctx=ctx))[0].reshape((28, 28))
plt.imshow(img.asnumpy(),cmap='gray')
plt.show()
Now this doesn't get nearly as good as the repo's example from above. Although fairly similar.
Thus I was wondering if you could take a look and figure out why:
the colors are inverted
why the results are sub par
I have been fiddling around with this trying a lot of various things to improve the results (I will list this in a second), but for the MNIST dataset this really shouldn't be needed.
Things I have tried (and I have also tried a host of combinations):
increasing the generator network
increasing the discriminator network
using soft labeling
using noisy labeling
batch norm after every layer in the generator
batch norm of the data
normalizing all values between -1 and 1
leaky relus in the generator
drop out layers in the generator
increased learning rate of discriminator compared to generator
decreased learning rate of i compared to generator
Please let me know if you have any ideas.
1) If you look into original dataset:
training_set = mnist["train_data"].reshape(60000, 28, 28)
plt.imshow(training_set[10,:,:], cmap='gray')
you will notice that the digits are white on a black background. So, technically speaking, your results are not inversed - they match the pattern of original images you used as a real data.
If you want to invert colors for visualization purposes, you can easily do that by changing the pallete to reversed one by adding '_r' (it works for all color palletes):
plt.imshow(img.asnumpy(), cmap='gray_r')
You also can play with ranges of colors by changing vmin and vmax parameters. They control how big the difference between colors should be. By default it is calculated automatically based on provided set.
2) "Why the results are sub par" - I think this is exactly the reason why the community started to use dcGANs. To me the results in the git repo you provided are quite noisy. Surely, they are different from what you receive, and you can achieve the same quality just by changing your activation functions from tanh to sigmoid as in the example on github:
G = nn.Sequential()
with G.name_scope():
G.add(nn.Dense(300, activation="relu"))
G.add(nn.Dense(28 * 28, activation="sigmoid"))
D = nn.Sequential()
with D.name_scope():
D.add(nn.Dense(128, activation="relu"))
D.add(nn.Dense(64, activation="relu"))
D.add(nn.Dense(32, activation="relu"))
D.add(nn.Dense(2, activation="sigmoid"))
Sigmoid never goes below zero and it works better in this scenario. Here is a sample picture I get if I train updated model for 30 epochs (the rest of the hyperparameters are same).
If you decide to explore dcGAN to get even better results, take a look here - https://mxnet.incubator.apache.org/tutorials/unsupervised_learning/gan.html It is a well explained tutorial on how to build dcGAN with Mxnet and Gluon. By using dcGAN you will get way better results than that.
I am trying to use DBSCAN from scikitlearn to segment an image based on color. The results I'm getting are . As you can see there are 3 clusters. My goal is to separate the buoys in the picture into different clusters. But obviously they are showing up as the same cluster. I've tried a wide range of eps values and min_samples but those two things always cluster together. My code is:
img= cv2.imread("buoy1.jpg)
labimg = cv2.cvtColor(img, cv2.COLOR_BGR2LAB)
n = 0
while(n<4):
labimg = cv2.pyrDown(labimg)
n = n+1
feature_image=np.reshape(labimg, [-1, 3])
rows, cols, chs = labimg.shape
db = DBSCAN(eps=5, min_samples=50, metric = 'euclidean',algorithm ='auto')
db.fit(feature_image)
labels = db.labels_
plt.figure(2)
plt.subplot(2, 1, 1)
plt.imshow(img)
plt.axis('off')
plt.subplot(2, 1, 2)
plt.imshow(np.reshape(labels, [rows, cols]))
plt.axis('off')
plt.show()
I assume this is taking the euclidean distance and since its in lab space euclidean distance would be different between different colors. If anyone can give me guidance on this I'd really appreciate it.
Update:
The below answer works. Since DBSCAN requires an array with no more then 2 dimensions I concatenated the columns to the original image and reshaped to produce a n x 5 matrix where n is the x dimension times the y dimension. This seems to work for me.
indices = np.dstack(np.indices(img.shape[:2]))
xycolors = np.concatenate((img, indices), axis=-1)
np.reshape(xycolors, [-1,5])
You need to use both color and position.
Right now, you are using colors only.
Could you please add the enitre code in the answer? Im not able to understand where do I add the those 3 lines which have worked for you – user8306074 Sep 4 at 8:58
Let me answer for you, and here is the full version of the code:
import numpy as np
import cv2
import matplotlib.pyplot as plt
from sklearn.cluster import DBSCAN
img= cv2.imread('your image')
labimg = cv2.cvtColor(img, cv2.COLOR_BGR2LAB)
n = 0
while(n<4):
labimg = cv2.pyrDown(labimg)
n = n+1
feature_image=np.reshape(labimg, [-1, 3])
rows, cols, chs = labimg.shape
db = DBSCAN(eps=5, min_samples=50, metric = 'euclidean',algorithm ='auto')
db.fit(feature_image)
labels = db.labels_
indices = np.dstack(np.indices(labimg.shape[:2]))
xycolors = np.concatenate((labimg, indices), axis=-1)
feature_image2 = np.reshape(xycolors, [-1,5])
db.fit(feature_image2)
labels2 = db.labels_
plt.figure(2)
plt.subplot(2, 1, 1)
plt.imshow(img)
plt.axis('off')
# plt.subplot(2, 1, 2)
# plt.imshow(np.reshape(labels, [rows, cols]))
# plt.axis('off')
plt.subplot(2, 1, 2)
plt.imshow(np.reshape(labels2, [rows, cols]))
plt.axis('off')
plt.show()