I want to check the optimal number of k using the elbow method. I'm not using the scikit-learn library. I have my k-means coded from scratch and now I'm having a difficult time figuring out how to code the elbow method in python. I'm a total beginner.
This is my k-means code:
def cluster_init(array, k):
initial_assgnm = np.append(np.arange(k), np.random.randint(0, k, size=(len(array))))[:len(array)]
np.random.shuffle(initial_assgnm)
zero_arr = np.zeros((len(initial_assgnm), 1))
for indx, cluster_assgnm in enumerate(initial_assgnm):
zero_arr[indx] = cluster_assgnm
upd_array = np.append(array, zero_arr, axis=1)
return upd_array
def kmeans(array, k):
cluster_array = cluster_init(array, k)
while True:
unique_clusters = np.unique(cluster_array[:, -1])
centroid_dictonary = {}
for cluster in unique_clusters:
centroid_dictonary[cluster] = np.mean(cluster_array[np.where(cluster_array[:, -1] == cluster)][:, :-1], axis=0)
start_array = np.copy(cluster_array)
for row in range(len(cluster_array)):
cluster_array[row, -1] = unique_clusters[np.argmin(
[np.linalg.norm(cluster_array[row, :-1] - centroid_dictonary.get(cluster)) for cluster in unique_clusters])]
if np.array_equal(cluster_array, start_array):
break
return centroid_dictonary
This is what I have tried for the elbow method:
cost = []
K= range(1,239)
for k in K :
KM = kmeans(x,k)
print(k)
KM.fit(x)
cost.append(KM.inertia_)
But I get the following error
KM.fit(x)
AttributeError: 'dict' object has no attribute 'fit'
If you want to compute the elbow values from scratch, you need to compute the inertia for the current clustering assigment. To do this, you can compute the sum of the particle inertias. The particle inertia from a data point is the distance from its current position, to the closest center. If you have a function that computes this for you (in scikit-learn this function corresponds to pairwise_distances_argmin_min) you could do
labels, mindist = pairwise_distances_argmin_min(
X=X, Y=centers, metric='euclidean', metric_kwargs={'squared': True})
inertia = mindist.sum()
If you actually wanted to write this function what you would do is loop over every row x in X, find the minimum over all y in Y of dist(x,y), and this would be your inertia for x. This naive method of computing the particle inertias is O(nk), so you might consider using the library function instead.
Related
I am trying to implement Group Lasso on weight matrices of a neural network in PyTorch.
I have written the code to implement Group Lasso but am unsure if this is correct, confirmation or correction of my code will be very helpful.
def gl_norm(model, gl_lambda, num_blk):
gl_reg = torch.tensor(0., dtype=torch.float32).cuda()
for key in model:
for param in model[key].parameters():
dim = param.size()
if dim.__len__() > 1 and not model[key].skip_regularization:
div1 = list(torch.chunk(param,int(num_blk),1))
all_blks = []
for div2 in div1:
temp = list(torch.chunk(div2,int(num_blk),0))
for blk in temp:
all_blks.append(blk)
for l2_param in all_blks:
gl_reg += torch.norm(l2_param, 2)
return gl_reg * float(gl_lambda)
I expect the torch.chunk function to break up the weight matrix into small blocks which then go through L2 norm for the block and L1 norm between all the blocks.
I am learning to implement the Factorization Machine in Pytorch.
And there should be some feature crossing operations.
For example, I've got three features [A,B,C], after embedding, they are [vA,vB,vC], so the feature crossing is "[vA·vB], [vA·vC], [vB·vc]".
I know this operation can be simplified by the following:
It can be implemented by MATRIX OPERATIONS.
But this only gives a final result, say, a single value.
The question is, how to get all cross_vec in the following without doing FOR loop:
note: size of "feature_emb" is [batch_size x feature_len x embedding_size]
g_feature = 0
for i in range(self.featurn_len):
for j in range(self.featurn_len):
if j <= i: continue
cross_vec = feature_emb[:,i,:] * feature_emb[:,j,:]
g_feature += torch.sum(cross_vec, dim=1)
You can
cross_vec = (feature_emb[:, None, ...] * feature_emb[..., None, :]).sum(dim=-1)
This should give you corss_vec of shape (batch_size, feature_len, feature_len).
Alternatively, you can use torch.bmm
cross_vec = torch.bmm(feature_emb, feature_emb.transpose(1, 2))
For a classification problem using BernoulliNB , how to calculate the joint log-likelihood. The joint likelihood it to be calculated by below formula, where y(d) is the array of actual output (not predicted values) and x(d) is the data set of features.
I read this answer and read the documentation but it didn't exactly served my purpose. Can somebody please
help.
By looking at the code, it looks like there is a hidden undocumented ._joint_log_likelihood(self, X) function in the BernoulliNB which computes the joint log-likelihood.
Its implementation is somewhat consistent with what you ask.
The solution is to count the y(d) of the output.
If the output is True, the y(d) is the [1] in data[idx][1],
else [0] in data[idx][0].
The first block of code calls the _joint_log_likelihood function.
The second block of code is the detail of that function.
The third block of code uses the function on a Bernoulli Naive Bayes dataset.
train, test, train_labels, test_labels = train_test_split(Xs[0], ys[0],
test_size=1./3, random_state=r)
naive = BernoulliNB(alpha= 10**-7)
model = naive.fit(train, train_labels)
joint_log_train = model._joint_log_likelihood(train)
l = [np.append(x,y) for x, y in zip(train, train_labels)]
def count(data, label):
x = 0
for idx, l in enumerate(label):
if (l == True):
x += data[idx][1]
else:
x += data[idx][0]
return x
# Write your code below this line.
for i, (x, y) in enumerate(zip(Xs, ys)):
train, test, train_labels, test_labels = train_test_split(x, y, test_size=1./3, random_state=r)
for j, a in enumerate(alphas):
naive = BernoulliNB(alpha = a)
model = naive.fit(train, train_labels)
joint_log_train = model._joint_log_likelihood(train)
joint_log_test = model._joint_log_likelihood(test)
train_jil[i][j] = count(joint_log_train, train_labels)
test_jil[i][j] = count(joint_log_test, test_labels)
I am new to ML and am trying my hands on Linear regression. I am using this dataset. The data and my "optimized" model look like this:
I am modifying the data like this:
X = np.vstack((np.ones((X.size)),X,X**2))
Y = np.log10 (Y)
#have tried roots of Y and 3 degree feature as well
Intial cost: 0.8086672720475084
Optimized cost: 0.7282965408177141
I am unable to optimize further no matter the no. of runs.
Increasing learning rate causes increase in cost.
My rest algorithm is fine as I am able to optimize for a simpler dataset. Shown Below:
Sorry, If this is something basic but I can't seem to find a way to optimize my model for original data.
EDIT:
Pls have look at my code, I don't why its not working
def GradientDescent(X,Y,theta,alpha):
m = X.shape[1]
h = Predict(X,theta)
gradient = np.dot(X,(h - Y))
gradient.shape = (gradient.size,1)
gradient = gradient/m
theta = theta - alpha*gradient
cost = CostFunction(X,Y,theta)
return theta,cost
def CostFunction(X,Y,theta):
m = X.shape[1]
h = Predict(X,theta)
cost = h - Y
cost = np.sum(np.square(cost))/(2*m)
return cost
def Predict(X,theta):
h = np.transpose(X).dot(theta)
return h
x is 2,333
y is 333,1
I tried debugging it again but I can't find it. Pls help me.
I am trying to implement a custom Keras objective function:
in 'Direct Intrinsics: Learning Albedo-Shading Decomposition by Convolutional Regression', Narihira et al.
This is the sum of equations (4) and (6) from the previous picture. Y* is the ground truth, Y a prediction map and y = Y* - Y.
This is my code:
def custom_objective(y_true, y_pred):
#Eq. (4) Scale invariant L2 loss
y = y_true - y_pred
h = 0.5 # lambda
term1 = K.mean(K.sum(K.square(y)))
term2 = K.square(K.mean(K.sum(y)))
sca = term1-h*term2
#Eq. (6) Gradient L2 loss
gra = K.mean(K.sum((K.square(K.gradients(K.sum(y[:,1]), y)) + K.square(K.gradients(K.sum(y[1,:]), y)))))
return (sca + gra)
However, I suspect that the equation (6) is not correctly implemented because the results are not good. Am I computing this right?
Thank you!
Edit:
I am trying to approximate (6) convolving with Prewitt filters. It works when my input is a chunk of images i.e. y[batch_size, channels, row, cols], but not with y_true and y_pred (which are of type TensorType(float32, 4D)).
My code:
def cconv(image, g_kernel, batch_size):
g_kernel = theano.shared(g_kernel)
M = T.dtensor3()
conv = theano.function(
inputs=[M],
outputs=conv2d(M, g_kernel, border_mode='full'),
)
accum = 0
for curr_batch in range (batch_size):
accum = accum + conv(image[curr_batch])
return accum/batch_size
def gradient_loss(y_true, y_pred):
y = y_true - y_pred
batch_size = 40
# Direction i
pw_x = np.array([[-1,0,1],[-1,0,1],[-1,0,1]]).astype(np.float64)
g_x = cconv(y, pw_x, batch_size)
# Direction j
pw_y = np.array([[-1,-1,-1],[0,0,0],[1,1,1]]).astype(np.float64)
g_y = cconv(y, pw_y, batch_size)
gra_l2_loss = K.mean(K.square(g_x) + K.square(g_y))
return (gra_l2_loss)
The crash is produced in:
accum = accum + conv(image[curr_batch])
...and error description is the following one:
*** TypeError: ('Bad input argument to theano function with name "custom_models.py:836" at index 0 (0-based)', 'Expected an array-like
object, but found a Variable: maybe you are trying to call a function
on a (possibly shared) variable instead of a numeric array?')
How can I use y (y_true - y_pred) as a numpy array, or how can I solve this issue?
SIL2
term1 = K.mean(K.square(y))
term2 = K.square(K.mean(y))
[...]
One mistake spread across the code was that when you see (1/n * sum()) in the equations, it is a mean. Not the mean of a sum.
Gradient
After reading your comment and giving it more thought, I think there is a confusion about the gradient. At least I got confused.
There are two ways of interpreting the gradient symbol:
The gradient of a vector where y should be differentiated with respect to the parameters of your model (usually the weights of the neural net). In previous edits I started to write in this direction because that's the sort of approach used to trained the model (eg. gradient descent). But I think I was wrong.
The pixel intensity gradient in a picture, as you mentioned in your comment. The diff of each pixel with its neighbor in each direction. In which case I guess you have to translate the example you gave into Keras.
To sum up, K.gradients() and numpy.gradient() are not used in the same way. Because numpy implicitly considers (i, j) (the row and column indices) as the two input variables, while when you feed a 2D image to a neural net, every single pixel is an input variable. Hope I'm clear.